Skip to content


Subversion checkout URL

You can clone with
Download ZIP


#count doesn't honor distinct attributes in select clause #5554

mtalcott opened this Issue · 15 comments

7 participants


Calling count on an ActiveRecord::Relation with more than 1 distinct attribute in the select clause doesn't honor the select conditions. For example, this works:'distinct username').all    #=> SELECT distinct username FROM "users"'distinct username').count  #=> SELECT COUNT(distinct username) FROM "users"

However, the distinct clause disappears from the count SQL when another distinct attribute is specified:'distinct username, email').all     #=> SELECT distinct username, email FROM "users"'distinct username, email').count   #=> SELECT count(*) FROM "users"

Same thing with distinct users.*:'distinct users.*').all     #=> SELECT distinct users.* FROM "users"'distinct users.*').count   #=> SELECT count(*) FROM "users"

That means for a relation with multiple distinct columns, relation.count is not necessarily the same as relation.all.count.

For distinct users.*, count(:distinct => true) works (it generates SELECT COUNT(DISTINCT "users"."id") FROM "users"), but that won't work for distinct username, email.

I've verified this behavior on Postgres and SQLite.


FYI, as a workaround you can use the :select option to replace the "*" with the columns, &c. of your choosing.

User.count(select: 'distinct users.*') #=> SELECT count(distinct users.*) FROM "users"

I'm seeing the same behavior with that workaround as well:

User.count(select: 'distinct users.*') #=> SELECT COUNT(*) FROM "users"


Ah you're right. Unfortunately I don't have a record of what I originally ran that set me astray.


It seems Kaminari is working around this bug, with massive performance consequences for us. I've written a post describing the issue here:

I've replicated the issue here:

It seems that #sum is also broken by DISTINCT, as are probably the #min, #max, #avg and other calculations. This could have dire consequences for anyone generating mission-critical reports and wondering why the numbers don't come out right. Heck, it'd be nicer if AR just raised a "too hard" exception, rather than give incorrect results.

BTW, this may be related to #1003.


Nice writeup of the issue and performance in different cases. It seems that for now, if you need an accurate count/sum/etc. and there is a distinct clause on the query, you have to instantiate the AR object array. Not very performant, but that's better than getting inaccurate results.

1003 is definitely related.


Hi, i think this is expected as you see here.

If you remove that validation, you will get:

>>'distinct username, email').count
   (0.2ms)  SELECT COUNT(distinct username, email) FROM "users" 
SQLite3::SQLException: wrong number of arguments to function COUNT(): SELECT COUNT(distinct username, email) FROM "users" 

>>'distinct users.*').count
   (0.2ms)  SELECT COUNT(distinct users.*) FROM "users" 
SQLite3::SQLException: near "*": syntax error: SELECT COUNT(distinct users.*) FROM "users" 

/cc @carlosantoniodasilva @rafaelfranca I'm not sure about this, but i'd prefer to raise an error to avoid this kind of issues.


Something like:

def select_for_count
  if select_values.present?
    select = select_values.join(", ")
    raise IDontFigureOutAnErrorForThisYet if select = ~/[,*]/

What do you think?


Yes, raising an exception is much more preferable to returning an incorrect result.

@frodsan frodsan referenced this issue from a commit
Francesco Rodriguez fix calculation for a relation with multiple selected columns
Before this patch, `count` returns an incorrect result when is
executed with multiple columns:'distinct username, email').count
    # => SELECT count(*) FROM "users"'distinct users.*').count
    # => SELECT count(*) FROM "users"

It proposes to raise an ActiveRecordError instead.

    >>'distinct username, email').count
    # => ActiveRecord::ActiveRecordError: calculation of multiple columns is not
    #    supported

    >>'distinct users.*').count
    # => ActiveRecord::ActiveRecordError: calculation of multiple columns is not
    #    supported

Fixes #5554.

Related with #6865

@anthonyalberto anthonyalberto referenced this issue in mislav/will_paginate

has_many association with uniq #251


Is this still happening? I might have fixed this by working on a different issue. Let me know if it's still a problem on master and I'll take a look.


I've updated to work with rails master, and the tests still fail, so would be great if you can fix this @senny.

You should also note that this bug affects everything in ActiveRecord::Calculations (so sum, average, minimum, maximum and so on, as well as count).


I'll take a look


not every database supports counting on multiple distinct columns. The fallback is to use subqueries.

The code that silently falls back to count(*) feels hacky at best. I think we should let the user know what's happening and if a count can't be performed.


@mtalcott, would you mind to show an example of an AR object array that would work around this issue?

Following is generating correct count for me:
User.find(:all, :select => "DISTINCT username, email").count

However, since I need this in a model scope it wont work since this returns an array instead of an AR relation.

@senny senny referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.

I submitted a PR to solve this issue: #10710

@jonleighton jonleighton closed this in #10710
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.