Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.group and .having of little usefulness with PostgreSQL #1515

Closed
jaroslawr opened this issue Jun 6, 2011 · 7 comments
Closed

.group and .having of little usefulness with PostgreSQL #1515

jaroslawr opened this issue Jun 6, 2011 · 7 comments

Comments

@jaroslawr
Copy link
Contributor

When using MySQL, .group and .having can be used to accomplish the very common task of finding records from SQL table table_a by specifying some conditions on some tables joining table_a, i.e. table_b, table_c etc. So, if I want to find all the users that have paid more then four times, I can do:

User.joins(:payments).group('users.id').having('count(payments.id) > 4')

This however doesn't work with PostgreSQL, which will in this case throw an error message similar to this one:

ActiveRecord::StatementInvalid: PGError: ERROR: column "users.email" must 
appear in the GROUP BY clause or be used in an aggregate function

This is because in PostgreSQL the columns that appear in the FROM clause have to be the same ones that appear in the GROUP by clause. Correct me if I'm wrong, but from what I see you can fix it only by using .select('users.id,users.email,..."), but in this case you either have to manually list all the columns or you get crippled objects returned with only some attributes present. This forces PostgreSQL users to either come up with some ugly workarounds or to use find_by_sql for any but the simplest queries concerning only one table. I don't know what's the most "Railsy" way of fixing it, but I think it would at least be helpful to have a method that would return all the column names qualified by the table name, like this:

module ActiveRecord
  class Base
    def all_columns
      column_names.collect { |c| "#{table_name}.#{c}" }.join(",")
    end
  end
end

That would make .group and .having usable in PostgreSQL like this:

User.joins(:payments).group(User.all_columns).having('count(payments.id) > 4')

In the end I think it would be great if Rails provided some shorthand notation for doing the same thing, like this:

User.joins(:payments).group(:users).having('count(payments.id) > 4')

Edit:

Maybe .group with no arguments could do just that?

jake3030 pushed a commit to jake3030/rails that referenced this issue Jun 28, 2011
… has been sent to an association [rails#1515 state:resolved]

Signed-off-by: Joshua Peek <josh@joshpeek.com>
@willbryant
Copy link
Contributor

This is really just postgresql's behavior though. You can and should list all the non-aggregate columns in the group call as that is how they define group as working in current versions of postgresql.

Your patch would ease the pain here, but this is being changed in postgresql itself in 9.2 - see http://developer.postgresql.org/pgdocs/postgres/sql-select.html:

When GROUP BY is present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions or if the ungrouped column is functionally dependent on the grouped columns, since there would otherwise be more than one possible value to return for an ungrouped column. A functional dependency exists if the grouped columns (or a subset thereof) are the primary key of the table containing the ungrouped column.

Which in plain english means that if you group by the users.id, you'll also be grouping by the other users columns and won't need to list them all.

Will that make the issue moot?

@jaroslawr
Copy link
Contributor Author

I know this is supposed to be fixed in PostgreSQL in the future, but lots of people already use Rails with PostgreSQL 8.x and they're not all going to upgrade anytime soon to PostgreSQL 9.2. Of course one can just live with this, but this issue makes ActiveRecord crippled for all those people, as it makes any queries but the simplest one very cubersome to build, so I still think it would be a good idea to fix this in some way.

@willbryant
Copy link
Contributor

Yeah, it'd definitely be a useful plugin.

@carlosantoniodasilva
Copy link
Member

I have to agree that this is how postgresql works. Given it's been a goal of postgresql to support such queries, I think it'd not be Rails concern to handle that. This might be a good place for a plugin indeed, if that's still required. Thanks!

@russ
Copy link
Contributor

russ commented May 14, 2013

I would like to add my comments to this. It sounds like postgresql was going to make this change but still has yet to do it. Every Rails project that I've worked on with postgresql runs into this issue. If you look for a solution on how to work around it, you find the same answers all over StackOverflow. I would like to see something like @jaroslawr mentioned above.

User.joins(:payments).group(:users).having('count(payments.id) > 4')

That solution would make the headache a lot worse and the code quite a bit cleaner. I'm willing to implement said code, but I wanted to receive feedback before diving into that. Or hopefully, I somehow missed the memo on where postgresql supports these types of queries.

@jaroslawr
Copy link
Contributor Author

With recent PostgreSQL versions it's possible to do User.select('users.*').joins(:payments).group('users.id').having('count(payments.id) > 4')

@fatuhoku
Copy link

Run into this myself. Can't really group at all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants