Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Incorrect SQL generated #530

Closed
ketema opened this Issue Jun 7, 2012 · 15 comments

Comments

Projects
None yet
4 participants

ketema commented Jun 7, 2012

When issuing a Model::find with the following options:
array(
'with' => array( ... )
'conditions' => array( ... )
'order' => 'Model.column asc'
'offset' => string '0'
'limit' => string '10'
)

The generated SQL contains an un-neccessary GROUP BY clause that incorrectly does not include the order by field. The group by should either not be present, or all fields from the ORDER BY clause should be included in the group by.

ketema commented Jun 7, 2012

Discussion in irc has lead me to believe this is specific to the Pg adapter as the generated SQL works for MySQL(according to irc conversation). As such trying to track down how to override this correctly in the Pg adapter.

Member

Howard3 commented Jun 8, 2012

Hello ketema, is the 'with' specifying a relationship that is "hasMany"?

Contributor

jails commented Jun 8, 2012

The extra query with the GROUP BY is necessary to make the 'limit' option works.
But indeed, the GROUP BY query fail with the PostgreSQL adapter because the way MySQL manage the GROUP BY clause seems to be unconventional.

Fail with PostgreSQL (but works with MySQL):

SELECT id FROM table GROUP BY id ORDER BY column;

The valid syntax on PostgreSQL seems to be :

SELECT id FROM table GROUP BY id, column ORDER BY column;

Same for the following which works only with MySQL :

SELECT id, column FROM table GROUP BY id;

The valid syntax with PostgreSQL

SELECT id, column FROM table GROUP BY id, column;

With PostgreSQL any column on which there's no aggregate function (i.e. COUNT, MIN, MAX, etc.) should appear in the GROUP BY clause.

Since I don't use PostgreSQL, a PostgreSQL expert is welcome to validate the right syntax on this DBMS.

ketema commented Jun 9, 2012

The GROUP BY clause in Postgresql is defined as: "GROUP BY will condense into a single row all selected rows that share the same values for the grouped expressions. expression can be an input column name, or the name or ordinal number of an output column (SELECT list item), or an arbitrary expression formed from input-column values. In case of ambiguity, a GROUP BY name will be interpreted as an input-column name rather than an output column name.

Aggregate functions, if any are used, are computed across all rows making up each group, producing a separate value for each group (whereas without GROUP BY, an aggregate produces a single value computed across all the selected rows). When GROUP BY is present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions or if the ungrouped column is functionally dependent on the grouped columns, since there would otherwise be more than one possible value to return for an ungrouped column. A functional dependency exists if the grouped columns (or a subset thereof) are the primary key of the table containing the ungrouped column."

I was able to solve my particular issue simply by overriding Database::read in the Pg adapter and in the sub query section eliminate the group by clause. This is a short sighted solution. A better one would have been to ensure that the group by generation took into account the full select list and order by column list. An even better approach would be to determine if a group by is even needed, since it is inefficient to include it if you are not performing any sort of aggregation.

Contributor

jails commented Jun 9, 2012

Well the problem here is that adding "fields" in the GROUP BY clause to make it syntactically valid for "PostgreSQL" will not produce a workable result for the extra query (which aims to gather only unique IDs for the main query for making the 'limit' option works has expected).

I come to the conclusion that the GROUP BY clause in PostgreSQL can't be used the way we need to use it for the extra query.

Imo the following workaround, whithout GROUP BY, would work for all drivers :

SELECT DISCTINCT(table.id) FROM table LEFT JOIN ... WHERE ... ORDER BY ...

ketema commented Jun 11, 2012

I agree. If all that sub query is for is to get the unique ids for the limit then the DISTINCT option is a better solution.

Member

marcghorayeb commented Jun 11, 2012

Furthermore, the subquery is made even if a find is called without any 'with' demands. This shouldn't be done i think.

Contributor

jails commented Jun 13, 2012

Well the real problem is that DISCTINCT doesn't seem to work the same way on PostgreSQL either...

ketema commented Jun 13, 2012

Can you clarify? What was the SQL your patch outputted?

Contributor

jails commented Jun 13, 2012

It's an attempt to replace the GROUP BY clause to a DISTINCT clause but i suddenly realize that the DISTINCT clause doesn't exists on PostgreSQL... should be a DISTINCT ON clause. So I close the "outputed" PR because if this alternative works on MySQL, it won't work on PostgreSQL neither.

ketema commented Jun 15, 2012

The ON part of the Pg distinct clause is optional. http://www.postgresql.org/docs/current/static/sql-select.html
It is used for when you have an expression as the distinct criteria. In this Use case you would just have the bare keyword DISTINCT and the column name.

Contributor

jails commented Jun 15, 2012

Ok so the patch may work, Maybe you can test it before I reopen it ?

ketema commented Jun 15, 2012

sure, update your patch above or post a new one and I'll pull into a local copy of li3 and give it a shot.

Contributor

jails commented Jun 15, 2012

This is my fork : https://github.com/jails/lithium.git, the patch is on the branch pr23. You can checkout the whole branch or cherry-pick the last commit from this branch to your local copy.

Contributor

jails commented Dec 12, 2012

This should have been solved in #705. Feel free to reopen if this issue is not completly solved.

@jails jails closed this Dec 12, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment