Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
DISTINCT does not work with ORDER BY in some queries #1287
A fix for issue #408.
Only plain selects are fixed,
I don't know what modern SQL standards say about queries like
Also such support for unrelated columns was requested too in the comments.
MySQL is not good guidance for reasonable SQL functionality :) I wouldn't be surprised if MySQL just ordered rows randomly if the
The SQL standard is quite wordy about this part. I'm looking at ISO/IEC 9075-2:2016(E), 7.17 <query expression>, Syntax Rules 28) d) i) 9) B) II), which says
Which makes total sense when you think about it. Assuming this data set:
-- It is easy to see how the ordering is defined on this query. It will be 2, 1, 1, 2 SELECT A ORDER BY B -- It is impossible to see how the ordering is defined on this query. Is it 1, 2 or 2, 1? SELECT DISTINCT A ORDER BY B
I would definitely go with the SQL standard / PostgreSQL here. Optionally, support PostgreSQL's
Yes, to my understanding,
While the SQL standard is extremely hard to read, I think this article I wrote some time ago might explain it better:
logically, the order of operations in
When looking at
This shows that
But without this change such valid queries were rejected too.
It's not easy to check whether expression use only valid columns due to internal implementation of such selects in H2.
H2 builds complete list of expressions. Some of them are expressions for select (they may be treated as distinct or not), others are used only for sorting (but not in distinct operations). Expressions are not linked to each other and intermediate result contains evaluated values in its rows.
I guess that
SELECT UPPER(A) FROM TEST ORDER BY LOWER(A)
should be accepted too?
Should be accepted, because without
Heh. That's what the MySQL folks probably thought as well. Now everyone criticises them for that, and yet, they cannot get rid of it easily because no one can turn on strict mode on a legacy database :-)
Actually we can enable such unrestricted queries only in MySQL mode. But I want to fix problem with valid queries that are not currently accepted in regular mode too. (They also are not accepted by PostrgreSQL, but accepted by Oracle, MySQL and may be by some others).
No, this shouldn't be valid. The rule is always the same. The expression
Did you mean that in
SELECT DISTINCT A FROM TEST ORDER BY LOWER(A)
Or query like
SELECT DISTINCT A, B FROM TEST ORDER BY (A + B)
should be valid and query
SELECT DISTINCT (A - B) FROM TEST ORDER BY (A + B)
should not be valid?
Let's look again at:
Try it with example data if formalism doesn't help :)
What would be the expected result of this query? Clearly,
Now, at this point, what does
Yes that works, because the expression in
No. Try again with data:
How would you want to order these 2 rows by
Thank you for detailed explanation!
To perform correct check for
For MySQL mode such additional check should be skipped.