-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-11431: [Rust][DataFusion] Support the HAVING clause. #9364
Conversation
This is looking good @drusso ! I think it can use some tests with example (tabular) input as well? To make sure the results are as expected. There are some more end to end tests in the tests directory. |
Thanks! I will definitely add some more tests. There are also a couple of clippy errors for me to address. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I quickly skimmed this PR @drusso and it is looking really cool! Thank you very much! Please ping me when you think it is ready for review and I will give it a close read.
@alamb @Dandandan I've updated the PR, let me know if there's anything outstanding. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went through this PR carefully, understood all the changes and have no comments. Great work, @drusso 💯
I think that we should add it to the README as part of the SQL features. :)
I plan to review this carefully tomorrow |
Thanks @jorgecarleitao and @alamb! I added a note in the README. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is beautiful @drusso -- I think it should have one more test (for a query with both HAVING and WHERE) and get rebased against master and it is ready to merge. 🥇
Thanks @drusso ! |
I am just waiting for the CI to finish on this PR and then I plan to merge it! |
Looks like github is operating in degraded fashion: https://www.githubstatus.com/ |
All the rust tests are passing. Merging this in even though Travis is still running. |
🎉 |
🎉 |
🚀 |
This commit adds support for the SQL `HAVING` clause. For example, the following queries are supported: * Filtering on an aggregate: ``` SELECT c1, MAX(c2) FROM t GROUP BY c1 HAVING MAX(c2) > 100; ``` * Filtering on an aliased aggregate: ``` SELECT c1, MAX(c2) AS m FROM t GROUP BY c1 HAVING m > 100; ``` * Filtering on an aggregate that does not appear in the SELECT: ``` SELECT c1, MAX(c2) FROM t GROUP BY c1 HAVING MIN(c2) > 100; ``` * Filtering on a complex aggregates: ``` SELECT c1, MAX(c2) FROM t GROUP BY c1 HAVING MAX(c2 / 2) + 1 > 100; ``` * Filtering on a non-aggregate column: ``` SELECT c1, MAX(c2) FROM t GROUP BY c1 HAVING c1 > 100; ``` * Filtering without a `GROUP BY`: ``` SELECT MAX(c1) FROM t HAVING MAX(c1) > 100; ``` Closes apache#9364 from drusso/ARROW-11431 Authored-by: Daniel Russo <danrusso@gmail.com> Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
This commit adds support for the SQL `HAVING` clause. For example, the following queries are supported: * Filtering on an aggregate: ``` SELECT c1, MAX(c2) FROM t GROUP BY c1 HAVING MAX(c2) > 100; ``` * Filtering on an aliased aggregate: ``` SELECT c1, MAX(c2) AS m FROM t GROUP BY c1 HAVING m > 100; ``` * Filtering on an aggregate that does not appear in the SELECT: ``` SELECT c1, MAX(c2) FROM t GROUP BY c1 HAVING MIN(c2) > 100; ``` * Filtering on a complex aggregates: ``` SELECT c1, MAX(c2) FROM t GROUP BY c1 HAVING MAX(c2 / 2) + 1 > 100; ``` * Filtering on a non-aggregate column: ``` SELECT c1, MAX(c2) FROM t GROUP BY c1 HAVING c1 > 100; ``` * Filtering without a `GROUP BY`: ``` SELECT MAX(c1) FROM t HAVING MAX(c1) > 100; ``` Closes apache#9364 from drusso/ARROW-11431 Authored-by: Daniel Russo <danrusso@gmail.com> Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
This commit adds support for the SQL `HAVING` clause. For example, the following queries are supported: * Filtering on an aggregate: ``` SELECT c1, MAX(c2) FROM t GROUP BY c1 HAVING MAX(c2) > 100; ``` * Filtering on an aliased aggregate: ``` SELECT c1, MAX(c2) AS m FROM t GROUP BY c1 HAVING m > 100; ``` * Filtering on an aggregate that does not appear in the SELECT: ``` SELECT c1, MAX(c2) FROM t GROUP BY c1 HAVING MIN(c2) > 100; ``` * Filtering on a complex aggregates: ``` SELECT c1, MAX(c2) FROM t GROUP BY c1 HAVING MAX(c2 / 2) + 1 > 100; ``` * Filtering on a non-aggregate column: ``` SELECT c1, MAX(c2) FROM t GROUP BY c1 HAVING c1 > 100; ``` * Filtering without a `GROUP BY`: ``` SELECT MAX(c1) FROM t HAVING MAX(c1) > 100; ``` Closes apache#9364 from drusso/ARROW-11431 Authored-by: Daniel Russo <danrusso@gmail.com> Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
This commit adds support for the SQL
HAVING
clause.For example, the following queries are supported:
Filtering on an aggregate:
Filtering on an aliased aggregate:
Filtering on an aggregate that does not appear in the SELECT:
Filtering on a complex aggregates:
Filtering on a non-aggregate column:
Filtering without a
GROUP BY
: