Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL: Support queries with HAVING over SELECT #46709

Merged
merged 4 commits into from Sep 17, 2019
Merged

Conversation

costin
Copy link
Member

@costin costin commented Sep 13, 2019

Handle queries with implicit GROUP BY where the aggregation is not in
the projection/SELECT but inside the filter/HAVING such as:

SELECT 1 FROM x HAVING COUNT(*) > 0

The engine now properly identifies the case

Fix #37051

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

Handle queries with implicit GROUP BY where the aggregation is not in
the projection/SELECT but inside the filter/HAVING such as:

SELECT 1 FROM x HAVING COUNT(*) > 0

The engine now properly identifies the case

Fix elastic#37051
Copy link
Contributor

@matriv matriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Left some minor comments.
I would add a couple of unit tests too (QueryTranslatorTests most probably).

@@ -174,10 +174,12 @@ public QueryContainer(Query query,
*/
public BitSet columnMask(List<Attribute> columns) {
BitSet mask = new BitSet(fields.size());
for (Attribute column : columns) {
for (int columnIndex = 0; columnIndex < columns.size(); columnIndex++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need this? I don't see columnIndex to be used.

@@ -227,7 +230,7 @@ public int limit() {

public boolean isAggsOnly() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rename the method since with anyMatch the name seems misleading now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is a bit misleading but the meaning holds, the method indicates whether the query is an agg-only or scroll-any.
The issue with some expressions is that they can appear in both contexts (column name or literals), hence the anyMatch.
This might be addressed better through FieldExtraction directly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe hasAggsOnly is better?

@@ -140,6 +140,10 @@ protected PhysicalPlan rule(ProjectExec project) {
ScalarFunction f = (ScalarFunction) pj;
processors.put(f.toAttribute(), Expressions.pipe(f));
}
//
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove if not needed.

@@ -1002,6 +1003,42 @@ protected LogicalPlan rule(Project p) {
}
}

//
// Detect implicit grouping with filtering and convert them into aggregates.
// SELECT 1 FROM x WHERE COUNT(*) > 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WHERE -> HAVING

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I like all the integration test queries added.

+ "(InternalSqlScriptUtils.add(InternalSqlScriptUtils.docValue(doc,params.v0),"
+ "InternalSqlScriptUtils.intervalYearMonth(params.v1,params.v2)),params.v3)\",\"lang\":\"painless\","
+ "\"params\":{\"v0\":\"date\",\"v1\":\"P1Y\",\"v2\":\"INTERVAL_YEAR\",\"v3\":\"Z\"}},\"missing_bucket\":true,"));
}


public void testHavingWithLiteralImplicitGrouping() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@costin costin merged commit fa53ca0 into elastic:master Sep 17, 2019
@costin costin deleted the fix/37051 branch September 17, 2019 08:05
costin added a commit that referenced this pull request Sep 17, 2019
Handle queries with implicit GROUP BY where the aggregation is not in
the projection/SELECT but inside the filter/HAVING such as:

SELECT 1 FROM x HAVING COUNT(*) > 0

The engine now properly identifies the case and handles it accordingly.

Fix #37051

(cherry picked from commit fa53ca0)
costin added a commit that referenced this pull request Sep 17, 2019
Handle queries with implicit GROUP BY where the aggregation is not in
the projection/SELECT but inside the filter/HAVING such as:

SELECT 1 FROM x HAVING COUNT(*) > 0

The engine now properly identifies the case and handles it accordingly.

Fix #37051

(cherry picked from commit fa53ca0)
(cherry picked from commit 683b5fd)
@colings86 colings86 added v7.4.0 and removed v7.4.1 labels Sep 20, 2019
jkakavas pushed a commit to jkakavas/elasticsearch that referenced this pull request Sep 25, 2019
Handle queries with implicit GROUP BY where the aggregation is not in
the projection/SELECT but inside the filter/HAVING such as:

SELECT 1 FROM x HAVING COUNT(*) > 0

The engine now properly identifies the case and handles it accordingly.

Fix elastic#37051
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SQL: HAVING without GROUP BY
6 participants