Skip to content

Commit

Permalink
[SPARK-41985][SQL][FOLLOWUP] Remove alias in GROUP BY only when the e…
Browse files Browse the repository at this point in the history
…xpr is resolved

### What changes were proposed in this pull request?

This is a followup of apache#39508 to fix a regression. We should not remove aliases from grouping expressions if they are not resolved, as the alias may be necessary for resolution, such as `CreateNamedStruct`.

### Why are the changes needed?

fix a regression

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

new test

Closes apache#39867 from cloud-fan/column.

Lead-authored-by: Wenchen Fan <wenchen@databricks.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
  • Loading branch information
2 people authored and MaxGekk committed Feb 3, 2023
1 parent a916a05 commit 02b39f0
Show file tree
Hide file tree
Showing 3 changed files with 21 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,13 @@ object ResolveReferencesInAggregate extends SQLConfHelper
// can't find the grouping expressions via `semanticEquals` and the analysis will fail.
// Example rules: ResolveGroupingAnalytics (See SPARK-31670 for more details) and
// ResolveLateralColumnAliasReference.
groupingExpressions = resolvedGroupExprs.map(trimAliases),
groupingExpressions = resolvedGroupExprs.map { e =>
// Only trim the alias if the expression is resolved, as the alias may be needed to resolve
// the expression, such as `NamePlaceHolder` in `CreateNamedStruct`.
// Note: this rule will be invoked even if the Aggregate is fully resolved. So alias in
// GROUP BY will be removed eventually, by following iterations.
if (e.resolved) trimAliases(e) else e
},
aggregateExpressions = resolvedAggExprsWithOuter)
}

Expand Down
3 changes: 3 additions & 0 deletions sql/core/src/test/resources/sql-tests/inputs/group-by.sql
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,9 @@ SELECT a + b, COUNT(b) FROM testData GROUP BY a + b;
SELECT a + 2, COUNT(b) FROM testData GROUP BY a + 1;
SELECT a + 1 + 1, COUNT(b) FROM testData GROUP BY a + 1;

-- struct() in group by
SELECT count(1) FROM testData GROUP BY struct(a + 0.1 AS aa);

-- Aggregate with nulls.
SELECT SKEWNESS(a), KURTOSIS(a), MIN(a), MAX(a), AVG(a), VARIANCE(a), STDDEV(a), SUM(a), COUNT(a)
FROM testData;
Expand Down
11 changes: 11 additions & 0 deletions sql/core/src/test/resources/sql-tests/results/group-by.sql.out
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,17 @@ struct<((a + 1) + 1):int,count(b):bigint>
NULL 1


-- !query
SELECT count(1) FROM testData GROUP BY struct(a + 0.1 AS aa)
-- !query schema
struct<count(1):bigint>
-- !query output
2
2
2
3


-- !query
SELECT SKEWNESS(a), KURTOSIS(a), MIN(a), MAX(a), AVG(a), VARIANCE(a), STDDEV(a), SUM(a), COUNT(a)
FROM testData
Expand Down

0 comments on commit 02b39f0

Please sign in to comment.