[SPARK-41228][SQL] Rename & Improve error message for `COLUMN_NOT_IN_GROUP_BY_CLAUSE`. #38769

itholic · 2022-11-23T07:07:24Z

What changes were proposed in this pull request?

This PR proposes to rename COLUMN_NOT_IN_GROUP_BY_CLAUSE to MISSING_AGGREGATION.

Also, improve its error message.

Why are the changes needed?

The current error class name and its error message doesn't illustrate the error cause and resolution correctly.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

./build/sbt “sql/testOnly org.apache.spark.sql.SQLQueryTestSuite*”

…GREGATION

core/src/main/resources/error/error-classes.json

itholic · 2022-11-24T06:04:21Z

Fixed the Python test first since we have unresolved discussion #38769 (comment)

MaxGekk · 2022-11-28T09:21:14Z

@itholic Please, rebase on the recent master.

…1128

MaxGekk · 2022-11-29T18:35:06Z

@itholic Could you fix the test failure. It seems it is related to your changes:

python/pyspark/sql/tests/pandas/test_pandas_udf_grouped_agg.py.test_invalid_args
"COLUMN_NOT_IN_GROUP_BY_CLAUSE" does not match "[MISSING_AGGREGATION] The non-aggregating expression "v" is based on columns which are not participating in the GROUP BY clause.
Add the columns or the expression to the GROUP BY, aggregate the expression, or use "any_value("v")" if you do not care which of the values within a group is returned.;
Aggregate [id#668L], [id#668L, plus_one(v#674)#687 AS plus_one(v)#688]

itholic · 2022-11-30T05:56:57Z

python/pyspark/sql/tests/pandas/test_pandas_udf_grouped_agg.py


        with QuietTest(self.sc):
-            with self.assertRaisesRegex(AnalysisException, "nor.*aggregate function"):
+            with self.assertRaisesRegex(AnalysisException, "[MISSING_AGGREGATION]"):


I think maybe we should test this with similar logic to checkError in Scala side.

Let me create the JIRA and update soon after getting some feedback from community.

MaxGekk

@itholic Please, construct AnyValue.

core/src/main/resources/error/error-classes.json

…1128

MaxGekk · 2022-12-01T06:17:39Z

+1, LGTM. Merging to master.
Thank you, @itholic.

…GROUP_BY_CLAUSE` ### What changes were proposed in this pull request? This PR proposes to rename `COLUMN_NOT_IN_GROUP_BY_CLAUSE` to `MISSING_AGGREGATION`. Also, improve its error message. ### Why are the changes needed? The current error class name and its error message doesn't illustrate the error cause and resolution correctly. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? ``` ./build/sbt “sql/testOnly org.apache.spark.sql.SQLQueryTestSuite*” ``` Closes apache#38769 from itholic/SPARK-41128. Authored-by: itholic <haejoon.lee@databricks.com> Signed-off-by: Max Gekk <max.gekk@gmail.com>

[SPARK-41228][SQL] Rename COLUMN_NOT_IN_GROUP_BY_CLAUSE to MISSING_AG…

35dd6e9

…GREGATION

github-actions bot added CORE SQL labels Nov 23, 2022

MaxGekk reviewed Nov 23, 2022

View reviewed changes

core/src/main/resources/error/error-classes.json Outdated Show resolved Hide resolved

Fix Python test

55a2969

github-actions bot added the PYTHON label Nov 24, 2022

Merge branch 'master' of https://github.com/apache/spark into SPARK-4…

b78e186

…1128

fix test

cf468e3

itholic commented Nov 30, 2022

View reviewed changes

MaxGekk requested changes Nov 30, 2022

View reviewed changes

core/src/main/resources/error/error-classes.json Outdated Show resolved Hide resolved

itholic added 2 commits December 1, 2022 09:13

Merge branch 'master' of https://github.com/apache/spark into SPARK-4…

265f14b

…1128

add expressionAnyValue

a939c81

MaxGekk approved these changes Dec 1, 2022

View reviewed changes

MaxGekk closed this in 5badb24 Dec 1, 2022

itholic deleted the SPARK-41128 branch April 22, 2023 05:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-41228][SQL] Rename & Improve error message for `COLUMN_NOT_IN_GROUP_BY_CLAUSE`. #38769

[SPARK-41228][SQL] Rename & Improve error message for `COLUMN_NOT_IN_GROUP_BY_CLAUSE`. #38769

Uh oh!

itholic commented Nov 23, 2022

Uh oh!

Uh oh!

itholic commented Nov 24, 2022 •

edited

Loading

Uh oh!

MaxGekk commented Nov 28, 2022

Uh oh!

MaxGekk commented Nov 29, 2022

Uh oh!

itholic Nov 30, 2022 •

edited

Loading

Uh oh!

MaxGekk left a comment

Uh oh!

Uh oh!

MaxGekk commented Dec 1, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[SPARK-41228][SQL] Rename & Improve error message for COLUMN_NOT_IN_GROUP_BY_CLAUSE. #38769

[SPARK-41228][SQL] Rename & Improve error message for COLUMN_NOT_IN_GROUP_BY_CLAUSE. #38769

Uh oh!

Conversation

itholic commented Nov 23, 2022

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

Uh oh!

itholic commented Nov 24, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MaxGekk commented Nov 28, 2022

Uh oh!

MaxGekk commented Nov 29, 2022

Uh oh!

itholic Nov 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MaxGekk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MaxGekk commented Dec 1, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[SPARK-41228][SQL] Rename & Improve error message for `COLUMN_NOT_IN_GROUP_BY_CLAUSE`. #38769

[SPARK-41228][SQL] Rename & Improve error message for `COLUMN_NOT_IN_GROUP_BY_CLAUSE`. #38769

itholic commented Nov 24, 2022 •

edited

Loading

itholic Nov 30, 2022 •

edited

Loading