Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-47102][SQL] Add the COLLATION_ENABLED config flag #45285

Closed
wants to merge 19 commits into from

Conversation

mihailom-db
Copy link
Contributor

@mihailom-db mihailom-db commented Feb 27, 2024

What changes were proposed in this pull request?

This PR adds COLLATION_ENABLED config to SQLConf and introduces new error class UNSUPPORTED_FEATURE.COLLATION to appropriately report error on usage of feature under development.

Closes #45218

Why are the changes needed?

We want to make collations configurable on this flag. These changes disable usage of collate and collation functions, along with any COLLATE syntax when the flag is set to false. By default, the flag is set to false.

Does this PR introduce any user-facing change?

Yes. It introduces new error along with an appropriate message.

How was this patch tested?

./build/mvn -Dtest=none -DwildcardSuites=org.apache.spark.sql.errors.QueryCompilationErrorsSuite test
./build/mvn -Dtest=none -DwildcardSuites=org.apache.spark.sql.catalyst.expressions.CollationExpressionSuite test
./build/mvn -Dtest=none -DwildcardSuites=org.apache.spark.sql.CollationSuite test

Was this patch authored or co-authored using generative AI tooling?

No.

Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

introduces new error class COLLATION_SUPPORT_NOT_ENABLED to appropriately report error

@mihailom-db Could update PR's description according to your recent changes. In particular, change the name of the error class.

Also, please add Closes #45218 to PR's description to close the PR automatically when this PR be merged.

@MaxGekk
Copy link
Member

MaxGekk commented Mar 4, 2024

@mihailom-db The test failure is related to your changes:

[info]   Cause: org.scalatest.exceptions.TestFailedException: Function 'collate', Expression class 'org.apache.spark.sql.catalyst.expressions.CollateExpressionBuilder' "...ql.collation.enabled[	]true" did not equal "...ql.collation.enabled[ ]true"

Please, fix it.

@MaxGekk MaxGekk changed the title [SPARK-47102][SQL][COLLATION] Add COLLATION_ENABLED config flag [SPARK-47102][SQL] Add the COLLATION_ENABLED config flag Mar 5, 2024
@MaxGekk
Copy link
Member

MaxGekk commented Mar 5, 2024

The failed GA Run / Build modules: pyspark-connect has been passed already a couple comments before. Highly likely, it is not related to the changes.

@MaxGekk
Copy link
Member

MaxGekk commented Mar 5, 2024

+1, LGTM. Merging to master.
Thank you, @mihailom-db and @cloud-fan @dbatomic @mkaravel for review.

@MaxGekk MaxGekk closed this in 6534a33 Mar 5, 2024
jpcorreia99 pushed a commit to jpcorreia99/spark that referenced this pull request Mar 12, 2024
### What changes were proposed in this pull request?
This PR adds `COLLATION_ENABLED` config to `SQLConf` and introduces new error class `UNSUPPORTED_FEATURE.COLLATION` to appropriately report error on usage of feature under development.

Closes apache#45218

### Why are the changes needed?
We want to make collations configurable on this flag. These changes disable usage of `collate` and `collation` functions, along with any `COLLATE` syntax when the flag is set to false. By default, the flag is set to false.

### Does this PR introduce _any_ user-facing change?
Yes. It introduces new error along with an appropriate message.

### How was this patch tested?
```
./build/mvn -Dtest=none -DwildcardSuites=org.apache.spark.sql.errors.QueryCompilationErrorsSuite test
./build/mvn -Dtest=none -DwildcardSuites=org.apache.spark.sql.catalyst.expressions.CollationExpressionSuite test
./build/mvn -Dtest=none -DwildcardSuites=org.apache.spark.sql.CollationSuite test
```

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#45285 from mihailom-db/SPARK-47102.

Authored-by: Mihailo Milosevic <mihailo.milosevic@databricks.com>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants