Skip to content

[SPARK-47102][SQL][COLLATION] Adding COLLATION_ENABLED config#45218

Closed
mihailomilosevic2001 wants to merge 14 commits intoapache:masterfrom
mihailomilosevic2001:SPARK-47102
Closed

[SPARK-47102][SQL][COLLATION] Adding COLLATION_ENABLED config#45218
mihailomilosevic2001 wants to merge 14 commits intoapache:masterfrom
mihailomilosevic2001:SPARK-47102

Conversation

@mihailomilosevic2001
Copy link
Contributor

@mihailomilosevic2001 mihailomilosevic2001 commented Feb 22, 2024

What changes were proposed in this pull request?

This PR adds COLLATION_ENABLED config to SQLConf and introduces new error class COLLATION_SUPPORT_NOT_ENABLED to appropriately report error on usage of feature under development.

Why are the changes needed?

We want to make collations configurable on this flag. These changes disable usage of collate and collation functions, along with any COLLATE syntax when the flag is set to false. By default, the flag is set to false.

Does this PR introduce any user-facing change?

Yes. It introduces new error along with an appropriate message.

How was this patch tested?

./build/mvn -Dtest=none -DwildcardSuites=org.apache.spark.sql.errors.QueryCompilationErrorsSuite test
./build/mvn -Dtest=none -DwildcardSuites=org.apache.spark.sql.catalyst.expressions.CollationExpressionSuite test
./build/mvn -Dtest=none -DwildcardSuites=org.apache.spark.sql.CollationSuite test

Was this patch authored or co-authored using generative AI tooling?

No.

@mihailomilosevic2001 mihailomilosevic2001 changed the title [SPARK-47102][SQL] Adding COLLATION_ENABLED config [WIP] [SPARK-47102][SQL] Adding COLLATION_ENABLED config Feb 22, 2024
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Collation feature is under development and not all features are supported. To enable existing features toggle `COLLATION_ENABLED` flag to true."
"Collation support is under development and not all features are supported. To enable existing features set `spark.sql.collation.enabled` to `true`."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity: do we have or want any mechanism to disallow the enablement of the feature?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understood correctly, we want to make sure that customer is prevented from trying to use partially built feature and getting in trouble because some functionality was not implemented at the time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good question. I am not sure what is the policy here - @cloud-fan and @MaxGekk - can you guide us here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preventing the use of a partially baked feature is a good reason here.

@mkaravel
Copy link
Contributor

One more thing: it would be nice to add the [Collation] tag in the title of the PR.

@mihailomilosevic2001 mihailomilosevic2001 changed the title [SPARK-47102][SQL] Adding COLLATION_ENABLED config [SPARK-47102][SQL][COLLATION] Adding COLLATION_ENABLED config Feb 23, 2024
@mihailomilosevic2001
Copy link
Contributor Author

PR should be ready now, but until I get GitHub Actions fixed for my account, CI runs can't be run before merging. Will ping when that gets fixed.

@mihailomilosevic2001
Copy link
Contributor Author

Switched to #45285 as GitHub Actions are not enabled on my account. Please provide review on that PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants