[SPARK-49543][SQL] Add SHOW COLLATIONS command by viirya · Pull Request #55099 · apache/spark

viirya · 2026-03-30T21:38:27Z

What changes were proposed in this pull request?

Add SHOW COLLATIONS SQL syntax to list all Spark built-in collations. Supports optional LIKE pattern filtering (e.g. SHOW COLLATIONS LIKE 'UNICODE*').

Output schema: NAME, LANGUAGE, COUNTRY, ACCENT_SENSITIVITY, CASE_SENSITIVITY, PAD_ATTRIBUTE, ICU_VERSION — matching the existing collations() TVF but without the constant CATALOG/SCHEMA columns.

Implementation follows the ShowCatalogsCommand pattern as collations are engine-global and not tied to any catalog or namespace.

Why are the changes needed?

SHOW COLLATIONS is a SQL command supported by MySQL and its derivatives (MariaDB, TiDB) for listing available collations. Spark currently only exposes this information via a table-valued function (SELECT * FROM collations()), which is inconsistent with how other catalog objects are queried (SHOW CATALOGS, SHOW TABLES, etc.) and unfamiliar to users coming from MySQL-compatible databases. This change adds a more intuitive SQL syntax consistent with Spark's existing SHOW command family.

Does this PR introduce any user-facing change?

Yes, this adds SHOW COLLATIONS command.

How was this patch tested?

Unit tests

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Sonnet 4.6

Add SHOW COLLATIONS SQL syntax to list all Spark built-in collations. Supports optional LIKE pattern filtering (e.g. SHOW COLLATIONS LIKE 'UNICODE%'). Output schema: NAME, LANGUAGE, COUNTRY, ACCENT_SENSITIVITY, CASE_SENSITIVITY, PAD_ATTRIBUTE, ICU_VERSION — matching the existing collations() TVF but without the constant CATALOG/SCHEMA columns. Implementation follows the ShowCatalogsCommand pattern as collations are engine-global and not tied to any catalog or namespace. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

dongjoon-hyun

+1, LGTM (Pending CIs).

…NS token Add COLLATIONS to SQL keyword golden files and hardcoded keyword lists in ThriftServer and SparkConnect JDBC tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…LATIONS Add COLLATIONS to reserved keyword list in keywords-enforced.sql.out and add COLLATIONS documentation entry in sql-ref-ansi-compliance.md. COLLATIONS is reserved in ANSI mode (ansiNonReserved) and non-reserved in non-ANSI mode; it is not part of SQL-2016 standard. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

filterPattern uses * (not %) as the wildcard character, consistent with other SHOW commands like SHOW NAMESPACES and SHOW FUNCTIONS. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

dongjoon-hyun

Although the test passed, the last test code change commit looks suspicious to me.

dongjoon-hyun · 2026-03-31T13:12:26Z

+    assert(utf8Row.getString(3) == "ACCENT_SENSITIVE")
+    assert(utf8Row.getString(4) == "CASE_SENSITIVE")
+
+    val likeResult = sql("SHOW COLLATIONS LIKE 'UNICODE*'").collect()


Could you double-check this, @viirya ? LIKE should use % instead of *.

* is used for another syntax like REGEXP.

Thanks for catching this!

For context: other existing SHOW commands like SHOW NAMESPACES and SHOW FUNCTIONS also use *. This SHOW COLLATIONS fix makes it consistent with SQL LIKE convention, unlike those commands.

ShowNamespacesSuiteBase.scala:88 — SHOW NAMESPACES LIKE '1'
ShowFunctionsParserSuite.scala:50 — SHOW FUNCTIONS LIKE 'funct*'
ShowTablesParserSuite.scala:36 — SHOW TABLES LIKE 'test'
ShowTablesParserSuite.scala:54 — SHOW TABLE EXTENDED LIKE 'test'
ShowTablesSuiteBase.scala:338 — SHOW TABLE EXTENDED LIKE '$viewName*'

I also take a look at our document for SHOW TABLES LIKE https://spark.apache.org/docs/latest/sql-ref-syntax-aux-show-tables.html. It said the command uses regex_pattern after LIKE:

Syntax SHOW TABLES [ { FROM | IN } database_name ] [ LIKE regex_pattern ]

~~Oh. So, we support both % and *?~~

Thank you, @viirya . After reading the doc once more, I realized my misunderstanding. Sorry for making you confused.

BTW, can we have a new document for SHOW COLLATIONS like SHOW TABLES LIKE https://spark.apache.org/docs/latest/sql-ref-syntax-aux-show-tables.html ?

Yes, let me add a document for it. Thanks for the reminder.

there was a ticket to make it follow the standard SQL LIKE pattern SPARK-45880 (#43751) but was not landed.

Convert SQL LIKE wildcard % to glob * before passing to filterPattern, so SHOW COLLATIONS LIKE 'UNICODE%' works correctly. Revert test to use % per SQL LIKE convention. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… pattern" This reverts commit c8a3010.

Add sql-ref-syntax-aux-show-collations.md following the same structure as other SHOW command docs (description, syntax, parameters, output schema, examples, related statements). Also add entry to the SQL syntax index in sql-ref-syntax.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

dongjoon-hyun · 2026-03-31T17:15:58Z

+      parser.parsePlan("SHOW COLLATIONS"),
+      ShowCollationsCommand(None))
+    comparePlans(
+      parser.parsePlan("SHOW COLLATIONS LIKE 'UNICODE%'"),


If we don't support %, shall we avoid % in the test case, @viirya ?

Good catch! Let me remove it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

dongjoon-hyun · 2026-03-31T18:09:57Z

I updated the PR description too.

-Add SHOW COLLATIONS SQL syntax to list all Spark built-in collations. Supports optional LIKE pattern filtering (e.g. SHOW COLLATIONS LIKE 'UNICODE%').
+Add SHOW COLLATIONS SQL syntax to list all Spark built-in collations. Supports optional LIKE pattern filtering (e.g. SHOW COLLATIONS LIKE 'UNICODE*').

viirya · 2026-03-31T18:26:54Z

I updated the PR description too.

-Add SHOW COLLATIONS SQL syntax to list all Spark built-in collations. Supports optional LIKE pattern filtering (e.g. SHOW COLLATIONS LIKE 'UNICODE%').
+Add SHOW COLLATIONS SQL syntax to list all Spark built-in collations. Supports optional LIKE pattern filtering (e.g. SHOW COLLATIONS LIKE 'UNICODE*').

Thank you @dongjoon-hyun

dongjoon-hyun · 2026-03-31T20:27:15Z

Merged to master for Apache Spark 4.2.0.

### What changes were proposed in this pull request? Add SHOW COLLATIONS SQL syntax to list all Spark built-in collations. Supports optional LIKE pattern filtering (e.g. SHOW COLLATIONS LIKE 'UNICODE*'). Output schema: NAME, LANGUAGE, COUNTRY, ACCENT_SENSITIVITY, CASE_SENSITIVITY, PAD_ATTRIBUTE, ICU_VERSION — matching the existing collations() TVF but without the constant CATALOG/SCHEMA columns. Implementation follows the ShowCatalogsCommand pattern as collations are engine-global and not tied to any catalog or namespace. ### Why are the changes needed? SHOW COLLATIONS is a SQL command supported by MySQL and its derivatives (MariaDB, TiDB) for listing available collations. Spark currently only exposes this information via a table-valued function (SELECT * FROM collations()), which is inconsistent with how other catalog objects are queried (SHOW CATALOGS, SHOW TABLES, etc.) and unfamiliar to users coming from MySQL-compatible databases. This change adds a more intuitive SQL syntax consistent with Spark's existing SHOW command family. ### Does this PR introduce _any_ user-facing change? Yes, this adds `SHOW COLLATIONS` command. ### How was this patch tested? Unit tests ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Sonnet 4.6 Closes apache#55099 from viirya/SPARK-49543-show-collations. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

viirya · 2026-04-01T04:56:51Z

Thank you @dongjoon-hyun

dongjoon-hyun approved these changes Mar 30, 2026

View reviewed changes

viirya and others added 3 commits March 30, 2026 17:29

[SPARK-49543][SQL] Update keyword golden files and tests for COLLATIO…

dab0682

…NS token Add COLLATIONS to SQL keyword golden files and hardcoded keyword lists in ThriftServer and SparkConnect JDBC tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

[SPARK-49543][SQL] Fix SHOW COLLATIONS LIKE pattern to use * wildcard

ba0a781

filterPattern uses * (not %) as the wildcard character, consistent with other SHOW commands like SHOW NAMESPACES and SHOW FUNCTIONS. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

dongjoon-hyun requested changes Mar 31, 2026

View reviewed changes

viirya and others added 3 commits March 31, 2026 09:15

Revert "[SPARK-49543][SQL] Support % wildcard in SHOW COLLATIONS LIKE…

9d117a2

… pattern" This reverts commit c8a3010.

dongjoon-hyun reviewed Mar 31, 2026

View reviewed changes

[SPARK-49543][SQL] Fix SHOW COLLATIONS parser test to use * wildcard

6e0373f

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

dongjoon-hyun approved these changes Mar 31, 2026

View reviewed changes

dongjoon-hyun closed this in d580b65 Mar 31, 2026

viirya deleted the SPARK-49543-show-collations branch April 1, 2026 04:56

Conversation

viirya commented Mar 30, 2026 • edited by dongjoon-hyun Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

viirya Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

viirya Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

pan3793 Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

viirya Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun commented Mar 31, 2026

Uh oh!

viirya commented Mar 31, 2026

Uh oh!

dongjoon-hyun commented Mar 31, 2026

Uh oh!

viirya commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

viirya commented Mar 30, 2026 •

edited by dongjoon-hyun

Loading

viirya Mar 31, 2026 •

edited

Loading

dongjoon-hyun Mar 31, 2026 •

edited

Loading