Skip to content

Support multi-column aliases in SELECT items#2289

Open
funcpp wants to merge 1 commit intoapache:mainfrom
funcpp:support-multi-column-alias
Open

Support multi-column aliases in SELECT items#2289
funcpp wants to merge 1 commit intoapache:mainfrom
funcpp:support-multi-column-alias

Conversation

@funcpp
Copy link
Copy Markdown
Contributor

@funcpp funcpp commented Mar 31, 2026

Summary

Add support for parenthesized multi-column aliases in SELECT items, as defined in the Spark SQL grammar:

namedExpression
    : expression (AS? (name=errorCapturingIdentifier | identifierList))?
    ;

identifierList
    : LEFT_PAREN identifierSeq RIGHT_PAREN
    ;

This enables syntax like:

SELECT stack(2, 'a', 'b', 'c', 'd') AS (col1, col2)

Changes

  • Add SelectItem::ExprWithAliases variant for multi-column aliases
  • Add Dialect::supports_select_item_multi_column_alias(), enabled for Databricks and Generic dialects
  • Parse AS (ident, ident, ...) when the dialect supports it

Context

While not documented in the Databricks SQL reference, this syntax is part of the Spark SQL grammar that Databricks implements. Verified to execute successfully on Databricks Runtime:

SELECT stack(2, 'a', 'b', 'c', 'd') AS (col1, col2)
-- Returns:
-- col1  col2
-- a     b
-- c     d

Test plan

  • Round-trip test: SELECT stack(...) AS (col1, col2) with and without FROM
  • Negative test: non-supporting dialects reject the syntax
  • Existing tests unaffected
  • cargo fmt, cargo clippy, full test suite pass

Spark SQL grammar allows parenthesized identifier lists as SELECT
item aliases:

  namedExpression: expression (AS? (identifier | identifierList))?
  identifierList: '(' identifier (',' identifier)* ')'

This enables syntax like:
  SELECT stack(2, 'a', 'b', 'c', 'd') AS (col1, col2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant