Skip to content

fix(optimizer)!: qualify UNPIVOT on CTE sources#7550

Merged
georgesittas merged 1 commit intomainfrom
jo/improve_unpivot_support
Apr 23, 2026
Merged

fix(optimizer)!: qualify UNPIVOT on CTE sources#7550
georgesittas merged 1 commit intomainfrom
jo/improve_unpivot_support

Conversation

@georgesittas
Copy link
Copy Markdown
Collaborator

Two fixes that together let SELECT * FROM cte UNPIVOT(...) qualify
correctly in BigQuery and other dialects.

  • parser: UNPIVOT's pre-FOR value column(s) and the FOR field are now
    parsed as Identifier (was Column). Those names don't reference
    existing columns — they're new output names. The IN-list items stay as
    Column since they do reference source-table columns. PIVOT is
    unchanged.

  • optimizer/resolver: when a CTE is pivoted, the scope stores an
    exp.Table under the pivot alias rather than the CTE's Scope, so
    column resolution couldn't see the CTE's columns. Fall back to
    scope.cte_sources in that case. Guarded on not source.db and
    source.args.get("pivots") so a real db.x that happens to share a
    name with a CTE doesn't misroute through the CTE scope.

  • optimizer/qualify_columns: removes the post-hoc filter in
    validate_qualify_columns that excluded unpivot output names from
    scope.unqualified_columns — they're Identifiers now and never land
    there. _unpivot_columns yields Identifiers; output_name still
    works.

Test: new assertion in test_qualify_columns covering the CTE + UNPIVOT shape.

  Two fixes that together let `SELECT * FROM cte UNPIVOT(...)` qualify
  correctly in BigQuery and other dialects.

  - parser: `UNPIVOT`'s pre-`FOR` value column(s) and the `FOR` field are now
    parsed as `Identifier` (was `Column`). Those names don't reference
    existing columns — they're new output names. The `IN`-list items stay as
    `Column` since they do reference source-table columns. `PIVOT` is
    unchanged.

  - optimizer/resolver: when a CTE is pivoted, the scope stores an
    `exp.Table` under the pivot alias rather than the CTE's `Scope`, so
    column resolution couldn't see the CTE's columns. Fall back to
    `scope.cte_sources` in that case. Guarded on `not source.db` and
    `source.args.get("pivots")` so a real `db.x` that happens to share a
    name with a CTE doesn't misroute through the CTE scope.

  - optimizer/qualify_columns: removes the post-hoc filter in
    `validate_qualify_columns` that excluded unpivot output names from
    `scope.unqualified_columns` — they're `Identifier`s now and never land
    there. `_unpivot_columns` yields `Identifier`s; `output_name` still
    works.

  Test: new assertion in `test_qualify_columns` covering the CTE + `UNPIVOT` shape.
Comment thread sqlglot/parser.py
Comment on lines +5154 to +5158
if unpivot:
pivot.set("expressions", [_unpivot_target(e) for e in pivot.expressions])
for pivot_field in pivot.fields:
if isinstance(pivot_field, exp.In):
pivot_field.set("this", _unpivot_target(pivot_field.this))
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See https://docs.cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#unpivot_operator for reference.

The Produce table contains:

/*---------+----+----+----+----+
 | product | Q1 | Q2 | Q3 | Q4 |
 +---------+----+----+----+----+
 | Kale    | 51 | 23 | 45 | 3  |
 | Apple   | 77 | 0  | 25 | 2  |
 +---------+----+----+----+----*/

So, for this query:

SELECT * FROM Produce
UNPIVOT(sales FOR quarter IN (Q1, Q2, Q3, Q4))

/*---------+-------+---------+
 | product | sales | quarter |
 +---------+-------+---------+
 | Kale    | 51    | Q1      |
 | Kale    | 23    | Q2      |
 | Kale    | 45    | Q3      |
 | Kale    | 3     | Q4      |
 | Apple   | 77    | Q1      |
 | Apple   | 0     | Q2      |
 | Apple   | 25    | Q3      |
 | Apple   | 2     | Q4      |
 +---------+-------+---------*/

We can't treat sales and quarter as Column nodes. They're really Identifier nodes that determine what the UNPIVOT operator's output schema is.

@github-actions
Copy link
Copy Markdown
Contributor

SQLGlot Integration Test Results

Comparing:

  • this branch (sqlglot:jo/improve_unpivot_support, sqlglot version: jo/improve_unpivot_support)
  • baseline (main, sqlglot version: 0.0.1.dev1)

By Dialect

dialect main sqlglot:jo/improve_unpivot_support transitions links
bigquery -> bigquery 24645/24650 passed (100.0%) 23495/23495 passed (100.0%) No change full result / delta
bigquery -> duckdb 867/1154 passed (75.1%) 0/0 passed (0.0%) Results not found full result / delta
duckdb -> duckdb 5823/5823 passed (100.0%) 0/0 passed (0.0%) Results not found full result / delta
snowflake -> duckdb 1063/1961 passed (54.2%) 0/0 passed (0.0%) Results not found full result / delta
snowflake -> snowflake 65133/65133 passed (100.0%) 63027/63027 passed (100.0%) No change full result / delta
databricks -> databricks 1370/1370 passed (100.0%) 1370/1370 passed (100.0%) No change full result / delta
postgres -> postgres 6042/6042 passed (100.0%) 6042/6042 passed (100.0%) No change full result / delta
redshift -> redshift 7101/7101 passed (100.0%) 7101/7101 passed (100.0%) No change full result / delta

Overall

main: 113234 total, 112044 passed (pass rate: 98.9%), sqlglot version: 0.0.1.dev1

sqlglot:jo/improve_unpivot_support: 101035 total, 101035 passed (pass rate: 100.0%), sqlglot version: jo/improve_unpivot_support

Transitions:
No change

Dialect pair changes: 0 previous results not found, 3 current results not found

✅ 34 test(s) passed

@georgesittas georgesittas merged commit 8f572f8 into main Apr 23, 2026
8 checks passed
@georgesittas georgesittas deleted the jo/improve_unpivot_support branch April 23, 2026 17:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants