Skip to content

branch-4.1: [fix](analyze) Preserve variant subfields in view definitions to fix select view result wrong when view select has variant field #62907#63151

Open
github-actions[bot] wants to merge 1 commit into
branch-4.1from
auto-pick-62907-branch-4.1
Open

branch-4.1: [fix](analyze) Preserve variant subfields in view definitions to fix select view result wrong when view select has variant field #62907#63151
github-actions[bot] wants to merge 1 commit into
branch-4.1from
auto-pick-62907-branch-4.1

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

Cherry-picked from #62907

…select view result wrong when view select has variant field (#62907)

### What problem does this PR solve?

Related PR: #57532 

Problem Summary:

`CREATE VIEW` with Nereids could persist wrong SQL when the view
definition used dotted `VARIANT` subfield access such as
`event_value.video_id`. The buggy persisted SQL replaced the whole
subfield expression with the base column `event_value`, so querying the
view could return `0` rows while the same CTE queried directly returned
`1` row.

Reproduction steps:

1. Create an `events` table with `event_value
VARIANT<'video_id':largeint, 'duration':bigint>` and `user_connect_info
VARIANT<'user_client':text>`.
2. Insert one `watch_time` row where `event_value.video_id = 100`,
`event_value.duration = 15000`, and `user_connect_info.user_client =
'ios'`.
3. Create a view with a CTE that projects `CAST(event_value.video_id AS
LARGEINT)`, `TRY_CAST(event_value.duration AS INT)`, joins `video_meta`,
and aggregates the result.
4. Query the view with `SELECT COUNT(*) FROM v_bug` and `SELECT COUNT(*)
FROM v_bug WHERE event_day >= '2026-04-20'`.
5. On a buggy build both view queries return `0`; on the fixed build
both return `1`. `SHOW CREATE VIEW` also keeps
``event_value`.`video_id`` instead of losing the subfield path.

Root cause: In `ExpressionAnalyzer.bindExpressionByColumn()` and the
qualified variants (`bindExpressionByTableColumn()`,
`bindExpressionByDbTableColumn()`,
`bindExpressionByCatalogDbTableColumn()`), the analyzer attached
SQL-index rewrite metadata to the base `SlotReference` before
`bindNestedFields()` handled the dotted subfield path. After binding
`event_value.video_id` into `ElementAt(SlotReference(event_value),
'video_id')`, the base slot still owned the original SQL span for the
whole `event_value.video_id` text. The create-view SQL rewrite later
replaced that span with the fully qualified base column only, changing
the view definition semantics.

| File | Change Description |
|------|-------------------|
|
`fe/fe-core/src/main/java/org/apache/doris/nereids/rules/analysis/ExpressionAnalyzer.java`
| Delays `addSqlIndexInfo()` until the analyzer knows the reference is
an ordinary slot path. For successfully bound nested fields, records a
fully qualified dotted replacement that includes the base column and all
nested field names. |
|
`regression-test/suites/ddl_p0/create_view_nereids/test_create_view_variant_nested_field.groovy`
| Adds regression coverage for `CREATE VIEW`, `ALTER VIEW`,
bracket-style access, and the CTE/join/aggregation shape from the JIRA
reproduction. The test queries the created view directly and verifies
the row count remains `1`. |

Design rationale: This is fixed in expression analysis instead of
`BaseViewInfo.SlotDealer` because `ExpressionAnalyzer` is where the
dotted name is resolved into a nested-field expression and where the
original SQL span is still available with semantic context. `SlotDealer`
only sees later slot references during view SQL rewrite, so fixing it
there would require guessing whether a base slot originated from a
nested-field path. Delaying base-slot rewrite metadata and adding
explicit nested-field rewrite metadata keeps ordinary column behavior
unchanged while preserving `VARIANT` subfield semantics in view
definitions.

### Release note

Fixed an issue where `CREATE VIEW` or `ALTER VIEW` could lose dotted
`VARIANT` subfield paths in persisted view SQL, causing later queries on
the view to return incorrect results.
@github-actions github-actions Bot requested a review from yiguolei as a code owner May 12, 2026 02:12
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@hello-stephen
Copy link
Copy Markdown
Contributor

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 76.67% (23/30) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants