branch-4.1: [fix](analyze) Preserve variant subfields in view definitions to fix select view result wrong when view select has variant field #62907#63151
Open
github-actions[bot] wants to merge 1 commit into
Conversation
…select view result wrong when view select has variant field (#62907) ### What problem does this PR solve? Related PR: #57532 Problem Summary: `CREATE VIEW` with Nereids could persist wrong SQL when the view definition used dotted `VARIANT` subfield access such as `event_value.video_id`. The buggy persisted SQL replaced the whole subfield expression with the base column `event_value`, so querying the view could return `0` rows while the same CTE queried directly returned `1` row. Reproduction steps: 1. Create an `events` table with `event_value VARIANT<'video_id':largeint, 'duration':bigint>` and `user_connect_info VARIANT<'user_client':text>`. 2. Insert one `watch_time` row where `event_value.video_id = 100`, `event_value.duration = 15000`, and `user_connect_info.user_client = 'ios'`. 3. Create a view with a CTE that projects `CAST(event_value.video_id AS LARGEINT)`, `TRY_CAST(event_value.duration AS INT)`, joins `video_meta`, and aggregates the result. 4. Query the view with `SELECT COUNT(*) FROM v_bug` and `SELECT COUNT(*) FROM v_bug WHERE event_day >= '2026-04-20'`. 5. On a buggy build both view queries return `0`; on the fixed build both return `1`. `SHOW CREATE VIEW` also keeps ``event_value`.`video_id`` instead of losing the subfield path. Root cause: In `ExpressionAnalyzer.bindExpressionByColumn()` and the qualified variants (`bindExpressionByTableColumn()`, `bindExpressionByDbTableColumn()`, `bindExpressionByCatalogDbTableColumn()`), the analyzer attached SQL-index rewrite metadata to the base `SlotReference` before `bindNestedFields()` handled the dotted subfield path. After binding `event_value.video_id` into `ElementAt(SlotReference(event_value), 'video_id')`, the base slot still owned the original SQL span for the whole `event_value.video_id` text. The create-view SQL rewrite later replaced that span with the fully qualified base column only, changing the view definition semantics. | File | Change Description | |------|-------------------| | `fe/fe-core/src/main/java/org/apache/doris/nereids/rules/analysis/ExpressionAnalyzer.java` | Delays `addSqlIndexInfo()` until the analyzer knows the reference is an ordinary slot path. For successfully bound nested fields, records a fully qualified dotted replacement that includes the base column and all nested field names. | | `regression-test/suites/ddl_p0/create_view_nereids/test_create_view_variant_nested_field.groovy` | Adds regression coverage for `CREATE VIEW`, `ALTER VIEW`, bracket-style access, and the CTE/join/aggregation shape from the JIRA reproduction. The test queries the created view directly and verifies the row count remains `1`. | Design rationale: This is fixed in expression analysis instead of `BaseViewInfo.SlotDealer` because `ExpressionAnalyzer` is where the dotted name is resolved into a nested-field expression and where the original SQL span is still available with semantic context. `SlotDealer` only sees later slot references during view SQL rewrite, so fixing it there would require guessing whether a base slot originated from a nested-field path. Delaying base-slot rewrite metadata and adding explicit nested-field rewrite metadata keeps ordinary column behavior unchanged while preserving `VARIANT` subfield semantics in view definitions. ### Release note Fixed an issue where `CREATE VIEW` or `ALTER VIEW` could lose dotted `VARIANT` subfield paths in persisted view SQL, causing later queries on the view to return incorrect results.
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Contributor
|
run buildall |
Contributor
FE Regression Coverage ReportIncrement line coverage |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Cherry-picked from #62907