[fix](match) Allow MATCH on aliased variant subcolumns#63772
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
4bab929 to
18da2b7
Compare
|
run buildall |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Fixes MATCH predicate validation to accept VARIANT dot subcolumn access (e.g., msg.trace_id) that produces an Alias wrapping the pruned subcolumn slot, matching the existing behavior for bracket subcolumn access.
Changes:
- Extend
getSlotFromSlotOrCastChainto also unwrapAliasnodes and rename it accordingly. - Add unit tests in
CheckMatchExpressionTestcovering alias/cast chains over variant subcolumns and rejection cases. - Add an integration test in
VariantPruningLogicTestverifying the scan predicate uses aSlotRefwith the expected sub-column path, and refactor helpers.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/CheckMatchExpression.java | Allow Alias in the slot/cast unwrap chain for MATCH validation. |
| fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/CheckMatchExpressionTest.java | Add tests for alias/cast chains over variant subcolumns and non-slot alias rejection. |
| fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/VariantPruningLogicTest.java | Add end-to-end test for MATCH on dot variant subcolumn; refactor scan-node collection helpers. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
### What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: CIR-20398 reports that MATCH predicates fail for VARIANT dot subcolumn access such as cast(msg.trace_id as string), while the equivalent bracket access msg['trace_id'] works. Dot access can leave an Alias around the pruned subcolumn slot, and CheckMatchExpression rejected the aliased slot.
### Release note
Fix MATCH predicates on VARIANT dot subcolumn access such as msg.trace_id so they are accepted like equivalent bracket subcolumn access.
### Check List (For Author)
- Test: Unit Test
- ./run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.VariantPruningLogicTest#testMatchOnDotVariantSubColumnUsesSlotRefInScanPredicate
- ./run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.CheckMatchExpressionTest
- Behavior changed: Yes. MATCH validation now accepts alias/cast chains that resolve to a SlotReference, while still rejecting aliases over non-slot expressions and root VARIANT MATCH predicates.
- Does this need documentation: No
18da2b7 to
d0cfb94
Compare
|
run buildall |
|
/review |
There was a problem hiding this comment.
Review result: no blocking issues found.
Checkpoint conclusions:
- Goal/test: The PR fixes MATCH validation for VARIANT dot subcolumn access where the left side may retain an Alias. The implementation now unwraps Cast/Alias chains to validate the underlying SlotReference, and the added tests cover root VARIANT rejection, aliased subcolumn acceptance, alias+cast chains, non-slot alias rejection, and the scan predicate shape for dot subcolumns.
- Scope/focus: The change is small and localized to CheckMatchExpression plus targeted tests. No unrelated rewrite behavior was changed.
- Concurrency/lifecycle: Not applicable; this is a stateless FE rewrite validation rule and test refactor with no shared mutable state or lifecycle-sensitive objects.
- Configuration/compatibility: No new configs, protocol fields, persisted formats, or rolling-upgrade compatibility concerns.
- Parallel paths: The existing cast-unwrapping behavior is preserved while adding Alias unwrapping for the same validation path. Translator behavior already strips Alias during expression translation, so this remains consistent with downstream scan predicate construction.
- Error handling: Existing Nereids AnalysisException behavior is preserved. Invalid root VARIANT MATCH and aliases over non-slot expressions still fail.
- Test coverage: Coverage is appropriate for this focused fix. I attempted to run ./run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.CheckMatchExpressionTest locally, but the runner is missing thirdparty/installed/bin/protoc, so the FE UT could not start in this environment.
- Observability/performance: No new runtime observability is needed. The added validation loop remains trivial and only runs in rewrite analysis.
- User focus: No additional user-provided review focus was specified.
FE UT Coverage ReportIncrement line coverage |
TPC-H: Total hot run time: 31433 ms |
TPC-DS: Total hot run time: 170928 ms |
FE Regression Coverage ReportIncrement line coverage |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: MATCH predicates fail for VARIANT dot subcolumn access such as
cast(msg.trace_id as string), while the equivalent bracket accessmsg['trace_id']works. Dot access can leave anAliasaround the pruned subcolumn slot, andCheckMatchExpressionrejected the aliased slot.Release note
Fix MATCH predicates on VARIANT dot subcolumn access such as
msg.trace_idso they are accepted like equivalent bracket subcolumn access.Check List (For Author)
cd fe && mvn clean checkstyle:check./run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.VariantPruningLogicTest#testMatchOnDotVariantSubColumnUsesSlotRefInScanPredicate./run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.CheckMatchExpressionTest