[fix](search) Add session variable to allow MATCH without index metadata on alias slots#60839
Open
airborne12 wants to merge 3 commits intoapache:masterfrom
Open
[fix](search) Add session variable to allow MATCH without index metadata on alias slots#60839airborne12 wants to merge 3 commits intoapache:masterfrom
airborne12 wants to merge 3 commits intoapache:masterfrom
Conversation
… expressions When Alias wraps non-SlotReference expressions like Cast(ElementAt(SlotRef, Literal)) (variant subcolumn access), toSlot() was losing originalTable, originalColumn, oneLevelTable, oneLevelColumn, and subPath metadata. This caused ExpressionTranslator.visitMatch() to crash with "SlotReference in Match failed to get Column" when MATCH was inside OR predicates (where pushdown through project doesn't happen). Fix: Use getInputSlots() to find the unique underlying SlotReference through any expression wrapper depth. Also fixed pre-existing bug where oneLevelColumn parameter was incorrectly using getOriginalColumn() instead of getOneLevelColumn().
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Member
Author
|
run buildall |
TPC-H: Total hot run time: 28777 ms |
TPC-DS: Total hot run time: 183883 ms |
morrySnow
requested changes
Feb 26, 2026
fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/Alias.java
Outdated
Show resolved
Hide resolved
…as slots Revert the Alias.toSlot() approach (searching child nodes for SlotReference violates MySQL protocol metadata semantics). Instead, fix the crash in ExpressionTranslator.visitMatch() by gracefully handling slots that lack originalColumn/originalTable metadata. When MATCH references an alias output slot from a CTE/subquery project whose child is a non-SlotReference expression (e.g., Cast(ElementAt(...))), the slot lacks column metadata. This happens when MATCH is inside an OR predicate, preventing pushdown through the project. Fix: When column or table metadata is missing, create the MatchPredicate with null invertedIndex (already supported by the constructor). The BE evaluates MATCH using the actual index metadata from the scan, so FE-side index info is not required for correctness. An explicit USING ANALYZER clause still requires column metadata for validation.
Member
Author
|
run buildall |
Member
Author
|
run buildall |
TPC-H: Total hot run time: 28870 ms |
TPC-DS: Total hot run time: 183411 ms |
…ata on alias slots When MATCH is inside an OR predicate with a LEFT JOIN, the optimizer cannot push MATCH down to the scan level. The remaining post-join filter references an alias output slot that lacks column metadata (because Alias.toSlot() only preserves metadata when child is a direct SlotReference, not for variant subcolumn access like Cast(ElementAt(...))). This adds a new session variable `enable_match_without_index_check` (default false) that when set to true, allows MATCH predicates to fall back to function-based matching instead of throwing an error when inverted index metadata is unavailable on the slot. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fbccbba to
c578ff8
Compare
Member
Author
|
run buildall |
TPC-H: Total hot run time: 28466 ms |
TPC-DS: Total hot run time: 183419 ms |
Contributor
FE UT Coverage ReportIncrement line coverage |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Issue Number: close #xxx
Problem Summary:
When MATCH is inside an OR predicate with a JOIN (e.g.,
(col MATCH_ALL 'hello' AND t2.id IS NOT NULL) OR col2 > 200), the optimizer cannot fully push MATCH down to the scan level. The remaining post-join filter references an alias output slot that lacks column metadata, causingExpressionTranslator.visitMatch()to crash with "SlotReference in Match failed to get Column".Root cause:
Alias.toSlot()only preservesoriginalColumn/originalTablemetadata when the alias child is a directSlotReference. For variant subcolumn access (Cast(ElementAt(SlotRef, Literal))) or explicitCast(SlotRef), the metadata is lost.Why OR triggers the bug:
Fix: Add a new session variable
enable_match_without_index_check(defaultfalse). When set totrue, MATCH predicates on slots without inverted index metadata will fall back to function-based matching instead of throwing an error. The error message now also suggests this workaround.Release note
Add session variable
enable_match_without_index_checkto allow MATCH expressions on alias columns (from CTE/subquery with variant subcolumns) in OR predicates to fall back to function-based matching when inverted index metadata is unavailable.Check List (For Author)
Test
Behavior changed:
enable_match_without_index_check(default false). When enabled, MATCH on alias slots without column metadata falls back to function-based matching instead of throwing an error.Does this need documentation?
Check List (For Reviewer who merge this PR)