[fix](search) Fix implicit conjunction incorrectly modifying preceding term in lucene mode#60814
Merged
airborne12 merged 1 commit intoapache:masterfrom Feb 25, 2026
Conversation
…g term in lucene mode In Lucene's QueryParserBase.addClause(), only explicit CONJ_AND/CONJ_OR modify the preceding term's occur. Implicit conjunction (CONJ_NONE) only affects the current term via default_operator, without modifying the preceding term. The FE SearchDslParser incorrectly treated implicit conjunction the same as explicit AND when default_operator=AND, causing hasExplicitAndBefore() to return true. This made queries like "a OR b c" with default_operator=AND produce SHOULD(a) MUST(b) MUST(c) instead of the correct SHOULD(a) SHOULD(b) MUST(c), diverging from ES behavior. Fix: hasExplicitAndBefore() now returns false when no explicit AND token is found, regardless of default_operator. Only explicit AND tokens trigger the "introduced by AND" logic that modifies preceding terms. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Member
Author
|
buildall |
Member
Author
|
run buildall |
TPC-H: Total hot run time: 28906 ms |
TPC-DS: Total hot run time: 184094 ms |
Contributor
FE Regression Coverage ReportIncrement line coverage |
Contributor
|
PR approved by at least one committer and no changes requested. |
Contributor
|
PR approved by anyone and no changes requested. |
zhiqiang-hhhh
approved these changes
Feb 25, 2026
This was referenced Feb 27, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Issue Number: close #DORIS-24545
Problem Summary:
In
search()function's lucene mode, queries with mixed explicit and implicit operators produce different results from Elasticsearch. For example:"Sumer" OR Ptolemaic\ dynasty Limonenewithdefault_operator=ANDRoot cause: In Lucene's
QueryParserBase.addClause(), only explicitCONJ_AND/CONJ_ORmodify the preceding term's occur. Implicit conjunction (CONJ_NONE, i.e., space-separated terms without an explicit operator) only affects the current term viadefault_operator, without modifying the preceding term.The FE
SearchDslParser.hasExplicitAndBefore()incorrectly returnedtrue(based ondefault_operator) when no explicit AND token was found. This caused implicit conjunction to be treated identically to explicit AND, making it modify the preceding term's occur — diverging from Lucene/ES semantics.Example of the bug:
For
a OR b cwithdefault_operator=AND:SHOULD(a) MUST(b) MUST(c)— wrong, implicit space beforecincorrectly upgradedbfrom SHOULD to MUSTSHOULD(a) SHOULD(b) MUST(c)— correct, matches ES behavior. Onlycgets MUST (from default_operator),bretains SHOULD (from the preceding OR)Fix:
hasExplicitAndBefore()now returnsfalsewhen no explicit AND token is found, regardless ofdefault_operator. Only explicit AND tokens trigger the "introduced by AND" logic that modifies preceding terms.Release note
Fix search() lucene mode producing incorrect results when queries mix explicit operators (OR/AND) with implicit conjunction (space-separated terms).
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)