[fix](variant) Bind Variant search to nested indexes#63660
Draft
eldenmoon wants to merge 2 commits into
Draft
Conversation
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Split Variant inverted-index search helpers out of function_search and bind scalar/nested Variant search paths through the segment reader path. Preserve null bitmap behavior for empty bitset query results and add focused BE UT coverage for nested search binding and bitset null handling.
26cbef3 to
8310d28
Compare
Member
Author
|
run buildall |
Member
Author
|
/review |
Contributor
There was a problem hiding this comment.
Review completed. I did not find blocking correctness issues in this PR. The split of Variant-specific search binding into variant_inverted_index_search keeps the generic search path clearer, and the updated resolver/nested mapping paths appear consistent with the segment reader and query_v2 null-bitmap semantics.
Critical checkpoint conclusions:
- Goal/test coverage: The PR addresses Variant scalar and nested inverted-index binding, including direct/inherited subcolumn index selection and nested child-to-parent bitmap mapping. Focused BE UT coverage was added for direct binding, nested mapping, and empty truth bitmap with null bitmap preservation.
- Scope/clarity: The change is relatively large but mostly isolates Variant-specific logic into new files and keeps storage-side changes focused on candidate index diagnostics/type inference.
- Concurrency/lifecycle: No new shared mutable state requiring additional locking was found. Variant reader lifetime for nested-group iterators is explicitly retained via ReaderOwnedColumnIterator, and existing ColumnReader/segment call-once patterns are preserved.
- Configuration/compatibility: No new configs or storage/protocol format changes were introduced. Existing nested-group provider gating remains in place.
- Parallel paths: Both normal search and top-level nested search paths use the new FieldReaderResolver context, and storage iterator discovery handles direct and inherited Variant indexes.
- Error handling: Status-returning paths are checked; exceptions raised inside query_v2 scorer construction are still under the existing VExprContext RETURN_IF_CATCH_EXCEPTION boundary.
- Data correctness: The true/null bitmap handling for missing Variant leaves and BitSetQuery preserves three-valued logic before final WHERE masking. Nested child hits are mapped back through the active nested-group chain before row-level filtering.
- Observability: Added diagnostics are capped and routed through existing inverted-index stats/profile reporting.
- Performance: No obvious hot-path regression beyond bounded diagnostics and necessary reader resolution was found.
User focus: No additional user-provided review focus was present.
Contributor
TPC-H: Total hot run time: 32227 ms |
Contributor
TPC-DS: Total hot run time: 172787 ms |
Contributor
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
Member
Author
|
run buildall |
Contributor
TPC-H: Total hot run time: 32450 ms |
Contributor
TPC-DS: Total hot run time: 172103 ms |
Member
Author
|
run buildall |
Contributor
TPC-H: Total hot run time: 31264 ms |
Contributor
TPC-DS: Total hot run time: 172213 ms |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What Problem Does This PR Solve?
This PR fixes Variant inverted-index search binding for scalar Variant paths and nested Variant paths
Before this change, Variant search had several correctness gaps around how a logical search field was bound to the physical segment/index structures:
BitSetQuerytreated an empty truth bitmap as an empty scorer even when a null bitmap was present, which dropped null-bitmap semantics needed by Variant search predicates.What Changed
function_searchintovariant_inverted_index_search.{h,cpp}so the generic search function is smaller and Variant binding logic is isolated.FieldReaderResolverto track whether a field is actually bound, missing in the segment, or handled through a direct Variant index reader.BitSetQuery/BitSetWeightwhen the truth bitmap is empty but the null bitmap is not.Testing
Ran focused Variant-related BE unit tests:
./run-be-ut.sh --run --filter='*Variant*:FunctionSearchTest.TestBuildLeafQueryDirectUnknownClauseUsesLeafMapper:FunctionSearchNestedTest.*:BitSetQueryTest.EmptyTruthBitmapPreservesNullBitmap'