Skip to content

branch-4.1: [fix](inverted index) resolve variant sub-column indexes for score() #62992#63078

Merged
yiguolei merged 1 commit intobranch-4.1from
auto-pick-62992-branch-4.1
May 9, 2026
Merged

branch-4.1: [fix](inverted index) resolve variant sub-column indexes for score() #62992#63078
yiguolei merged 1 commit intobranch-4.1from
auto-pick-62992-branch-4.1

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot commented May 8, 2026

Cherry-picked from #62992

…62992)

### What problem does this PR solve?

Issue Number: N/A

Related PR: 

Problem Summary:

Fix `score()` query failing on variant sub-columns with:

```text
Index statistics collection failed: Score query is not supported without inverted index for column=<variant.subcolumn>
```

`MatchPredicateCollector::collect` previously used
`TabletSchema::inverted_indexs(int32_t col_unique_id, const std::string&
suffix_path)`, which only consults `_col_id_suffix_to_index`. Variant
sub-column indexes can also live in:

1. `_path_set_info_map` (`subcolumn_indexes` / `typed_path_set`), when
the parent variant index is inherited by sub-columns or a typed path
index is materialized.
2. `_index_by_unique_id_with_pattern`, when the inverted index is
created on the parent variant column with
`PROPERTIES("field_pattern"="...")`.

This PR fixes both paths:

- Use the column-aware `TabletSchema::inverted_indexs(const
TabletColumn&)` lookup so inherited / typed-path variant sub-column
indexes are resolved.
- Reuse the parent variant sub-column pattern matching logic from
`variant_util` to resolve `MATCH_NAME` / `MATCH_NAME_GLOB` templates,
then look up `inverted_index_by_field_pattern(parent_uid,
matched_pattern)`.
- Clone matched field-pattern index metadata and set the actual Lucene
field suffix so score collection uses keys such as
`<parent_uid>.<parent_col>.user.name`, while
`CollectInfo::owned_index_meta` keeps the cloned metadata alive.

Behaviour after this PR:

| Scenario | Behaviour |
| --- | --- |
| Variant sub-column index is inherited or materialized in
`_path_set_info_map` | Schema lookup succeeds through
`inverted_indexs(const TabletColumn&)`; score collection uses the real
sub-column Lucene field. |
| Parent variant index uses exact `field_pattern`, e.g. `host` | Pattern
lookup resolves the index and score collection uses
`<parent_uid>.<parent_col>.host`. |
| Parent variant index uses glob `field_pattern`, e.g. `user.*`, and
slot path is `user.name` | Parent template matching returns `user.*`;
index lookup uses the matched pattern; score collection uses
`<parent_uid>.<parent_col>.user.name`. |
| No index matches the sub-column | Existing unsupported-index error is
preserved; BM25 score still requires an inverted index. |

### Release note

Fix `score()` queries on variant sub-columns whose inverted index is
inherited, typed-path based, or defined through `field_pattern`.
@github-actions github-actions Bot requested a review from yiguolei as a code owner May 8, 2026 09:17
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@hello-stephen
Copy link
Copy Markdown
Contributor

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 17.14% (6/35) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.55% (26387/36878)
Line Coverage 54.53% (279682/512849)
Region Coverage 51.91% (232854/448541)
Branch Coverage 53.30% (100680/188906)

@yiguolei yiguolei merged commit 21b213e into branch-4.1 May 9, 2026
28 of 32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants