branch-4.1: [feature](score) support BM25 scoring in inverted index query_v2 #59847#61472
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
4a5d6d0 to
094e120
Compare
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
…M25 scoring 1. Update contrib/clucene from 8b57674 to c51b5cc to include: - ac9475a: block max WAND algorithm with BM25 similarity - aef5c9c: Fix GCC -Werror compilation errors - c51b5cc: Fix GCC -Werror=overloaded-virtual in FieldForMerge Required for readBlock, getMaxBlockFreq, getMaxBlockNorm, getLastDocInBlock APIs used by segment_postings.h 2. Fix doc_set_collector: When context.readers is empty (e.g., AllQuery for MATCH_ALL_DOCS), for_each_index_segment returns immediately without iterating. Added fallback to create scorer directly from weight, which allows AllScorer to work without an IndexReader.
46d977d to
6cb69d6
Compare
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
The branch-4.0 stub (static, returns false) conflicts with the real implementation in the anonymous namespace added by the BM25 scoring PR. On branch-4.1 the real implementation is correct, so remove the stub to fix the ambiguous call compilation error.
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
FE UT Coverage ReportIncrement line coverage |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
Proposed changes
Cherry-pick #59847 to branch-4.1.
Original PR: #59847
Further comments
Resolved cherry-pick conflicts (minor differences from branch-4.0 pick):
IndexQueryContextPtrmember andset/get_index_query_contexttoIndexExecContextevaluate_inverted_index_with_search_paramwithindex_query_contextparameterenable_inverted_index_wand_queryto thrift and SessionVariable (field 203)single_backend_queryfield 202) intact