[opt](be) Optimize row store lookup to support batch row reading per segment#63434
[opt](be) Optimize row store lookup to support batch row reading per segment#63434HappenLee wants to merge 1 commit into
Conversation
…segment Refactor `lookup_row_data` to accept multiple row IDs for the same segment, reading them in a single batch via `ColumnString` instead of one-by-one with per-row `std::string` buffers. This reduces the per-row overhead of column iterator calls and string copy for multi-row fetch scenarios (e.g. batch point queries and index lookup). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
|
/review |
There was a problem hiding this comment.
I found one blocking correctness issue in the row-store batch lookup path. The PR goal is clear and focused, but the new batched call changes a per-row lookup into ColumnIterator::read_by_rowids() without preserving that iterator's ordering assumptions. Tests are also missing for unordered row ids, which is the scenario that exposes this regression.
Critical checkpoint conclusions: goal is to optimize row-store lookup by batching row ids per segment, but the current implementation does not safely accomplish it for arbitrary request order; scope is small and focused; no new concurrency, lifecycle, config, persistence, or FE/BE protocol compatibility concerns found; no transaction/delete-bitmap changes; rowset/segment lifetimes follow existing shared_ptr/cache-handle patterns; error handling mostly follows existing style; observability is unchanged and sufficient for this refactor; performance intent is good, but it must not rely on unsorted row ids; test coverage is insufficient because no case covers non-monotonic row ids in the batch materialization path.
User focus points: no additional user-provided review focus.
| RETURN_IF_ERROR(scope_timer_run( | ||
| [&]() { | ||
| return tablet->lookup_row_data({}, segment_id, row_ids, rowset, stats, | ||
| *row_store_rows); |
There was a problem hiding this comment.
This batched row-store read can return wrong rows when row_ids are not monotonically increasing. read_batch_doris_format_row() groups only contiguous equal file_ids from the request; it does not sort by row id. The request is built in result-row order (MaterializationSharedState::create_muiltget_result() appends row_location.row_id as rows arrive), so a batch for the same segment can be in arbitrary order such as [100, 1]. FileColumnIterator::read_by_rowids() seeks to rowids[total_read_count] and then advances within the current page, so it assumes the remaining row ids are ordered within/after that page; the old per-row loop did not have this assumption. Please either preserve the old one-by-one path for unordered input, or sort (row_id, original_index) for the iterator and restore the original output order before jsonb_to_block().
|
PR approved by at least one committer and no changes requested. |
TPC-H: Total hot run time: 31506 ms |
TPC-DS: Total hot run time: 176469 ms |
ClickBench: Total hot run time: 24.64 s |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
What problem does this PR solve?
Refactor
lookup_row_datato accept multiple row IDs for the same segment, reading them in a single batch viaColumnStringinstead of one-by-one with per-rowstd::stringbuffers. This reduces the per-row overhead of column iterator calls and string copy for multi-row fetch scenarios (e.g. batch point queries and index lookup).None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)