[opt](be) Batch row_id reads in seek_and_read_by_rowid to reduce column iterator overhead#63436
[opt](be) Batch row_id reads in seek_and_read_by_rowid to reduce column iterator overhead#63436HappenLee wants to merge 2 commits into
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
…mn iterator overhead
Change seek_and_read_by_rowid to accept a batch of row_ids instead of a single
row_id, allowing the underlying column iterator's read_by_rowids to process all
rows in one call. This eliminates per-row iterator re-initialization overhead
in multi-row fetch paths (point query, batch index lookup).
|
run buildall |
TPC-H: Total hot run time: 31264 ms |
TPC-DS: Total hot run time: 168451 ms |
|
/review |
There was a problem hiding this comment.
Review completed. I did not find blocking correctness issues in the PR diff.
Critical checkpoint conclusions:
- Goal/test: The code changes the row-id reader API to accept sorted batches and updates both batch index lookup and point query callers. Existing tests may cover some paths, but this is not purely refactor-only behavior because batching, sorting, deduplication, and scatter-back logic changed; no new test evidence is provided.
- Scope: The PR diff is small and focused on batched row-id column reads.
- Concurrency: No new shared mutable concurrency is introduced in the internal Doris-format path. External row-id fetch concurrency was not changed by this PR.
- Lifecycle/static initialization: No special lifecycle or static initialization risk found.
- Config/compatibility: No config, storage format, or FE-BE protocol compatibility change found.
- Parallel code paths: The direct multi-get, batch index lookup, and point query callers were updated to the new API.
- Conditions/invariants: The new sorted/unique row_id precondition is enforced at Segment entry; current PR callers satisfy it.
- Test coverage: No tests were run by me, and the PR declares no testing. This remains the main residual risk for the new scatter/dedup behavior.
- Observability/transactions/data writes: Not applicable.
- Performance: The batching direction removes per-row iterator calls and appears aligned with the PR goal.
User focus: No additional user-provided review focus was present.
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
### What problem does this PR solve? Issue Number: N/A Problem Summary: The latest commit introduced batch row_id reading in the v2 multiget path (read_batch_doris_format_row / read_batch_external_row). The scatter phase used Block::get_columns() to obtain raw IColumn* pointers, but get_columns() returns a temporary std::vector<ColumnPtr> that is destroyed at the end of the expression. For columns like ColumnMap, convert_to_full_column_if_const() allocates a new object into that temporary vector; the raw pointer extracted via .get() immediately becomes dangling, causing an ASAN heap-use-after-free at rowid_fetcher.cpp:750 when insert_from_multi_column() dereferences it. Fix: - Replace both scatter loops with a shared helper scatter_scan_blocks_to_result_block() that uses Block::get_by_position(i).column.get() for stable pointer access (the column is owned by the Block itself, not a temporary vector). - The commit also inadvertently changed the v1 (PMultiGetRequest) path and point_query_executor.cpp to wrap single row_ids in vectors and call the batch overload. Revert those callers to pass a single-element vector to the single vector overload instead, keeping the v1 path behaviour unchanged. - Remove the redundant uint32_t single-row overload of seek_and_read_by_rowid; callers that read one row now pass a one-element std::vector<rowid_t>. ### Release note None ### Check List (For Author) - Test: Manual ASAN build compilation verified (./build.sh --be -j 64) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/review |
|
run buildall |
There was a problem hiding this comment.
Automated review completed. I did not find any critical blocking issues in the changed code.
Checkpoint conclusions:
- Goal/test: The change batches internal Doris row-id column reads per segment, preserves output order via scatter mapping, and includes a manual ASAN build note. I did not see new automated coverage for interleaved same-segment row ids, duplicate row ids, row-store fetch, variant columns, or the external scatter helper reuse, so residual risk is mainly test coverage.
- Scope: The implementation is focused on
seek_and_read_by_rowidbatching and the two multiget call paths. - Concurrency: No new shared mutable state was introduced in the internal path. The external path keeps the existing scan-task concurrency model; the helper is called after producer completion, so no new concurrent mutation of result columns was found.
- Lifecycle: Segment, rowset, iterator, and scan block lifetimes appear preserved. The previous temporary
get_columns()pointer hazard is avoided by usingget_by_position(). - Configuration/compatibility: No new configs, storage formats, RPC fields, or FE-BE protocol compatibility concerns found.
- Parallel paths: Single-row point query and v1 multiget callers were updated to the vector signature; v2 internal batching and existing external scatter both use the shared helper.
- Conditions/invariants: The strict sorted/no-duplicate row-id contract is enforced before calling column iterators; batching code sorts and deduplicates before calling it.
- Data correctness: The scatter map restores original request order after sorted segment reads, including duplicate row ids within a segment. No version/delete-bitmap behavior was changed.
- Observability/performance: Existing stats/logging are retained. The main tradeoff is temporary per-segment scan blocks before scattering; no correctness issue found.
- User focus: No additional user-provided review focus was present.
No inline comments submitted.
TPC-H: Total hot run time: 31139 ms |
TPC-DS: Total hot run time: 169382 ms |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
What problem does this PR solve?
Change seek_and_read_by_rowid to accept a batch of row_ids instead of a single row_id, allowing the underlying column iterator's read_by_rowids to process all rows in one call. This eliminates per-row iterator re-initialization overhead in multi-row fetch paths (point query, batch index lookup).
about 10% speed up
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?