Skip to content

[opt](be) Batch row_id reads in seek_and_read_by_rowid to reduce column iterator overhead#63436

Open
HappenLee wants to merge 2 commits into
apache:masterfrom
HappenLee:function
Open

[opt](be) Batch row_id reads in seek_and_read_by_rowid to reduce column iterator overhead#63436
HappenLee wants to merge 2 commits into
apache:masterfrom
HappenLee:function

Conversation

@HappenLee
Copy link
Copy Markdown
Contributor

@HappenLee HappenLee commented May 20, 2026

What problem does this PR solve?

Change seek_and_read_by_rowid to accept a batch of row_ids instead of a single row_id, allowing the underlying column iterator's read_by_rowids to process all rows in one call. This eliminates per-row iterator re-initialization overhead in multi-row fetch paths (point query, batch index lookup).

about 10% speed up

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@HappenLee
Copy link
Copy Markdown
Contributor Author

run buildall

BiteTheDDDDt
BiteTheDDDDt previously approved these changes May 20, 2026
Copy link
Copy Markdown
Contributor

@BiteTheDDDDt BiteTheDDDDt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions Bot added the approved Indicates a PR has been approved by one committer. label May 20, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Copy Markdown
Contributor

PR approved by anyone and no changes requested.

…mn iterator overhead

    Change seek_and_read_by_rowid to accept a batch of row_ids instead of a single
    row_id, allowing the underlying column iterator's read_by_rowids to process all
    rows in one call. This eliminates per-row iterator re-initialization overhead
    in multi-row fetch paths (point query, batch index lookup).
@HappenLee
Copy link
Copy Markdown
Contributor Author

run buildall

@github-actions github-actions Bot removed the approved Indicates a PR has been approved by one committer. label May 20, 2026
@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31264 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit e0a41d9a131dd086ba57fd89e37bad2070d8d8a4, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17884	3860	3863	3860
q2	q3	10774	1398	792	792
q4	4704	477	349	349
q5	7872	2311	2098	2098
q6	326	173	144	144
q7	942	764	635	635
q8	9358	1778	1622	1622
q9	7045	4924	4909	4909
q10	6466	2150	1828	1828
q11	433	275	242	242
q12	699	440	296	296
q13	18225	3496	2824	2824
q14	259	255	234	234
q15	q16	823	776	702	702
q17	1011	1010	998	998
q18	6918	5764	5443	5443
q19	1203	1539	1073	1073
q20	544	429	270	270
q21	5867	2770	2628	2628
q22	446	368	317	317
Total cold run time: 101799 ms
Total hot run time: 31264 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4571	4498	4741	4498
q2	q3	4839	5237	4594	4594
q4	2130	2197	1400	1400
q5	4807	4785	4720	4720
q6	228	184	133	133
q7	1837	1768	1550	1550
q8	2370	1925	1897	1897
q9	7257	7317	7191	7191
q10	4490	4403	3989	3989
q11	531	383	352	352
q12	714	721	511	511
q13	3054	3353	2767	2767
q14	275	273	247	247
q15	q16	677	703	615	615
q17	1267	1238	1238	1238
q18	7650	6937	6968	6937
q19	1103	1101	1080	1080
q20	2212	2204	1922	1922
q21	5328	4613	4446	4446
q22	514	456	394	394
Total cold run time: 55854 ms
Total hot run time: 50481 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 168451 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit e0a41d9a131dd086ba57fd89e37bad2070d8d8a4, data reload: false

query5	4338	644	522	522
query6	333	219	203	203
query7	4239	578	329	329
query8	326	227	219	219
query9	8819	4045	4015	4015
query10	464	332	299	299
query11	5859	2369	2247	2247
query12	189	130	126	126
query13	1336	582	426	426
query14	6006	5393	5036	5036
query14_1	4372	4353	4360	4353
query15	214	205	188	188
query16	996	455	436	436
query17	1193	758	618	618
query18	2459	476	371	371
query19	225	216	173	173
query20	141	133	132	132
query21	213	139	123	123
query22	13629	13500	13324	13324
query23	17214	16402	16009	16009
query23_1	16125	16155	16139	16139
query24	7451	1755	1286	1286
query24_1	1315	1359	1278	1278
query25	545	465	414	414
query26	1308	332	172	172
query27	2692	561	326	326
query28	4457	1976	1958	1958
query29	976	631	509	509
query30	301	243	193	193
query31	1112	1061	934	934
query32	93	72	78	72
query33	536	340	303	303
query34	1160	1134	636	636
query35	777	790	662	662
query36	1348	1323	1174	1174
query37	152	105	90	90
query38	3188	3162	3002	3002
query39	939	926	900	900
query39_1	877	873	874	873
query40	232	150	126	126
query41	66	66	61	61
query42	111	108	109	108
query43	334	327	291	291
query44	
query45	206	197	197	197
query46	1078	1153	721	721
query47	2376	2344	2299	2299
query48	380	416	284	284
query49	633	490	382	382
query50	1059	358	257	257
query51	4310	4346	4193	4193
query52	107	101	94	94
query53	246	274	203	203
query54	313	269	262	262
query55	94	91	85	85
query56	293	291	329	291
query57	1440	1408	1313	1313
query58	304	270	273	270
query59	1526	1608	1461	1461
query60	331	321	300	300
query61	157	153	155	153
query62	662	605	573	573
query63	251	208	205	205
query64	2422	788	633	633
query65	
query66	1744	480	362	362
query67	29896	30014	29698	29698
query68	
query69	453	340	312	312
query70	1045	976	979	976
query71	298	275	258	258
query72	2938	2955	2478	2478
query73	844	785	425	425
query74	5075	4926	4700	4700
query75	2675	2586	2256	2256
query76	2302	1140	723	723
query77	393	404	357	357
query78	12167	12192	11548	11548
query79	1423	1003	712	712
query80	648	548	451	451
query81	443	274	239	239
query82	1383	156	122	122
query83	357	289	252	252
query84	267	135	113	113
query85	907	524	458	458
query86	390	329	305	305
query87	3381	3437	3199	3199
query88	3547	2647	2634	2634
query89	431	390	338	338
query90	1896	183	179	179
query91	176	172	133	133
query92	81	76	71	71
query93	1524	1476	845	845
query94	542	336	320	320
query95	678	479	341	341
query96	1030	768	337	337
query97	2769	2697	2577	2577
query98	236	240	242	240
query99	1105	1098	981	981
Total cold run time: 252962 ms
Total hot run time: 168451 ms

@HappenLee
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. I did not find blocking correctness issues in the PR diff.

Critical checkpoint conclusions:

  • Goal/test: The code changes the row-id reader API to accept sorted batches and updates both batch index lookup and point query callers. Existing tests may cover some paths, but this is not purely refactor-only behavior because batching, sorting, deduplication, and scatter-back logic changed; no new test evidence is provided.
  • Scope: The PR diff is small and focused on batched row-id column reads.
  • Concurrency: No new shared mutable concurrency is introduced in the internal Doris-format path. External row-id fetch concurrency was not changed by this PR.
  • Lifecycle/static initialization: No special lifecycle or static initialization risk found.
  • Config/compatibility: No config, storage format, or FE-BE protocol compatibility change found.
  • Parallel code paths: The direct multi-get, batch index lookup, and point query callers were updated to the new API.
  • Conditions/invariants: The new sorted/unique row_id precondition is enforced at Segment entry; current PR callers satisfy it.
  • Test coverage: No tests were run by me, and the PR declares no testing. This remains the main residual risk for the new scatter/dedup behavior.
  • Observability/transactions/data writes: Not applicable.
  • Performance: The batching direction removes per-row iterator calls and appears aligned with the PR goal.

User focus: No additional user-provided review focus was present.

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/62) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.52% (20733/38736)
Line Coverage 37.17% (196153/527687)
Region Coverage 33.51% (153782/458924)
Branch Coverage 34.52% (66997/194078)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 85.48% (53/62) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 63.42% (24060/37939)
Line Coverage 47.18% (248335/526335)
Region Coverage 44.26% (205075/463346)
Branch Coverage 45.42% (88478/194805)

### What problem does this PR solve?

Issue Number: N/A

Problem Summary:
The latest commit introduced batch row_id reading in the v2 multiget path
(read_batch_doris_format_row / read_batch_external_row). The scatter phase
used Block::get_columns() to obtain raw IColumn* pointers, but get_columns()
returns a temporary std::vector<ColumnPtr> that is destroyed at the end of
the expression. For columns like ColumnMap, convert_to_full_column_if_const()
allocates a new object into that temporary vector; the raw pointer extracted
via .get() immediately becomes dangling, causing an ASAN heap-use-after-free
at rowid_fetcher.cpp:750 when insert_from_multi_column() dereferences it.

Fix:
- Replace both scatter loops with a shared helper
  scatter_scan_blocks_to_result_block() that uses
  Block::get_by_position(i).column.get() for stable pointer access (the
  column is owned by the Block itself, not a temporary vector).
- The commit also inadvertently changed the v1 (PMultiGetRequest) path and
  point_query_executor.cpp to wrap single row_ids in vectors and call the
  batch overload. Revert those callers to pass a single-element vector to
  the single vector overload instead, keeping the v1 path behaviour unchanged.
- Remove the redundant uint32_t single-row overload of seek_and_read_by_rowid;
  callers that read one row now pass a one-element std::vector<rowid_t>.

### Release note

None

### Check List (For Author)

- Test: Manual ASAN build compilation verified (./build.sh --be -j 64)
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@HappenLee
Copy link
Copy Markdown
Contributor Author

/review

@HappenLee
Copy link
Copy Markdown
Contributor Author

run buildall

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review completed. I did not find any critical blocking issues in the changed code.

Checkpoint conclusions:

  • Goal/test: The change batches internal Doris row-id column reads per segment, preserves output order via scatter mapping, and includes a manual ASAN build note. I did not see new automated coverage for interleaved same-segment row ids, duplicate row ids, row-store fetch, variant columns, or the external scatter helper reuse, so residual risk is mainly test coverage.
  • Scope: The implementation is focused on seek_and_read_by_rowid batching and the two multiget call paths.
  • Concurrency: No new shared mutable state was introduced in the internal path. The external path keeps the existing scan-task concurrency model; the helper is called after producer completion, so no new concurrent mutation of result columns was found.
  • Lifecycle: Segment, rowset, iterator, and scan block lifetimes appear preserved. The previous temporary get_columns() pointer hazard is avoided by using get_by_position().
  • Configuration/compatibility: No new configs, storage formats, RPC fields, or FE-BE protocol compatibility concerns found.
  • Parallel paths: Single-row point query and v1 multiget callers were updated to the vector signature; v2 internal batching and existing external scatter both use the shared helper.
  • Conditions/invariants: The strict sorted/no-duplicate row-id contract is enforced before calling column iterators; batching code sorts and deduplicates before calling it.
  • Data correctness: The scatter map restores original request order after sorted segment reads, including duplicate row ids within a segment. No version/delete-bitmap behavior was changed.
  • Observability/performance: Existing stats/logging are retained. The main tradeoff is temporary per-segment scan blocks before scattering; no correctness issue found.
  • User focus: No additional user-provided review focus was present.

No inline comments submitted.

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31139 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 3b9bcc4f754698820f79d6d2e3ac40ac87994837, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17633	3881	3879	3879
q2	q3	10745	1455	807	807
q4	4700	482	355	355
q5	7687	2284	2149	2149
q6	241	176	141	141
q7	956	779	640	640
q8	9418	1764	1718	1718
q9	5303	4951	4891	4891
q10	6369	2088	1770	1770
q11	430	271	244	244
q12	642	427	287	287
q13	18108	3401	2789	2789
q14	274	262	234	234
q15	q16	818	775	711	711
q17	972	1025	905	905
q18	6930	5685	5620	5620
q19	1288	1362	1141	1141
q20	526	393	262	262
q21	6058	2645	2292	2292
q22	436	353	304	304
Total cold run time: 99534 ms
Total hot run time: 31139 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4159	4163	4086	4086
q2	q3	4543	4909	4317	4317
q4	2106	2236	1388	1388
q5	4421	4273	4287	4273
q6	224	175	131	131
q7	2125	2040	1687	1687
q8	2514	2230	2152	2152
q9	7936	7840	7892	7840
q10	4540	4477	4087	4087
q11	576	419	377	377
q12	774	747	514	514
q13	3250	3579	2941	2941
q14	308	290	278	278
q15	q16	703	720	659	659
q17	1331	1312	1289	1289
q18	8058	7296	7355	7296
q19	1149	1193	1119	1119
q20	2203	2214	1948	1948
q21	5274	4603	4448	4448
q22	510	457	405	405
Total cold run time: 56704 ms
Total hot run time: 51235 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 169382 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 3b9bcc4f754698820f79d6d2e3ac40ac87994837, data reload: false

query5	4325	657	510	510
query6	342	222	197	197
query7	4310	588	309	309
query8	341	235	234	234
query9	8849	3996	3992	3992
query10	458	361	298	298
query11	5759	2425	2164	2164
query12	183	131	135	131
query13	1271	604	476	476
query14	5988	5379	5075	5075
query14_1	4377	4395	4399	4395
query15	215	208	183	183
query16	999	458	403	403
query17	965	723	605	605
query18	2450	489	367	367
query19	227	211	172	172
query20	140	133	131	131
query21	220	142	116	116
query22	13676	13595	13312	13312
query23	17192	16430	16101	16101
query23_1	15971	16194	16288	16194
query24	7408	1768	1284	1284
query24_1	1321	1311	1336	1311
query25	581	494	443	443
query26	1346	340	181	181
query27	2634	557	344	344
query28	4499	1968	1946	1946
query29	1017	660	532	532
query30	303	247	202	202
query31	1118	1075	944	944
query32	93	82	76	76
query33	546	373	307	307
query34	1148	1135	639	639
query35	800	772	687	687
query36	1356	1336	1244	1244
query37	159	103	91	91
query38	3231	3144	3067	3067
query39	937	931	895	895
query39_1	893	891	874	874
query40	230	147	129	129
query41	66	63	63	63
query42	114	108	109	108
query43	327	325	290	290
query44	
query45	210	202	196	196
query46	1057	1223	715	715
query47	2320	2378	2225	2225
query48	399	433	289	289
query49	625	517	399	399
query50	1000	356	249	249
query51	4364	4310	4242	4242
query52	106	109	94	94
query53	265	286	201	201
query54	308	289	254	254
query55	93	88	93	88
query56	300	307	302	302
query57	1404	1409	1300	1300
query58	297	269	263	263
query59	1583	1644	1434	1434
query60	325	318	315	315
query61	168	162	162	162
query62	671	616	563	563
query63	240	197	218	197
query64	2420	831	619	619
query65	
query66	1754	472	355	355
query67	30030	30016	29851	29851
query68	
query69	461	336	349	336
query70	991	1029	970	970
query71	304	289	272	272
query72	2937	2750	2413	2413
query73	850	724	394	394
query74	5076	4964	4784	4784
query75	2675	2591	2244	2244
query76	2281	1134	763	763
query77	397	409	335	335
query78	11969	12120	11571	11571
query79	1454	1019	746	746
query80	830	550	457	457
query81	480	285	243	243
query82	1354	158	121	121
query83	359	276	253	253
query84	305	144	116	116
query85	920	531	474	474
query86	430	345	320	320
query87	3398	3389	3232	3232
query88	3503	2651	2621	2621
query89	443	386	341	341
query90	1802	182	180	180
query91	179	172	143	143
query92	80	80	76	76
query93	1477	1461	835	835
query94	623	355	314	314
query95	677	497	365	365
query96	980	783	327	327
query97	2714	2718	2555	2555
query98	235	229	229	229
query99	1127	1111	960	960
Total cold run time: 252654 ms
Total hot run time: 169382 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 91.43% (64/70) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.55% (27906/37940)
Line Coverage 57.53% (302822/526327)
Region Coverage 54.86% (254183/463311)
Branch Coverage 56.29% (109653/194797)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants