Skip to content

[fix](be) Preserve borrowed block data for http send#63258

Draft
mrhhsg wants to merge 3 commits into
apache:masterfrom
mrhhsg:fix_protobuf_http_borrowed_block
Draft

[fix](be) Preserve borrowed block data for http send#63258
mrhhsg wants to merge 3 commits into
apache:masterfrom
mrhhsg:fix_protobuf_http_borrowed_block

Conversation

@mrhhsg
Copy link
Copy Markdown
Member

@mrhhsg mrhhsg commented May 14, 2026

What problem does this PR solve?

Issue Number: None

Related PR: None

Problem Summary: Broadcast exchange sends can borrow a shared PBlock through set_allocated_block. When such a request takes the HTTP attachment path, the helper must remove column_values from the protobuf request before serialization, but it must not permanently consume data owned by the shared broadcast block. This PR adds an explicit restore mode for borrowed block storage while keeping the default owned-request path unchanged.

It also documents intentional protobuf set_allocated_* ownership patterns in exchange sink buffer and rowset meta code, and adds focused unit coverage for the attachment helper's owned vs borrowed behavior.

Release note

None

Check List (For Author)

  • Test: Unit Test
    • ./run-be-ut.sh --run --filter=ProtoUtilTest.*
  • Behavior changed: Yes (broadcast exchange HTTP sends preserve borrowed column_values for later channel sends)
  • Does this need documentation: No

### What problem does this PR solve?

Issue Number: None

Related PR: None

Problem Summary: Prevent broadcast exchange HTTP sends from consuming column_values owned by shared broadcast blocks. Keep the default owned-request attachment path moving column_values out of the request, and add a restore mode for borrowed block storage. Document intentional protobuf set_allocated ownership patterns to avoid unsafe future rewrites.

### Release note

None

### Check List (For Author)

- Test: Unit Test
    - ./run-be-ut.sh --run --filter=ProtoUtilTest.*
- Behavior changed: Yes (broadcast exchange HTTP sends preserve borrowed column_values for later channel sends)
- Does this need documentation: No
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

### What problem does this PR solve?

Issue Number: None

Related PR: None

Problem Summary: Apply clang-format to the proto attachment ownership guard added for borrowed column values.

### Release note

None

### Check List (For Author)

- Test: No need to test (format-only change)
- Behavior changed: No
- Does this need documentation: No
@mrhhsg
Copy link
Copy Markdown
Member Author

mrhhsg commented May 14, 2026

run buildall

@mrhhsg
Copy link
Copy Markdown
Member Author

mrhhsg commented May 14, 2026

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary: I found one blocking issue in the HTTP borrowed-block fix. The new restore path can mutate a shared broadcast PBlock concurrently across destination RPC instances, which can crash or race when multiple queued broadcast sends enter the HTTP helper at the same time.

Critical checkpoint conclusions:

  • Goal/test: The PR targets preserving broadcast column_values for HTTP large-block sends and adds focused unit tests for owned vs borrowed helper behavior, but the tests do not cover the multi-destination concurrent broadcast lifecycle where the borrowed object is shared.
  • Scope: The change is small and focused, but the helper-level restore flag hides an important ownership/concurrency precondition.
  • Concurrency: Blocking issue found. BroadcastPBlockHolder is shared across remote channels/RpcInstances; different brpc callback threads can process the same holder concurrently after queued prior RPCs complete. The new release/restore mutates the shared protobuf without a common lock.
  • Lifecycle: The scoped restore handles single-call lifetime, but not shared-holder concurrent lifetime.
  • Configuration/compatibility: No new config or protocol/storage format incompatibility observed.
  • Parallel paths: Non-broadcast and non-HTTP paths remain unchanged; the issue is specific to broadcast HTTP borrowed block handling.
  • Tests: Unit tests cover helper behavior in isolation, but miss the concurrent broadcast exchange scenario.
  • Observability: No new observability required for the intended fix.
  • Transaction/persistence/data visibility: Not applicable to this PR.
  • Performance: The intended no-copy behavior is preserved, but the fix must avoid shared mutation or serialize access.
  • User focus: No additional user-provided review focus was specified.

Comment thread be/src/util/proto_util.h Outdated
// column_values field so the serialized request stays small, then restore it before
// returning so the real owner can still be reused by later sends.
auto* column_values = block->release_column_values();
DORIS_CHECK(column_values != nullptr);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This restore mode mutates the borrowed PBlock by calling release_column_values(), but broadcast blocks are shared across remote channels and can be processed concurrently by different RpcInstance callback threads. A concrete path is: an unpartitioned exchange enqueues the same BroadcastPBlockHolder to multiple remote channels while those destinations already have in-flight RPCs; when those prior RPCs complete around the same time, each callback calls _send_rpc() for its destination and reaches this HTTP helper for the same holder. The first thread temporarily removes column_values; the second thread races on the same protobuf and can see nullptr here (hitting DORIS_CHECK) or otherwise race with the restore. Please avoid mutating the shared PBlock during attachment embedding, or add synchronization that covers all users of the shared holder.

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 29606 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit cdde6d13200718593069301b60edc5624d1ea419, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17627	3790	3788	3788
q2	q3	10715	872	598	598
q4	4665	457	341	341
q5	7458	1330	1136	1136
q6	211	169	136	136
q7	915	933	752	752
q8	9716	1387	1308	1308
q9	6078	5372	5345	5345
q10	6308	2064	1814	1814
q11	478	265	256	256
q12	686	424	285	285
q13	18170	3322	2713	2713
q14	296	285	263	263
q15	q16	892	872	796	796
q17	1049	979	786	786
q18	6449	5642	5693	5642
q19	1651	1202	1083	1083
q20	492	399	267	267
q21	4742	2353	1968	1968
q22	468	383	329	329
Total cold run time: 99066 ms
Total hot run time: 29606 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4673	4546	4516	4516
q2	q3	4654	4777	4260	4260
q4	2085	2165	1410	1410
q5	5009	4988	5233	4988
q6	201	172	140	140
q7	2074	1786	1702	1702
q8	3383	3103	3129	3103
q9	8431	8550	8368	8368
q10	4464	4485	4233	4233
q11	629	410	406	406
q12	724	740	528	528
q13	3194	3565	2907	2907
q14	310	312	275	275
q15	q16	763	770	707	707
q17	1347	1318	1279	1279
q18	8091	7026	7046	7026
q19	1167	1159	1156	1156
q20	2255	2242	1987	1987
q21	6115	5435	4748	4748
q22	528	481	395	395
Total cold run time: 60097 ms
Total hot run time: 54134 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 170079 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit cdde6d13200718593069301b60edc5624d1ea419, data reload: false

query5	4308	647	508	508
query6	329	213	205	205
query7	4267	593	300	300
query8	330	224	217	217
query9	8821	4008	3961	3961
query10	451	336	304	304
query11	5789	2398	2194	2194
query12	185	129	127	127
query13	1275	588	440	440
query14	6452	5340	5077	5077
query14_1	4353	4363	4341	4341
query15	219	207	188	188
query16	997	468	374	374
query17	1170	728	608	608
query18	2723	494	346	346
query19	208	193	164	164
query20	139	130	126	126
query21	214	134	116	116
query22	13562	13541	13309	13309
query23	17092	16318	16059	16059
query23_1	16113	16174	16176	16174
query24	7358	1738	1326	1326
query24_1	1339	1344	1357	1344
query25	563	499	444	444
query26	1320	296	173	173
query27	2702	573	338	338
query28	4426	1969	1950	1950
query29	979	630	509	509
query30	302	238	197	197
query31	1177	1068	941	941
query32	87	77	72	72
query33	556	349	289	289
query34	1187	1150	638	638
query35	755	777	677	677
query36	1325	1307	1123	1123
query37	152	103	88	88
query38	3215	3103	3040	3040
query39	915	934	926	926
query39_1	895	871	864	864
query40	232	158	137	137
query41	81	64	60	60
query42	111	111	106	106
query43	323	323	281	281
query44	
query45	217	205	201	201
query46	1082	1175	709	709
query47	2281	2261	2191	2191
query48	383	415	335	335
query49	647	536	413	413
query50	716	281	221	221
query51	4325	4293	4239	4239
query52	113	105	95	95
query53	254	285	199	199
query54	313	272	266	266
query55	95	89	89	89
query56	307	308	304	304
query57	1529	1375	1301	1301
query58	296	274	272	272
query59	1524	1642	1380	1380
query60	340	334	322	322
query61	162	157	158	157
query62	671	644	571	571
query63	252	202	208	202
query64	2365	808	681	681
query65	
query66	1696	509	404	404
query67	30027	29941	29866	29866
query68	
query69	455	341	289	289
query70	1030	970	920	920
query71	301	263	269	263
query72	2993	2740	2403	2403
query73	817	779	423	423
query74	5091	4893	4748	4748
query75	2777	2694	2346	2346
query76	2314	1134	752	752
query77	428	432	354	354
query78	12842	12900	12361	12361
query79	1489	963	713	713
query80	730	612	529	529
query81	452	277	244	244
query82	1347	165	123	123
query83	367	287	260	260
query84	264	153	121	121
query85	926	585	532	532
query86	399	324	316	316
query87	3506	3347	3200	3200
query88	3553	2677	2646	2646
query89	453	388	340	340
query90	1909	183	181	181
query91	217	165	139	139
query92	76	74	70	70
query93	952	944	559	559
query94	514	331	307	307
query95	654	369	356	356
query96	1013	813	347	347
query97	2701	2667	2549	2549
query98	239	229	227	227
query99	1129	1111	995	995
Total cold run time: 253499 ms
Total hot run time: 170079 ms

### What problem does this PR solve?

Issue Number: close #xxx

Related PR: apache#63258

Problem Summary: Avoid release/restore mutation of a shared broadcast PBlock when embedding HTTP attachment data. The broadcast HTTP path now sends a lightweight request containing only block metadata and borrows column_values as read-only attachment data.

### Release note

None

### Check List (For Author)

- Test: Unit Test
    - ./run-be-ut.sh --run --filter=ProtoUtilTest.*
    - build-support/run-clang-tidy.sh --build-dir be/ut_build_ASAN (fails on existing be/src/core/types.h unmatched NOLINTEND and toolchain/system stddef.h diagnostic)
- Behavior changed: No
- Does this need documentation: No
@mrhhsg
Copy link
Copy Markdown
Member Author

mrhhsg commented May 14, 2026

/review

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 73.33% (11/15) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.50% (20645/38587)
Line Coverage 37.16% (195113/525081)
Region Coverage 33.55% (152792/455422)
Branch Coverage 34.56% (66576/192633)

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. I did not find additional blocking issues beyond the already-known inline thread, and the current head no longer mutates the shared broadcast PBlock in the HTTP path.

Critical checkpoint conclusions:

  • Goal/test: The change targets preserving borrowed broadcast column_values during HTTP attachment sends. The current code accomplishes that by sending copied request/block metadata and read-only attachment data; focused ProtoUtil unit coverage is present for owned vs borrowed attachment embedding.
  • Scope: The implementation is limited to the exchange HTTP broadcast path plus helper refactoring and ownership comments.
  • Concurrency: The previously raised release/restore mutation race is avoided on the updated path. The per-RpcInstance send state machine and mutex usage remain unchanged; no new shared mutable state was introduced.
  • Lifecycle/ownership: AutoReleaseClosure owns the lightweight HTTP request for the RPC lifetime, while BroadcastPBlockHolder remains the owner of the shared PBlock. The scoped protobuf set_allocated ownership comments match the release calls in the same function.
  • Configuration/compatibility: No new config or protocol field was added; copied PTransmitDataParams/PBlock fields cover the current proto schema.
  • Parallel paths: Non-broadcast and non-HTTP paths keep the existing behavior. Multi-block broadcast behavior remains unchanged because the HTTP helper is only used when the single block field is present.
  • Error handling: Status results from attachment embedding and HTTP client setup continue to be propagated.
  • Performance/memory: The extra metadata copy happens only on the large HTTP broadcast path and avoids copying column_values; no new untracked large allocation was introduced beyond the existing attachment construction.
  • Tests: I did not run tests in this review. The PR states ./run-be-ut.sh --run --filter=ProtoUtilTest.* was run.
  • User focus: No additional user-provided review focus was present.

@mrhhsg
Copy link
Copy Markdown
Member Author

mrhhsg commented May 14, 2026

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 29615 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a5ae6300f8122b1ec603a2727913f39eac342273, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17607	3800	3748	3748
q2	q3	10700	865	611	611
q4	4659	465	348	348
q5	7445	1320	1132	1132
q6	185	163	134	134
q7	924	950	754	754
q8	9301	1427	1313	1313
q9	5587	5356	5318	5318
q10	6247	2086	1819	1819
q11	459	258	263	258
q12	627	403	295	295
q13	18080	3212	2753	2753
q14	290	282	261	261
q15	q16	897	857	786	786
q17	1020	1002	764	764
q18	6466	5688	5677	5677
q19	1186	1233	1027	1027
q20	511	407	251	251
q21	5357	2399	2022	2022
q22	488	396	344	344
Total cold run time: 98036 ms
Total hot run time: 29615 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4696	4697	4808	4697
q2	q3	4620	4818	4179	4179
q4	2106	2151	1400	1400
q5	4939	4961	5222	4961
q6	203	163	133	133
q7	2107	1780	1598	1598
q8	3309	3062	3111	3062
q9	8474	8397	8458	8397
q10	4497	4539	4262	4262
q11	581	426	383	383
q12	688	746	524	524
q13	3237	3583	2903	2903
q14	314	313	286	286
q15	q16	760	760	674	674
q17	1475	1310	1241	1241
q18	7923	7106	6935	6935
q19	1127	1160	1145	1145
q20	2217	2233	1940	1940
q21	6118	5325	4837	4837
q22	546	512	421	421
Total cold run time: 59937 ms
Total hot run time: 53978 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 171837 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a5ae6300f8122b1ec603a2727913f39eac342273, data reload: false

query5	4303	659	520	520
query6	344	216	210	210
query7	4252	559	300	300
query8	325	252	221	221
query9	8805	4024	4014	4014
query10	451	335	288	288
query11	5791	2347	2213	2213
query12	186	140	133	133
query13	1264	615	445	445
query14	5928	5376	5052	5052
query14_1	4318	4416	4348	4348
query15	209	207	184	184
query16	1045	490	444	444
query17	1145	768	646	646
query18	2604	507	375	375
query19	224	212	166	166
query20	137	132	129	129
query21	221	146	118	118
query22	13617	13969	14580	13969
query23	17126	16615	16224	16224
query23_1	16156	16271	16284	16271
query24	7395	1721	1324	1324
query24_1	1391	1360	1379	1360
query25	582	474	414	414
query26	1305	309	163	163
query27	2705	612	327	327
query28	4347	1929	1943	1929
query29	985	666	520	520
query30	290	237	194	194
query31	1102	1059	928	928
query32	81	74	69	69
query33	541	345	285	285
query34	1143	1142	627	627
query35	781	781	662	662
query36	1324	1333	1167	1167
query37	148	101	85	85
query38	3161	3113	3073	3073
query39	926	903	892	892
query39_1	869	862	880	862
query40	237	159	132	132
query41	63	60	61	60
query42	106	112	107	107
query43	320	327	282	282
query44	
query45	209	198	193	193
query46	1035	1175	721	721
query47	2288	2243	2235	2235
query48	395	423	299	299
query49	632	524	416	416
query50	691	284	216	216
query51	4264	4297	4178	4178
query52	103	103	94	94
query53	254	276	200	200
query54	311	275	267	267
query55	96	91	85	85
query56	300	307	300	300
query57	1405	1375	1260	1260
query58	286	269	262	262
query59	1521	1612	1375	1375
query60	334	339	323	323
query61	164	157	165	157
query62	670	616	545	545
query63	237	204	212	204
query64	2397	835	701	701
query65	
query66	1697	520	402	402
query67	29933	29844	29849	29844
query68	
query69	452	327	296	296
query70	1037	910	989	910
query71	305	273	275	273
query72	2937	2714	2376	2376
query73	858	754	422	422
query74	5046	4893	4701	4701
query75	2747	2693	2329	2329
query76	2295	1133	778	778
query77	407	424	355	355
query78	12955	12940	12446	12446
query79	1448	933	745	745
query80	673	575	488	488
query81	456	280	238	238
query82	1342	158	122	122
query83	366	278	270	270
query84	256	137	113	113
query85	868	524	451	451
query86	397	347	356	347
query87	3383	3362	3212	3212
query88	3548	2730	2686	2686
query89	447	386	339	339
query90	1897	190	188	188
query91	196	179	152	152
query92	81	80	75	75
query93	960	954	548	548
query94	563	364	279	279
query95	685	381	355	355
query96	1027	810	342	342
query97	2699	2674	2538	2538
query98	239	224	224	224
query99	1138	1111	965	965
Total cold run time: 251955 ms
Total hot run time: 171837 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 5.00% (3/60) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.50% (20647/38590)
Line Coverage 37.17% (195210/525119)
Region Coverage 33.54% (152737/455454)
Branch Coverage 34.57% (66595/192650)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 6.67% (4/60) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.62% (27820/37790)
Line Coverage 57.53% (301306/523742)
Region Coverage 54.66% (251346/459869)
Branch Coverage 56.24% (108763/193380)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants