Skip to content

[fix](cloud) Skip wait for async rowset warmup#62764

Open
bobhan1 wants to merge 4 commits into
apache:masterfrom
bobhan1:fix-warm-up-sync-wait
Open

[fix](cloud) Skip wait for async rowset warmup#62764
bobhan1 wants to merge 4 commits into
apache:masterfrom
bobhan1:fix-warm-up-sync-wait

Conversation

@bobhan1
Copy link
Copy Markdown
Contributor

@bobhan1 bobhan1 commented Apr 23, 2026

What problem does this PR solve?

Issue Number: None

Related PR: None

Problem Summary: CloudWarmUpManager::warm_up_rowset still waited on a bthread condition variable when sync_wait_timeout_ms was non-positive. Submit those warmup tasks asynchronously and return immediately, while keeping the rowset meta alive for the background task and logging if rowset meta initialization fails.

Release note

None

Check List (For Author)

  • Test:
  • Behavior changed: Yes. Non-positive warm_up_rowset sync wait timeout now submits the warmup task without waiting on the condition variable.
  • Does this need documentation: No

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Copy Markdown
Contributor

Thearas commented Apr 23, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@bobhan1
Copy link
Copy Markdown
Contributor Author

bobhan1 commented Apr 24, 2026

run buildall

@liaoxin01
Copy link
Copy Markdown
Contributor

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes because the new async non-positive-timeout path removes the only producer backpressure on event-driven rowset warm-up submission while the shared warm-up thread pool still has a bounded worker count and an unbounded queue.

Critical checkpoints:

  • Goal of current task: The patch does make sync_wait_timeout_ms <= 0 return immediately, but it is not safe as written because the new queueing behavior can overload the warm-up pipeline under load.
  • Modification size/focus: Small and focused to one function.
  • Concurrency: Applicable. warm_up_rowset() submits onto _thread_pool_token; get_replica_info() / set_event() / clear_job() serialize on _mtx. The new fire-and-forget branch allows many more queued tasks contending on the same state, and the same token is also used by recycle_cache().
  • Lifecycle/static init: Applicable. Copying RowsetMetaPB avoids the caller-stack lifetime problem for rs_meta; I did not find a new static-init issue.
  • Configuration: No new config. Existing warm_up_manager_thread_pool_size remains the only concurrency cap, and the queue is still unbounded.
  • Compatibility/incompatible changes: No FE/BE protocol or storage-format compatibility change.
  • Parallel code paths: Applicable. The positive-timeout path still waits; only the non-positive path changed.
  • Special conditional checks: The new sync_wait_timeout_ms <= 0 branch is understandable.
  • Test coverage: No unit/regression test covers the new async lifetime/backpressure behavior.
  • Test results changed: None in this PR.
  • Observability: Existing logs/metrics are still enough to see failures, but they do not prevent backlog growth.
  • Transaction/persistence: No EditLog or persistence change.
  • Data writes/modifications: Rowset commit persistence is unchanged; the risk is the post-commit warm-up scheduling path.
  • FE-BE variable passing: Unchanged.
  • Performance: Lower caller latency is good, but removing backpressure from an unbounded queue can grow memory and delay synchronous recycle_cache() work on the same token.
  • Other issues: See inline comment.

User focus points:

  • No additional user-provided review focus.


void CloudWarmUpManager::warm_up_rowset(RowsetMeta& rs_meta, int64_t sync_wait_timeout_ms) {
if (sync_wait_timeout_ms <= 0) {
auto rs_meta_pb = std::make_shared<RowsetMetaPB>(rs_meta.get_rowset_pb());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warm_up_rowset() is on the commit path for every load-generated rowset. Before this change each caller waited for its submitted task to finish, which implicitly bounded the number of outstanding warm-up jobs. With this fire-and-forget branch, callers can enqueue unlimited RowsetMetaPB copies while CloudWarmUpManagerThreadPool still only has warm_up_manager_thread_pool_size workers and its builder never sets max_queue_size, so the queue stays effectively unbounded. Under an active LOAD warm-up job, a slow FE replica lookup or slow target-BE RPC will now let this backlog grow without bound, and recycle_cache() submissions share the same token, so cache-recycle work can get stuck behind it. Please keep some form of backpressure or add a real queue bound here.

@bobhan1
Copy link
Copy Markdown
Contributor Author

bobhan1 commented Apr 29, 2026

run beut

@bobhan1
Copy link
Copy Markdown
Contributor Author

bobhan1 commented Apr 30, 2026

run buildall

liaoxin01
liaoxin01 previously approved these changes Apr 30, 2026
Copy link
Copy Markdown
Contributor

@liaoxin01 liaoxin01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions Bot added the approved Indicates a PR has been approved by one committer. label Apr 30, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@bobhan1
Copy link
Copy Markdown
Contributor Author

bobhan1 commented Apr 30, 2026

run p0

1 similar comment
@bobhan1
Copy link
Copy Markdown
Contributor Author

bobhan1 commented Apr 30, 2026

run p0

bobhan1 added 3 commits May 8, 2026 16:43
### What problem does this PR solve?

Issue Number: None

Related PR: None

Problem Summary: CloudWarmUpManager::warm_up_rowset still waited on a bthread condition variable when sync_wait_timeout_ms was non-positive. Submit those warmup tasks asynchronously and return immediately, while keeping the rowset meta alive for the background task and logging if rowset meta initialization fails.

### Release note

None

### Check List (For Author)

- Test: Manual test
    - Ran build-support/clang-format.sh
    - Ran build-support/check-format.sh
    - Ran git diff --check
    - Attempted build-support/run-clang-tidy.sh --build-dir be/ut_build_ASAN, blocked by existing compile environment errors including missing /usr/include/stddef.h and pre-existing tidy diagnostics
    - Attempted ./build.sh --be, blocked before compilation by sandboxed submodule metadata write and network access
- Behavior changed: Yes. Non-positive warm_up_rowset sync wait timeout now submits the warmup task without waiting on the condition variable.
- Does this need documentation: No
### What problem does this PR solve?

Issue Number: N/A

Related PR: N/A

Problem Summary: Synchronous rowset warmup waited on a condition variable without a completion predicate, so a spurious wakeup could let the caller return before the warmup task finished.

### Release note

None

### Check List (For Author)

- Test: Build and style check
    - build-support/check-format.sh be/src/cloud/cloud_warm_up_manager.cpp
    - ./build.sh --be
    - clang-tidy skipped per user request
- Behavior changed: No
- Does this need documentation: No
### What problem does this PR solve?

Issue Number: None

Related PR: None

Problem Summary: Add BE unit coverage for CloudWarmUpManager rowset warmup wait behavior, including fire-and-forget non-positive timeout ownership of RowsetMetaPB and positive-timeout resistance to spurious condition-variable wakeups.

### Release note

None

### Check List (For Author)

- Test: Unit Test
    - ./run-be-ut.sh --run --filter=CloudWarmUpManagerTest.* -j100
    - build-support/check-format.sh be/src/cloud/cloud_warm_up_manager.cpp be/test/cloud/cloud_warm_up_manager_test.cpp
    - git diff --check
    - build-support/run-clang-tidy.sh --build-dir be/ut_build_ASAN (blocked by existing clang-tidy analysis errors: missing stddef.h and core/types.h NOLINTEND)
- Behavior changed: No
- Does this need documentation: No
@bobhan1 bobhan1 force-pushed the fix-warm-up-sync-wait branch from 4f28fd3 to f33fa2b Compare May 8, 2026 11:01
@bobhan1 bobhan1 requested a review from w41ter as a code owner May 8, 2026 11:01
@github-actions github-actions Bot removed the approved Indicates a PR has been approved by one committer. label May 8, 2026
@bobhan1
Copy link
Copy Markdown
Contributor Author

bobhan1 commented May 8, 2026

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 76.00% (19/25) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.82% (27788/37643)
Line Coverage 57.67% (300873/521687)
Region Coverage 54.81% (250358/456744)
Branch Coverage 56.42% (108439/192197)

@bobhan1
Copy link
Copy Markdown
Contributor Author

bobhan1 commented May 9, 2026

run cloud_p0

@bobhan1
Copy link
Copy Markdown
Contributor Author

bobhan1 commented May 9, 2026

run nonConcurrent

### What problem does this PR solve?

Issue Number: None

Related PR: None

Problem Summary: Add BE unit coverage for asynchronous rowset warmup when RowsetMetaPB initialization fails, ensuring the background task returns without entering rowset warmup.

### Release note

None

### Check List (For Author)

- Test: Unit Test
    - ./run-be-ut.sh --run --filter=CloudWarmUpManagerTest.* -j100
    - build-support/check-format.sh be/src/cloud/cloud_warm_up_manager.cpp be/test/cloud/cloud_warm_up_manager_test.cpp
    - git diff --check
    - build-support/run-clang-tidy.sh --build-dir be/ut_build_ASAN (blocked by existing clang-tidy analysis errors: missing stddef.h, unmatched NOLINTEND in be/src/core/types.h, and pre-existing tidy warnings in cloud_warm_up_manager.cpp)
- Behavior changed: No
- Does this need documentation: No
@bobhan1
Copy link
Copy Markdown
Contributor Author

bobhan1 commented May 9, 2026

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 29444 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 3ba26952a55038c73499a9715b1f483ea201e6d5, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17671	3871	3843	3843
q2	q3	10732	895	604	604
q4	4660	462	341	341
q5	7447	1323	1154	1154
q6	190	175	138	138
q7	924	961	760	760
q8	9493	1380	1321	1321
q9	6141	5429	5340	5340
q10	6305	2093	1783	1783
q11	481	265	253	253
q12	691	427	294	294
q13	18194	3313	2725	2725
q14	298	288	262	262
q15	q16	907	880	796	796
q17	973	1084	682	682
q18	6530	5765	5593	5593
q19	1205	1167	1123	1123
q20	510	394	266	266
q21	4575	2295	1860	1860
q22	416	354	306	306
Total cold run time: 98343 ms
Total hot run time: 29444 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4189	4086	4074	4074
q2	q3	4690	4777	4211	4211
q4	2093	2169	1394	1394
q5	4995	4919	5217	4919
q6	192	162	130	130
q7	2047	1870	1776	1776
q8	3447	3289	3235	3235
q9	8558	8477	8441	8441
q10	4521	4506	4294	4294
q11	602	413	403	403
q12	702	752	517	517
q13	3241	3549	2980	2980
q14	305	306	292	292
q15	q16	788	800	694	694
q17	1334	1318	1261	1261
q18	8305	7159	7023	7023
q19	1154	1146	1153	1146
q20	2295	2262	1952	1952
q21	6174	5418	4904	4904
q22	568	514	427	427
Total cold run time: 60200 ms
Total hot run time: 54073 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 171066 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 3ba26952a55038c73499a9715b1f483ea201e6d5, data reload: false

query5	4320	664	518	518
query6	338	223	199	199
query7	4228	597	303	303
query8	334	227	217	217
query9	8845	4059	4054	4054
query10	480	361	306	306
query11	5841	2388	2218	2218
query12	189	128	126	126
query13	1332	616	444	444
query14	6614	5374	5105	5105
query14_1	4431	4369	4378	4369
query15	215	204	183	183
query16	1009	494	447	447
query17	1160	779	646	646
query18	2749	500	365	365
query19	242	208	167	167
query20	143	138	132	132
query21	218	137	117	117
query22	13662	13954	14642	13954
query23	17299	16566	16074	16074
query23_1	16389	16461	16286	16286
query24	8052	1829	1416	1416
query24_1	1398	1417	1392	1392
query25	641	494	433	433
query26	1315	316	173	173
query27	2758	605	341	341
query28	4322	1952	1946	1946
query29	979	614	523	523
query30	302	235	202	202
query31	1141	1064	935	935
query32	84	70	72	70
query33	527	336	284	284
query34	1161	1152	671	671
query35	776	786	663	663
query36	1296	1289	1108	1108
query37	161	97	89	89
query38	3235	3112	3106	3106
query39	989	949	914	914
query39_1	896	910	917	910
query40	235	155	133	133
query41	62	62	59	59
query42	113	111	107	107
query43	328	331	298	298
query44	
query45	211	209	193	193
query46	1115	1223	721	721
query47	2268	2272	2200	2200
query48	388	398	304	304
query49	631	516	444	444
query50	696	285	224	224
query51	4282	4260	4227	4227
query52	105	107	98	98
query53	251	275	205	205
query54	311	293	259	259
query55	94	91	81	81
query56	304	346	302	302
query57	1400	1368	1282	1282
query58	294	278	283	278
query59	1551	1647	1421	1421
query60	346	338	328	328
query61	153	150	157	150
query62	684	624	582	582
query63	253	202	209	202
query64	2372	831	688	688
query65	
query66	1697	505	402	402
query67	30077	30045	29951	29951
query68	
query69	475	349	308	308
query70	1057	1006	971	971
query71	304	287	285	285
query72	2987	2701	2433	2433
query73	867	773	439	439
query74	5113	4899	4732	4732
query75	2792	2639	2331	2331
query76	2275	1141	799	799
query77	422	433	357	357
query78	12876	12928	12338	12338
query79	1515	971	736	736
query80	1387	560	502	502
query81	534	271	245	245
query82	947	165	122	122
query83	353	280	246	246
query84	258	141	119	119
query85	911	521	447	447
query86	457	352	332	332
query87	3432	3346	3225	3225
query88	3557	2680	2657	2657
query89	436	372	340	340
query90	1956	182	177	177
query91	181	188	139	139
query92	80	77	74	74
query93	1100	939	569	569
query94	718	359	292	292
query95	659	468	346	346
query96	1074	784	314	314
query97	2720	2702	2585	2585
query98	239	231	224	224
query99	1117	1126	983	983
Total cold run time: 256078 ms
Total hot run time: 171066 ms

@bobhan1
Copy link
Copy Markdown
Contributor Author

bobhan1 commented May 9, 2026

run vault_p0

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 89.29% (25/28) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.76% (27770/37650)
Line Coverage 57.61% (300577/521762)
Region Coverage 54.80% (250352/456816)
Branch Coverage 56.37% (108345/192213)

Copy link
Copy Markdown
Contributor

@liaoxin01 liaoxin01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions Bot added the approved Indicates a PR has been approved by one committer. label May 9, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 9, 2026

PR approved by at least one committer and no changes requested.

@gavinchou
Copy link
Copy Markdown
Contributor

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review conclusion: request changes. I found a distinct blocking issue in the newly added test: it directly accesses a private CloudWarmUpManager member, so the BE unit test target will fail to compile. I did not duplicate the existing inline thread about unbounded fire-and-forget warm-up queueing; that concern remains already raised in the PR review context.

Critical checkpoint conclusions:

  • Goal/test proof: the implementation adds async non-positive timeout warm-up and tests for copy lifetime, init failure, and spurious wakeups, but the new test currently cannot compile.
  • Scope/minimality: production changes are focused on CloudWarmUpManager; tests add targeted coverage.
  • Concurrency: the synchronous wait now uses a completion predicate; the existing review thread already covers the new async branch's queue/backpressure risk.
  • Lifecycle: async rowset metadata lifetime is handled by copying RowsetMetaPB and reconstructing RowsetMeta in the task; manager/thread-pool lifetime appears covered by token shutdown semantics.
  • Config/compatibility: no new config, storage format, or protocol compatibility changes.
  • Parallel paths: recycle_cache still uses the same token synchronously; no new issue beyond the existing queueing thread.
  • Tests: added BE tests are relevant but blocked by private member access.
  • Observability: submission/init failures are logged and failed submissions increment the existing metric.
  • User focus: no additional user-provided review focus was specified.

bool release_blocker = false;

ASSERT_TRUE(manager._thread_pool_token
->submit_func([&] {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test directly accesses CloudWarmUpManager::_thread_pool_token, but that member is private in cloud_warm_up_manager.h and this file does not use a #define private public test hack or a friend/test hook. BE_TEST only exposes consumer_job(), so the new BE unit test will fail to compile. Please expose a small test hook or drive the behavior through public/sync-point observable state instead of accessing the private token directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.1.x dev/4.0.x dev/4.1.x

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants