Skip to content

[fix](warmup): fix passive cancellation of event-driven jobs#54962

Merged
liaoxin01 merged 1 commit intoapache:masterfrom
kaijchen:warmup-case-1
Aug 21, 2025
Merged

[fix](warmup): fix passive cancellation of event-driven jobs#54962
liaoxin01 merged 1 commit intoapache:masterfrom
kaijchen:warmup-case-1

Conversation

@kaijchen
Copy link
Member

@kaijchen kaijchen commented Aug 18, 2025

What problem does this PR solve?

Problem Summary:

Previously, the code assumed that a cancelled job no longer exists on the FE.
However, in practice the FE may still retain the job in a cancelled state.

This PR fixes the case where an event-driven job must be cancelled passively —
i.e., when the BE misses the cancel RPC from the FE.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@kaijchen
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 33721 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ebf902dc7662af7cf2ae709232b59b169e36ea91, data reload: false

------ Round 1 ----------------------------------
q1	17652	5205	5098	5098
q2	1914	286	188	188
q3	10301	1246	723	723
q4	10226	1000	511	511
q5	7510	2395	2338	2338
q6	181	156	132	132
q7	880	739	601	601
q8	9312	1260	1122	1122
q9	6969	5586	5051	5051
q10	6906	2351	1985	1985
q11	466	281	280	280
q12	346	345	212	212
q13	17783	3549	3016	3016
q14	250	234	213	213
q15	558	464	469	464
q16	410	416	372	372
q17	603	851	359	359
q18	7422	6971	7083	6971
q19	1093	937	570	570
q20	334	339	221	221
q21	4136	3217	2326	2326
q22	1071	1045	968	968
Total cold run time: 106323 ms
Total hot run time: 33721 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5210	5158	5115	5115
q2	239	307	217	217
q3	2169	2666	2290	2290
q4	1345	1762	1311	1311
q5	4200	4491	4507	4491
q6	221	165	128	128
q7	2031	1953	1774	1774
q8	2675	2593	2533	2533
q9	7302	7235	7285	7235
q10	3066	3321	2863	2863
q11	585	540	486	486
q12	893	767	605	605
q13	3469	3939	3322	3322
q14	294	317	300	300
q15	546	467	470	467
q16	611	523	443	443
q17	1173	1488	1376	1376
q18	7657	7551	7707	7551
q19	871	912	991	912
q20	2045	2062	1868	1868
q21	4810	4343	4294	4294
q22	1060	1041	1020	1020
Total cold run time: 52472 ms
Total hot run time: 50601 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 183822 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ebf902dc7662af7cf2ae709232b59b169e36ea91, data reload: false

query1	986	377	396	377
query2	6515	1707	1736	1707
query3	6750	220	223	220
query4	26127	23760	22807	22807
query5	4356	632	521	521
query6	306	213	187	187
query7	4620	498	280	280
query8	265	219	218	218
query9	8614	2849	2835	2835
query10	457	331	293	293
query11	15788	14904	14779	14779
query12	163	117	110	110
query13	1653	557	416	416
query14	8995	5773	5738	5738
query15	213	188	175	175
query16	7278	633	497	497
query17	1204	717	588	588
query18	2007	434	312	312
query19	196	197	176	176
query20	133	119	112	112
query21	214	121	166	121
query22	4216	4171	4108	4108
query23	34095	33299	33182	33182
query24	8231	2324	2299	2299
query25	544	461	403	403
query26	1264	265	160	160
query27	2756	491	336	336
query28	4419	2207	2213	2207
query29	784	598	438	438
query30	284	229	190	190
query31	893	770	700	700
query32	80	78	72	72
query33	555	361	333	333
query34	803	847	503	503
query35	796	855	753	753
query36	972	1035	915	915
query37	122	114	93	93
query38	4134	3986	3911	3911
query39	1471	1431	1411	1411
query40	215	123	110	110
query41	58	57	52	52
query42	120	114	109	109
query43	508	482	464	464
query44	1375	847	852	847
query45	177	166	170	166
query46	855	1003	635	635
query47	1751	1772	1724	1724
query48	384	418	309	309
query49	713	481	375	375
query50	662	682	393	393
query51	4045	4084	4111	4084
query52	116	113	104	104
query53	232	258	195	195
query54	596	584	519	519
query55	88	85	84	84
query56	304	312	294	294
query57	1168	1197	1111	1111
query58	273	258	267	258
query59	2583	2678	2582	2582
query60	329	333	318	318
query61	129	119	120	119
query62	782	698	653	653
query63	226	187	190	187
query64	4371	1011	674	674
query65	4254	4186	4193	4186
query66	1241	416	314	314
query67	15538	15111	14960	14960
query68	8026	896	573	573
query69	463	308	274	274
query70	1209	1139	1091	1091
query71	457	317	313	313
query72	5308	4632	4671	4632
query73	713	576	347	347
query74	9066	8927	8884	8884
query75	3791	3144	2584	2584
query76	3658	1116	733	733
query77	772	392	315	315
query78	9554	9566	8866	8866
query79	2506	820	593	593
query80	608	523	465	465
query81	479	254	220	220
query82	456	141	103	103
query83	280	246	238	238
query84	296	104	87	87
query85	789	367	339	339
query86	396	324	295	295
query87	4209	4266	4164	4164
query88	3480	2184	2178	2178
query89	392	306	289	289
query90	1832	225	209	209
query91	135	146	110	110
query92	82	71	114	71
query93	1849	978	650	650
query94	661	400	302	302
query95	393	318	303	303
query96	486	569	268	268
query97	2647	2663	2551	2551
query98	247	223	213	213
query99	1435	1406	1253	1253
Total cold run time: 273027 ms
Total hot run time: 183822 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.33 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit ebf902dc7662af7cf2ae709232b59b169e36ea91, data reload: false

query1	0.04	0.03	0.04
query2	0.08	0.04	0.04
query3	0.25	0.08	0.07
query4	1.61	0.12	0.11
query5	0.40	0.42	0.40
query6	1.16	0.63	0.66
query7	0.03	0.02	0.02
query8	0.05	0.03	0.05
query9	0.61	0.53	0.52
query10	0.59	0.56	0.57
query11	0.16	0.11	0.11
query12	0.14	0.12	0.11
query13	0.63	0.61	0.61
query14	0.80	0.82	0.83
query15	0.88	0.84	0.86
query16	0.39	0.40	0.39
query17	1.08	1.05	1.02
query18	0.22	0.19	0.19
query19	1.91	1.82	1.81
query20	0.01	0.01	0.01
query21	15.43	0.97	0.57
query22	0.81	1.20	0.69
query23	14.86	1.38	0.60
query24	6.76	0.61	1.78
query25	0.51	0.34	0.17
query26	0.63	0.14	0.14
query27	0.06	0.06	0.05
query28	9.79	0.94	0.42
query29	12.60	3.86	3.22
query30	3.09	3.02	2.99
query31	2.82	0.58	0.39
query32	3.22	0.55	0.48
query33	3.08	3.11	3.10
query34	16.08	5.42	4.85
query35	4.93	4.89	4.93
query36	0.69	0.50	0.50
query37	0.10	0.07	0.07
query38	0.06	0.04	0.04
query39	0.04	0.02	0.02
query40	0.18	0.15	0.14
query41	0.08	0.03	0.03
query42	0.03	0.02	0.02
query43	0.03	0.03	0.02
Total cold run time: 106.92 s
Total hot run time: 32.33 s

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 19, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Contributor

@liaoxin01 liaoxin01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@liaoxin01 liaoxin01 merged commit ba02a0b into apache:master Aug 21, 2025
31 of 33 checks passed
github-actions bot pushed a commit that referenced this pull request Aug 21, 2025
Previously, the code assumed that a cancelled job no longer exists on
the FE.
However, in practice the FE may still retain the job in a cancelled
state.

This PR fixes the case where an event-driven job must be cancelled
passively —
i.e., when the BE misses the cancel RPC from the FE.
github-actions bot pushed a commit that referenced this pull request Aug 21, 2025
Previously, the code assumed that a cancelled job no longer exists on
the FE.
However, in practice the FE may still retain the job in a cancelled
state.

This PR fixes the case where an event-driven job must be cancelled
passively —
i.e., when the BE misses the cancel RPC from the FE.
morrySnow pushed a commit that referenced this pull request Aug 21, 2025
…obs #54962 (#55088)

Cherry-picked from #54962

Co-authored-by: Kaijie Chen <chenkaijie@selectdb.com>
dataroaring pushed a commit that referenced this pull request Aug 24, 2025
…obs #54962 (#55087)

Cherry-picked from #54962

Co-authored-by: Kaijie Chen <chenkaijie@selectdb.com>
@gavinchou gavinchou mentioned this pull request Sep 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.0.8-merged dev/3.1.0-merged p0_test reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants