Skip to content

[fix](tablet scheduler) fix decommission sched exception#36415

Merged
dataroaring merged 2 commits intoapache:masterfrom
yujun777:decommission-sched-exception
Jun 18, 2024
Merged

[fix](tablet scheduler) fix decommission sched exception#36415
dataroaring merged 2 commits intoapache:masterfrom
yujun777:decommission-sched-exception

Conversation

@yujun777
Copy link
Contributor

BUG: when there exists running txns, decommission replica will throw a sched failed exception and enter waiting for decommission status. But pr #30117 changes this behaviour, it cause throw a sched unrecovrable exception, then decommission replica will fail immediately.

Proposed changes

Issue Number: close #xxx

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@yujun777
Copy link
Contributor Author

run buildall

Copy link
Contributor

@deardeng deardeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@yujun777
Copy link
Contributor Author

run buildall

1 similar comment
@yujun777
Copy link
Contributor Author

run buildall

Copy link
Contributor

@deardeng deardeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yujun777
Copy link
Contributor Author

run buildall

1 similar comment
@yujun777
Copy link
Contributor Author

run buildall

@yujun777
Copy link
Contributor Author

本地编译是可以的, 流水线上编译似乎又抽风了

@doris-robot
Copy link

TPC-H: Total hot run time: 39886 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 0e3bcd6594b13767e4953f8862cd5743d43ddea8, data reload: false

------ Round 1 ----------------------------------
q1	17629	4303	4335	4303
q2	2023	189	197	189
q3	10457	1113	1093	1093
q4	10920	871	828	828
q5	7510	2694	2638	2638
q6	224	144	136	136
q7	976	624	626	624
q8	9284	2076	2115	2076
q9	8965	6599	6595	6595
q10	9797	3671	3765	3671
q11	440	236	236	236
q12	432	232	223	223
q13	17842	2954	2957	2954
q14	271	213	221	213
q15	526	477	478	477
q16	520	393	376	376
q17	965	696	657	657
q18	8066	7409	7310	7310
q19	5985	1489	1437	1437
q20	640	320	313	313
q21	4911	3207	3253	3207
q22	389	330	331	330
Total cold run time: 118772 ms
Total hot run time: 39886 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4411	4289	4226	4226
q2	368	265	261	261
q3	2957	2714	2715	2714
q4	1898	1609	1626	1609
q5	5268	5260	5281	5260
q6	213	125	126	125
q7	2104	1667	1708	1667
q8	3186	3283	3309	3283
q9	8391	8349	8352	8349
q10	3875	3686	3662	3662
q11	573	491	473	473
q12	760	604	591	591
q13	16480	2987	2974	2974
q14	287	250	259	250
q15	520	465	471	465
q16	470	408	414	408
q17	1775	1448	1460	1448
q18	7550	7552	7487	7487
q19	1730	1627	1548	1548
q20	1985	1759	1773	1759
q21	4840	4555	4660	4555
q22	599	532	551	532
Total cold run time: 70240 ms
Total hot run time: 53646 ms

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jun 17, 2024
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@doris-robot
Copy link

TPC-DS: Total hot run time: 169998 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 0e3bcd6594b13767e4953f8862cd5743d43ddea8, data reload: false

query1	926	379	375	375
query2	6477	2325	2309	2309
query3	6644	207	211	207
query4	19661	17428	17294	17294
query5	4149	471	462	462
query6	274	156	156	156
query7	4594	299	288	288
query8	317	278	277	277
query9	8642	2386	2358	2358
query10	603	295	275	275
query11	10711	10077	10001	10001
query12	141	89	81	81
query13	1640	363	368	363
query14	9895	7764	7541	7541
query15	237	196	195	195
query16	7808	274	275	274
query17	1894	561	555	555
query18	1968	276	279	276
query19	199	159	157	157
query20	98	88	87	87
query21	216	139	131	131
query22	4204	4104	4038	4038
query23	33838	33130	33140	33130
query24	12177	2871	2770	2770
query25	725	367	357	357
query26	1749	151	147	147
query27	2960	317	313	313
query28	7499	2034	2036	2034
query29	1106	636	611	611
query30	277	150	149	149
query31	955	750	729	729
query32	93	57	55	55
query33	774	300	278	278
query34	948	473	469	469
query35	741	620	610	610
query36	1068	976	903	903
query37	273	74	68	68
query38	2877	2736	2754	2736
query39	855	797	789	789
query40	289	125	124	124
query41	48	56	52	52
query42	125	95	103	95
query43	541	530	540	530
query44	1180	716	720	716
query45	193	167	172	167
query46	1074	747	722	722
query47	1839	1777	1766	1766
query48	375	288	293	288
query49	1195	403	429	403
query50	762	375	385	375
query51	6824	6641	6530	6530
query52	104	91	99	91
query53	363	287	288	287
query54	999	438	456	438
query55	73	72	69	69
query56	279	253	249	249
query57	1170	1036	1091	1036
query58	248	240	240	240
query59	3425	3190	3151	3151
query60	316	265	272	265
query61	96	93	95	93
query62	646	438	449	438
query63	317	291	282	282
query64	9866	2221	1716	1716
query65	3152	3118	3104	3104
query66	1386	330	330	330
query67	15385	15141	14856	14856
query68	4876	534	539	534
query69	534	294	310	294
query70	1154	1152	1133	1133
query71	409	269	308	269
query72	7940	2790	2621	2621
query73	732	320	323	320
query74	5980	5563	5447	5447
query75	3574	2618	2636	2618
query76	2967	921	948	921
query77	451	292	290	290
query78	10386	9931	9737	9737
query79	2287	510	509	509
query80	965	458	463	458
query81	555	221	225	221
query82	802	106	103	103
query83	250	176	165	165
query84	250	86	84	84
query85	1989	287	267	267
query86	474	300	333	300
query87	3278	3126	3045	3045
query88	4045	2349	2349	2349
query89	467	369	378	369
query90	1869	191	191	191
query91	134	102	99	99
query92	66	51	51	51
query93	2309	511	493	493
query94	1266	197	186	186
query95	415	315	315	315
query96	604	271	262	262
query97	3190	3056	3119	3056
query98	219	200	193	193
query99	1179	862	837	837
Total cold run time: 278099 ms
Total hot run time: 169998 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.73 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 0e3bcd6594b13767e4953f8862cd5743d43ddea8, data reload: false

query1	0.04	0.03	0.03
query2	0.09	0.04	0.04
query3	0.22	0.05	0.05
query4	1.68	0.07	0.08
query5	0.50	0.49	0.50
query6	1.12	0.72	0.72
query7	0.02	0.01	0.02
query8	0.05	0.05	0.05
query9	0.54	0.48	0.47
query10	0.55	0.55	0.54
query11	0.14	0.11	0.12
query12	0.15	0.11	0.12
query13	0.59	0.58	0.60
query14	0.79	0.80	0.77
query15	0.82	0.81	0.81
query16	0.37	0.36	0.36
query17	1.05	1.01	1.03
query18	0.23	0.24	0.24
query19	1.79	1.66	1.69
query20	0.02	0.01	0.01
query21	15.44	0.68	0.66
query22	3.86	7.11	2.20
query23	18.28	1.28	1.24
query24	2.11	0.23	0.22
query25	0.15	0.08	0.09
query26	0.26	0.19	0.18
query27	0.08	0.09	0.08
query28	13.22	1.02	0.99
query29	12.59	3.19	3.24
query30	0.27	0.07	0.05
query31	2.88	0.38	0.38
query32	3.28	0.47	0.46
query33	2.92	2.94	2.86
query34	17.23	4.45	4.44
query35	4.46	4.51	4.52
query36	0.64	0.47	0.47
query37	0.17	0.15	0.16
query38	0.14	0.15	0.14
query39	0.04	0.04	0.03
query40	0.17	0.14	0.15
query41	0.10	0.05	0.05
query42	0.05	0.05	0.04
query43	0.05	0.04	0.04
Total cold run time: 109.15 s
Total hot run time: 30.73 s

@dataroaring dataroaring merged commit 2f5d6fa into apache:master Jun 18, 2024
dataroaring pushed a commit that referenced this pull request Jun 21, 2024
BUG: when there exists running txns, decommission replica will throw a
sched failed exception and enter waiting for decommission status. But pr
#30117 changes this behaviour, it cause throw a sched unrecovrable
exception, then decommission replica will fail immediately.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.0.0-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants