Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](routine load) write edit log when rescheduled job #40728

Merged
merged 1 commit into from
Sep 14, 2024

Conversation

sollhui
Copy link
Contributor

@sollhui sollhui commented Sep 12, 2024

2024-09-11 20:00:53,079 ERROR (replayer|105) [RoutineLoadManager.replayChangeRoutineLoadJob():836] should not happened
org.apache.doris.common.DdlException: errCode = 2, detailMessage = Could not transform PAUSED to PAUSED
	at org.apache.doris.load.routineload.RoutineLoadJob.checkStateTransform(RoutineLoadJob.java:855) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.load.routineload.RoutineLoadJob.unprotectUpdateState(RoutineLoadJob.java:1407) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.load.routineload.RoutineLoadJob.updateState(RoutineLoadJob.java:1394) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.load.routineload.RoutineLoadManager.replayChangeRoutineLoadJob(RoutineLoadManager.java:834) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.persist.EditLog.loadJournal(EditLog.java:717) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.catalog.Env.replayJournal(Env.java:2913) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.catalog.Env$4.runOneCycle(Env.java:2675) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.common.util.Daemon.run(Daemon.java:116) ~[doris-fe.jar:1.2-SNAPSHOT]

unprotectNeedReschedule() will change job state to JobState.NEED_SCHEDULE without logOpRoutineLoadJob.If job is paused then rescheduled and paused finally, the record of two consecutive edit logs will be 'PAUSED', the correct
replay sequence should be: PAUSED -> NEED_SCHEDULE -> PAUSED.

Therefore, it is need to write edit log when rescheduled job.

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@sollhui
Copy link
Contributor Author

sollhui commented Sep 12, 2024

run buildall

@sollhui
Copy link
Contributor Author

sollhui commented Sep 12, 2024

run buildall

@sollhui
Copy link
Contributor Author

sollhui commented Sep 12, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 39963 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 12398ff87c6eb3560861780f76f27011391f683e, data reload: false

------ Round 1 ----------------------------------
q1	18034	4402	4288	4288
q2	2492	197	194	194
q3	11013	1444	1387	1387
q4	10282	967	1020	967
q5	8110	3227	3182	3182
q6	228	141	142	141
q7	1056	641	647	641
q8	9841	1981	2014	1981
q9	6811	6287	6284	6284
q10	7029	2525	2484	2484
q11	441	242	245	242
q12	404	224	228	224
q13	17752	3030	3022	3022
q14	281	256	261	256
q15	545	486	482	482
q16	511	418	425	418
q17	967	954	940	940
q18	7378	6903	6809	6809
q19	1391	1235	1228	1228
q20	605	326	311	311
q21	3906	3546	3504	3504
q22	1052	978	987	978
Total cold run time: 110129 ms
Total hot run time: 39963 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4193	4190	4181	4181
q2	351	228	233	228
q3	2899	2874	2930	2874
q4	1982	1966	1968	1966
q5	5414	5370	5435	5370
q6	216	129	130	129
q7	2053	1671	1677	1671
q8	3229	3289	3305	3289
q9	8422	8384	8393	8384
q10	3361	3416	3439	3416
q11	574	456	475	456
q12	768	559	561	559
q13	5903	3037	3055	3037
q14	303	265	260	260
q15	531	484	493	484
q16	482	443	449	443
q17	1795	1697	1726	1697
q18	8035	7742	7673	7673
q19	1721	1691	1688	1688
q20	2036	1792	1788	1788
q21	5583	5414	5439	5414
q22	1124	994	1009	994
Total cold run time: 60975 ms
Total hot run time: 56001 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 193876 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 12398ff87c6eb3560861780f76f27011391f683e, data reload: false

query1	927	391	382	382
query2	6496	1682	1676	1676
query3	6663	211	225	211
query4	25988	24023	23547	23547
query5	4915	531	535	531
query6	254	177	166	166
query7	4598	297	303	297
query8	273	227	223	223
query9	8492	2457	2474	2457
query10	454	282	284	282
query11	16220	15859	15690	15690
query12	162	103	110	103
query13	1688	401	375	375
query14	11569	6743	6486	6486
query15	214	173	184	173
query16	7586	440	493	440
query17	1550	582	571	571
query18	1902	303	295	295
query19	199	160	150	150
query20	122	111	112	111
query21	209	112	107	107
query22	4420	4402	4195	4195
query23	34400	33539	33673	33539
query24	9533	3166	3094	3094
query25	671	419	440	419
query26	1213	164	168	164
query27	2892	286	287	286
query28	6916	2024	2006	2006
query29	960	432	432	432
query30	297	155	156	155
query31	1015	799	823	799
query32	105	57	62	57
query33	737	301	322	301
query34	903	468	485	468
query35	887	723	757	723
query36	1063	922	923	922
query37	167	83	83	83
query38	4020	3952	3895	3895
query39	1476	1411	1416	1411
query40	209	118	116	116
query41	51	53	47	47
query42	117	99	100	99
query43	479	446	440	440
query44	1259	777	756	756
query45	202	173	173	173
query46	1122	816	805	805
query47	1895	1781	1788	1781
query48	365	291	294	291
query49	1124	465	470	465
query50	921	441	456	441
query51	6852	7022	6774	6774
query52	104	91	88	88
query53	255	182	186	182
query54	791	471	460	460
query55	80	84	78	78
query56	306	275	266	266
query57	1212	1111	1096	1096
query58	275	242	254	242
query59	2628	2462	2541	2462
query60	310	294	289	289
query61	123	227	101	101
query62	929	695	698	695
query63	226	190	188	188
query64	5308	724	666	666
query65	3242	3179	3166	3166
query66	1386	346	353	346
query67	16080	15796	15543	15543
query68	3440	878	863	863
query69	441	323	333	323
query70	1152	1160	1143	1143
query71	355	340	347	340
query72	6130	3404	3402	3402
query73	587	585	581	581
query74	9358	9057	9041	9041
query75	3171	3027	3057	3027
query76	2058	851	844	844
query77	444	410	399	399
query78	9396	9275	9225	9225
query79	937	912	881	881
query80	854	818	814	814
query81	454	265	276	265
query82	267	261	268	261
query83	193	192	194	192
query84	232	108	105	105
query85	690	404	399	399
query86	337	298	320	298
query87	4304	4360	4422	4360
query88	4116	4044	4033	4033
query89	381	367	369	367
query90	1320	312	321	312
query91	125	125	124	124
query92	79	78	76	76
query93	1048	1068	1041	1041
query94	619	390	369	369
query95	462	419	416	416
query96	474	469	468	468
query97	3098	3142	3088	3088
query98	233	228	232	228
query99	1556	1316	1313	1313
Total cold run time: 278318 ms
Total hot run time: 193876 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.45 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 12398ff87c6eb3560861780f76f27011391f683e, data reload: false

query1	0.04	0.05	0.04
query2	0.08	0.04	0.04
query3	0.23	0.05	0.04
query4	1.69	0.07	0.06
query5	0.50	0.51	0.51
query6	1.12	0.73	0.73
query7	0.02	0.01	0.01
query8	0.05	0.04	0.05
query9	0.56	0.52	0.52
query10	0.57	0.58	0.57
query11	0.15	0.12	0.12
query12	0.15	0.13	0.13
query13	0.62	0.61	0.60
query14	1.47	1.53	1.47
query15	0.89	0.88	0.87
query16	0.36	0.37	0.36
query17	0.99	1.03	1.05
query18	0.22	0.21	0.20
query19	1.92	1.82	1.83
query20	0.02	0.01	0.01
query21	15.42	0.68	0.67
query22	3.91	7.14	1.70
query23	17.93	1.33	1.32
query24	2.27	0.22	0.21
query25	0.18	0.07	0.08
query26	0.30	0.19	0.18
query27	0.07	0.08	0.08
query28	13.18	1.02	0.98
query29	12.55	3.36	3.33
query30	0.24	0.05	0.05
query31	2.89	0.42	0.41
query32	3.24	0.48	0.49
query33	3.04	3.05	3.05
query34	15.48	4.32	4.34
query35	4.33	4.34	4.36
query36	0.70	0.49	0.47
query37	0.19	0.17	0.16
query38	0.18	0.15	0.17
query39	0.05	0.04	0.04
query40	0.16	0.14	0.14
query41	0.09	0.05	0.05
query42	0.05	0.04	0.04
query43	0.04	0.04	0.04
Total cold run time: 108.14 s
Total hot run time: 31.45 s

@sollhui
Copy link
Contributor Author

sollhui commented Sep 12, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40176 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit d5ee4b37b9b5d75d01514c189070c6e4eb8e6377, data reload: false

------ Round 1 ----------------------------------
q1	18191	4349	4328	4328
q2	2898	214	205	205
q3	12381	1345	1461	1345
q4	10626	1020	1036	1020
q5	8649	3189	3153	3153
q6	229	137	143	137
q7	1045	634	654	634
q8	9463	2023	1993	1993
q9	6831	6381	6364	6364
q10	7018	2530	2523	2523
q11	432	254	254	254
q12	424	233	223	223
q13	17760	2993	3038	2993
q14	281	243	250	243
q15	554	515	483	483
q16	512	424	424	424
q17	981	955	943	943
q18	7571	6822	6947	6822
q19	1398	1242	1233	1233
q20	617	333	330	330
q21	3914	3508	3519	3508
q22	1099	1018	1019	1018
Total cold run time: 112874 ms
Total hot run time: 40176 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4205	4206	4224	4206
q2	346	241	234	234
q3	2896	2888	2896	2888
q4	1967	1941	1949	1941
q5	5444	5409	5470	5409
q6	218	131	130	130
q7	2059	1675	1677	1675
q8	3251	3317	3290	3290
q9	8437	8441	8472	8441
q10	3386	3446	3453	3446
q11	573	461	470	461
q12	791	577	576	576
q13	4680	3061	3052	3052
q14	303	276	273	273
q15	526	495	492	492
q16	495	455	458	455
q17	1771	1723	1683	1683
q18	7908	7534	7604	7534
q19	1737	1705	1693	1693
q20	2044	1809	1811	1809
q21	5527	5353	5442	5353
q22	1109	1020	1004	1004
Total cold run time: 59673 ms
Total hot run time: 56045 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 194957 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit d5ee4b37b9b5d75d01514c189070c6e4eb8e6377, data reload: false

query1	908	397	384	384
query2	6492	1751	1723	1723
query3	6650	207	220	207
query4	25972	23959	24254	23959
query5	5476	532	533	532
query6	269	170	161	161
query7	4596	313	297	297
query8	276	216	225	216
query9	8487	2465	2458	2458
query10	504	287	289	287
query11	16238	15390	15704	15390
query12	157	102	109	102
query13	1686	393	374	374
query14	11538	6747	6723	6723
query15	211	172	176	172
query16	7597	485	484	484
query17	1547	584	578	578
query18	1997	304	294	294
query19	199	152	155	152
query20	120	113	113	113
query21	212	108	105	105
query22	4702	4280	4297	4280
query23	34493	33694	33951	33694
query24	9285	3055	3095	3055
query25	670	410	413	410
query26	1380	160	161	160
query27	2891	284	280	280
query28	6922	2051	2016	2016
query29	979	441	436	436
query30	300	159	155	155
query31	1028	765	810	765
query32	106	57	61	57
query33	735	312	307	307
query34	915	477	472	472
query35	895	766	729	729
query36	1057	910	896	896
query37	174	90	79	79
query38	4009	3912	3871	3871
query39	1468	1421	1543	1421
query40	281	115	117	115
query41	53	49	51	49
query42	121	102	98	98
query43	508	461	467	461
query44	1228	777	755	755
query45	199	174	172	172
query46	1127	845	812	812
query47	1891	1840	1817	1817
query48	370	291	285	285
query49	1117	461	450	450
query50	920	435	445	435
query51	7152	6992	6953	6953
query52	101	91	92	91
query53	262	193	187	187
query54	765	481	477	477
query55	77	78	75	75
query56	301	280	294	280
query57	1226	1132	1102	1102
query58	248	241	254	241
query59	2840	2680	2734	2680
query60	309	290	279	279
query61	123	123	218	123
query62	901	663	666	663
query63	240	183	184	183
query64	5335	688	678	678
query65	3240	3157	3180	3157
query66	1435	337	336	336
query67	16080	15631	15478	15478
query68	3452	873	865	865
query69	430	326	326	326
query70	1166	1137	1141	1137
query71	358	356	343	343
query72	6142	3353	3371	3353
query73	603	588	580	580
query74	9279	9157	9034	9034
query75	3165	3000	3029	3000
query76	1885	852	862	852
query77	442	401	401	401
query78	9446	9310	9346	9310
query79	920	910	890	890
query80	857	851	836	836
query81	443	261	270	261
query82	264	268	274	268
query83	192	198	197	197
query84	241	107	105	105
query85	662	464	402	402
query86	338	325	298	298
query87	4394	4368	4427	4368
query88	4157	4102	4075	4075
query89	374	374	375	374
query90	1258	318	313	313
query91	125	127	123	123
query92	78	75	108	75
query93	1062	1095	1062	1062
query94	596	368	409	368
query95	469	424	420	420
query96	478	476	476	476
query97	3099	3110	3138	3110
query98	223	232	219	219
query99	1572	1328	1290	1290
Total cold run time: 279761 ms
Total hot run time: 194957 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.15 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit d5ee4b37b9b5d75d01514c189070c6e4eb8e6377, data reload: false

query1	0.04	0.05	0.04
query2	0.08	0.04	0.04
query3	0.22	0.04	0.04
query4	1.67	0.06	0.06
query5	0.50	0.50	0.50
query6	1.16	0.72	0.72
query7	0.02	0.01	0.02
query8	0.05	0.04	0.04
query9	0.56	0.52	0.50
query10	0.57	0.57	0.56
query11	0.16	0.12	0.12
query12	0.15	0.12	0.12
query13	0.62	0.62	0.61
query14	1.45	1.49	1.46
query15	0.90	0.87	0.88
query16	0.36	0.36	0.36
query17	1.01	1.05	1.03
query18	0.21	0.20	0.20
query19	1.96	1.83	1.85
query20	0.01	0.01	0.00
query21	15.41	0.66	0.66
query22	3.59	6.75	1.58
query23	17.77	1.30	1.32
query24	2.27	0.23	0.22
query25	0.18	0.08	0.08
query26	0.29	0.19	0.18
query27	0.08	0.08	0.07
query28	13.17	1.00	0.98
query29	12.52	3.29	3.26
query30	0.25	0.05	0.05
query31	2.89	0.42	0.42
query32	3.23	0.49	0.50
query33	3.08	3.04	3.02
query34	15.44	4.32	4.31
query35	4.35	4.34	4.31
query36	0.69	0.48	0.49
query37	0.19	0.16	0.16
query38	0.17	0.15	0.15
query39	0.05	0.05	0.04
query40	0.16	0.14	0.15
query41	0.10	0.05	0.05
query42	0.06	0.05	0.05
query43	0.05	0.04	0.04
Total cold run time: 107.69 s
Total hot run time: 31.15 s

@sollhui
Copy link
Contributor Author

sollhui commented Sep 12, 2024

run buildall

@sollhui
Copy link
Contributor Author

sollhui commented Sep 12, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 43271 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit fd42d4be3e74da95646bded1e91ad0d44357640a, data reload: false

------ Round 1 ----------------------------------
q1	17607	7324	7304	7304
q2	2028	198	187	187
q3	10463	1340	1357	1340
q4	10160	957	1001	957
q5	7733	3254	3188	3188
q6	242	150	150	150
q7	1045	637	632	632
q8	9470	2062	2040	2040
q9	6874	6346	6385	6346
q10	7028	2539	2498	2498
q11	442	253	253	253
q12	413	224	227	224
q13	17760	3016	3022	3016
q14	290	264	256	256
q15	572	534	509	509
q16	528	431	427	427
q17	1007	967	978	967
q18	7426	6907	6884	6884
q19	1384	1259	1253	1253
q20	617	323	333	323
q21	3941	3576	3524	3524
q22	1103	993	1014	993
Total cold run time: 108133 ms
Total hot run time: 43271 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7194	7187	7175	7175
q2	343	235	239	235
q3	2913	2924	2897	2897
q4	1978	1958	1964	1958
q5	5480	5431	5443	5431
q6	234	143	148	143
q7	2049	1677	1679	1677
q8	3281	3350	3330	3330
q9	8435	8378	8421	8378
q10	3409	3463	3469	3463
q11	567	464	479	464
q12	777	561	600	561
q13	8059	3020	3035	3020
q14	321	280	268	268
q15	576	517	518	517
q16	495	450	470	450
q17	1779	1725	1724	1724
q18	7962	7669	7799	7669
q19	1758	1741	1731	1731
q20	2041	1774	1810	1774
q21	5765	5472	5491	5472
q22	1125	996	985	985
Total cold run time: 66541 ms
Total hot run time: 59322 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 196041 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit fd42d4be3e74da95646bded1e91ad0d44357640a, data reload: false

query1	941	386	391	386
query2	6488	1853	1860	1853
query3	6649	208	222	208
query4	26053	24162	24036	24036
query5	5528	541	530	530
query6	265	175	164	164
query7	4605	300	319	300
query8	294	238	219	219
query9	8518	2567	2590	2567
query10	500	284	279	279
query11	16260	15422	15514	15422
query12	161	100	100	100
query13	1687	411	382	382
query14	11070	7158	7084	7084
query15	234	166	178	166
query16	7629	493	467	467
query17	1571	596	586	586
query18	2071	295	298	295
query19	196	152	149	149
query20	130	119	115	115
query21	214	109	108	108
query22	4650	4511	4582	4511
query23	34676	33973	33783	33783
query24	9386	3189	3120	3120
query25	644	405	423	405
query26	723	157	158	157
query27	2150	286	287	286
query28	5769	2132	2091	2091
query29	901	441	437	437
query30	293	171	156	156
query31	1009	761	825	761
query32	105	59	57	57
query33	728	308	315	308
query34	930	488	479	479
query35	896	743	727	727
query36	1036	913	907	907
query37	138	85	83	83
query38	4111	3878	3834	3834
query39	1468	1405	1411	1405
query40	205	119	120	119
query41	50	51	48	48
query42	120	97	99	97
query43	492	441	447	441
query44	1260	819	782	782
query45	202	175	174	174
query46	1128	832	816	816
query47	1904	1779	1809	1779
query48	371	305	297	297
query49	1083	464	459	459
query50	929	440	458	440
query51	7206	6901	7016	6901
query52	102	90	91	90
query53	260	188	182	182
query54	741	469	469	469
query55	81	78	75	75
query56	297	302	272	272
query57	1248	1091	1085	1085
query58	261	250	244	244
query59	2959	2843	2627	2627
query60	306	290	281	281
query61	127	216	98	98
query62	928	686	680	680
query63	221	187	185	185
query64	5441	696	702	696
query65	3267	3183	3170	3170
query66	1409	294	297	294
query67	16131	15677	15582	15582
query68	3200	872	851	851
query69	439	322	336	322
query70	1146	1172	1153	1153
query71	349	346	347	346
query72	5958	3401	3394	3394
query73	598	589	589	589
query74	9241	9074	9042	9042
query75	3118	3045	2986	2986
query76	1897	854	849	849
query77	444	411	407	407
query78	9557	9325	9324	9324
query79	925	887	868	868
query80	853	839	803	803
query81	456	271	273	271
query82	264	261	256	256
query83	196	194	189	189
query84	231	110	107	107
query85	701	410	398	398
query86	329	307	314	307
query87	4393	4324	4340	4324
query88	4206	4135	4128	4128
query89	369	366	372	366
query90	1249	320	318	318
query91	159	125	125	125
query92	87	76	77	76
query93	1037	1041	1036	1036
query94	619	380	360	360
query95	457	424	422	422
query96	472	474	472	472
query97	3139	3121	3120	3120
query98	223	224	230	224
query99	1609	1302	1302	1302
Total cold run time: 277327 ms
Total hot run time: 196041 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.5 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit fd42d4be3e74da95646bded1e91ad0d44357640a, data reload: false

query1	0.04	0.04	0.04
query2	0.07	0.04	0.04
query3	0.23	0.04	0.04
query4	1.68	0.07	0.07
query5	0.50	0.50	0.50
query6	1.14	0.72	0.72
query7	0.02	0.01	0.01
query8	0.06	0.05	0.04
query9	0.55	0.52	0.51
query10	0.58	0.60	0.56
query11	0.16	0.12	0.12
query12	0.15	0.12	0.12
query13	0.63	0.62	0.59
query14	1.47	1.46	1.46
query15	0.89	0.87	0.86
query16	0.36	0.36	0.38
query17	1.01	1.00	1.02
query18	0.22	0.20	0.20
query19	1.88	1.84	1.80
query20	0.01	0.01	0.01
query21	15.40	0.66	0.66
query22	3.82	7.45	1.75
query23	18.05	1.34	1.38
query24	2.30	0.21	0.22
query25	0.18	0.09	0.08
query26	0.29	0.19	0.17
query27	0.08	0.09	0.07
query28	13.18	1.12	1.09
query29	12.57	3.34	3.34
query30	0.27	0.06	0.06
query31	2.86	0.41	0.42
query32	3.23	0.49	0.49
query33	3.06	3.04	3.05
query34	15.44	4.30	4.30
query35	4.36	4.34	4.39
query36	0.69	0.49	0.49
query37	0.19	0.16	0.17
query38	0.16	0.15	0.14
query39	0.05	0.04	0.04
query40	0.16	0.13	0.14
query41	0.10	0.05	0.05
query42	0.06	0.05	0.05
query43	0.05	0.04	0.04
Total cold run time: 108.2 s
Total hot run time: 31.5 s

Copy link
Contributor

@liaoxin01 liaoxin01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Sep 12, 2024
Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring dataroaring merged commit 10d9c50 into apache:master Sep 14, 2024
25 of 28 checks passed
dataroaring pushed a commit that referenced this pull request Sep 23, 2024
```
2024-09-11 20:00:53,079 ERROR (replayer|105) [RoutineLoadManager.replayChangeRoutineLoadJob():836] should not happened
org.apache.doris.common.DdlException: errCode = 2, detailMessage = Could not transform PAUSED to PAUSED
	at org.apache.doris.load.routineload.RoutineLoadJob.checkStateTransform(RoutineLoadJob.java:855) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.load.routineload.RoutineLoadJob.unprotectUpdateState(RoutineLoadJob.java:1407) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.load.routineload.RoutineLoadJob.updateState(RoutineLoadJob.java:1394) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.load.routineload.RoutineLoadManager.replayChangeRoutineLoadJob(RoutineLoadManager.java:834) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.persist.EditLog.loadJournal(EditLog.java:717) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.catalog.Env.replayJournal(Env.java:2913) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.catalog.Env$4.runOneCycle(Env.java:2675) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.common.util.Daemon.run(Daemon.java:116) ~[doris-fe.jar:1.2-SNAPSHOT]
```

`unprotectNeedReschedule()` will change job state to
`JobState.NEED_SCHEDULE` without `logOpRoutineLoadJob`.If job is paused
then rescheduled and paused finally, the record of two consecutive edit
logs will be 'PAUSED', the correct
replay sequence should be: `PAUSED` -> `NEED_SCHEDULE` ->` PAUSED`. 

Therefore, it is need to write edit log when rescheduled job.
sollhui added a commit to sollhui/doris that referenced this pull request Sep 23, 2024
```
2024-09-11 20:00:53,079 ERROR (replayer|105) [RoutineLoadManager.replayChangeRoutineLoadJob():836] should not happened
org.apache.doris.common.DdlException: errCode = 2, detailMessage = Could not transform PAUSED to PAUSED
	at org.apache.doris.load.routineload.RoutineLoadJob.checkStateTransform(RoutineLoadJob.java:855) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.load.routineload.RoutineLoadJob.unprotectUpdateState(RoutineLoadJob.java:1407) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.load.routineload.RoutineLoadJob.updateState(RoutineLoadJob.java:1394) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.load.routineload.RoutineLoadManager.replayChangeRoutineLoadJob(RoutineLoadManager.java:834) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.persist.EditLog.loadJournal(EditLog.java:717) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.catalog.Env.replayJournal(Env.java:2913) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.catalog.Env$4.runOneCycle(Env.java:2675) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.common.util.Daemon.run(Daemon.java:116) ~[doris-fe.jar:1.2-SNAPSHOT]
```

`unprotectNeedReschedule()` will change job state to
`JobState.NEED_SCHEDULE` without `logOpRoutineLoadJob`.If job is paused
then rescheduled and paused finally, the record of two consecutive edit
logs will be 'PAUSED', the correct
replay sequence should be: `PAUSED` -> `NEED_SCHEDULE` ->` PAUSED`. 

Therefore, it is need to write edit log when rescheduled job.
yiguolei pushed a commit that referenced this pull request Sep 24, 2024
)

pick (#40728)
```
2024-09-11 20:00:53,079 ERROR (replayer|105) [RoutineLoadManager.replayChangeRoutineLoadJob():836] should not happened
org.apache.doris.common.DdlException: errCode = 2, detailMessage = Could not transform PAUSED to PAUSED
	at org.apache.doris.load.routineload.RoutineLoadJob.checkStateTransform(RoutineLoadJob.java:855) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.load.routineload.RoutineLoadJob.unprotectUpdateState(RoutineLoadJob.java:1407) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.load.routineload.RoutineLoadJob.updateState(RoutineLoadJob.java:1394) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.load.routineload.RoutineLoadManager.replayChangeRoutineLoadJob(RoutineLoadManager.java:834) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.persist.EditLog.loadJournal(EditLog.java:717) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.catalog.Env.replayJournal(Env.java:2913) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.catalog.Env$4.runOneCycle(Env.java:2675) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.common.util.Daemon.run(Daemon.java:116) ~[doris-fe.jar:1.2-SNAPSHOT]
```

`unprotectNeedReschedule()` will change job state to
`JobState.NEED_SCHEDULE` without `logOpRoutineLoadJob`.If job is paused
then rescheduled and paused finally, the record of two consecutive edit
logs will be 'PAUSED', the correct
replay sequence should be: `PAUSED` -> `NEED_SCHEDULE` ->` PAUSED`. 

Therefore, it is need to write edit log when rescheduled job.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.1.7-merged dev/3.0.2-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants