Skip to content

Conversation

@JNSimba
Copy link
Member

@JNSimba JNSimba commented Jan 12, 2026

What problem does this PR solve?

fix get remote meta failed to pause streaming job

Releate PR: #58898

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Jan 12, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@JNSimba
Copy link
Member Author

JNSimba commented Jan 12, 2026

run buildall

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to fix an issue where streaming jobs fail to pause correctly when fetching remote metadata fails. The fix addresses the auto-resume mechanism that could incorrectly wake up manually paused jobs.

Changes:

  • Added volatile modifier to the failureReason field for thread-safe access
  • Modified error handling in fetchMeta() to check existing failure reasons before setting new ones
  • Added logic to pause the job and trigger auto-resume when metadata fetch fails

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@JNSimba
Copy link
Member Author

JNSimba commented Jan 12, 2026

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32111 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c5fab237661c009795a182678477e8020d954c09, data reload: false

------ Round 1 ----------------------------------
q1	17621	4293	4164	4164
q2	2135	356	243	243
q3	10085	1263	735	735
q4	10228	915	325	325
q5	7549	2064	1845	1845
q6	193	170	140	140
q7	946	818	669	669
q8	9269	1395	1227	1227
q9	4849	4592	4504	4504
q10	6751	1791	1424	1424
q11	514	297	292	292
q12	699	739	585	585
q13	17765	3879	3118	3118
q14	283	296	274	274
q15	594	513	515	513
q16	681	699	623	623
q17	664	804	508	508
q18	6550	6467	6865	6467
q19	1330	1085	634	634
q20	440	400	268	268
q21	3272	2688	2493	2493
q22	1118	1094	1060	1060
Total cold run time: 103536 ms
Total hot run time: 32111 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4291	4499	4369	4369
q2	321	399	336	336
q3	2228	2841	2395	2395
q4	1461	1847	1418	1418
q5	4410	4339	4262	4262
q6	215	171	140	140
q7	2018	2104	1715	1715
q8	2563	2335	2410	2335
q9	7272	7202	7067	7067
q10	2514	2757	2269	2269
q11	550	464	475	464
q12	756	759	619	619
q13	3707	4053	3375	3375
q14	312	323	291	291
q15	553	506	527	506
q16	664	700	631	631
q17	1183	1322	1346	1322
q18	7372	7178	7230	7178
q19	832	795	805	795
q20	1904	1993	1787	1787
q21	4571	4325	4122	4122
q22	1040	1008	969	969
Total cold run time: 50737 ms
Total hot run time: 48365 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 173551 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c5fab237661c009795a182678477e8020d954c09, data reload: false

query5	5652	638	471	471
query6	371	245	222	222
query7	4211	455	269	269
query8	350	252	245	245
query9	8811	2654	2660	2654
query10	565	381	333	333
query11	15418	15243	14977	14977
query12	194	128	118	118
query13	1258	485	378	378
query14	7776	3131	2975	2975
query14_1	2872	2807	2858	2807
query15	261	197	179	179
query16	999	478	461	461
query17	1266	687	550	550
query18	2697	425	331	331
query19	301	224	198	198
query20	131	126	123	123
query21	217	139	127	127
query22	3986	4032	3829	3829
query23	16030	15645	15243	15243
query23_1	15157	15398	15342	15342
query24	6907	1568	1192	1192
query24_1	1191	1169	1173	1169
query25	530	456	395	395
query26	1225	260	155	155
query27	2698	464	290	290
query28	4498	2131	2119	2119
query29	782	550	457	457
query30	322	246	226	226
query31	799	632	573	573
query32	77	69	73	69
query33	537	363	286	286
query34	893	902	535	535
query35	757	776	703	703
query36	860	851	852	851
query37	131	98	82	82
query38	2817	2749	2708	2708
query39	787	759	723	723
query39_1	706	735	713	713
query40	219	136	118	118
query41	67	66	63	63
query42	108	108	110	108
query43	440	449	428	428
query44	1349	725	727	725
query45	191	182	179	179
query46	861	980	606	606
query47	1387	1442	1320	1320
query48	316	327	236	236
query49	603	409	335	335
query50	660	278	203	203
query51	3866	3806	3837	3806
query52	110	114	97	97
query53	307	339	278	278
query54	285	262	256	256
query55	85	79	72	72
query56	315	301	301	301
query57	1030	985	875	875
query58	271	257	267	257
query59	2027	2111	2147	2111
query60	331	331	298	298
query61	165	168	155	155
query62	387	347	309	309
query63	315	272	281	272
query64	4939	1433	1119	1119
query65	3858	3749	3741	3741
query66	1379	434	323	323
query67	15216	15703	14539	14539
query68	2756	1038	755	755
query69	494	369	334	334
query70	1049	853	1005	853
query71	350	318	340	318
query72	6004	3396	3465	3396
query73	581	733	301	301
query74	8807	8669	8607	8607
query75	2812	2869	2505	2505
query76	2667	1072	668	668
query77	524	369	285	285
query78	9731	9953	9097	9097
query79	971	932	599	599
query80	1096	582	497	497
query81	551	274	240	240
query82	217	148	116	116
query83	403	270	243	243
query84	258	115	109	109
query85	1047	517	468	468
query86	404	323	321	321
query87	2905	2982	2826	2826
query88	3200	2222	2218	2218
query89	382	359	334	334
query90	1814	158	161	158
query91	176	171	144	144
query92	75	71	66	66
query93	950	898	529	529
query94	580	323	279	279
query95	562	385	307	307
query96	601	462	205	205
query97	2334	2408	2302	2302
query98	217	202	208	202
query99	599	589	530	530
Total cold run time: 250548 ms
Total hot run time: 173551 ms

@JNSimba
Copy link
Member Author

JNSimba commented Jan 12, 2026

run cloud_p0

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 0.00% (0/5) 🎉
Increment coverage report
Complete coverage report

@JNSimba
Copy link
Member Author

JNSimba commented Jan 12, 2026

run buildall

@JNSimba
Copy link
Member Author

JNSimba commented Jan 12, 2026

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31701 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 6bc3e07b9eb4ff07790931bf67bab037e00c8226, data reload: false

------ Round 1 ----------------------------------
q1	17609	4210	4032	4032
q2	2036	355	243	243
q3	10135	1323	685	685
q4	10221	919	333	333
q5	7548	2097	1887	1887
q6	192	171	140	140
q7	918	792	658	658
q8	9289	1475	1203	1203
q9	4976	4601	4577	4577
q10	6806	1790	1399	1399
q11	542	310	291	291
q12	691	747	595	595
q13	17808	3846	3083	3083
q14	287	300	285	285
q15	600	523	505	505
q16	691	666	642	642
q17	670	770	569	569
q18	6530	6461	6474	6461
q19	1097	967	602	602
q20	396	361	251	251
q21	2973	2484	2327	2327
q22	1034	1009	933	933
Total cold run time: 103049 ms
Total hot run time: 31701 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4125	4040	4038	4038
q2	335	391	332	332
q3	2094	2613	2207	2207
q4	1298	1751	1339	1339
q5	4077	3990	4086	3990
q6	214	171	130	130
q7	1868	1830	1700	1700
q8	2816	2548	2491	2491
q9	7157	7214	7045	7045
q10	2492	2743	2309	2309
q11	558	468	453	453
q12	717	800	663	663
q13	3631	4137	3274	3274
q14	289	318	280	280
q15	561	641	513	513
q16	651	677	625	625
q17	1331	1282	1347	1282
q18	8075	7827	7659	7659
q19	889	873	950	873
q20	2026	2112	1925	1925
q21	4921	4588	4303	4303
q22	1060	988	982	982
Total cold run time: 51185 ms
Total hot run time: 48413 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 172533 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 6bc3e07b9eb4ff07790931bf67bab037e00c8226, data reload: false

query5	4369	587	443	443
query6	349	226	221	221
query7	4207	449	265	265
query8	334	240	233	233
query9	8791	2671	2657	2657
query10	502	367	327	327
query11	15367	15151	14992	14992
query12	187	115	117	115
query13	1260	496	372	372
query14	6175	3098	2855	2855
query14_1	2692	2707	2721	2707
query15	208	194	174	174
query16	1013	469	439	439
query17	1116	678	586	586
query18	2432	451	347	347
query19	227	222	197	197
query20	127	116	116	116
query21	216	139	120	120
query22	3934	3970	3880	3880
query23	15994	15565	15148	15148
query23_1	15433	15492	15309	15309
query24	7360	1542	1190	1190
query24_1	1185	1181	1189	1181
query25	568	481	421	421
query26	1242	270	176	176
query27	2748	461	303	303
query28	4555	2142	2129	2129
query29	801	569	467	467
query30	317	237	209	209
query31	799	638	557	557
query32	83	73	76	73
query33	547	342	297	297
query34	898	876	536	536
query35	725	774	735	735
query36	856	874	775	775
query37	129	92	81	81
query38	2760	2750	2697	2697
query39	779	746	726	726
query39_1	707	696	714	696
query40	214	131	112	112
query41	64	60	61	60
query42	105	101	107	101
query43	438	497	417	417
query44	1331	713	717	713
query45	193	184	173	173
query46	849	998	588	588
query47	1400	1481	1277	1277
query48	319	328	242	242
query49	616	410	330	330
query50	637	266	196	196
query51	3755	3794	3666	3666
query52	106	104	93	93
query53	298	325	281	281
query54	287	257	258	257
query55	76	73	71	71
query56	282	289	293	289
query57	1031	1017	930	930
query58	268	251	286	251
query59	2175	2218	2205	2205
query60	325	310	317	310
query61	159	162	162	162
query62	396	359	321	321
query63	310	263	272	263
query64	4921	1305	1011	1011
query65	3789	3690	3748	3690
query66	1455	425	302	302
query67	15013	14540	14592	14540
query68	5478	1002	704	704
query69	510	351	314	314
query70	1056	919	904	904
query71	365	301	290	290
query72	6074	3503	3414	3414
query73	777	754	304	304
query74	8802	8875	8633	8633
query75	2851	2812	2437	2437
query76	3458	1063	655	655
query77	520	383	284	284
query78	9748	9797	9185	9185
query79	1269	922	608	608
query80	663	564	494	494
query81	517	258	227	227
query82	207	142	113	113
query83	263	260	245	245
query84	259	128	100	100
query85	933	529	459	459
query86	382	334	340	334
query87	2922	2935	2773	2773
query88	3242	2223	2199	2199
query89	404	352	336	336
query90	2062	164	153	153
query91	180	165	141	141
query92	77	67	63	63
query93	1101	928	543	543
query94	572	328	301	301
query95	573	377	302	302
query96	591	471	207	207
query97	2341	2371	2308	2308
query98	237	201	207	201
query99	601	577	500	500
Total cold run time: 250964 ms
Total hot run time: 172533 ms

@JNSimba
Copy link
Member Author

JNSimba commented Jan 12, 2026

run cloud_p0

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 0.00% (0/5) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 0.00% (0/5) 🎉
Increment coverage report
Complete coverage report

@JNSimba
Copy link
Member Author

JNSimba commented Jan 12, 2026

run cloud_p0

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 0.00% (0/5) 🎉
Increment coverage report
Complete coverage report

1 similar comment
@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 0.00% (0/5) 🎉
Increment coverage report
Complete coverage report

Copy link
Contributor

@sollhui sollhui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 13, 2026
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@JNSimba JNSimba merged commit 4bf66fb into apache:master Jan 13, 2026
28 of 29 checks passed
github-actions bot pushed a commit that referenced this pull request Jan 13, 2026
…59760)

### What problem does this PR solve?

fix get remote meta failed to pause streaming job

Releate PR: #58898
yiguolei pushed a commit that referenced this pull request Jan 13, 2026
…aming job #59760 (#59808)

Cherry-picked from #59760

Co-authored-by: wudi <wudi@selectdb.com>
zzzxl1993 pushed a commit to zzzxl1993/doris that referenced this pull request Jan 13, 2026
…pache#59760)

### What problem does this PR solve?

fix get remote meta failed to pause streaming job

Releate PR: apache#58898
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.3-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants