Skip to content

[fix](s3) Add anonymous credential fallback for S3 TVF and Broker Load on public buckets#60515

Open
dataroaring wants to merge 2 commits intomasterfrom
fix/s3-tvf-anonymous-fallback-public-buckets
Open

[fix](s3) Add anonymous credential fallback for S3 TVF and Broker Load on public buckets#60515
dataroaring wants to merge 2 commits intomasterfrom
fix/s3-tvf-anonymous-fallback-public-buckets

Conversation

@dataroaring
Copy link
Contributor

@dataroaring dataroaring commented Feb 5, 2026

Summary

  • Add retry-with-anonymous logic in S3TableValuedFunction (TVF) and BrokerLoadPendingTask (Broker Load): when parseFile() fails with 403 and no explicit S3 credentials were provided, switch to ANONYMOUS credentials and retry
  • For Broker Load, update both the pending task's and parent job's brokerDesc so the downstream BE scan phase also uses anonymous access
  • Extract shared 403 detection and anonymous BrokerDesc creation into BrokerDesc.isS3AccessDeniedWithoutExplicitCredentials() and BrokerDesc.withAnonymousCredentials(), consolidating logic used by both paths

Test plan

  • S3TableValuedFunctionTest — TVF anonymous fallback tests (403 no creds, explicit creds, both fail, non-403)
  • BrokerLoadPendingTaskTest.testAnonymousFallbackOn403NoCredentials — 403 with no credentials triggers anonymous fallback, both task and job brokerDesc updated
  • BrokerLoadPendingTaskTest.testNoFallbackWhenExplicitCredentials — 403 with explicit access_key/secret_key does not trigger fallback
  • BrokerLoadPendingTaskTest.testOriginalErrorThrownWhenBothAttemptsFail — when anonymous retry also fails, original error is thrown
  • BrokerLoadPendingTaskTest.testNoFallbackOnNon403Error — non-403 errors (e.g. 404) do not trigger fallback
  • Manual test: Broker Load from public S3 bucket on instance with IAM role

🤖 Generated with Claude Code

When Doris runs on an instance with an IAM role, the default AWS credential
chain picks up instance profile credentials before reaching the anonymous
fallback. If that role lacks s3:ListBucket on a public bucket, the S3 TVF
query fails with 403.

Add retry-with-anonymous logic in S3TableValuedFunction: when parseFile()
fails with 403 and no explicit credentials (access_key, secret_key, or
role_arn) were provided, switch to ANONYMOUS credentials and retry. All
three property maps (storageProperties, backendConnectProperties,
processedParams) are updated so both FE listing and BE data reading use
anonymous access.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings February 5, 2026 02:22
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds automatic fallback to anonymous S3 credentials when accessing public S3 buckets from Doris instances running with IAM roles. The issue occurs because the AWS credential chain picks up instance profile credentials before trying anonymous access, causing 403 errors on public buckets.

Changes:

  • Adds retry-with-anonymous logic in S3TableValuedFunction that triggers on 403 errors when no explicit credentials are provided
  • Updates all three property maps (storageProperties, backendConnectProperties, processedParams) to use anonymous credentials during retry
  • Includes comprehensive unit tests covering all edge cases (explicit credentials, role ARN, retry failures, non-403 errors)

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
fe/fe-core/src/main/java/org/apache/doris/tablefunction/S3TableValuedFunction.java Implements the anonymous credential fallback logic with retry mechanism and credential checks
fe/fe-core/src/test/java/org/apache/doris/tablefunction/S3TableValuedFunctionTest.java Adds comprehensive test coverage for all anonymous fallback scenarios

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

parseFile();
} catch (AnalysisException e) {
if (shouldRetryWithAnonymous(e)) {
LOG.info("S3 TVF got 403 with no explicit credentials, retrying with anonymous access");
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code uses LOG (at lines 67 and 71) which is inherited from the parent class ExternalFileTableValuedFunction. This means log messages will be associated with the parent class name rather than S3TableValuedFunction. For consistency with other table-valued functions in the codebase (e.g., HdfsTableValuedFunction defines its own LOG at line 38), consider adding a static LOG field specific to this class.

Copilot uses AI. Check for mistakes.
@dataroaring dataroaring changed the title [fix](s3) Add anonymous credential fallback for S3 TVF on public buckets [fix](s3) Add anonymous credential fallback for S3 TVF and Broker Load on public buckets Feb 5, 2026
@dataroaring
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 30972 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 282d5f83ada214bf6859ec41a627c460c25e80c7, data reload: false

------ Round 1 ----------------------------------
q1	17601	4465	4308	4308
q2	2047	358	243	243
q3	10126	1343	736	736
q4	10219	907	321	321
q5	7536	2158	1900	1900
q6	194	176	146	146
q7	871	721	585	585
q8	9261	1338	1080	1080
q9	5118	4952	4791	4791
q10	6783	1958	1561	1561
q11	531	306	274	274
q12	336	384	230	230
q13	17786	4092	3254	3254
q14	232	230	213	213
q15	893	836	815	815
q16	697	656	619	619
q17	653	834	453	453
q18	6739	6608	6368	6368
q19	1235	1026	617	617
q20	380	358	244	244
q21	2604	1943	1942	1942
q22	359	309	272	272
Total cold run time: 102201 ms
Total hot run time: 30972 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4785	4333	4333	4333
q2	259	341	236	236
q3	2106	2593	2210	2210
q4	1356	1736	1328	1328
q5	4289	4223	4251	4223
q6	210	179	138	138
q7	1855	1780	1705	1705
q8	2788	2544	2467	2467
q9	7453	7632	7511	7511
q10	2799	3205	2665	2665
q11	556	476	476	476
q12	705	808	618	618
q13	3844	4368	3503	3503
q14	320	315	306	306
q15	903	786	806	786
q16	691	747	706	706
q17	1173	1392	1405	1392
q18	8182	8064	7946	7946
q19	893	892	878	878
q20	2083	2134	1986	1986
q21	4852	4401	4518	4401
q22	599	560	533	533
Total cold run time: 52701 ms
Total hot run time: 50347 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 28.63 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 282d5f83ada214bf6859ec41a627c460c25e80c7, data reload: false

query1	0.05	0.05	0.05
query2	0.09	0.04	0.05
query3	0.26	0.09	0.08
query4	1.61	0.11	0.11
query5	0.26	0.24	0.25
query6	1.16	0.67	0.68
query7	0.03	0.02	0.03
query8	0.05	0.04	0.04
query9	0.58	0.51	0.49
query10	0.55	0.54	0.54
query11	0.15	0.09	0.10
query12	0.14	0.11	0.10
query13	0.63	0.62	0.62
query14	1.07	1.07	1.05
query15	0.88	0.85	0.87
query16	0.40	0.42	0.41
query17	1.09	1.15	1.18
query18	0.24	0.21	0.21
query19	2.12	1.93	2.04
query20	0.02	0.01	0.01
query21	15.40	0.29	0.15
query22	5.23	0.05	0.05
query23	15.94	0.29	0.11
query24	2.49	0.65	0.58
query25	0.09	0.05	0.08
query26	0.14	0.14	0.14
query27	0.07	0.05	0.04
query28	4.80	1.15	0.97
query29	12.55	3.89	3.17
query30	0.29	0.14	0.12
query31	2.82	0.66	0.41
query32	3.25	0.61	0.50
query33	3.20	3.29	3.31
query34	16.58	5.40	4.69
query35	4.81	4.80	4.79
query36	0.65	0.50	0.50
query37	0.11	0.07	0.08
query38	0.08	0.04	0.04
query39	0.04	0.04	0.04
query40	0.20	0.18	0.14
query41	0.08	0.04	0.03
query42	0.04	0.02	0.02
query43	0.05	0.04	0.03
Total cold run time: 100.29 s
Total hot run time: 28.63 s

@dataroaring dataroaring force-pushed the fix/s3-tvf-anonymous-fallback-public-buckets branch from 282d5f8 to f35e2df Compare February 5, 2026 03:53
@dataroaring
Copy link
Contributor Author

run buildall

… S3 buckets

When Doris runs on an instance with an IAM role, the default AWS
credential chain picks up instance profile credentials before reaching
the anonymous fallback. If that role lacks s3:ListBucket on a public
bucket, Broker Load fails with 403 during the pending task file listing.

Add retry-with-anonymous logic in BrokerLoadPendingTask.getAllFileStatus():
when BrokerUtil.parseFile() fails with 403 and no explicit credentials
(access_key, secret_key, or role_arn) were provided, switch to ANONYMOUS
credentials and retry. Both the pending task's and parent job's brokerDesc
are updated so the BE scan phase also uses anonymous access.

Extract the shared 403 detection and anonymous BrokerDesc creation logic
into BrokerDesc.isS3AccessDeniedWithoutExplicitCredentials() and
BrokerDesc.withAnonymousCredentials(), consolidating the duplicate code
from S3TableValuedFunction.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@dataroaring dataroaring force-pushed the fix/s3-tvf-anonymous-fallback-public-buckets branch from f35e2df to 0babf23 Compare February 5, 2026 04:16
@doris-robot
Copy link

TPC-H: Total hot run time: 30812 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit f35e2df947d0bd2db0904bc0595107ca688f1419, data reload: false

------ Round 1 ----------------------------------
q1	17605	4413	4280	4280
q2	2014	327	225	225
q3	10201	1288	711	711
q4	10203	812	305	305
q5	7495	2121	1889	1889
q6	192	175	149	149
q7	849	711	587	587
q8	9261	1333	1097	1097
q9	5278	4839	4860	4839
q10	6810	1959	1506	1506
q11	530	289	271	271
q12	331	377	220	220
q13	17806	4022	3243	3243
q14	223	234	216	216
q15	886	829	796	796
q16	686	677	615	615
q17	636	808	480	480
q18	6698	6357	6443	6357
q19	1242	982	632	632
q20	380	352	250	250
q21	2600	2046	1869	1869
q22	356	316	275	275
Total cold run time: 102282 ms
Total hot run time: 30812 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4332	4347	4363	4347
q2	267	333	254	254
q3	2110	2578	2264	2264
q4	1353	1722	1286	1286
q5	4275	4141	4273	4141
q6	208	180	135	135
q7	1867	1805	1700	1700
q8	2703	2558	2469	2469
q9	7783	7740	7625	7625
q10	2965	3057	2664	2664
q11	544	492	465	465
q12	712	797	717	717
q13	3870	4352	3545	3545
q14	281	312	319	312
q15	897	810	831	810
q16	696	714	682	682
q17	1141	1331	1408	1331
q18	8298	7970	8017	7970
q19	844	837	819	819
q20	2066	2194	2039	2039
q21	4910	4351	4128	4128
q22	581	563	502	502
Total cold run time: 52703 ms
Total hot run time: 50205 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 28.03 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit f35e2df947d0bd2db0904bc0595107ca688f1419, data reload: false

query1	0.05	0.04	0.04
query2	0.10	0.05	0.05
query3	0.26	0.09	0.08
query4	1.61	0.11	0.10
query5	0.26	0.24	0.25
query6	1.17	0.69	0.68
query7	0.03	0.03	0.02
query8	0.05	0.04	0.04
query9	0.55	0.49	0.50
query10	0.56	0.55	0.55
query11	0.14	0.09	0.09
query12	0.14	0.10	0.10
query13	0.64	0.61	0.61
query14	1.07	1.08	1.04
query15	0.87	0.84	0.86
query16	0.41	0.38	0.43
query17	1.15	1.15	1.09
query18	0.23	0.20	0.21
query19	2.14	1.97	2.08
query20	0.02	0.02	0.02
query21	15.50	0.27	0.15
query22	5.23	0.05	0.06
query23	15.94	0.29	0.11
query24	1.40	0.32	0.18
query25	0.12	0.07	0.06
query26	0.15	0.15	0.13
query27	0.08	0.08	0.05
query28	3.24	1.14	0.96
query29	12.57	3.87	3.14
query30	0.27	0.14	0.11
query31	2.82	0.63	0.40
query32	3.24	0.61	0.50
query33	3.26	3.25	3.21
query34	16.63	5.42	4.75
query35	4.80	4.81	4.75
query36	0.64	0.51	0.49
query37	0.11	0.07	0.07
query38	0.07	0.05	0.05
query39	0.04	0.02	0.02
query40	0.19	0.17	0.14
query41	0.09	0.03	0.03
query42	0.05	0.03	0.02
query43	0.05	0.04	0.03
Total cold run time: 97.94 s
Total hot run time: 28.03 s

@dataroaring
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31330 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 0babf23a89c119475ecdb9e5266e9c7dd43a1c19, data reload: false

------ Round 1 ----------------------------------
q1	17649	4478	4294	4294
q2	2046	355	224	224
q3	10156	1327	717	717
q4	10203	840	320	320
q5	7667	2180	1966	1966
q6	231	176	145	145
q7	914	718	590	590
q8	9272	1356	1165	1165
q9	5412	4932	4871	4871
q10	6875	1945	1575	1575
q11	526	299	280	280
q12	399	376	220	220
q13	17790	4102	3209	3209
q14	237	232	219	219
q15	904	828	804	804
q16	702	674	627	627
q17	668	805	507	507
q18	6962	6582	6507	6507
q19	1371	1018	624	624
q20	387	360	245	245
q21	2648	2061	1947	1947
q22	369	316	274	274
Total cold run time: 103388 ms
Total hot run time: 31330 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4408	4337	4366	4337
q2	266	341	256	256
q3	2087	2667	2240	2240
q4	1373	1755	1292	1292
q5	4281	4174	4201	4174
q6	228	180	135	135
q7	1881	1778	1999	1778
q8	2602	2501	2499	2499
q9	7679	7518	7436	7436
q10	2916	3009	2580	2580
q11	557	486	494	486
q12	711	761	600	600
q13	3886	4373	3647	3647
q14	294	319	278	278
q15	858	827	791	791
q16	686	764	674	674
q17	1157	1351	1346	1346
q18	8000	8041	8061	8041
q19	890	844	873	844
q20	2095	2353	1928	1928
q21	4844	4473	4196	4196
q22	583	558	494	494
Total cold run time: 52282 ms
Total hot run time: 50052 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 28.4 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 0babf23a89c119475ecdb9e5266e9c7dd43a1c19, data reload: false

query1	0.05	0.04	0.04
query2	0.09	0.04	0.04
query3	0.26	0.08	0.09
query4	1.61	0.12	0.11
query5	0.28	0.24	0.26
query6	1.17	0.68	0.66
query7	0.03	0.03	0.02
query8	0.06	0.05	0.04
query9	0.57	0.50	0.50
query10	0.56	0.54	0.55
query11	0.14	0.10	0.10
query12	0.14	0.10	0.10
query13	0.63	0.62	0.62
query14	1.06	1.07	1.04
query15	0.87	0.87	0.88
query16	0.41	0.40	0.40
query17	1.19	1.16	1.06
query18	0.22	0.21	0.21
query19	1.99	2.04	2.10
query20	0.03	0.02	0.02
query21	15.39	0.28	0.15
query22	5.04	0.07	0.06
query23	15.94	0.29	0.11
query24	1.04	0.34	0.51
query25	0.08	0.12	0.07
query26	0.14	0.13	0.13
query27	0.06	0.06	0.08
query28	4.15	1.14	0.97
query29	12.55	3.89	3.17
query30	0.27	0.15	0.12
query31	2.82	0.64	0.40
query32	3.24	0.59	0.50
query33	3.25	3.18	3.25
query34	16.13	5.45	4.74
query35	4.79	4.83	4.81
query36	0.66	0.49	0.50
query37	0.11	0.07	0.07
query38	0.08	0.04	0.04
query39	0.04	0.02	0.02
query40	0.19	0.16	0.15
query41	0.09	0.03	0.03
query42	0.04	0.03	0.03
query43	0.04	0.04	0.03
Total cold run time: 97.5 s
Total hot run time: 28.4 s

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 43.62% (41/94) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants