Skip to content

branch-4.1: [improvement](cloud) Support configurable S3 credentials providers #62788#63680

Open
wyxxxcat wants to merge 1 commit into
apache:branch-4.1from
wyxxxcat:pick_62788_4_1
Open

branch-4.1: [improvement](cloud) Support configurable S3 credentials providers #62788#63680
wyxxxcat wants to merge 1 commit into
apache:branch-4.1from
wyxxxcat:pick_62788_4_1

Conversation

@wyxxxcat
Copy link
Copy Markdown
Collaborator

pick: #62788

…pache#62788)

Support configuring the AWS S3 credentials provider type for cloud
storage vaults and S3 storage parameters. This allows IAM role based
access to use providers other than instance profile, such as
environment, system
properties, web identity, container, default chain, and anonymous
credentials. The credential provider type is propagated through FE
properties, thrift/proto definitions, meta service storage vault
handling, and cloud
  recycler S3 accessor creation.

```
CREATE storage vault IF NOT EXISTS s3_vault_2
    PROPERTIES (
    "type"="S3",
    "s3.endpoint"="xxx",
    "s3.role_arn" = "xxx",
    "s3.region" = "us-west-1",
    "s3.root.path" = "xxx",
    "s3.bucket" = "xxx",
    "provider" = "S3",
    "use_path_style" = "false",
    "s3.credentials_provider_type" = "WEB_IDENTITY",
    "s3_validity_check" = "false"
);

```

```
curl "127.0.0.1:5678/MetaService/http/get_obj_store_info
```
```
{
    "id": "3",
    "name": "s3_vault_2",
    "obj_info": {
        "ctime": "1777011933",
        "mtime": "1777011933",
        "id": "3",
        "bucket": "xxx",
        "prefix": "wyx_test",
        "endpoint": "xxx",
        "region": "us-west-1",
        "provider": "S3",
        "external_endpoint": "",
        "sse_enabled": true,
        "use_path_style": false,
        "cred_provider_type": "WEB_IDENTITY",
        "role_arn": "xxx",
        "external_id": ""
    }
}
```
@wyxxxcat wyxxxcat requested a review from yiguolei as a code owner May 26, 2026 10:00
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@wyxxxcat
Copy link
Copy Markdown
Collaborator Author

run buildall

@wyxxxcat
Copy link
Copy Markdown
Collaborator Author

/review

@hello-stephen
Copy link
Copy Markdown
Contributor

Cloud UT Coverage Report

Increment line coverage 31.82% (14/44) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 78.08% (1849/2368)
Line Coverage 64.82% (33237/51275)
Region Coverage 65.32% (16457/25193)
Branch Coverage 55.84% (8791/15744)

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 54.84% (17/31) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 0.84% (1/119) 🎉
Increment coverage report
Complete coverage report

@wyxxxcat
Copy link
Copy Markdown
Collaborator Author

/review

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 0.84% (1/119) 🎉
Increment coverage report
Complete coverage report

1 similar comment
@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 0.84% (1/119) 🎉
Increment coverage report
Complete coverage report

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review summary:

I found blocking issues in the new configurable S3 credentials provider support. The AWS_* alias is only wired for the static cloud/thrift conversion path, so FE object-bound paths can ignore AWS_CREDENTIALS_PROVIDER_TYPE. Also, the newly exposed CONTAINER provider only handles AWS_CONTAINER_CREDENTIALS_RELATIVE_URI, while ECS/Fargate can use AWS_CONTAINER_CREDENTIALS_FULL_URI plus AWS_CONTAINER_AUTHORIZATION_TOKEN; this affects both the cloud recycler path and the BE thrift path made reachable by this PR.

Critical checkpoints:

  • Existing inline threads: none were present, so no duplicate comments were submitted.
  • User focus: no additional focus was provided.
  • Repository AGENTS/code-review guidance: the only applicable AGENTS.md found is for Nereids expression legality checks; this PR does not touch those files, so that checkpoint is not applicable.
  • Backward compatibility / runtime behavior: request changes because valid documented AWS credential-provider configurations can be silently ignored or fail at runtime.

@@ -468,6 +469,7 @@ public static class Env {

public static final String ROLE_ARN = "AWS_ROLE_ARN";
public static final String EXTERNAL_ID = "AWS_EXTERNAL_ID";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds the AWS_* spelling for the static cloud/thrift conversion helpers, but the object-bound S3Properties path still does not bind that key because the @ConnectorProperty for awsCredentialsProviderType only lists s3.credentials_provider_type and glue.credentials_provider_type. A user creating S3 properties with AWS_ROLE_ARN plus AWS_CREDENTIALS_PROVIDER_TYPE=env goes through StorageProperties.createPrimary, leaves awsCredentialsProviderType at DEFAULT, and getBackendConfigProperties()/FE STS validation use the wrong provider even though getObjStoreInfoPB() would accept it. Please add this alias to the connector property binding (or normalize before binding) and cover the createPrimary/backend-config path with an AWS_* test.

return std::make_shared<Aws::Auth::ProfileConfigFileAWSCredentialsProvider>();
case CredProviderType::WebIdentity:
return std::make_shared<Aws::Auth::STSAssumeRoleWebIdentityCredentialsProvider>();
case CredProviderType::Container:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explicit CONTAINER provider only looks at AWS_CONTAINER_CREDENTIALS_RELATIVE_URI. ECS/Fargate can provide credentials via AWS_CONTAINER_CREDENTIALS_FULL_URI (with optional AWS_CONTAINER_AUTHORIZATION_TOKEN), and the existing CustomAwsCredentialsProviderChain in this repo already handles both forms. With this implementation, a vault configured with s3.credentials_provider_type=CONTAINER on a FULL_URI-only task constructs TaskRoleCredentialsProvider with an empty relative path and the recycler cannot obtain credentials. Please mirror the relative/full URI selection used by CustomAwsCredentialsProviderChain here.

Comment thread be/src/util/s3_util.cpp
return CredProviderType::SystemProperties;
case TCredProviderType::WEB_IDENTITY:
return CredProviderType::WebIdentity;
case TCredProviderType::CONTAINER:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exposing TCredProviderType::CONTAINER makes FE able to send this mode to BE, but BE's S3ClientFactory::_create_credentials_provider container branch only supports AWS_CONTAINER_CREDENTIALS_RELATIVE_URI and ignores AWS_CONTAINER_CREDENTIALS_FULL_URI/AWS_CONTAINER_AUTHORIZATION_TOKEN. That means the new thrift value fails for ECS/Fargate deployments that use the full-URI credential endpoint. Please update the BE provider creation before enabling this enum mapping, and add a full-URI test alongside the relative-URI case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants