fix: cache CLI AWS credentials until expiry#23325
Open
ametel01 wants to merge 4 commits into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
The CLI's S3 credential provider wrapped AWS SDK credentials but called
SharedCredentialsProvider::provide_credentials()for every object-store credential request. That bypasses the AWS SDK request pipeline identity cache and can repeatedly invoke expensive providers such ascredential_processor SSO-backed providers even when the returned credentials have not expired.What changes are included in this PR?
object_store::aws::AwsCredentialinsideS3CredentialProvider.credential_processflow that fails onmainwithout this fix.Are these changes tested?
Yes:
cargo test -p datafusion-cli s3_object_store_reuses_fetched_credentials_until_expiry -- --nocapturecargo test -p datafusion-cli object_storagecargo test -p datafusion-clicargo clippy --all-targets --all-features -- -D warningsI also verified the new public-flow regression test against detached
main; it fails there because the credential process runs multiple times instead of once.Are there any user-facing changes?
No API or configuration changes. The CLI should make fewer redundant AWS credential-provider calls for S3 object stores while still refreshing expiring credentials.