Skip to content

[Enhancement] Support refresh AWS credentials#16553

Merged
xiangfu0 merged 2 commits intoapache:masterfrom
hongkunxu:enhance/support-s3-access-key-hotreload
Aug 11, 2025
Merged

[Enhancement] Support refresh AWS credentials#16553
xiangfu0 merged 2 commits intoapache:masterfrom
hongkunxu:enhance/support-s3-access-key-hotreload

Conversation

@hongkunxu
Copy link
Copy Markdown
Contributor

@hongkunxu hongkunxu commented Aug 8, 2025

During our production operations, all controller/server/minion suddenly experienced S3 credential failures, which caused Minion jobs and Backfill jobs to fail. The only way to resolve the issue was to restart the servers.
This PR aims to refresh the AWS credentials automatically upon S3 data retrieval failure, enabling recovery without requiring a restart.

@xiangfu0 xiangfu0 requested review from Copilot and xiangfu0 August 8, 2025 03:04
@xiangfu0 xiangfu0 added pinot-filesystem Related to PinotFS abstraction (S3, GCS, ADLS, HDFS) AWS Related to AWS-specific features or deployment security Related to security hardening labels Aug 8, 2025

This comment was marked as outdated.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Aug 8, 2025

Codecov Report

❌ Patch coverage is 38.54167% with 59 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.27%. Comparing base (642bf00) to head (13dc1cf).

Files with missing lines Patch % Lines
.../org/apache/pinot/plugin/filesystem/S3PinotFS.java 38.54% 58 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #16553      +/-   ##
============================================
+ Coverage     63.26%   63.27%   +0.01%     
- Complexity     1362     1379      +17     
============================================
  Files          3012     3012              
  Lines        174477   174521      +44     
  Branches      26724    26729       +5     
============================================
+ Hits         110381   110432      +51     
+ Misses        55658    55654       -4     
+ Partials       8438     8435       -3     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 63.24% <38.54%> (+<0.01%) ⬆️
java-21 63.22% <38.54%> (+0.01%) ⬆️
temurin 63.27% <38.54%> (+0.01%) ⬆️
unittests 63.27% <38.54%> (+0.01%) ⬆️
unittests1 56.34% <ø> (-0.02%) ⬇️
unittests2 33.37% <38.54%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@hongkunxu hongkunxu force-pushed the enhance/support-s3-access-key-hotreload branch 2 times, most recently from 10bb00e to 4d3ab9a Compare August 8, 2025 05:43
@hongkunxu hongkunxu marked this pull request as ready for review August 8, 2025 05:47
@hongkunxu hongkunxu force-pushed the enhance/support-s3-access-key-hotreload branch from 4d3ab9a to 808db80 Compare August 8, 2025 06:07
@xiangfu0 xiangfu0 requested a review from Copilot August 8, 2025 19:45

This comment was marked as outdated.

LOGGER.info("Copy {} to local {}", srcUri, dstFile.getAbsolutePath());
copyToLocalFileInternal(srcUri, dstFile);
} catch (Exception e) {
LOGGER.warn("Caught exception during S3 copy, attempting to refresh credentials and retry");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel we also need to handle the refresh logic in other method? E.g. mkdir, list files, delete file etc.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your advice. I've handled the refresh logic in other method. Please help to review again. @xiangfu0

@xiangfu0 xiangfu0 self-requested a review August 8, 2025 22:35
@hongkunxu hongkunxu force-pushed the enhance/support-s3-access-key-hotreload branch 2 times, most recently from ddce63c to 3417651 Compare August 9, 2025 15:16
@xiangfu0 xiangfu0 requested a review from Copilot August 9, 2025 21:47
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces automatic refresh functionality for AWS credentials in the S3PinotFS component to handle production credential failures without requiring server restarts. The implementation adds credential refresh and retry logic when S3 operations fail with authentication/authorization errors.

  • Extracts S3 client initialization into a reusable method for credential refresh
  • Implements retry mechanism that refreshes credentials on 401/403 errors
  • Adds credential logging and debugging capabilities for troubleshooting

_s3Client.deleteObject(deleteObjectRequest));

deleteSucceeded &= deleteObjectResponse.sdkHttpResponse().isSuccessful();
LOGGER.error("delete response result {}", deleteSucceeded);
Copy link

Copilot AI Aug 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This debug log statement uses ERROR level inappropriately and appears to be leftover debug code. It should be removed or changed to DEBUG level with a more descriptive message.

Suggested change
LOGGER.error("delete response result {}", deleteSucceeded);
LOGGER.debug("Delete operation for object '{}' in bucket '{}' succeeded: {}", s3Object.key(), segmentUri.getHost(), deleteObjectResponse.sdkHttpResponse().isSuccessful());

Copilot uses AI. Check for mistakes.
_s3Client.deleteObject(deleteObjectRequest));

deleteSucceeded &= deleteObjectResponse.sdkHttpResponse().isSuccessful();
LOGGER.error("delete response result {}", deleteSucceeded);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this log?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed, thanks for your reminding.

Signed-off-by: Hongkun Xu <xuhongkun666@163.com>
@hongkunxu hongkunxu force-pushed the enhance/support-s3-access-key-hotreload branch from 3417651 to 13dc1cf Compare August 10, 2025 03:14
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@xiangfu0 xiangfu0 merged commit 662be93 into apache:master Aug 11, 2025
17 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AWS Related to AWS-specific features or deployment pinot-filesystem Related to PinotFS abstraction (S3, GCS, ADLS, HDFS) security Related to security hardening

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants