[Enhancement] Support refresh AWS credentials#16553
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #16553 +/- ##
============================================
+ Coverage 63.26% 63.27% +0.01%
- Complexity 1362 1379 +17
============================================
Files 3012 3012
Lines 174477 174521 +44
Branches 26724 26729 +5
============================================
+ Hits 110381 110432 +51
+ Misses 55658 55654 -4
+ Partials 8438 8435 -3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
10bb00e to
4d3ab9a
Compare
4d3ab9a to
808db80
Compare
| LOGGER.info("Copy {} to local {}", srcUri, dstFile.getAbsolutePath()); | ||
| copyToLocalFileInternal(srcUri, dstFile); | ||
| } catch (Exception e) { | ||
| LOGGER.warn("Caught exception during S3 copy, attempting to refresh credentials and retry"); |
There was a problem hiding this comment.
I feel we also need to handle the refresh logic in other method? E.g. mkdir, list files, delete file etc.
There was a problem hiding this comment.
Thanks for your advice. I've handled the refresh logic in other method. Please help to review again. @xiangfu0
ddce63c to
3417651
Compare
There was a problem hiding this comment.
Pull Request Overview
This PR introduces automatic refresh functionality for AWS credentials in the S3PinotFS component to handle production credential failures without requiring server restarts. The implementation adds credential refresh and retry logic when S3 operations fail with authentication/authorization errors.
- Extracts S3 client initialization into a reusable method for credential refresh
- Implements retry mechanism that refreshes credentials on 401/403 errors
- Adds credential logging and debugging capabilities for troubleshooting
...s/pinot-file-system/pinot-s3/src/main/java/org/apache/pinot/plugin/filesystem/S3PinotFS.java
Show resolved
Hide resolved
| _s3Client.deleteObject(deleteObjectRequest)); | ||
|
|
||
| deleteSucceeded &= deleteObjectResponse.sdkHttpResponse().isSuccessful(); | ||
| LOGGER.error("delete response result {}", deleteSucceeded); |
There was a problem hiding this comment.
This debug log statement uses ERROR level inappropriately and appears to be leftover debug code. It should be removed or changed to DEBUG level with a more descriptive message.
| LOGGER.error("delete response result {}", deleteSucceeded); | |
| LOGGER.debug("Delete operation for object '{}' in bucket '{}' succeeded: {}", s3Object.key(), segmentUri.getHost(), deleteObjectResponse.sdkHttpResponse().isSuccessful()); |
...s/pinot-file-system/pinot-s3/src/main/java/org/apache/pinot/plugin/filesystem/S3PinotFS.java
Outdated
Show resolved
Hide resolved
| _s3Client.deleteObject(deleteObjectRequest)); | ||
|
|
||
| deleteSucceeded &= deleteObjectResponse.sdkHttpResponse().isSuccessful(); | ||
| LOGGER.error("delete response result {}", deleteSucceeded); |
There was a problem hiding this comment.
removed, thanks for your reminding.
Signed-off-by: Hongkun Xu <xuhongkun666@163.com>
3417651 to
13dc1cf
Compare
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
During our production operations, all controller/server/minion suddenly experienced S3 credential failures, which caused Minion jobs and Backfill jobs to fail. The only way to resolve the issue was to restart the servers.
This PR aims to refresh the AWS credentials automatically upon S3 data retrieval failure, enabling recovery without requiring a restart.