Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry on interrupted connections when reading from S3 in native FS library #22895

Merged
merged 2 commits into from
Aug 1, 2024

Conversation

nineinchnick
Copy link
Member

Description

Read the response inside a transformer lambda and throw a retryable
exception on IO errors to use the already configured SDK retry mechanism.

Additional context and related issues

Release notes

(x) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

@nineinchnick
Copy link
Member Author

The solution is based on aws/aws-sdk-java#856 (comment)

Copy link
Contributor

@SemionPar SemionPar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Member

@electrum electrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing this kind of stuff is hard. I like the usage of Toxiproxy. It's very cool that we can wrap this around an arbitrary service. The test verifies that retries are happening, which is good enough, so the below is just a thought/suggestion.

It would be nice to test that reads succeed after the retry. Given that S3 is simple and the request/response pattern is also simple, using MockWebServer might be a good alternative. You could enqueue multiple failure responses for the failure case, and failure+success responses for the success-with-retry case.

Copy link
Contributor

@mayankvadariya mayankvadariya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

Read the resposne inside a transformer lambda and throw a retryable
exception on IO errors to use the already configured SDK retry
mechanism.
@nineinchnick
Copy link
Member Author

@electrum I first tried doing this with the MockSyncHttpClient from the AWS SDK testing services library, but without constructing a complete/proper request, the SDK was doing retries automatically before calling the response transformer. I'd much rather implement a custom toxic that'd only drop the first N connections. WDYT?

Maybe that can be a follow-up, I applied all comments and this should be ready to go.

@wendigo wendigo merged commit eb36dfc into trinodb:master Aug 1, 2024
124 checks passed
@github-actions github-actions bot added this to the 454 milestone Aug 1, 2024
@nineinchnick nineinchnick deleted the native-s3-retries branch August 6, 2024 12:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

5 participants