New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Copy S3 file GCP fallback to buffer copy #60164
Copy S3 file GCP fallback to buffer copy #60164
Conversation
Sometimes during
And this looks right according to documentation https://cloud.google.com/storage/docs/json_api/v1/status-codes#413_Payload_Too_Large. But sometimes error can be like this, and in documentation it is not specified that
Maybe we can somehow better understand this specific case, and properly fallback to buffer copy. |
This is an automated comment for commit 25bfcdd with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page Successful checks
|
I think this is related to this issue #59660 (comment) |
Totally agree, we need to support |
src/IO/S3/copyS3File.cpp
Outdated
if (outcome.GetError().GetExceptionName() == "EntityTooLarge" || | ||
outcome.GetError().GetExceptionName() == "InvalidRequest" || | ||
outcome.GetError().GetExceptionName() == "InvalidArgument" || | ||
(outcome.GetError().GetExceptionName() == "InternalError" && outcome.GetError().GetResponseCode() == Aws::Http::HttpResponseCode::GATEWAY_TIMEOUT)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking for GATEWAY_TIMEOUT
doesn't seem completely correct because it can mean something else.
Can we check something else to identify that kind of response?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, we could search in the response for this text use the Rewrite method in the JSON API
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. I think we need to change this check a bit.
But I am not sure if check for some specific substring in error message is okay.
@antonio2368 What do you think ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm I agree with @vitlibar, we would risk fallback to heavy operation instead of simply retrying the copy if the error can be avoided by a simple retry.
and both the code and exception name seem generic so we would have to search in the respones
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
6e141ee
to
25bfcdd
Compare
…b3765c03a2196b0fcb02ac97be39b Cherry pick #60164 to 23.8: Copy S3 file GCP fallback to buffer copy
…eb3765c03a2196b0fcb02ac97be39b Cherry pick #60164 to 23.11: Copy S3 file GCP fallback to buffer copy
…eb3765c03a2196b0fcb02ac97be39b Cherry pick #60164 to 23.12: Copy S3 file GCP fallback to buffer copy
…b3765c03a2196b0fcb02ac97be39b Cherry pick #60164 to 24.1: Copy S3 file GCP fallback to buffer copy
Backport #60164 to 23.12: Copy S3 file GCP fallback to buffer copy
Backport #60164 to 24.1: Copy S3 file GCP fallback to buffer copy
Backport #60164 to 23.11: Copy S3 file GCP fallback to buffer copy
Backport #60164 to 23.8: Copy S3 file GCP fallback to buffer copy
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Copy S3 file GCP fallback to buffer copy in case GCP returned
Internal Error
withGATEWAY_TIMEOUT
HTTP error code.