Increase tenacity of S3RetryingInputStream #87243
Labels
>bug
:Distributed/Snapshot/Restore
Anything directly related to the `_snapshot/*` APIs
Team:Distributed
Meta label for distributed team
The
S3RetryingInputStream
hides cases where S3 closes a connection partway through downloading a blob. By default it retries 3 times before failing the download. However, the number of failures tends to increase on larger blobs and often 3 failures is not enough to complete a multi-GB download if S3 is suffering from a cluster of failures as sometimes happens. Typically we make 10s-to-100s of MBs of progress between each failure even in this state.I think we should increase the tenacity of
S3RetryingInputStream
when downloading larger blobs. For instance, we could not count a partial download towards the retry limit if it makes significant progress before failing.The text was updated successfully, but these errors were encountered: