backport: 3425 (archival: Make upload loop more resilient) #3441
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Cover letter
Handle all exceptions in the upload loop. If one of the archivers will
throw an exception which is not the expected kind just add the message
to the log. On the next iteration we will proceed with different set of
archivers so broken archiver won't cause unavailability for all
partitions.
This change is supposed to make transient errors less severe. For
instance, it is possible for the log to get truncated while the archiver
is trying to upload it. It's safe to assume that this archiver can retry
after a while. The archiver would be pushed to the back of the upload
queue so it won't be retried right away.
Backport: #3425
Release notes