Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backport: 3425 (archival: Make upload loop more resilient) #3441

Merged
merged 1 commit into from
Jan 11, 2022

Conversation

Lazin
Copy link
Contributor

@Lazin Lazin commented Jan 11, 2022

Cover letter

Handle all exceptions in the upload loop. If one of the archivers will
throw an exception which is not the expected kind just add the message
to the log. On the next iteration we will proceed with different set of
archivers so broken archiver won't cause unavailability for all
partitions.

This change is supposed to make transient errors less severe. For
instance, it is possible for the log to get truncated while the archiver
is trying to upload it. It's safe to assume that this archiver can retry
after a while. The archiver would be pushed to the back of the upload
queue so it won't be retried right away.

Backport: #3425

Release notes

  • none

Handle all exceptions in the upload loop. If one of the archivers will
throw an exception which is not the expected kind just add the message
to the log. On the next iteration we will proceed with different set of
archivers so broken archiver won't cause unavailability for all
partitions.

This change is supposed to make transient errors less severe. For
instance, it is possible for the log to get truncated while the archiver
is trying to upload it. It's safe to assume that this archiver can retry
after a while. The archiver would be pushed to the back of the upload
queue so it won't be retried right away.

(cherry picked from commit 3c7fcda)
@Lazin Lazin merged commit 67b09f4 into redpanda-data:v21.11.x Jan 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants