Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backup: fail instead of pausing when out of retries #96673

Merged
merged 1 commit into from
Feb 7, 2023

Conversation

dt
Copy link
Member

@dt dt commented Feb 6, 2023

This changes backups to fail with the most recent error instead of pausing when they run out of retries while hitting non-permanent errors. This is desired as a failed backup is expected to be more likely to be seen and noticed in monitoring (number of failed jobs increments, they show up red on jobs page, etc) and becuase a backup schedule it then able to trigger subsequent jobs instead of wiating on the incomplete paused job that will never resume or complete without human intervention.

Release note (ops change): a BACKUP which encounters too many retryable errors will now fail instead of pausing to allow subsequent backups the chance to succeed.

Epic: CRDB-21953

This changes backups to fail with the most recent error instead of pausing
when they run out of retries while hitting non-permanent errors. This is
desired as a failed backup is expected to be more likely to be seen and noticed
in monitoring (number of failed jobs increments, they show up red on jobs page,
etc) and becuase a backup schedule it then able to trigger subsequent jobs
instead of wiating on the incomplete paused job that will never resume or
complete without human intervention.

Release note (ops change): a BACKUP which encounters too many retryable errors will now
fail instead of pausing to allow subsequent backups the chance to succeed.
@dt dt requested review from a team as code owners February 6, 2023 19:20
@blathers-crl
Copy link

blathers-crl bot commented Feb 6, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@dt
Copy link
Member Author

dt commented Feb 7, 2023

TFTRs!

bors r+

@craig
Copy link
Contributor

craig bot commented Feb 7, 2023

Build failed (retrying...):

@joshimhoff
Copy link
Collaborator

Thanks for making this change! CC @dasrirez

@craig craig bot merged commit 9444613 into cockroachdb:master Feb 7, 2023
@craig
Copy link
Contributor

craig bot commented Feb 7, 2023

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants