Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple the readinessProbe of etcd and Snapshotter #411

Merged
merged 5 commits into from
Feb 16, 2022

Conversation

ishan16696
Copy link
Member

@ishan16696 ishan16696 commented Dec 27, 2021

What this PR does / why we need it:
Currently, snapshotter fails the readiness probe of etcd if it fails to take/upload the backup, thus cutting off incoming traffic to etcd until snapshotter becomes healthy again.
This PR decouples the readinessProbe of etcd and snapshotter health of backup-restore.So, Backup-restore's snapshotter failure won't fails the readinessProbe of etcd and hence etcd will allow to keep serving the incoming traffic.
This PR also introduced a Exponential-Backoff mechanism in Snapshotter(if Snapshotter fails.)

Which issue(s) this PR fixes:
Fixes #325

Special notes for your reviewer:

Release note:

Failed backups won't fail the readinessProbe thus etcd is allowed to keep serving the incoming traffic.
Added CLI flags `--backoff-multiplier`, `--backoff-attempt-limit` and `--backoff-threshold-time` to configure the exponential-backoff mechanism.

@ishan16696 ishan16696 requested a review from a team as a code owner December 27, 2021 07:53
@gardener-robot gardener-robot added needs/review Needs review size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) labels Dec 27, 2021
@gardener-robot-ci-1 gardener-robot-ci-1 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Dec 27, 2021
@gardener-robot-ci-3 gardener-robot-ci-3 added needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Dec 27, 2021
@ishan16696 ishan16696 added the reviewed/do-not-merge Has no approval for merging as it may break things, be of poor quality or have (ext.) dependencies label Dec 27, 2021
@gardener-robot gardener-robot added size/s Size of pull request is small (see gardener-robot robot/bots/size.py) and removed size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) labels Jan 3, 2022
@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jan 3, 2022
@gardener-robot-ci-1 gardener-robot-ci-1 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jan 3, 2022
@gardener-robot gardener-robot added size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) and removed size/s Size of pull request is small (see gardener-robot robot/bots/size.py) labels Jan 3, 2022
@gardener-robot-ci-2 gardener-robot-ci-2 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jan 3, 2022
@gardener-robot-ci-3 gardener-robot-ci-3 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jan 3, 2022
@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jan 3, 2022
@gardener-robot-ci-2 gardener-robot-ci-2 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jan 3, 2022
@gardener-robot-ci-2 gardener-robot-ci-2 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jan 3, 2022
@ishan16696 ishan16696 removed the reviewed/do-not-merge Has no approval for merging as it may break things, be of poor quality or have (ext.) dependencies label Jan 4, 2022
@gardener-robot-ci-3 gardener-robot-ci-3 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jan 4, 2022
@gardener-robot-ci-2 gardener-robot-ci-2 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jan 4, 2022
@ishan16696
Copy link
Member Author

Hi @shreyas-s-rao,
I have added a new pkg for backoff mechanism and also expose the backoff config as a configurable flags.
I have also updated the release note and addressed all of your comments. PTAL !!

@gardener-robot gardener-robot added size/l Size of pull request is large (see gardener-robot robot/bots/size.py) needs/second-opinion Needs second review by someone else and removed size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) labels Jan 27, 2022
@gardener-robot-ci-2 gardener-robot-ci-2 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jan 27, 2022
@gardener-robot-ci-1 gardener-robot-ci-1 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jan 27, 2022
Copy link
Collaborator

@shreyas-s-rao shreyas-s-rao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ishan16696 Thanks for making the changes! Except for one small nitpick, LGTM 👍🏼

pkg/backoff/exponentialbackoff_test.go Outdated Show resolved Hide resolved
@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Feb 15, 2022
@gardener-robot-ci-1 gardener-robot-ci-1 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Feb 15, 2022
@gardener-robot-ci-1 gardener-robot-ci-1 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Feb 15, 2022
@gardener-robot-ci-3 gardener-robot-ci-3 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Feb 15, 2022
@ishan16696 ishan16696 added the merge/squash Should be merged via 'Squash and merge' label Feb 15, 2022
@gardener-robot-ci-1 gardener-robot-ci-1 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Feb 15, 2022
Copy link
Contributor

@abdasgupta abdasgupta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Looks good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge/squash Should be merged via 'Squash and merge' needs/changes Needs (more) changes needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) needs/review Needs review needs/second-opinion Needs second review by someone else size/l Size of pull request is large (see gardener-robot robot/bots/size.py)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Snapshotter should not proactively cut off traffic to etcd on backup failures
7 participants