Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double check latest revision by starting an embedded etcd if revision check based on data dir fails during data validation #275

Merged
merged 1 commit into from
Nov 9, 2020

Conversation

ishan16696
Copy link
Member

@ishan16696 ishan16696 commented Oct 23, 2020

What this PR does / why we need it:
WALs file can have data revisions which are ahead of data revisions present in DB file, if etcd terminates abnormally and fails to flush the changes from WALs to Bolt DB ,this lead to unnecessary data restoration.
This PR eliminates this unnecessary data restoration. Starting the embedded etcd on data dir, ping the embedded etcd for 60s to get the latest revision from etcd, after that compare this latest revision and latest revision of back-up data.

Which issue(s) this PR fixes:
Fixes #260

Special notes for your reviewer:

Release note:

Validator now double checks latest revision by starting an embedded etcd if DB-based revision check fails. This can potentially avoid unnecessary data restoration when etcd terminates abnormally.

@CLAassistant
Copy link

CLAassistant commented Oct 23, 2020

CLA assistant check
All committers have signed the CLA.

@gardener-robot gardener-robot added needs/review Needs review size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) labels Oct 23, 2020
@gardener-robot-ci-2 gardener-robot-ci-2 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) reviewed/ok-to-test-tm Has approval for running integration tests on TestMachinery labels Oct 23, 2020
@gardener-robot-ci-3 gardener-robot-ci-3 removed the reviewed/ok-to-test-tm Has approval for running integration tests on TestMachinery label Oct 23, 2020
@gardener-robot-ci-1 gardener-robot-ci-1 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Oct 23, 2020
@gardener-robot-ci-3 gardener-robot-ci-3 added the needs/ok-to-test-tm Requires integration tests to be run on TestMachinery label Oct 23, 2020
@gardener-robot-ci-1 gardener-robot-ci-1 added the needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Oct 23, 2020
@ishan16696 ishan16696 changed the title To avoid Unnecessary Data Restoration #260 To avoid Unnecessary Data Restoration Oct 23, 2020
Copy link
Collaborator

@amshuman-kr amshuman-kr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the PR @ishan16696! Just some small suggestions.

Also, do you plan to include the test (copying old data directory and restoring it)?

BTW Could you please sign the CLA?

pkg/initializer/validator/datavalidator.go Outdated Show resolved Hide resolved
pkg/initializer/validator/datavalidator.go Outdated Show resolved Hide resolved
pkg/initializer/validator/datavalidator.go Outdated Show resolved Hide resolved
pkg/initializer/validator/datavalidator.go Outdated Show resolved Hide resolved
pkg/initializer/validator/datavalidator.go Outdated Show resolved Hide resolved
@gardener-robot gardener-robot added needs/changes Needs (more) changes and removed needs/review Needs review labels Oct 23, 2020
@amshuman-kr amshuman-kr changed the title To avoid Unnecessary Data Restoration Double check by starting an embedded etcd if revision check based on data dir fails during data validation Oct 23, 2020
@amshuman-kr amshuman-kr changed the title Double check by starting an embedded etcd if revision check based on data dir fails during data validation Double check latest revision by starting an embedded etcd if revision check based on data dir fails during data validation Oct 23, 2020
@gardener-robot-ci-1 gardener-robot-ci-1 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) reviewed/ok-to-test-tm Has approval for running integration tests on TestMachinery and removed reviewed/ok-to-test-tm Has approval for running integration tests on TestMachinery reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Oct 23, 2020
Copy link
Collaborator

@amshuman-kr amshuman-kr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes! One small change again :-)

pkg/initializer/validator/datavalidator.go Outdated Show resolved Hide resolved
pkg/initializer/validator/types.go Outdated Show resolved Hide resolved
@gardener-robot-ci-1 gardener-robot-ci-1 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) reviewed/ok-to-test-tm Has approval for running integration tests on TestMachinery and removed reviewed/ok-to-test-tm Has approval for running integration tests on TestMachinery labels Oct 23, 2020
@gardener-robot-ci-2 gardener-robot-ci-2 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Oct 23, 2020
Copy link
Collaborator

@amshuman-kr amshuman-kr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small change again 😉

pkg/initializer/validator/datavalidator.go Outdated Show resolved Hide resolved
@gardener-robot-ci-2 gardener-robot-ci-2 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) reviewed/ok-to-test-tm Has approval for running integration tests on TestMachinery and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Oct 23, 2020
@gardener-robot-ci-3 gardener-robot-ci-3 removed the reviewed/ok-to-test-tm Has approval for running integration tests on TestMachinery label Oct 23, 2020
@gardener-robot-ci-1 gardener-robot-ci-1 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Nov 6, 2020
@gardener-robot-ci-2 gardener-robot-ci-2 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Nov 6, 2020
@gardener-robot-ci-3 gardener-robot-ci-3 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Nov 6, 2020
Copy link
Collaborator

@shreyas-s-rao shreyas-s-rao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ishan16696 for making the changes. Looks like the change in copyFile function is breaking one of the other test cases. Please look into this and fix it. You can run unit tests locally before pushing changes, by running make verify.

pkg/initializer/validator/types.go Outdated Show resolved Hide resolved
pkg/initializer/validator/datavalidator_test.go Outdated Show resolved Hide resolved
@gardener-robot-ci-1 gardener-robot-ci-1 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Nov 6, 2020
@gardener-robot-ci-3 gardener-robot-ci-3 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Nov 6, 2020
@gardener-robot-ci-2 gardener-robot-ci-2 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Nov 6, 2020
@gardener-robot-ci-3 gardener-robot-ci-3 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Nov 6, 2020
Copy link
Collaborator

@amshuman-kr amshuman-kr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending unit test results.

@amshuman-kr amshuman-kr added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Nov 6, 2020
@gardener-robot-ci-3 gardener-robot-ci-3 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Nov 8, 2020
…es abruptly, this can lead to unnecessary data restoration.
@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Nov 8, 2020
@gardener-robot-ci-2 gardener-robot-ci-2 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Nov 8, 2020
Copy link
Collaborator

@shreyas-s-rao shreyas-s-rao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs/changes Needs (more) changes needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) needs/ok-to-test-tm Requires integration tests to be run on TestMachinery size/m Size of pull request is medium (see gardener-robot robot/bots/size.py)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Database verification should check WALs also for latest revision
9 participants