Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: Implement backup verification command #613

Merged
merged 1 commit into from
May 30, 2023

Conversation

aiven-anton
Copy link
Contributor

@aiven-anton aiven-anton commented May 11, 2023

About this change - What it does

Adds a CLI command to allow verifying checksum integrity of data files in a V3 backup. The command exposes a --level argument to allow two different granularity levels of verification.

Running with --level=file will only inspect the bytes of data files, making sure they sum up to the expected checksums found in metadata. No parsing of records will happen when running in this mode.

Running with --level=record is closer to a dry-run of restoration, and will parse each record and run all the integrity checks we do at restore-time: verifying checksum checkpoints, metadata checksum, as well as record count checks.

Having these separate levels of verification allows running the deeper checks right after producing a backup, and building process around that to dismiss a backup should it fail the verification. This in turn allows greater confidence in running only the shallow file-level check before initiating a restoration.

The suggested operations are:

At backup-time

Run backup + deep check.

$ karapace_schema_backup get [...]
$ karapace_schema_backup verify --level=record [...]  # process should be setup to discard the backup if this fails

At restoration-time

Run checksum check, then initiate restoration.

$ karapace_schema_backup verify --level=file [...]  # refuse to restore if this fails.
$ karapace_schema_backup restore [...]

@aiven-anton aiven-anton force-pushed the aiven-anton/feature/backup-v3-avro-verify branch from 7ffa3d2 to 7e9fb33 Compare May 11, 2023 16:52
@cloudflare-pages
Copy link

cloudflare-pages bot commented May 11, 2023

Deploying with  Cloudflare Pages  Cloudflare Pages

Latest commit: 9a9e5e1
Status: ✅  Deploy successful!
Preview URL: https://c39df111.karapace.pages.dev
Branch Preview URL: https://aiven-anton-feature-backup-v-ysvz.karapace.pages.dev

View logs

karapace/backup/api.py Outdated Show resolved Hide resolved
@aiven-anton aiven-anton force-pushed the aiven-anton/feature/backup-v3-avro-verify branch from 7e9fb33 to fd3061d Compare May 11, 2023 17:18
@aiven-anton

This comment was marked as resolved.

@aiven-anton aiven-anton force-pushed the aiven-anton/feature/backup-v3-avro branch from bf3dee0 to 130b9ca Compare May 15, 2023 14:27
@aiven-anton aiven-anton force-pushed the aiven-anton/feature/backup-v3-avro-verify branch from fd3061d to b96b5f9 Compare May 15, 2023 16:00
@aiven-anton aiven-anton changed the base branch from aiven-anton/feature/backup-v3-avro to aiven-anton/feature/backup-v3-inspect-command May 15, 2023 16:03
@aiven-anton aiven-anton force-pushed the aiven-anton/feature/backup-v3-inspect-command branch 2 times, most recently from 35dd973 to 071226f Compare May 26, 2023 13:54
@aiven-anton aiven-anton force-pushed the aiven-anton/feature/backup-v3-avro-verify branch 2 times, most recently from b2683a8 to 3003113 Compare May 26, 2023 15:31
Base automatically changed from aiven-anton/feature/backup-v3-inspect-command to main May 29, 2023 08:41
@aiven-anton aiven-anton force-pushed the aiven-anton/feature/backup-v3-avro-verify branch 5 times, most recently from 0d27c47 to 71f01fa Compare May 29, 2023 10:50
@aiven-anton aiven-anton marked this pull request as ready for review May 29, 2023 11:01
@aiven-anton aiven-anton requested review from a team as code owners May 29, 2023 11:01
@aiven-anton aiven-anton force-pushed the aiven-anton/feature/backup-v3-avro-verify branch 2 times, most recently from 8c39d4f to 92f70bd Compare May 29, 2023 13:03
@aiven-anton aiven-anton force-pushed the aiven-anton/feature/backup-v3-avro-verify branch from 92f70bd to e28ae4b Compare May 30, 2023 10:30
jjaakola-aiven
jjaakola-aiven previously approved these changes May 30, 2023
Adds a CLI command to allow verifying checksum integrity of data files
in a V3 backup. The command exposes a `--level` argument to allow two
different granularity levels of verification.

Running with `--level=file` will only inspect the bytes of data files,
making sure they sum up to the expected checksums found in metadata. No
parsing of records will happen when running in this mode.

Running with `--level=record` is closer to a dry-run of restoration, and
will parse each record and run all the integrity checks we do at
restore-time: verifying checksum checkpoints, metadata checksum, as well
as record count checks.

Having these separate levels of verification allows running the deeper
checks right after producing a backup, and building process around that
to dismiss a backup should it fail the verification. This in turn allows
greater confidence in running only the shallow file-level check before
initiating a restoration.

The suggested operations are:

#### At backup-time

Run backup + deep check.

```sh
$ karapace_schema_backup get [...]
$ karapace_schema_backup verify --level=record [...]  # process should be setup to discard the backup if this fails
```

#### At restoration-time

Run checksum check, then initiate restoration.

```sh
$ karapace_schema_backup verify --level=file [...]  # refuse to restore if this fails.
$ karapace_schema_backup restore [...]
```
@jjaakola-aiven jjaakola-aiven merged commit 801e45f into main May 30, 2023
10 checks passed
@jjaakola-aiven jjaakola-aiven deleted the aiven-anton/feature/backup-v3-avro-verify branch May 30, 2023 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants