New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snapshot uploaded chunk lengths #500

Open
wants to merge 6 commits into
base: master
from

Conversation

Projects
None yet
2 participants
@adapt0

adapt0 commented Sep 22, 2018

Introduces 'upload-lengths' to snapshots; containing an array of the post compression & encryption chunk lengths. These upload lengths can then be used when checking snapshot integrity. Not only verifying that the chunk names are available, but that the chunks are of the expected size.

Additionally adds a '-check' flag to backup. This will check the unchanged, remote chunk lengths, to ensure that they are of the expected length. Remote chunks of an unexpected size will be re-uploaded (self-healing). Check is currently a flag as we do need to first pull down the chunk file list from the remote. Not sure how lengthy/costly an operation this could be? so made it optional.

Remote length checks will be skipped for older pre-existing snapshots lacking 'upload-lengths'. What this means is that the first snapshot to introduce upload-lengths will need to recompute the uploaded chunk lengths. These chunks do not need to be re-uploaded however, unless the lengths of the remote chunks are different than locally computed.

@CLAassistant

This comment has been minimized.

CLAassistant commented Sep 22, 2018

CLA assistant check
All committers have signed the CLA.

@adapt0

This comment has been minimized.

adapt0 commented Sep 22, 2018

Primary motivation is to help address an issue where my sftp backup repository contains zero sized chunks. I suspect this happened during a kernel panic while a backup was running. Discovered this issue while trying to copy chunks to cloud storage.

duplicacy 2.1.1 (e8b892):

duplicacy init test $HOME/tmp/d
/Users/adapt/go/src/github.com/adapt0/duplicacy/duplicacy will be backed up to /Users/adapt/tmp/d with id test

duplicacy backup
Storage set to /Users/adapt/tmp/d
No previous backup found
Indexing /Users/adapt/go/src/github.com/adapt0/duplicacy/duplicacy
Packed duplicacy (27272392)
Packed duplicacy_main.go (56974)
Backup for /Users/adapt/go/src/github.com/adapt0/duplicacy/duplicacy at revision 1 completed

duplicacy list -chunks
Storage set to /Users/adapt/tmp/d
Snapshot test revision 1 created at 2018-09-22 12:17 -hash
chunk: f0287babe8efa1a933f3993ae1493d98fde0a40cf9a4bc839f59c0d9aec491d6
chunk: a79e52a1677bb504d0c940ac8ca7f69f74f023a3f837847b6aa6295d647b356b
chunk: 9f6174e556f855ade49457a8a9a88f6defd34448215c8f9c0fe68d2ab2179430
chunk: fb75b742e850389e041031dba71e5c23a594626f09fd0fb2359af171ead35576
chunk: 0e47da6553dd94507b9096983f2c2f6f303f317f5f5ac41288fc5bc52b3dae01
chunk: 819e494c12c50e6261c0d95d4a547565d8a81de0e299e86761daeb8f11f0ec7b
chunk: 4a59cdca99f95d452127e684623be6b8a25f6285ac13e9d02460a281fb182cd3
chunk: 533780d8437889594b48857a41f71d15cdfe9f2aa4838f2d193c8576bec20054
chunk: 2292613b97db6a34a3b5a4058325c8b344f0d3396380b4056d0e2cb19e20d483
chunk: 2256883af250f202b989a0790b33fbb324fd732d16516bf704a4328bdeac7164

Remove a chunk:

rm ~/tmp/d/chunks/22/56883af250f202b989a0790b33fbb324fd732d16516bf704a4328bdeac7164

duplicacy check
Storage set to /Users/adapt/tmp/d
Listing all chunks
Chunk 2256883af250f202b989a0790b33fbb324fd732d16516bf704a4328bdeac7164 referenced by snapshot test at revision 1 does not exist
Some chunks referenced by snapshot test at revision 1 are missing
Some chunks referenced by some snapshots do not exist in the storage

Replace with a zero sized chunk:

touch ~/tmp/d/chunks/22/56883af250f202b989a0790b33fbb324fd732d16516bf704a4328bdeac7164

duplicacy check
Storage set to /Users/adapt/tmp/d
Listing all chunks
All chunks referenced by snapshot test at revision 1 exist

duplicacy backup
Storage set to /Users/adapt/tmp/d
Last backup at revision 1 found
Indexing /Users/adapt/go/src/github.com/adapt0/duplicacy/duplicacy
Backup for /Users/adapt/go/src/github.com/adapt0/duplicacy/duplicacy at revision 2 completed

duplicacy check
Storage set to /Users/adapt/tmp/d
Listing all chunks
All chunks referenced by snapshot test at revision 1 exist
All chunks referenced by snapshot test at revision 2 exist

duplicacy check -files
Storage set to /Users/adapt/tmp/d
Listing all chunks
Failed to decrypt the chunk 2256883af250f202b989a0790b33fbb324fd732d16516bf704a4328bdeac7164: unexpected EOF; retrying
Failed to decrypt the chunk 2256883af250f202b989a0790b33fbb324fd732d16516bf704a4328bdeac7164: unexpected EOF; retrying
Failed to decrypt the chunk 2256883af250f202b989a0790b33fbb324fd732d16516bf704a4328bdeac7164: unexpected EOF; retrying
Failed to decrypt the chunk 2256883af250f202b989a0790b33fbb324fd732d16516bf704a4328bdeac7164: unexpected EOF

Now with these changes:

./duplicacy check
Storage set to /Users/adapt/tmp/d
Listing all chunks
All chunks referenced by snapshot test at revision 1 exist
All chunks referenced by snapshot test at revision 2 exist

./duplicacy backup
Storage set to /Users/adapt/tmp/d
Last backup at revision 2 found
Indexing /Users/adapt/go/src/github.com/adapt0/duplicacy/duplicacy
Packed duplicacy (27272392)
Packed duplicacy_main.go (56974)
Backup for /Users/adapt/go/src/github.com/adapt0/duplicacy/duplicacy at revision 3 completed

./duplicacy check
Storage set to /Users/adapt/tmp/d
Listing all chunks
All chunks referenced by snapshot test at revision 1 exist
All chunks referenced by snapshot test at revision 2 exist
All chunks referenced by snapshot test at revision 3 exist

./duplicacy check -files
Storage set to /Users/adapt/tmp/d
Listing all chunks
All files in snapshot test at revision 1 have been successfully verified
All files in snapshot test at revision 2 have been successfully verified
All files in snapshot test at revision 3 have been successfully verified

Silently self-healed!

So let's try changing the length again:

./duplicacy list -chunks
Storage set to /Users/adapt/tmp/d
...
Snapshot test revision 3 created at 2018-09-22 12:19
chunk: f0287babe8efa1a933f3993ae1493d98fde0a40cf9a4bc839f59c0d9aec491d6
chunk: a79e52a1677bb504d0c940ac8ca7f69f74f023a3f837847b6aa6295d647b356b
chunk: 9f6174e556f855ade49457a8a9a88f6defd34448215c8f9c0fe68d2ab2179430
chunk: fb75b742e850389e041031dba71e5c23a594626f09fd0fb2359af171ead35576 (1730839 bytes)
chunk: 0e47da6553dd94507b9096983f2c2f6f303f317f5f5ac41288fc5bc52b3dae01 (2223001 bytes)
chunk: 819e494c12c50e6261c0d95d4a547565d8a81de0e299e86761daeb8f11f0ec7b (1160200 bytes)
chunk: 4a59cdca99f95d452127e684623be6b8a25f6285ac13e9d02460a281fb182cd3 (3105890 bytes)
chunk: 533780d8437889594b48857a41f71d15cdfe9f2aa4838f2d193c8576bec20054 (537842 bytes)
chunk: 2292613b97db6a34a3b5a4058325c8b344f0d3396380b4056d0e2cb19e20d483 (3149984 bytes)
chunk: 2256883af250f202b989a0790b33fbb324fd732d16516bf704a4328bdeac7164 (2858824 bytes)

echo corrupt > ~/tmp/d/chunks/22/56883af250f202b989a0790b33fbb324fd732d16516bf704a4328bdeac7164

./duplicacy check
Storage set to /Users/adapt/tmp/d
Listing all chunks
All chunks referenced by snapshot test at revision 1 exist
All chunks referenced by snapshot test at revision 2 exist
Chunk 2256883af250f202b989a0790b33fbb324fd732d16516bf704a4328bdeac7164 referenced by snapshot test at revision 3 expected size 2858824 but actual is 8
Some chunks referenced by snapshot test at revision 3 are missing and/or invalid

./duplicacy backup
Storage set to /Users/adapt/tmp/d
Last backup at revision 3 found
Indexing /Users/adapt/go/src/github.com/adapt0/duplicacy/duplicacy
Backup for /Users/adapt/go/src/github.com/adapt0/duplicacy/duplicacy at revision 4 completed

./duplicacy check
Storage set to /Users/adapt/tmp/d
Listing all chunks
All chunks referenced by snapshot test at revision 1 exist
All chunks referenced by snapshot test at revision 2 exist
Chunk 2256883af250f202b989a0790b33fbb324fd732d16516bf704a4328bdeac7164 referenced by snapshot test at revision 3 expected size 2858824 but actual is 8
Some chunks referenced by snapshot test at revision 3 are missing and/or invalid
Chunk 2256883af250f202b989a0790b33fbb324fd732d16516bf704a4328bdeac7164 referenced by snapshot test at revision 4 expected size 2858824 but actual is 8
Some chunks referenced by snapshot test at revision 4 are missing and/or invalid

# need to tell backup also check remote chunk lengths
./duplicacy backup -check
Storage set to /Users/adapt/tmp/d
Last backup at revision 4 found
Indexing /Users/adapt/go/src/github.com/adapt0/duplicacy/duplicacy
Listing remote chunks
Chunk 2256883af250f202b989a0790b33fbb324fd732d16516bf704a4328bdeac7164 referenced by snapshot test at revision 4 expected size 2858824 but actual is 8
verifyUploadLengths false
Chunk 2256883af250f202b989a0790b33fbb324fd732d16516bf704a4328bdeac7164 referenced by snapshot test at revision 4 expected size 2858824 but actual is 8
Packed duplicacy (27272392)
Packed duplicacy_main.go (56974)
Backup for /Users/adapt/go/src/github.com/adapt0/duplicacy/duplicacy at revision 5 completed

./duplicacy check -files
Storage set to /Users/adapt/tmp/d
Listing all chunks
All files in snapshot test at revision 1 have been successfully verified
All files in snapshot test at revision 2 have been successfully verified
All files in snapshot test at revision 3 have been successfully verified
All files in snapshot test at revision 4 have been successfully verified
All files in snapshot test at revision 5 have been successfully verified

adapt0 added some commits Sep 20, 2018

option to check remote chunk lengths during backuphave upload also ch…
…eck remote chunk length before assuming duplicate

@adapt0 adapt0 force-pushed the adapt0:store-upload-lengths branch from d4a66d7 to f49088a Dec 9, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment