-
Notifications
You must be signed in to change notification settings - Fork 334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pruning Vertical Backup storage gone awry #465
Comments
Which revision of CDS-Server referenced the missing chunk? Is it 67? |
CDSServ (not CDS-Server) is the one I was trying to copy, from a revision (I can't remember which exactly) since those prunes were run. But seems the chunk isn't referenced in 67 but it is in 68 onwards (even 72 which completed a few hours ago):
Edit: I'm currently running a check command but it's taking a loooong time - stuck on listing all chunks - will see in the morning. :) |
I think this was caused by a bug in determining the set of new snapshots: duplicacy/src/duplicacy_snapshotmanager.go Lines 115 to 119 in d4a65ff
A snapshot not seen when the fossil collection was created should always be added to I'll roll out a fix tomorrow. |
Excellent, thanks. How do I go about fixing the missing chunk? :) I'd rather not start from scratch (it's 480GB and mostly copied onto remote storage). It seems like the backup command in Vertical Backup needs a -hash or -force option, but I guess that would entail listing all the chunks on the storage, which takes an extraordinary long time on my system. (I'm still trying to run a check command and it's still "Listing chunks/8f/" etc..) |
The easiest way to fix this is to create a new empty storage directory on your sftp server, and copy over the |
That certainly seems like the safest way to go, but I'm running out of space on the SFTP server and don't have any spare, locally at least. This sounds hacky but what if I could go back to revision 67 by temporarily removing snapshot files 68 and above? I'd probably have to remove/rename the cache. Then I could put the original 68-75 revisions back and clean up. Would this work? |
My dilemma solved itself! Subsequent backups no longer reference this chunk and a check command (which took a helluva long time; to be investigated another day!) says all chunks referenced by the snapshot at revision 74 exist. My guess is that part of the .vmdk was overwritten by the running VM. |
I think this is because revision 73 didn't reference that missing chunk and when revision 74 was created, Duplicacy did a lookup on that chunk and uploaded a new copy since the chunk couldn't be found. By default, any chunks referenced by the latest revision are assumed to exist so no lookups are needed for them. The bug has been fixed by 72dfaa8. There is also a new release 2.1.1 that includes this bug fix if you need latest binaries. |
Wonderful, thanks. I can't see a v2.1.1 on the release page for CLI, is that for the GUI version? No matter, wanted to get around to compiling myself anyway... |
Sorry it was saved as draft and I thought it was published. Should be available on the release page now. |
I am still thinking about the What if Or, back to my hacky idea... Would it be feasible and simply easier to setup a new, temporary repository id, and run an initial backup to the same storage? Most of it would be de-duplicated but would it guarantee missing chunks referenced by the damaged repository id, be recreated? I can imagine it would in the case of Vertical Backup where chunks are fixed size, but a normal Duplicacy file-based backup? |
It is fairly easy to implement an option to avoid using the last snapshot as the cache and force a lookup on every chunk. I just couldn't find a good short and descriptive name for it. But I also wonder whether it is really needed. For Duplicacy you can just edit the |
For Duplicacy, would such a procedure guarantee missing chunks are recreated? Some time between backups pass, directories may get reorganised and the boundaries between files and chunks change. For Vertical Backup, one could temporarily rename the snapshot folder on the storage out of the way (and the local cache). The majority of chunks would exist on storage and an initial backup would replace the missing chunks, correct? |
Right, you may not be able to recreate the missing chunks even if you still have the original files.
You're correct. Renaming the snapshot folder should work for Vertical Backup. I didn't think about this. |
Some background: I'm using Vertical Backup to backup 3 VMs to local SFTP storage on a Debian VM. Then I'm using the Duplicacy copy command to copy the storage to a remote SFTP.
So far, because I've had to rearrange the remote storage a bit (had to buy a bigger HDD!) and the quantity of data is quite large, some time has past since the last full copy, so I'm copying (manually at the moment) each VM one at a time, from the latest revision, rather than all revisions. While this was going on, I also took the opportunity to manually run prune a few times on the local storage, for the first time ever.
During the copy of one VM I received this error:
Chunk 60a85ee0721cd38cac2cc1372e822ea2d56db93d311687a827a8477c894fbc4b can't be found
Apparently, this chunk was removed after two consecutive prunes.
(There are other prune logs in there but I trimmed it for relevancy; I listed these logs mainly to show the timestamps - when the prune was started, and when it completed.)
In order to understand the chain of events, I took the emailed Vertical Backup log and edited it (here) - adjusted the timestamps (which were in UTC) to local time and inserted events according to the two log files:
prune-log-20180721-174544.txt
prune-log-20180721-212613.txt
Note CDSServ is the VM with the missing chunk and none of the removed snapshot revisions are that recent. Subsequent backups by Vertical Backup, however, haven't replaced the missing chunk!
From my understanding, it shouldn't be possible for prune to remove chunks that are still used by snapshots. A fossil collection shouldn't be deleted until at least 1 backup from all repositories are made.
I don't have a wonderful understanding of the code but these comments drew my attention.
That's not what actually happen though, is it? Vertical Backup does one VM backup after another - they're not atomic.
Is Duplicacy treating the backup of the first VM as satisfied the fossil deletion step on the collected fossils?
The text was updated successfully, but these errors were encountered: