-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
weed backup tool would miss syncing the updated fids #399
Comments
Could you help me to think of a case where current backup strategy would fail? If it does happen, we should switch to more strict way to backup. |
OK! Then the weed client add file A,B,C into above volume. and at this time, the weed back tool running will sync A,B and C to the backup place correctly Then the weed client add file D, delete file A ,and update file B, after this, the weed backup tool running will add file D, delete file A but do nothing for the updated file B Does above case make sense? |
Yes. We will need a new simpler algorithm. |
A simpler algorithm? What's you plan? |
Can you write out your algorithm in psuedo code? |
@sure, I will provide the psuedo code(later this week), and please help to review~
Above is the psuedo code, hope for your suggestions~ |
fetchVolumeFileCRCChecksums function will get a map typed data with needle id as key and needle crc checksum as value. To reduce the network transferring cost, we could use protobuf's map feature and wrap this map and the compactVersion field into a struct which will be marshal into the http response body. |
I think this crcMap would work mostly, but not sure whether it is water tight in edge cases. I was thinking just treating the remote .dat file as a change log and just copy every bits of it. |
Chris:
The weed backup tool now only remove the fids which the src do not have but dest have, and add the ones which src have but desc does not have at the backup place.
For those files which are updated many times, I think that the backup tool will do nothing for them.
If this is a bug, do you have any good suggestion to fix it? e.g . Add the file crc32 checksum value in fetchVolumeFileEntries() function(This need to scan the whole volume and get it from each needle),and for the fid keys in both src and dest index, if the corresponding file size and checksum value are both same, ignore syncing.
Hope for your suggestions.Thanks!
The text was updated successfully, but these errors were encountered: