-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add content hash to ls --json #2870
Comments
What you are basically looking for is a possibility to check if your local files match the state of a backup snapshot. I agree that this would be a nice and useful extension to restic. It is similar to #2011. However, I don't agree that listing hashes in A solution using But I would prefer to have this implemented into restic - either as extension to the |
I share @aawsome's objection. The suggestion is to introduce a hash that doesn't correspond to anything in the restic object model and is also not the hash of a file on disk, so its usefulness is very limited. There must be a cleaner way to compare files that also works with a file outside the repo. |
I'm also not too happy with the 'multi' solution. Ideally restic would store the full file hash in the metadata, but this would impose a performance cost during backup. I don't think it's this is really worth it just for this. My original approach just included the list of content hashes in the output. Unfortunately these content hashes are not very useful by themselves, unless of course you want to fetch the content. In order to check any local files, you would need to either know the chunk sizes or know the split secret and apply the same algorithm yourself. This size information is currently unfortunately not available. Perhaps this is something we could add to the metadata and then print both For my purposes of comparing files between different snapshots just having the list of content hashes would suffice. Would my PR be acceptable if would simply expose the list Content IDs in the JSON and get rid of the weird multi hash? I guess this is useful anyway, because it allows you to reconstruct the file contents. I can also add a new issue to discuss the addition of a ContentSizes slice. |
The chunk size information is available to restic, it is saved in the index. Simply use something like for i, id := range node.Content {
size[i] = repo.Index().Lookup(id,restic.DataBlob)[0].Length
} However, I would still prefer to have a |
Thanks! I will try this and see how it performs.
I agree that this would be very useful. I currently do not have enough time to make any promises, but I may have a look at how much effort that would take. This does however not preclude making the |
Output of
restic version
restic 0.9.6 (v0.9.6-337-g0b21ec44-dirty)
What should restic do differently? Which functionality do you think we should add?
Add a content hash to the
ls --json
output.When the file was stored in a single chunk, the real sha256 has is available and the reported hash will have the format
"sha256:..."
.When the file was split into multiple chunks, it is not possible to show a real content hash, because restic does not currently store this information. We can however construct a hash out of the chunk hashes. To distinguish this from a real sha256 of the contents, this hash will have the format
"multi:..."
.This 'multi' hash can only be compared within a single repo, as different repos will split files in different locations.
I have a PR ready that I add in a moment.
What are you trying to do?
I am trying to figure out if any original pictures have succumbed to bitrot since my initial restic backup of them.
Did restic help you today? Did it make you happy in any way?
Keeping the full backup history of my pictures in an efficient way makes it possible to recover from bitrot in the future.
The text was updated successfully, but these errors were encountered: