Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fatal: unable to save snapshot: Detected data corruption while saving blob... Corrupted blobs are either caused by hardware issues or software bugs. #4765

Open
alexxtasi opened this issue Apr 15, 2024 · 6 comments

Comments

@alexxtasi
Copy link

Output of restic version

restic 0.16.4 compiled with go1.21.6 on linux/amd64

What backend/service did you use to store the repository?

A usb connected HDD

Problem description / Steps to reproduce

  • I created a restic repo: restic init --repo /run/media/alex/restic/restic-home/
  • then backup my whole home directory: restic --repo /run/media/alex/restic/restic-home --verbose backup /home/alex/ --exclude=/home/alex/VirtualBox\ VMs/ (tried to ...)

Expected behavior

Proceed to backup

Actual behavior

The output:

~ $ restic --repo /run/media/alex/restic/restic-home --verbose backup /home/alex/ --exclude=/home/alex/VirtualBox\ VMs/
open repository
enter password for repository: 
repository fd5f60c3 opened (version 2, compression level auto)
created new cache in /home/alex/.cache/restic
lock repository
no parent snapshot found, will read all files
load index files

start scan on [/home/alex/]
start backup on [/home/alex/]
scan finished in 42.706s: 297483 files, 63.019 GiB
Fatal: unable to save snapshot: Detected data corruption while saving blob bc9e3d6a268d474a31d36ce6dd92f9d5794771260921fd5b3ed13f35261c57c6: hash mismatch
Corrupted blobs are either caused by hardware issues or software bugs. Please open an issue at https://github.com/restic/restic/issues/new/choose for further troubleshooting.

Do you have any idea what may have caused this?

No !
It happened once again using the same HDD to create restic repos. Same ... Corrupted blobs are either caused by hardware issues or software bugs... error happened. But that time I was trying to create restic backup of large amount of photo files.

Did restic help you today? Did it make you happy in any way?

@mmachner
Copy link

I have the same problem with an s3/minio backend while backuping many small files.

Fatal: unable to save snapshot: Detected data corruption while writing pack-file header: pack header entry mismatch got <Blob (tree) cf6c4e56, offset 0, length 1002683982, uncompressed length 251812809> instead of <Blob (tree) cf6c4e56, offset 0, length Corrupted data is either caused by hardware issues or software bugs. Please open an issue at https://github.com/restic/restic/issues/new/choose for further troubleshooting.

"restic check --read-data" finds no errors. Also testing against a new bucket with a new repo runs into the same error.

@MichaelEischer
Copy link
Member

Do those errors show up reproducibly or do they only occur at random times? Please run a memory / CPU stress test to check for problems with your hardware.

In addition, please run restic check --read-data to verify the integrity of your repository. Warning: this will have to download the whole repository.

@mmachner Your error looks like a bitflip in the memory/CPU on your host: <Blob (tree) cf6c4e56, offset 0, length 1002683982, uncompressed length 251812809>. Notice that length (the compressed size) is about 4 times larger than the uncompressed length. That shouldn't be possible to happen. The error message also seems to be cut off. Something is missing between length Corrupted data is either. Do you still have the missing part of the error message?

@alexxtasi Is that error reproducible?

@mmachner
Copy link

It happens at every Backup run of this run. The same server has also another backup job with restic, which gets an big dbdump piped in, which runs fine. Only the job with with small files has the Problem. I also tried with a new, clean repo, same Problem. The Server has ECC Ram and RAID Disks.

Fatal: unable to save snapshot: Detected data corruption while writing pack-file header: pack header entry mismatch got <Blob (tree) 1bfb952e, offset 0, length 999811890, uncompressed length 240300308> instead of <Blob (tree) 1bfb952e, offset 0, length 999811890, uncompressed length 4535267604>

Fatal: unable to save snapshot: Detected data corruption while writing pack-file header: pack header entry mismatch got <Blob (tree) 800152cb, offset 0, length 1002453363, uncompressed length 250946913> instead of <Blob (tree) 800152cb, offset 0, length 1002453363, uncompressed length 4545914209>

Fatal: unable to save snapshot: Detected data corruption while writing pack-file header: pack header entry mismatch got <Blob (tree) cf6c4e56, offset 0, length 1002683982, uncompressed length 251812809> instead of <Blob (tree) cf6c4e56, offset 0, length 1002683982, uncompressed length 4546780105>

@MichaelEischer
Copy link
Member

I don't have much time right now, so I'll just give a short reply.

uncompressed length 4535267604

That's 4.2GB, which completely changes the explanation: you've hit #2446 which will only be fixed in restic 0.18.0 (https://forum.restic.net/t/roadmap-for-restic-0-17-to-0-19/7197), which hopefully still happens this year. There's unfortunately no real workaround, restic versions before 0.16.4 will just panic with offset or length does not fit in uint32. You have packs > 4GB!.

Do you have any way to reduce the number of small files in a folder (subfolders don't count!)?

@alexxtasi
Copy link
Author

@MichaelEischer thanks for the suggestion

@alexxtasi Is that error reproducible?

It is, in terms of whenever attempt to backup (the same command) the result is the same.

But...
I moved my two hard disks to another machine and ... restic succeeded !!

~ $ restic --repo /run/media/alex/restic/restic-darktable/ --verbose backup darktable
open repository
enter password for repository: 
repository e552c8b1 opened (version 2, compression level auto)
lock repository
no parent snapshot found, will read all files
load index files
[0:00]          0 index files loaded
start scan on [darktable]
start backup on [darktable]
scan finished in 6.557s: 17070 files, 275.138 GiB

Files:       17070 new,     0 changed,     0 unmodified
Dirs:          225 new,     0 changed,     0 unmodified
Data Blobs:  199255 new
Tree Blobs:    226 new
Added to the repository: 275.130 GiB (270.121 GiB stored)

processed 17070 files, 275.138 GiB in 4:15:45
snapshot 8e5395ae saved

and on the other hand... you where right !! My pc has RAM issues !! ;)
So I understand it's not restic's fault, either a large directory problems.

Thanks you very much.

@MichaelEischer
Copy link
Member

and on the other hand... you where right !! My pc has RAM issues !! ;)

restic includes several layers of integrity checks to (hopefully) detect data corruption in a backup. That means it's usually pretty effective at detection RAM issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants