Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LZ4F_decompress failed with code ERROR_blockChecksum_invalid #134

Closed
sbrunsch2024 opened this issue Sep 4, 2023 · 15 comments
Closed

LZ4F_decompress failed with code ERROR_blockChecksum_invalid #134

sbrunsch2024 opened this issue Sep 4, 2023 · 15 comments
Labels
restore Issue related to restore support Usage/support questions.

Comments

@sbrunsch2024
Copy link

Version used
Provide output of virtnbdbackup -V

1.9.38

Describe the bug
restore Error

Expected behavior
successful restore

Hypervisor information:

  • OS: [e.g. Alma, Debian]
  • HV type [e.g. opennebula, plain libvirt, rhev]
    libvirt kvm@ debian

virtnbdrestore -i /vmbackup/routervm/202309 -o /kvm/storage -N routervm

ERROR root virtnbdrestore - restoreData [MainThread]: LZ4F_decompress failed with code: ERROR_blockChecksum_invalid
Traceback (most recent call last):
File "/usr/bin/virtnbdrestore", line 142, in restoreData
written = chunk.read(
^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/libvirtnbdbackup/chunk.py", line 83, in read
data = lz4.decompressFrame(reader.read(blocklen))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/libvirtnbdbackup/lz4.py", line 27, in decompressFrame
return lz4.frame.decompress(data)

No workaround

@sbrunsch2024 sbrunsch2024 added the bug Something isn't working label Sep 4, 2023
@abbbi
Copy link
Owner

abbbi commented Sep 4, 2023

Hi, This Looks Like a corrupt Block in the data file. If compression was enabled during backup the lz4 layer also stores an checksum alongside the compressed block and it appears this checksum doesnt Match anymore. Any notable issues on the filesystem the data is stored on? Is it reproducible with other backups?

@sbrunsch2024
Copy link
Author

sbrunsch2024 commented Sep 4, 2023

I had 2 vms which had this error. I made a fresh backup of both and got the same error with one remaining vm (the biggest one).
Just trying a new backup without compression. But 300GB data take some time for backup and restore.

Target is a NFS-Server with TrueNAS Scale. Message-file from Server and NAS seem to be clean.

Lessons learnt: I think I will do regular restore test.

@abbbi
Copy link
Owner

abbbi commented Sep 4, 2023

Is it reproducible it you Backup/ restore to local storage on the hypervisor host? Granted i dont use compression all that often .. at least not with vm this big. The checksum is calculated by the lz4 Module automatically.

One reason could be that the complete frame read during restore is wrong But that would end in a different error..

restore log with verbose log might show more details, but its more likely the data beeing corrupted somehow.

Maybe there is a way to disable the Check during decompress, to See if its able to accomplish decompression even if the checksum is wrong, Need to Check the lz4 modules docs on that

@abbbi
Copy link
Owner

abbbi commented Sep 4, 2023

i just attempted to reproduce this with a backup i did. After backup i replaced one byte within the backup file and attempted restore:

ERROR root virtnbdrestore - restoreData [MainThread]: LZ4F_decompress failed with code: ERROR_blockChecksum_invalid

but there seems to be no way to ignore the checksum check (which is obviously OK)

Also in my case, just before the decompression the frame information for the lz4 block is shown:

[2023-09-04 21:12:46] DEBUG root lz4 - decompressFrame [MainThread]: Compressed Frame: {'block_size': 65536, 'block_size_id': 4, 'block_linked': True, 'content_checksum': True, 'block_checksum': True, 'skippable': False, 'content_size': 131072}
[2023-09-04 21:12:46] ERROR root virtnbdrestore - restoreData [MainThread]: LZ4F_decompress failed with code: ERROR_blockChecksum_invalid

which means the data read from the backup file is a correct frame, but its data contents dont match the checksum anymore. From the lz4 docs:

content_checksum (bool) – Specifies whether to enable checksumming of the uncompressed content. If True, a checksum is stored at the end of the frame, and checked during decompression. Default is False.
block_checksum (bool) –
Specifies whether to enable checksumming of the uncompressed content of each block. If True a checksum of the uncompressed data in each block in the frame is stored at

the end of each block. If present, these checksums will be used to validate the data during decompression.

both options are enabled by default during backup.

No messages in dmesg or alike where transfer issues via nfs might be the reason?

@sbrunsch2024
Copy link
Author

I made a backup to different NFS-Server and I could restore it.
Just doing the restore, but I have to backup my nextcloud data to be safe that I have a working backup. So Backup / Restore jobs now working simultanously.

Two different servers had the same problem...

Network errors? FS-errors?
I will schedule a zfs scrub.

@abbbi abbbi added Needs-testing Feature needs testing support Usage/support questions. and removed bug Something isn't working labels Sep 4, 2023
@abbbi
Copy link
Owner

abbbi commented Sep 5, 2023

ive commited a new feature into the "checksum" branch: if backup to regular directory is attempted it now computes an adler32 checksum for the complete data file contents and reports them via logfile:

[2023-09-05 09:54:28] INFO output target - checksum [vda]: Checksum for file: [/tmp/chksum/vda.full.data.partial]:[3486281336]
[2023-09-05 09:54:35] INFO output target - checksum [vda]: Checksum for file: [/tmp/chksum/vda.inc.virtnbdbackup.1.data.partial]:[1914456873

then there is a verify option that allows for verifying that checksum:

./virtnbdrestore -i /tmp/chksum/ -o verify
[2023-09-05 09:54:48] INFO lib common - printVersion [MainThread]: Version: 1.9.39 Arguments: ./virtnbdrestore -i /tmp/chksum/ -o verify
[2023-09-05 09:54:48] INFO root virtnbdrestore - verify [MainThread]: Computing checksum for: /tmp/chksum/vda.full.data
[2023-09-05 09:54:50] INFO root virtnbdrestore - verify [MainThread]: Checksum: 3486281336
[2023-09-05 09:54:50] INFO root virtnbdrestore - verify [MainThread]: Computing checksum for: /tmp/chksum/vda.inc.virtnbdbackup.1.data
[2023-09-05 09:54:50] INFO root virtnbdrestore - verify [MainThread]: Checksum: 1914456873

so altered files can be identified more easily no matter if compression is used or not.

@sbrunsch2024
Copy link
Author

In this case it seems to be a problem with the storage. Truenas found some failures in zpool.
But checksum would be very nice, so restore test has not tb done so often...

@abbbi
Copy link
Owner

abbbi commented Sep 5, 2023

thanks for the update! Closing this one ..

@abbbi abbbi closed this as completed Sep 5, 2023
@abbbi
Copy link
Owner

abbbi commented Sep 6, 2023

@sbrunsch2024 latest master now contains this feature, see:

https://github.com/abbbi/virtnbdbackup#verifying-created-backups

of course only works with new backups. Could you test? Thanks.

@sbrunsch2024
Copy link
Author

sbrunsch2024 commented Sep 6, 2023 via email

@sbrunsch2024
Copy link
Author

sbrunsch2024 commented Sep 6, 2023 via email

@sbrunsch2024
Copy link
Author

sbrunsch2024 commented Sep 6, 2023 via email

@sbrunsch2024
Copy link
Author

sbrunsch2024 commented Sep 6, 2023 via email

@abbbi
Copy link
Owner

abbbi commented Sep 6, 2023

Last questions before waiting for reaction…. How to modify a bit of a backup file? dd ?

hexeditor or dd

Is there a way to speed up checksum creation? It takes about 3 minutes for 30 GB on rotating disks.

you mean verify? Its using 4k blocks to compute the checksum during verify, that might be a bottleneck.
I just added option -B to be able to specify a bigger buffer size.
Default is now io.DEFAULT_BUFFER_SIZE (8k) but might be changed accordingly to suite the environment:

 virtnbdrestore -i /tmp/BACKUP -o verify -B 8096 -v
 DEBUG root virtnbdrestore - verify [MainThread]: Using buffer size: 8192
 [..]
 virtnbdrestore -i /tmp/BACKUP -o verify -B 16384 -v
DEBUG root virtnbdrestore - verify [MainThread]: Using buffer size: 16384

@abbbi
Copy link
Owner

abbbi commented Sep 6, 2023

or simply append something to the backup file:

# echo 1 >> /tmp/BACKUP/vda.copy.data
# ./virtnbdrestore -i /tmp/BACKUP -o verify -B 16384 -v
[2023-09-06 08:43:16] INFO lib common - printVersion [MainThread]: Version: 1.9.41 Arguments: ./virtnbdrestore -i /tmp/BACKUP -o verify -B 16384 -v
[2023-09-06 08:43:16] DEBUG lib common - getLatest [MainThread]: Sorted data files: 
['/tmp/BACKUP/vda.copy.data']
[2023-09-06 08:43:16] DEBUG root virtnbdrestore - verify [MainThread]: Using buffer size: 16384
[2023-09-06 08:43:16] INFO root virtnbdrestore - verify [MainThread]: Computing checksum for: /tmp/BACKUP/vda.copy.data
[2023-09-06 08:43:17] INFO root virtnbdrestore - verify [MainThread]: Checksum result: 4208209834
[2023-09-06 08:43:17] INFO root virtnbdrestore - verify [MainThread]: Comparing checksum with stored information
[2023-09-06 08:43:17] ERROR root virtnbdrestore - verify [MainThread]: Stored sums do not match: [2609524591]!=[4208209834]

@abbbi abbbi added restore Issue related to restore and removed Needs-testing Feature needs testing labels Dec 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
restore Issue related to restore support Usage/support questions.
Projects
None yet
Development

No branches or pull requests

2 participants