Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backup: Invalid HMAC on verification #1471

Closed
andrewdavidwong opened this issue Nov 30, 2015 · 10 comments
Closed

Backup: Invalid HMAC on verification #1471

andrewdavidwong opened this issue Nov 30, 2015 · 10 comments
Labels
C: core T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.

Comments

@andrewdavidwong
Copy link
Member

Since upgrading to R3.0, backup verification occasionally fails with the error message below, but immediately doing a host reboot and reattempting the verification on the same backup file then succeeds.

ERROR: ERROR: invalid hmac for file /var/tmp/restore_[...]:
[...]
Is the passphrase correct?
Partially restored files left in /var/tmp/restore_*, investigate them and/or clean them up
@andrewdavidwong
Copy link
Member Author

This is possibly related to #1124.

@marmarek marmarek added T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists. C: core labels Nov 30, 2015
@marmarek marmarek added this to the Release 3.0 updates milestone Nov 30, 2015
@andrewdavidwong
Copy link
Member Author

I think the problem occurs when there is insufficient disk space (on the disk on which /var/tmp/ resides). However, I would still consider this a bug, since the verification process does not indicate that (any) spare disk space will be needed, and the error messages do not indicate that the failure occurred due to insufficient space.

How much empty space is required, at minimum, for successful verification? (Just enough to hold the largest VM in the backup? More than that?)

@andrewdavidwong
Copy link
Member Author

OK, clearly a lot more space is being used than just the largest VM. Large backup-restore test failed with:

  • ~43GB left over in /var/tmp/ spread across multiple VM dirs.
  • ~70GB total free space on disk at start of restore test.
  • Largest VM in the backup was ~47GB.

@marmarek, is there any way to specify a location for qvm-backup-restore to use as its temp dir for holding temporary restore files during verification (e.g., /mnt/large-secondary-disk/tmp/)?

If not, is there a recommended Linux way to "move" /var/tmp/ to a larger secondary disk that works nicely with Qubes? (Found out the hard way that trying to use a symlink can create an endless "waiting" loop that prevents Qubes from booting. Access was needed to /var/tmp/ to mount and decrypt the secondary disk, but /var/tmp/ was on that secondary disk...)

@marmarek
Copy link
Member

Generally it may need as much disk space as the whole backup... But in practice should be much smaller (about 200-300MB). If not, it's a bug. It works this way:

  1. Backup stream is extracted from source media (either direct file, or some VM command/file).
  2. External tar extracted - result: private.img.000, private.img.000.hmac, private.img.001, private.img.001.hmac`, etc
  3. In parallel as soon as file is extracted, hmac is checked and then passed to internal tar extractor (which would extract actual private.img in this case). As soon as file part is passed to that second tar, it is removed from the disk.
  4. In case of backup verification, that second tar is set to only list archive content, not extract.

Parts are at most 100MB. So having that parallel extraction, the second one should be quite fast (tar t ...) and only few files should be on disk. Maybe your backup storage is so fast that the first tar is much faster than the second one?

Currently there is no way to specify alternative path. You can mount something there (just put it in /etc/fstab).

@andrewdavidwong
Copy link
Member Author

Thanks for that explanation, @marmarek. I did another test, and the results were inconsistent with what I previously reported (same test as before, just excluded a couple of normal-size VMs, and verification succeeded this time even though the same amount of free space was available as before). So, it may be that the issue doesn't actually have anything to do with insufficient disk space. I think it was failing before because the backup was somehow corrupted upon creation, even though I created the backup twice in a row without changing any data. (I remember you saying before that sometimes particularly large backups can have "holes" due to some tar bug...)

@marmarek
Copy link
Member

Excluded VMs aren't extracted (also during verification), so it still may be about disk space.

@andrewdavidwong
Copy link
Member Author

Sorry, I meant that I excluded them during backup creation (i.e., didn't back them up).

@andrewdavidwong
Copy link
Member Author

I just experienced the opposite issue. Verification first succeeded, then failed (and continued failing after many attempts). This is much more serious. Reported as issue #1577.

@andrewdavidwong
Copy link
Member Author

Still happens on R3.2, but observed only when creating a large (80+ GB) backup and using compression. After backing up the exact same VMs without compression, verification succeeds.

@andrewdavidwong
Copy link
Member Author

This issue is being closed because:

If anyone believes that this issue should be reopened, please let us know in a comment here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: core T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.
Projects
None yet
Development

No branches or pull requests

2 participants