New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to check to restore a backup #2052

Open
egonrassi opened this Issue Oct 18, 2018 · 7 comments

Comments

Projects
None yet
2 participants
@egonrassi
Copy link

egonrassi commented Oct 18, 2018

restic 0.9.3 compiled with go1.11.1 on linux/amd64

Backups originally created with restic version 0.8.3 with the following command

for backup in $backupdirs
do
ionice -c2 nice -n19 restic --no-cache backup --tag $backup $backup
done

Now when I try to restore the files I get the following errors.

export RESTIC_PASSWORD="xxxxxxx"

$ restic -r restic check
password is correct
create exclusive lock for repository
load indexes
check all packs
check snapshots, trees and blobs
error for tree [null]:
  tree 0000000000000000000000000000000000000000000000000000000000000000 not found in repository
Fatal: repository contains errors

$ restic -r restic rebuild-index
password is correct
counting files in repo
[0:00] 100.00%  11306 / 11306 packs
finding old index files
saved new indexes as [34a12efa 8cbd563c d24b20b9 f5dc29de]
remove 4 old index files

$ restic -r restic list snapshots
repository 2ad03384 opened successfully, password is correct
29a5de7d4bce696d7dae9ac8b9bde212d2857d095d9652402da3f81e3eb19183
2e1c5c336ffb53ae783b29d2ff04f270a732fe8128e7251d429849e2a3d30f0e
2ef404a61d133941baa7e1bdc901bcf7cfac3c356663f5b04ff825c577624c8a
$ restic -r restic cat snapshot 2ef404a61d133941baa7e1bdc901bcf7cfac3c356663f5b04ff825c577624c8a | jq .
{
  "time": "2018-07-13T06:40:30.455989205Z",
  "parent": "715d5f12173ac07b2d4f3f0dbcd320057d82b3403e2199301c9d4aa93573f568",
  "tree": "0000000000000000000000000000000000000000000000000000000000000000",
  "paths": [
    "/home/files"
  ],
  "hostname": "XXX",
  "tags": [
    "/home/files"
  ]
}
$ restic -r restic cat snapshot 2e1c5c336ffb53ae783b29d2ff04f270a732fe8128e7251d429849e2a3d30f0e | jq .
{
  "time": "2018-09-09T06:57:09.88842337Z",
  "parent": "f5cefb88eeceb40691f5b815d49b5b533745bee340bafdc02c119e4ee8a96190",
  "tree": "0000000000000000000000000000000000000000000000000000000000000000",
  "paths": [
    "/home/photo"
  ],
  "hostname": "XXX",
  "tags": [
    "/home/photo"
  ]
}
$ restic -r restic cat snapshot 29a5de7d4bce696d7dae9ac8b9bde212d2857d095d9652402da3f81e3eb19183 | jq .
{
  "time": "2018-07-22T06:44:51.695277455Z",
  "parent": "ddd688dcb4b7681c2904f763a91e73fd4b6e1da152c1fd6ae7a1573e76626f3e",
  "tree": "0000000000000000000000000000000000000000000000000000000000000000",
  "paths": [
    "/home/influxdb"
  ],
  "hostname": "XXX",
  "tags": [
    "influxdb"
  ]
}


The backup was originally made to S3, but I've copied the files locally to do these checks.

@fd0

This comment has been minimized.

Copy link
Member

fd0 commented Oct 18, 2018

Thank you very much for the report! This is a serious bug: restic created snapshots with invalid tree IDs. You said that you used restic 0.8.3, did you create all snapshots with that version?

Did you do anything else with the snapshots afterwards, like add or remove tags?

Can you reproduce the issue? Make a backup of one of the dirs (to the same repo) with restic 0.8.3, will it create another invalid snapshot? If that's the case, does it still happen with 0.9.0?

In 0.9.0, we've completely replaced the archiver code, so if it only happens with 0.8.3 that's probably the reason. I still want to get to the bottom of this if possible.

Last question: How important is the data for you? restic check warned you that something is wrong, so I'd advise to save the data again to a new repo, with restic 0.9.3. I have some ideas on how to get the data back if necessary, please let me know if you'd like to try that.

@egonrassi

This comment has been minimized.

Copy link

egonrassi commented Oct 19, 2018

Hi, thank you for the really swift response :)
a) Every snapshot was created with the version 0.8.3 as far as I know.
b) I found that after the backup succeeded I pruned with the following command:
$ restic forget -q --prune --keep-last 1

c) the files are missing so that's not possible :(
d) The data contains the family photos, so yes it would be REALLY nice if I could get assistance from you retreiving what can be saved if possible. So yes, I'd really like to try that.

@fd0

This comment has been minimized.

Copy link
Member

fd0 commented Oct 20, 2018

Hm. I'm not sure if prune by itself would have detected the issue. So there's a chance that it did not detect that, and removed all data from the repo. In this case, it's gone.

What you can try is the code in the branch add-recover, which adds a new command called recover. You can just run it restic recover and it'll create a new snapshot which will contain all directories it can find in the repo. It'll take a while though. Afterwards you can access the snapshot via restore or mount as usual.

I've also uploaded a precompiled binary for your convenience here: https://beta.restic.net/restic_recover_linux_amd64.bz2

Please let us know how it goes!

@fd0

This comment has been minimized.

Copy link
Member

fd0 commented Oct 20, 2018

Hm. I'm not sure if prune by itself would have detected the issue. So there's a chance that it did not detect that, and removed all data from the repo. In this case, it's gone.

I've reproduced the issue manually (made a snapshot with invalid ID just as your repo has them), in both 0.8.3 and the latest master branch restic prune panics:

$ ./restic prune

password is correct
counting files in repo
building new index for repo
[0:00] 100.00%  29 / 29 packs
repository contains 29 packs (2975 blobs) with 129.500 MiB
processed 2975 blobs: 0 duplicate blobs, 0B duplicate
load all snapshots
find data that is still in use for 1 snapshots
tree 0000000000000000000000000000000000000000000000000000000000000000 not found in repository
github.com/restic/restic/internal/repository.(*Repository).LoadTree
        /tmp/restic-build-717580179/src/github.com/restic/restic/internal/repository/repository.go:620
github.com/restic/restic/internal/restic.FindUsedBlobs
        /tmp/restic-build-717580179/src/github.com/restic/restic/internal/restic/find.go:11
main.pruneRepository
        src/github.com/restic/restic/cmd/restic/cmd_prune.go:191
main.runPrune
        src/github.com/restic/restic/cmd/restic/cmd_prune.go:85
main.glob..func17
        src/github.com/restic/restic/cmd/restic/cmd_prune.go:25
github.com/restic/restic/vendor/github.com/spf13/cobra.(*Command).execute
        /tmp/restic-build-717580179/src/github.com/restic/restic/vendor/github.com/spf13/cobra/command.go:698
github.com/restic/restic/vendor/github.com/spf13/cobra.(*Command).ExecuteC
        /tmp/restic-build-717580179/src/github.com/restic/restic/vendor/github.com/spf13/cobra/command.go:783
github.com/restic/restic/vendor/github.com/spf13/cobra.(*Command).Execute
        /tmp/restic-build-717580179/src/github.com/restic/restic/vendor/github.com/spf13/cobra/command.go:736
main.main
        src/github.com/restic/restic/cmd/restic/main.go:69
runtime.main
        /home/fd0/sdk/go1.11.1/src/runtime/proc.go:201
runtime.goexit
        /home/fd0/sdk/go1.11.1/src/runtime/asm_amd64.s:1333

So I'm pretty confident now the data is still there.

If you ran restic forget -q --prune --keep-last 1, did you get an error like the one above?

Then restic recover is able to create a new snapshot with the data:

$ restic recover
repository d25a3b45 opened successfully, password is correct
load index files
load 522 trees

tree (522/522)
done
found 1 roots
saved new snapshot 744391c1

$ restic ls -l latest /
repository d25a3b45 opened successfully, password is correct
snapshot 1bd21b8b of [/recover] filtered by [/] at 2018-10-20 11:48:17.924454012 +0200 CEST):
drwxr-xr-x     0     0      0 2018-10-20 11:48:17 /01c4fab0

$ ./restic ls -l latest /01c4fab0                                                                                     recover-data
repository d25a3b45 opened successfully, password is correct
snapshot 1bd21b8b of [/recover] filtered by [/01c4fab0] at 2018-10-20 11:48:17.924454012 +0200 CEST):
drwxr-xr-x     0     0      0 2018-10-20 11:48:17 /01c4fab0
drwxr-xr-x  1000   100      0 2018-10-20 11:42:50 /01c4fab0/.git
drwxr-xr-x  1000   100      0 2018-10-20 11:08:03 /01c4fab0/.github
-rw-r--r--  1000   100     18 2018-10-20 11:08:03 /01c4fab0/.gitignore
-rw-r--r--  1000   100     20 2016-08-21 11:07:12 /01c4fab0/.hound.yml
-rw-r--r--  1000   100   1299 2018-10-18 21:15:51 /01c4fab0/.travis.yml
-rw-r--r--  1000   100  65568 2018-10-20 11:08:03 /01c4fab0/CHANGELOG.md
-rw-r--r--  1000   100   8405 2018-10-20 11:08:03 /01c4fab0/CONTRIBUTING.md
-rw-r--r--  1000   100   1071 2018-10-18 21:15:51 /01c4fab0/GOVERNANCE.md
[...]

@fd0 fd0 referenced this issue Oct 20, 2018

Merged

Add new command 'recover' #2056

4 of 7 tasks complete
@egonrassi

This comment has been minimized.

Copy link

egonrassi commented Oct 20, 2018

Running:
restic forget -q --prune --keep-last 1
on both 0.8.3 and 0.9.3 (release, built without debug) returns retval 0. So no error returned, on my current dataset that's broken.

Making me really happy is the fact the recover procedure you pointed to, works like a charm :) The data is being copied without errors.

Your assistance is greatly appreciated :)

Are there any other actions you would like me to do? I guess most of this trivial anyway like you said. Most of these issues have been addressed, or been deprecated with new code.

@fd0

This comment has been minimized.

Copy link
Member

fd0 commented Oct 21, 2018

Thank you very much for the feedback! If forget does not remove any snapshots, prune is not run, so the error deos not occur. Did it remove any snapshots when you ran it? Do you maybe still have the complete output?

Making me really happy is the fact the recover procedure you pointed to, works like a charm :) The data is being copied without errors.

That's a relief! So, you got your data back in a new snapshot, then you can remove the broken snapshots with restic forget <id>, then you should be able to use the repo normally (including prune).

Can you please try to reproduce creating such an invalid snapshot again? With restic 0.8.3? It'd be great to know how that could have happened...

@egonrassi

This comment has been minimized.

Copy link

egonrassi commented Oct 24, 2018

I ran the backup again after restoring the data. No errors.
Ran prune. No errors.

$ restic check
using temporary cache in /tmp/restic-check-cache-104402917
repository 2ad03384 opened successfully, password is correct
create exclusive lock for repository
load indexes
check all packs
check snapshots, trees and blobs
no errors were found

So I'm no closer understanding what happened :(

@fd0 fd0 changed the title Unable to check ro restore a backup Unable to check to restore a backup Oct 25, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment