Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclear how to recover from "pack ID does not match" errors from "restic check" #2191

Closed
adsbarratt opened this issue Feb 27, 2019 · 20 comments

Comments

@adsbarratt
Copy link

Output of restic version

restic 0.9.2 compiled with go1.10.3 on windows/amd64 (for initial backup and check)
restic 0.9.4 compiled with go1.11.4 on windows/amd64 (for later diagnosis)

How did you run restic exactly?

AWS_ACCESS_KEY_ID=foo
AWS_SECRET_ACCESS_KEY=foo
RESTIC_REPOSITORY=s3:https://s3.amazonaws.com/myrepo
RESTIC_PASSWORD=foo:bar

Ran restic check --check-unused --read-data against the repository after generating 3 snapshots.

Output included:

Pack ID does not match, want 7691f738, got e3b0c442

What backend/server/service did you use to store the repository?

S3

Expected behavior

A simpe means of resolving the error, given that I still have the original data available locally.

Actual behavior

restic find --pack 7691f738 got me as far as

Found blob 1b0ccb7af24b6221798ee900d9f5943e56a186f3f30707e5a7366033415a3e50
... in file /f/2018-04/foo-Sun.7z
(tree 3229cbefa8f2e1b09726ef36f701516d01ae0d8c80339125508ca62e23ab7479)
... in snapshot 5c8529d9 (2019-02-06 15:58:21)
Found blob 11e01454070d44043df7536c6b9a2cde366d7aa97a066daf10d1049d4c7d1c55
... in file /f/2018-04/foo-web-Sun.7z
(tree 3229cbefa8f2e1b09726ef36f701516d01ae0d8c80339125508ca62e23ab7479)
... in snapshot 5c8529d9 (2019-02-06 15:58:21)

but it's unclear how I can resolve the issues. Will re-uploading the affected files and then running a forget on the repo simply replace the broken pack with the new version? (Admittedly in a new snapshot, but that's OK.)

Steps to reproduce the behavior

Upload ~1.5TB of data from a USB-attached SSD from a Windows server to an S3-hosted repo.

If it makes any difference, the upload was rate-limited and interrupted / restarted at a couple of points in order to adjust the rate limit. Unfortunately I'm not sure whether any of the affected files were being uploaded at the time.

Do you have any idea what may have caused this?

Possible I/O issue, as the drives are connected to a Windows server via a USB to SATA cable and at least once (although not while restic was running) the event log shows that Windows believes the drive was disconnected and reconnected (it wasn't).

Do you have an idea how to solve the issue?

Better worked example of recovering from issues raised by restic check, particularly when --read-data was used.

Did restic help you or made you happy in any way?

Generally I've been very impressed with restic so far.

@cdhowie
Copy link
Contributor

cdhowie commented Mar 28, 2019

This sounds like pack 7691f738 is corrupt. What is the output of sha256sum $repo/data/76/7691f738*?

@askielboe
Copy link
Contributor

We've seen issues like this before. In some cases it was probably due to hardware failure. I agree it could be handled better.

We had a lengthy discussion about it here: #1999

@jay-tuckey
Copy link

Is there any clarity on how to recover from a pack ID error? Would I be right in thinking I can just remove the corrupted pack?

@MichaelEischer
Copy link
Member

@jay-tuckey That depends a bit on the content of the pack file. You can move the damaged pack file somewhere else and create a copy of the index folder.

Afterwards run rebuild-index to remove the damaged pack file from the index. Then check might work or complain about missing files or tree. You can try to recover those by running backup --force ... on your backup set.

@jay-tuckey
Copy link

Thanks @MichaelEischer , I can confirm that the process you described worked quite nicely.

I think it would be nice to have some sort of repo-repair process built into the tool, but I'm sure there are other higher priority features to add first. 🙂

@aawsome
Copy link
Contributor

aawsome commented Aug 5, 2020

After #2827 has been merged, it is sufficient to run rebuild-index and run an arbitrary backup containing the missing data (if still present on your hard drive). After that all snapshots will be corrected.

@jay-tuckey I propose you close this issue now that #2827 has been merged. Maybe open a new issue if you are not satisfied with or need more / better documentation.

@aawsome
Copy link
Contributor

aawsome commented Aug 5, 2020

I think it would be nice to have some sort of repo-repair process built into the tool, but I'm sure there are other higher priority features to add first. slightly_smiling_face

About repair functionalities, there is now #2876 - but redoing the backup is always the first and much better option..

@jay-tuckey
Copy link

@aawsome doesn't look like I can close it, I think @adsbarratt needs to.

@jkirk
Copy link

jkirk commented Oct 9, 2020

I have a similar problem and I do not know which issue fits better: #816 (is closed, but describes my problem best), #1999 or this one.

  % restic check
  using temporary cache in /tmp/restic-check-cache-272168381
  repository cb48001c opened successfully, password is correct
  created new cache in /tmp/restic-check-cache-272168381
  create exclusive lock for repository
  load indexes
  check all packs
  pack 644ca023: not referenced in any index
  pack 3e508e03: not referenced in any index
  pack 57080c60: not referenced in any index
  pack c16adbed: not referenced in any index
  [...]
  pack 8448731d: not referenced in any index
  pack 84a99983: not referenced in any index
  451 additional files were found in the repo, which likely contain duplicate data.
  You can run `restic prune` to correct this.
  check snapshots, trees and blobs
  no errors were found

  % restic prune
  repository cb48001c opened successfully, password is correct
  counting files in repo
  building new index for repo
  [0:00] 100.00%  21388 / 21388 packs
  repository contains 21388 packs (422434 blobs) with 96.734 GiB
  processed 422434 blobs: 27972 duplicate blobs, 1.828 GiB duplicate
  load all snapshots
  find data that is still in use for 68 snapshots
  [0:44] 100.00%  68 / 68 snapshots
  found 369973 of 422434 data blobs still in use, removing 52461 blobs
  will remove 0 invalid files
  will delete 835 packs and rewrite 1762 packs, this frees 6.566 GiB
  [2:55] 47.05%  829 / 1762 packs rewritten
  hash does not match id: want 7d87533f4f645e82ce38dbf95a9386aaf9cf33947501b20a62a9243134a6c0f2, got 424b5299980e46b507eaaa8c7f74095036ac0a9633e4f4e5e4fe3c66de19d722
  github.com/restic/restic/internal/repository.Repack
          /restic/internal/repository/repack.go:37
  main.pruneRepository
          /restic/cmd/restic/cmd_prune.go:242
  main.runPrune
          /restic/cmd/restic/cmd_prune.go:62
  main.glob..func19
          /restic/cmd/restic/cmd_prune.go:27
  github.com/spf13/cobra.(*Command).execute
          /home/build/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:842
  github.com/spf13/cobra.(*Command).ExecuteC
          /home/build/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:950
  github.com/spf13/cobra.(*Command).Execute
          /home/build/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:887
  main.main
          /restic/cmd/restic/main.go:86
  runtime.main
          /usr/local/go/src/runtime/proc.go:204
  runtime.goexit
          /usr/local/go/src/runtime/asm_amd64.s:1374

  % ls -l /srv/restic/home/data/7d/7d87533f4f645e82ce38dbf95a9386aaf9cf33947501b20a62a9243134a6c0f2
  -rw------- 1 root root 5963916 Nov  5  2019 /srv/restic/home/data/7d/7d87533f4f645e82ce38dbf95a9386aaf9cf33947501b20a62a9243134a6c0f2
  
  % sha256sum /srv/restic/home/data/7d/7d87533f4f645e82ce38dbf95a9386aaf9cf33947501b20a62a9243134a6c0f2
  424b5299980e46b507eaaa8c7f74095036ac0a9633e4f4e5e4fe3c66de19d722  /srv/restic/home/data/7d/7d87533f4f645e82ce38dbf95a9386aaf9cf33947501b20a62a9243134a6c0f2

Backups have been performed after 2019-11-05 (the date stamp of the hash file), but it seems the blob could not be recovered.
(The backups were made with restic v0.9.6, restic check + prune with restic v0.10.0. The repository is hosted on ext4 on Debian/stretch serving restic rest-server, the backups itself were performed on a remote server.)

@MichaelEischer
Copy link
Member

The standard procedure would be to create a backup of the repository index, move the damaged pack file out of the repository, then run rebuild-index. Afterwards run backup with restic 0.10.0 to try to recover missing blobs. Afterwards run check --read-data to determine whether there are still snapshots with missing data and/or further damaged pack files. Do you have any idea how the pack file might have been damaged?

I've rebased @fd0's debug-1999 branch to current master branch and added the functionality to extract all intact parts of the damaged pack file. You can use it as follows:

  • Build the debug-1999 branch using go build -tags debug ./cmd/debug
  • Run restic debug examine --extract-pack <id-of-broken-pack-file> to extract all blobs in the damaged pack file into the current folder (-> it's best to use a temporary folder to reduce the chaos). The packfile must still be located in the proper place inside the repository, but it is not necessary for it to be contained in the repository index (i.e., there's no need to run rebuild-index, please do not run that command right now) The option --try-repair can be used to try to repair a single bit flip, and --repair-byte tries to repair a full byte at once (but is also much slower)
  • Move the damaged pack file out of the repository, create a backup of the index and run restic rebuild-index
  • Then run restic backup . in the folder which contains the extracted blobs
  • Now with a bit of luck restic check --read-data should no longer report errors or only a few ones
  • If the missing blobs are part of a file of which you still have the original, then run restic backup <original-file>, which will recover the missing blobs.

@jkirk
Copy link

jkirk commented Oct 13, 2020

Thank you for your comment and assistance.

  • Ad "create a backup of the repository index": is it enough to create a backup of $REPO_HOME/index?
  • Ad I ran check --check-unused --read-data after my (but before your) comment. The output looked like this:
  % restic check --check-unused --read-data
  [...]
  unused blob <data/8ff6ad5e> 
  unused blob <tree/f136a20c>
  unused blob <tree/f3e53c14> 
  read all data
  Pack ID does not match, want 7d87533f, got 424b5299
  [35:19] 100.00%  21388 / 21388 packs
  Fatal: repository contains errors
  • Our backup server ran out of disk space, that is why I checked the integrity of the repository in the first place. But the date stamp of the damaged blob indicates that the problem occurred last year. I have no clue what happened back then.

Building @fd0 's debug-1999 branch looks quite sophisticated (for me). If it somehow helps you and/or the restic team I would happily try this approach. If this is somehow risky and / or recovering the missing data could be achieved via the "standard procedure" I think I would recommend the simpler way.

But please advice: which path should I choose? :)

@MichaelEischer
Copy link
Member

Yes it is enough to create a copy of the index folder contained in the backup repository. The experimental code in the debug-1999 branch doesn't modify the repository, it just tries to salvage what's left of the damaged pack file. And these salvaged parts are then added back to the repository using a normal backup run (you can use a regular restic binary for that step. The experimental code is really only needed to extract data from the damaged pack file). The additional step to extract the still usable parts from the pack file is also the only difference to the standard procedure.

You can run restic find --pack 7d87533f to find out which files in which snapshots are affected by the damaged pack file. If you still have these files in their original state or you don't need the snapshot(s) then it would also be possible to recover using the standard procedure.

The instructions to build the branch are as follows. Download a current version of the go compiler (at least go >= 1.13, use 1.15 if possible), make sure that you can call the compiler directly as go by adding it to the PATH.

git clone https://github.com/restic/restic.git
cd restic
git checkout debug-1999
go build -tags debug ./cmd/restic

Afterwards you will have a new restic binary in the current folder.

@jkirk
Copy link

jkirk commented Oct 15, 2020

@MichaelEischer Wow, building restic was easier than expected! (I used go v1.14 from Debian/buster-backports).

  • Step 0: Took a backup of the $REPO_HOME/index and the damaged pack-file $REPO_HOME/data/7d/7d87533f4f645e82ce38dbf95a9386aaf9cf33947501b20a62a9243134a6c0f2
  • Step 1: Build the debug-1999 branch using go build -tags debug ./cmd/debug
  • Step 2a: Run restic debug examine --extract-pack <id-of-broken-pack-file>:
  % mkdir tmp
  % cd tmp
  % ../restic debug examine --extract-pack 7d87533f4f645e82ce38dbf95a9386aaf9cf33947501b20a62a9243134a6c0f2
  debug enabled
  repository cb48001c opened successfully, password is correct
  examine 7d87533f4f645e82ce38dbf95a9386aaf9cf33947501b20a62a9243134a6c0f2
    file size is 5963916
    wanted hash 7d87533f4f645e82ce38dbf95a9386aaf9cf33947501b20a62a9243134a6c0f2, got 424b5299980e46b507eaaa8c7f74095036ac0a9633e4f4e5e4fe3c66de19d722                                                                                       
    ========================================
    looking for info in the indexes
      index [19aad8bb 3a3f918e 26c6eefe 94d2f652 3884d3b0 2438df87 9f16641b d91fb07a fdf84442 95377d79 3fdc5686 d8694cf2]:                                                                                                                   
        data blob 97d3145d913e7ef270472dd1d6f93b1c7e7a6dd8c3d1e82073f6d8f735d0104d, offset 0     , raw length 2891301
        data blob 11b4dd6c7c1d645405b9caf06bb42db3124f96a8edbc3d9359cbf857adbb5941, offset 2891301, raw length 1028375
        data blob d4eafb2419ada6dbb987bf6ad32a092b0986ba74394c9ad6ddc5a78d2458fbaf, offset 3919676, raw length 2044093
        file sizes match
        loading blob 97d3145d913e7ef270472dd1d6f93b1c7e7a6dd8c3d1e82073f6d8f735d0104d at 0 (length 2891301)
  error decrypting blob: ciphertext verification failed
  decrypt of blob 97d3145d913e7ef270472dd1d6f93b1c7e7a6dd8c3d1e82073f6d8f735d0104d stored at damaged-97d3145d913e7ef270472dd1d6f93b1c7e7a6dd8c3d1e82073f6d8f735d0104d.bin
        loading blob 11b4dd6c7c1d645405b9caf06bb42db3124f96a8edbc3d9359cbf857adbb5941 at 2891301 (length 1028375)
           successfully decrypted blob (length 1028343), hash is 11b4dd6c7c1d645405b9caf06bb42db3124f96a8edbc3d9359cbf857adbb5941, ID matches
  decrypt of blob 11b4dd6c7c1d645405b9caf06bb42db3124f96a8edbc3d9359cbf857adbb5941 stored at correct-11b4dd6c7c1d645405b9caf06bb42db3124f96a8edbc3d9359cbf857adbb5941.bin
        loading blob d4eafb2419ada6dbb987bf6ad32a092b0986ba74394c9ad6ddc5a78d2458fbaf at 3919676 (length 2044093)
           successfully decrypted blob (length 2044061), hash is d4eafb2419ada6dbb987bf6ad32a092b0986ba74394c9ad6ddc5a78d2458fbaf, ID matches
  decrypt of blob d4eafb2419ada6dbb987bf6ad32a092b0986ba74394c9ad6ddc5a78d2458fbaf stored at correct-d4eafb2419ada6dbb987bf6ad32a092b0986ba74394c9ad6ddc5a78d2458fbaf.bin
    ========================================
    inspect the pack itself
        data blob 97d3145d913e7ef270472dd1d6f93b1c7e7a6dd8c3d1e82073f6d8f735d0104d, offset 0     , raw length 2891301
        data blob 11b4dd6c7c1d645405b9caf06bb42db3124f96a8edbc3d9359cbf857adbb5941, offset 2891301, raw length 1028375
        data blob d4eafb2419ada6dbb987bf6ad32a092b0986ba74394c9ad6ddc5a78d2458fbaf, offset 3919676, raw length 2044093
        file sizes match
  • Step 2b: Run restic debug examine --try-repair <id-of-broken-pack-file> (still running):
% restic debug examine --try-repair 7d87533f4f645e82ce38dbf95a9386aaf9cf33947501b20a62a9243134a6c0f2
debug enabled
repository cb48001c opened successfully, password is correct
examine 7d87533f4f645e82ce38dbf95a9386aaf9cf33947501b20a62a9243134a6c0f2
  file size is 5963916
  wanted hash 7d87533f4f645e82ce38dbf95a9386aaf9cf33947501b20a62a9243134a6c0f2, got 424b5299980e46b507eaaa8c7f74095036ac0a9633e4f4e5e4fe3c66de19d722
  ========================================
  looking for info in the indexes
    index [19aad8bb 26c6eefe 3a3f918e 94d2f652 2438df87 9f16641b 3884d3b0 d91fb07a fdf84442 95377d79 3fdc5686 d8694cf2]:
      data blob 97d3145d913e7ef270472dd1d6f93b1c7e7a6dd8c3d1e82073f6d8f735d0104d, offset 0     , raw length 2891301
      data blob 11b4dd6c7c1d645405b9caf06bb42db3124f96a8edbc3d9359cbf857adbb5941, offset 2891301, raw length 1028375
      data blob d4eafb2419ada6dbb987bf6ad32a092b0986ba74394c9ad6ddc5a78d2458fbaf, offset 3919676, raw length 2044093
      file sizes match
      loading blob 97d3145d913e7ef270472dd1d6f93b1c7e7a6dd8c3d1e82073f6d8f735d0104d at 0 (length 2891301)
error decrypting blob: ciphertext verification failed
        trying to repair blob with single bit flip
         spinning up 4 worker functions
4398 byte of 2891301 done (0.00%), 219 byte per second, ETA 3h40m10s

I also ran restic find --pack before which brought up a "zillion" lines like these:

% restic find --pack 7d87533f
[...]
Found blob 11b4dd6c7c1d645405b9caf06bb42db3124f96a8edbc3d9359cbf857adbb5941                                     
 ... in file /srv/home/example.file.1
     (tree dba61b66d1115e9f90047cf1bea2c4e75b9ff22537d2691c8a7020ecccaca7e3)  
 ... in snapshot ffa642f5 (2020-10-09 23:00:24)                                                                        
Found blob 11b4dd6c7c1d645405b9caf06bb42db3124f96a8edbc3d9359cbf857adbb5941                           
 ... in file /srv/home/example.file.2                                                       
     (tree dba61b66d1115e9f90047cf1bea2c4e75b9ff22537d2691c8a7020ecccaca7e3)                                    
 ... in snapshot ffa642f5 (2020-10-09 23:00:24)                                                          
Found blob 11b4dd6c7c1d645405b9caf06bb42db3124f96a8edbc3d9359cbf857adbb5941                                
 ... in file /srv/home/example.file.2                                                
     (tree dba61b66d1115e9f90047cf1bea2c4e75b9ff22537d2691c8a7020ecccaca7e3)                                      
 ... in snapshot ffa642f5 (2020-10-09 23:00:24)                                                     
Found blob 11b4dd6c7c1d645405b9caf06bb42db3124f96a8edbc3d9359cbf857adbb5941                                                 
 ... in file /srv/home/example.file.3                
     (tree b15611c27a33943f9cd15c61acd2efb0192911235fc93e0fe532ec5da7172402)                                                                                                                 
 ... in snapshot ffa642f5 (2020-10-09 23:00:24)

What shall I do with this info? Shall I check if each of the given files still exist?
I could do so, but I can not say if the files haven't been modified in the meanwhile.
Is it worth the effort?

I now wait for -try-repair to be completed and will follow the next steps of your instructions:

  • Move the damaged pack file out of the repository, create a backup of the index and run restic rebuild-index
  • Then run restic backup . in the folder which contains the extracted blobs
  • Now with a bit of luck restic check --read-data should no longer report errors or only a few ones
  • If the missing blobs are part of a file of which you still have the original, then run restic backup , which will recover the missing blobs.

The last one is tricky. How do I know "If the missing blobs are part of a file of which you still have the original"?
I am doing the things right so far? (And maybe anything new you spotted in the meanwhile?)

Anyway! Thank you so much for your help and effort @MichaelEischer ! Very much appreciated!

@MichaelEischer
Copy link
Member

As only one of the three blobs in the pack files is damaged, you can focus the search for affected files a bit more: restic find --blob 97d3145d913e7ef270472dd1d6f93b1c7e7a6dd8c3d1e82073f6d8f735d0104d. It's probably best to check whether you still have one of these files and just create a copy to make sure that the file doesn't change in the meantime. If one of these files is still in its original state, then that would allow recovering the damaged blob.

The last one is tricky. How do I know "If the missing blobs are part of a file of which you still have the original"?

I probably should have worded this the other way round: Just run a normal backup at the end (using restic 0.10.0). If a file which contains the missing blob still exists, then backup will pick that blob up again and thus fix the repository. To know whether that has worked run check afterwards.

@jkirk
Copy link

jkirk commented Oct 20, 2020

What a journey! I think the finish line is close! 😉

restic find --blob also gave me a 'zillion' lines. But only now I had the 'spirit' to investigate the lines closely:

  % restic find --blob 97d3145d913e7ef270472dd1d6f93b1c7e7a6dd8c3d1e82073f6d8f735d0104d
  debug enabled
  repository cb48001c opened successfully, password is correct
  Found blob 97d3145d913e7ef270472dd1d6f93b1c7e7a6dd8c3d1e82073f6d8f735d0104d
   ... in file /srv/home/example.1
       (tree 37f8abb36c231afa2b9db4110bac0a6746fb872c832725aeec7d3aa981eb47c2)
   ... in snapshot 014d01eb (2020-06-03 23:00:03)
  Found blob 97d3145d913e7ef270472dd1d6f93b1c7e7a6dd8c3d1e82073f6d8f735d0104d
   ... in file /srv/home/example.2
       (tree 423e4fdd47864497c617d098898ec06f951baa5203df96528d08b672f85f27e9)
   ... in snapshot 014d01eb (2020-06-03 23:00:03)
  Found blob 97d3145d913e7ef270472dd1d6f93b1c7e7a6dd8c3d1e82073f6d8f735d0104d
   ... in file /srv/home/example.3
       (tree 423e4fdd47864497c617d098898ec06f951baa5203df96528d08b672f85f27e9)
   ... in snapshot 014d01eb (2020-06-03 23:00:03)
  Found blob 97d3145d913e7ef270472dd1d6f93b1c7e7a6dd8c3d1e82073f6d8f735d0104d
   ... in file /srv/home/example.4
       (tree 423e4fdd47864497c617d098898ec06f951baa5203df96528d08b672f85f27e9)
   ... in snapshot 014d01eb (2020-06-03 23:00:03)
  [...]
  Found blob 97d3145d913e7ef270472dd1d6f93b1c7e7a6dd8c3d1e82073f6d8f735d0104d
   ... in file /srv/home/example.4
       (tree 4df9350afa47f3e318fedf8e7febea29aaa7e21b60816384537d1154bfe2f01d)
   ... in snapshot ffa642f5 (2020-10-09 23:00:24)

I put the output of restic find in a tmp file and did the following and indeed, only 4 files affected:

% grep "in file" tmp | sort | uniq
     ... in file /srv/home/example.1
     ... in file /srv/home/example.2
     ... in file /srv/home/example.3
     ... in file /srv/home/example.4

All files still exist and although not exactly the same (the sha256sum differs of each of the files) the date stamps date back between 4th and 6th November 2019.

I probably should have worded this the other way round: Just run a normal backup at the end (using restic 0.10.0). If a file which contains the missing blob still exists, then backup will pick that blob up again and thus fix the repository. To know whether that has worked run check afterwards.

But here we have a problem: the files exist (although I can not 100% guarantee that the blob is part of the files, how to check that?) and backups were taken of these files in the meanwhile but the repository has still not recovered. What to do? Just delete the pack file 7d87533f4f645e82ce38dbf95a9386aaf9cf33947501b20a62a9243134a6c0f2?

@MichaelEischer
Copy link
Member

But here we have a problem: the files exist (although I can not 100% guarantee that the blob is part of the files, how to check that?) and backups were taken of these files in the meanwhile but the repository has still not recovered.

As long as the damaged pack file is still contained in the repository index, restic thinks that its blobs are already in the repository and won't back them up again. To remove the files from the repository index, move the pack 7d87533f... to some place outside the data folder and then run rebuild-index. Afterwards restic will pick up missing blobs during a backup run.

@MichaelEischer MichaelEischer added category: backup state: need investigating cause unknown, need investigating/troubleshooting category: prune and removed category: backup labels Nov 15, 2020
@jkirk
Copy link

jkirk commented Dec 28, 2020

Sorry, for the very long delay. (We switched to a new repository and this issue got stuck in the backlog).

Finally took some time to bring this issue to an end. restic has been updated to 0.11 in the meanwhile, everything else stayed the same.

As you said, I moved the pack 7d87533f... away, but I did a mistake and took a backup and run prune before running rebuild-index:

  ~ % rm /srv/restic/home/data/7d/7d87533f4f645e82ce38dbf95a9386aaf9cf33947501b20a62a9243134a6c0f2
  
  # restic backup from remote #
  
  ~ % ls -l /srv/restic/home/data/7d/7d87533f4f645e82ce38dbf95a9386aaf9cf33947501b20a62a9243134a6c0f2
  ls: cannot access '/srv/restic/home/data/7d/7d87533f4f645e82ce38dbf95a9386aaf9cf33947501b20a62a9243134a6c0f2': No such file or directory
  
  ~ % restic  prune
  repository cb48001c opened successfully, password is correct
  counting files in repo
  building new index for repo
  [0:40] 100.00%  24688 / 24688 packs
  repository contains 24688 packs (607331 blobs) with 111.573 GiB
  processed 607331 blobs: 201889 duplicate blobs, 13.359 GiB duplicate
  load all snapshots
  find data that is still in use for 70 snapshots
  [0:43] 100.00%  70 / 70 snapshots
  Fatal: [<data/97d3145d> <data/11b4dd6c>] not found in the new index
  Data blobs seem to be missing, aborting prune to prevent further data loss!
  Please report this error (along with the output of the 'prune' run) at
  https://github.com/restic/restic/issues/new/choose

  ~ % restic rebuild-index
  repository cb48001c opened successfully, password is correct
  counting files in repo
  [0:00] 100.00%  24688 / 24688 packs
  finding old index files
  saved new indexes as [bf18b651 a531a9a3 b65a255c 49a6ec68 214576d7 41857df6 94f1247f 1073b230 a120bbee]
  remove 22 old index files
  [0:00] 100.00%  22 / 22 files deleted

  ~ % restic check
  using temporary cache in /tmp/restic-check-cache-839912599
  repository cb48001c opened successfully, password is correct
  created new cache in /tmp/restic-check-cache-839912599
  create exclusive lock for repository
  load indexes
  check all packs
  check snapshots, trees and blobs
  error for tree e4c67fd8:
    tree e4c67fd8: file "poster_printing.pdf" blob 13 size could not be found
    tree e4c67fd8, blob 97d3145d: not found in index
  [...]
  Fatal: repository contains errors

Analyzed the output of restic check, there were 15 files, two blobs and 28 trees affected (if I read it right):

  ~ % awk '/file/{ print $4 }' restic-home.check | sort | uniq | wc -l
  15
  
  ~ % awk '/, blob/{ print $4 }' restic-home.check | sort | uniq
  11b4dd6c:
  97d3145d:

  ~ % awk '/for tree/{ print $4 }' restic-home.check | sort | uniq | wc -l
  28

Then I did another backup which reported errors (but "only" 4 files were not found. I expected all of the 15 files mentioned in restic check above):

  error: parts of /srv/home/example.1 not found in the repository index; storing the file again
  [...]
  Warning: failed to read all source data during backup

and finally run restic check again:

  ~ % restic check
  using temporary cache in /tmp/restic-check-cache-153638149
  repository cb48001c opened successfully, password is correct
  created new cache in /tmp/restic-check-cache-153638149
  create exclusive lock for repository
  load indexes
  check all packs
  check snapshots, trees and blobs
  no errors were found

Looks good, but the pack file 7d87533f... is still missing (which is kind of suspicious):

  ~ % sudo ls -l /srv/restic/home/data/7d/7d87533f4f645e82ce38dbf95a9386aaf9cf33947501b20a62a9243134a6c0f2
  ls: cannot access '/srv/restic/home/data/7d/7d87533f4f645e82ce38dbf95a9386aaf9cf33947501b20a62a9243134a6c0f2': No such file or directory

Now I am running restic check --check-unused --read-data (with a lot of unused blob <data/...> and unused blob <tree/...> lines already).

I did not try --try-repair (as in #2191 (comment)). Should I try that? Does it even make sense? Anything else I missed or anything I should do? Did I loose data by running restic prune too early? But apart from that restic seems to behave ok. What do you think?

@jkirk
Copy link

jkirk commented Dec 28, 2020

Ok, restic check --check-unused --read-data finished and it contains errors:

restic check --check-unused --read-data
[...]
unused blob <data/865beab5>
unused blob <tree/703bb2a7>
unused blob <tree/c12f1cea>
read all data
[40:53] 100.00%  24689 / 24689 packs
Fatal: repository contains errors

What now?

@MichaelEischer
Copy link
Member

MichaelEischer commented Dec 28, 2020

@jkirk The unplanned backup run shouldn't cause any additional damage and the prune run failed during the initial integrity check so no harm done here either. It is possible that the damaged files had different filenames but still shared the same content, that would be a benign explanation for the too few reported file recoveries.

The name of a pack file depends on the exact file content which also includes random bytes for encryption. This effectively guarantees that restic never creates the same pack filename again. The recovered file chunks are now stored in some other pack files. That is it is expected that the damaged pack file is not recreated.

The --check-unused option is rather uninteresting, it merely reports that a few blobs could be pruned, which is expected once a snapshot was forgotten or a backup run was interrupted. Please just use restic check --read-data to check the repository integrity.

@MichaelEischer MichaelEischer added state: need feedback waiting for feedback, e.g. from the submitter and removed state: need investigating cause unknown, need investigating/troubleshooting labels Jan 10, 2021
@MichaelEischer
Copy link
Member

Closing in favor of #828.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants