New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fatal: number of used blobs is larger than number of available blobs! #2777
Comments
Please see #2663 |
I've tried step 1 in #2663 (comment) , Skimming through the errors, I notice two things:
|
The usual first step is to rebuild the repository index using The backup command only adds new files to the repository and never removes anything, thus it sounds rather unlikely to be related to the missing blobs. |
@MichaelEischer Today, after backing up the index, I ran
Unfortunately, after that, |
Hmm, that looks like some pack files disappeared. If you still have some of the affected files, then you can instruct restic to fully back those up again using If I understand your setup correctly you have a single host using that repository and use |
I'll make a copy of my backup script which runs all my
That is correct. |
Okay. I ran
This time, running As before, quickly skimming the I am not sure how to proceed on fixing the remaining trees. |
As the backup source data for the remaining broken trees seems to be no longer available, there are basically two options on how to proceed: Delete the snapshots with missing data. Or try whether you can remove the damaged files from the snapshots using PR #2731 (The PR is still experimental, I haven't had time to take a closer look at it, but I'm pretty sure that it won't cause additional damage to the repository. However, it might be possible that it creates a damaged rewritten snapshot or just doesn't work at all. And you should probably create a backup of the snapshots folder before, just in case).
restic freshly assembles the directory trees in each backup run. The only influence It's much more likely that the damage to the excluded files has been present for some time before these files were excluded or that a data pack files disappeared afterwards. |
I see. Would there be any harm in not removing these snapshots, and simply waiting for them to be pruned out as part of my normal I don't feel comfortable running an unmerged PR on my repo, but thank you for the pointer to it. That seems like it will be a useful feature moving forward.
Thank you for the clear explanation. I appreciate your help on this. I still have no idea what might have caused this damage, but ¯\_(ツ)_/¯ |
Once you've run |
@cdzombak Has your repository fixed itself by now? I'd like to close this issue as I don't see something we could still investigate here. |
@MichaelEischer thanks for following up. Unfortunately, it has not — in fact, for a long time now pruning has been failing every day. I can provide more information Monday, but the output reads:
I've intended for a while to write up another bug report, but other priorities have gotten in the way. The gist is, I have not changed how I am running restic, just updated it (currently on restic 0.11.0). |
As mentioned previously, it is expected behavior that prune fails as long as damaged snapshots exist. And as far as I remember we didn't remove those? As the initially damaged snapshots should have been thinned out quite a bit by now, it might be worth a shot to finally get rid of those. You can use the long list reported by prune to search for affected snapshots: Do you have logs of earlier failing prune runs? It would interesting to compare the blob list reported by the oldest prune run you can find which uses restic >=0.10.0 with the currently reported blob list. My expectation would be that no new blobs ended up in that list. |
I'll start doing this, but
Unfortunately, the oldest log I have is from October 11, 2020. I think I was running restic 0.10.0 by that time. Comparing the blob list from October 11 to one from this morning's failure, I do note that no new blobs have appeared. The blob list from oct. 11 has 2583 blobs in it; the one from this morning (Jan. 11, 2021) has 2573. |
You can pass a list of blobs to
So there have been no further disappearing data packs in the last three months. So let's hope that the initial issue was just a one-time problem (although that makes debugging really hard) |
Ah, nice! Suppose I should've checked the docs on that one :)
I think this is likely the case. I've so far searched for one blob and removed the three snapshots which used it. A subsequent search for a second missing blob didn't turn up any references, so perhaps there were just a few broken snapshots left, as you theorized. I'll go ahead and search for many more blobs at once and see what turns up. As before, @MichaelEischer, thanks for your help & patience here. |
🎉 with some Bash-fu I was able to search for around 300 of the affected blobs and remove the snapshots that used them. In total I ended up removing ~10 snapshots. Today's scheduled backup and prune job completed successfully. Pruning reported that 843.028 GiB will be freed, which ought to reduce my cloud storage bill :) |
So I guess we can close the issue for now? If the problem reappears then it would be no problem to reopen the issue. |
Agreed. 🍻 |
Output of
restic version
How did you run restic exactly?
Environment:
Command:
backup.sh
:.restic-cfg
:restic forget --prune
output:What backend/server/service did you use to store the repository?
Wasabi
Expected behavior
I expect
forget --prune
to complete successfully.Actual behavior
forget --prune
reports(MISSING)/ 594774 packs
,(MISSING)/ 205 snapshots
, and dies withFatal: number of used blobs is larger than number of available blobs!
.This backup job runs daily. This problem has occurred three times for me: on June 4, June 6, and June 9. (The run output above is from June 9.)
Note this means that
forget --prune
only fails intermittently. Looking at logs, I can confirm it completed successfully on June 3, 5, 7, and 8.Steps to reproduce the behavior
I cannot deterministically and consistently reproduce this behavior.
Do you have any idea what may have caused this?
No.edited to add: Not really. However, after running
restic check
I notice that a lot, but not all, of the files listed are in folders I recently (probably a couple weeks ago) added to--exclude
.I suspect this is related somehow — maybe a
prune
bug triggered by pruning files which had been included in old backups but excluded from more recent ones?Do you have an idea how to solve the issue?
No.
The text was updated successfully, but these errors were encountered: