Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prune: There is not enough space on the disk. #1140

Open
stuertz opened this issue Jul 27, 2017 · 18 comments
Open

prune: There is not enough space on the disk. #1140

stuertz opened this issue Jul 27, 2017 · 18 comments

Comments

@stuertz
Copy link
Contributor

@stuertz stuertz commented Jul 27, 2017

Output of restic version

restic 0.7.1
compiled with go1.8.3 on windows/amd64

How did you start restic exactly? (Include the complete command line)

d:\restic\restic -r d:\restic\ prune

What backend/server/service did you use?

Filesystem (local usb windows, ntfs partition)

Expected behavior

reduce disk usage

Actual behavior

λ d:\restic\restic -r d:\restic\ prune
counting files in repo
building new index for repo
[44:35] 100.00%  99865 / 99865 packs
incomplete pack file (will be removed): 7de8bd2c65eacc121bb689d748caa95139b1ad84e6a661bbe9704b86774949d9
repository contains 99864 packs (7063132 blobs) with 459.184 GiB bytes
processed 7063132 blobs: 203673 duplicate blobs, 14.243 GiB duplicate
load all snapshots
find data that is still in use for 30 snapshots
[33:27] 100.00%  30 / 30 snapshots
found 3338752 of 7063132 data blobs still in use, removing 3724380 blobs
will remove 1 invalid files
will delete 15626 packs and rewrite 12427 packs, this frees 106.755 GiB
write \\?\d:\restic\data\9c\9c620cef19836b6f78fb9cbcc99353756f94e13d277e3a5981f3c871b9700be5: There is not enough space on the disk.
Write
restic/backend/local.(*Local).Save
        src/restic/backend/local/local.go:110
restic/repository.(*Repository).savePacker
        src/restic/repository/packer_manager.go:122
restic/repository.(*Repository).SaveAndEncrypt
        src/restic/repository/repository.go:203
restic/repository.(*Repository).SaveBlob
        src/restic/repository/repository.go:529
restic/repository.Repack
        src/restic/repository/repack.go:103
main.pruneRepository
        src/cmds/restic/cmd_prune.go:242
main.runPrune
        src/cmds/restic/cmd_prune.go:83
main.glob..func14
        src/cmds/restic/cmd_prune.go:23
github.com/spf13/cobra.(*Command).execute
        src/github.com/spf13/cobra/command.go:647
github.com/spf13/cobra.(*Command).ExecuteC
        src/github.com/spf13/cobra/command.go:726
github.com/spf13/cobra.(*Command).Execute
        src/github.com/spf13/cobra/command.go:685
main.main
        src/cmds/restic/main.go:63
runtime.main
        /usr/local/go/src/runtime/proc.go:185
runtime.goexit
        /usr/local/go/src/runtime/asm_amd64.s:2197

Steps to reproduce the behavior

Having a small partition while pruning a lot of files.

What I did was exactly:

  • had some space on the partition
  • run forget, which marked several backups.
  • tried to prune
  • got above error
  • removed other files on the partition
  • retried to prune
  • got above error again.
  • Again the device is completely filled.

Now I'm stuck, is there anything that can be savely removed?

Do you have any idea what may have caused this?

as mentioned in #725 previous files are backuped but not removed in case of an error.

@middelink

This comment has been minimized.

Copy link
Member

@middelink middelink commented Jul 27, 2017

set TMPDIR to a mountpoint with some room. restic generates pack files locally when compacting them.
You can also run the prune from a different machine if need be.

@fd0

This comment has been minimized.

Copy link
Member

@fd0 fd0 commented Jul 30, 2017

The problem here is that restic is very conservative: It will only remove files from the repo once it has moved all still used data to other files. So the step where it removes files is the last step. If the file system where the repo is located fills up before, restic returns the error. We can try handling this as a special case.

@stuertz

This comment has been minimized.

Copy link
Contributor Author

@stuertz stuertz commented Jul 31, 2017

@middelink: Setting TMPDIR to another location didn't help.

@mbiebl

This comment has been minimized.

Copy link

@mbiebl mbiebl commented Aug 15, 2017

I've run into this issue as well (my backup location is a local disk). Tried to clean up a few older snapshots, then ran prune, which aborted with the above error as the disk was full.
check shows a lot of dangling pack files:

$ restic check
Create exclusive lock for repository
Load indexes
Check all packs
pack 0011a3706907383d629e94094a793267a8e2d1bda3b71daca760957b737f8052: not referenced in any index
[1750 lines snipped]
pack ffc304866b692599c2b8b7a51209e96a26009bd85efce2a497633527a485e0c3: not referenced in any index
Check snapshots, trees and blobs
Fatal: repository contains errors

How can I recover from this situation or is my backup repository hosed?
Could restic be more resilient in this case and pre-calculate the amount of disk space it needs for a prune operation and warn/abort if the available (disk) space is not sufficient?
It looks like prune required dozens of GBs for repack, could this operation of forgetting old snapshots and reclaiming that space be made more efficient?

@fd0

This comment has been minimized.

Copy link
Member

@fd0 fd0 commented Aug 15, 2017

I think we can improve the situation, I'll think about it. It seems users regularly run into this (thanks for the reports!).

@mbiebl

This comment has been minimized.

Copy link

@mbiebl mbiebl commented Aug 15, 2017

To give some background: I've created a backup of my /home directory to a local disk. It's about 30GB of data. I had about 15GB of free space left on the backup location (external hard drive) when I tried to run forget+prune.

@McKael

This comment has been minimized.

Copy link
Contributor

@McKael McKael commented Nov 11, 2017

FWIW, I've been hit by this as well.
I've increased the backup partition size several times, but not enough... And each time it fails, the repository gets bigger.

The repository is now 5 times its usual size :/

I wish there was at least a proper way to clean up failed prunes (or redundant data).

@fd0

This comment has been minimized.

Copy link
Member

@fd0 fd0 commented Nov 21, 2017

I'm aware of the limitation of prune. As a workaround, you could remove all files from the repo that were added since the run of prune started. Since prune will only ever remove data at the very end of the run, that shouldn't remove any data that isn't saved in another file which was present before. If in doubt, don't remove the files, but move them to a different directory outside of the repo and run restic check first.

@stefan-as

This comment has been minimized.

Copy link

@stefan-as stefan-as commented Oct 23, 2018

Any news on this issue? We run into it on a 10TB Hetzner Storage Box. There is no way to increase the disk size :-(

@fd0

This comment has been minimized.

Copy link
Member

@fd0 fd0 commented Oct 23, 2018

You can try the code in the prune-aggressive branch, which removes files that are completely unneeded before starting the repack process.

@stefan-as

This comment has been minimized.

Copy link

@stefan-as stefan-as commented Oct 24, 2018

Thanks for your quick reply! I built the prune-aggressive branch, but it does not seem to solve the situation:

Remove(<data/78348004bc>) returned error, retrying after 524.954656ms: Chmod: chmod /mnt/backup/kb_restic/data/78/78348004bc72698753b0f0991b432d2dd1cfa6868362b422d10e8ccb5f91413e: no space left on device
Remove(<data/78348004bc>) returned error, retrying after 565.788962ms: Chmod: chmod /mnt/backup/kb_restic/data/78/78348004bc72698753b0f0991b432d2dd1cfa6868362b422d10e8ccb5f91413e: no space left on device
Remove(<data/78348004bc>) returned error, retrying after 843.003712ms: Chmod: chmod /mnt/backup/kb_restic/data/78/78348004bc72698753b0f0991b432d2dd1cfa6868362b422d10e8ccb5f91413e: no space left on device
Remove(<data/78348004bc>) returned error, retrying after 1.538368257s: Chmod: chmod /mnt/backup/kb_restic/data/78/78348004bc72698753b0f0991b432d2dd1cfa6868362b422d10e8ccb5f91413e: no space left on device
Remove(<data/78348004bc>) returned error, retrying after 2.141937077s: Chmod: chmod /mnt/backup/kb_restic/data/78/78348004bc72698753b0f0991b432d2dd1cfa6868362b422d10e8ccb5f91413e: no space left on device
Remove(<data/78348004bc>) returned error, retrying after 2.495749063s: Chmod: chmod /mnt/backup/kb_restic/data/78/78348004bc72698753b0f0991b432d2dd1cfa6868362b422d10e8ccb5f91413e: no space left on device
Save(<lock/028fdb788e>) returned error, retrying after 12.885220654s: OpenFile: open /mnt/backup/kb_restic/locks/028fdb788ea2622df15c230c42370fc5a850696a03827f2c36f7577669e845c0: no space left on device
Remove(<data/78348004bc>) returned error, retrying after 8.041276629s: Chmod: chmod /mnt/backup/kb_restic/data/78/78348004bc72698753b0f0991b432d2dd1cfa6868362b422d10e8ccb5f91413e: no space left on device
Remove(<data/78348004bc>) returned error, retrying after 8.552941091s: Chmod: chmod /mnt/backup/kb_restic/data/78/78348004bc72698753b0f0991b432d2dd1cfa6868362b422d10e8ccb5f91413e: no space left on device
Save(<lock/028fdb788e>) returned error, retrying after 14.277652003s: OpenFile: open /mnt/backup/kb_restic/locks/028fdb788ea2622df15c230c42370fc5a850696a03827f2c36f7577669e845c0: no space left on device
Remove(<data/78348004bc>) returned error, retrying after 8.742477617s: Chmod: chmod /mnt/backup/kb_restic/data/78/78348004bc72698753b0f0991b432d2dd1cfa6868362b422d10e8ccb5f91413e: no space left on device
Remove(<data/78348004bc>) returned error, retrying after 10.005156128s: Chmod: chmod /mnt/backup/kb_restic/data/78/78348004bc72698753b0f0991b432d2dd1cfa6868362b422d10e8ccb5f91413e: no space left on device
unable to refresh lock: OpenFile: open /mnt/backup/kb_restic/locks/028fdb788ea2622df15c230c42370fc5a850696a03827f2c36f7577669e845c0: no space left on device
unable to remove file 78348004 from the repository
[...]
[11:44:32] 0.25%  749 / 304270 packs deleted
@fd0

This comment has been minimized.

Copy link
Member

@fd0 fd0 commented Oct 24, 2018

Hm, does it remove any files at all? I would've thought so...

@stefan-as

This comment has been minimized.

Copy link

@stefan-as stefan-as commented Oct 24, 2018

Reading the bottom line of the output there is some clue of removed files, but I assume it will take some more hours until we'll have a final statement...

@jo-so

This comment has been minimized.

Copy link

@jo-so jo-so commented Nov 7, 2018

Is there a way to find the ids of the unused indexes to delete them manually?

@jo-so

This comment has been minimized.

Copy link

@jo-so jo-so commented Nov 7, 2018

I've solved the problem by temporally move some files to a different filesystem (other disk or USB device). Beware of this uses Zsh features: $i:t == $(basename $i)

% cd …/backup/data
% mkdir /var/tmp/ba
    # check how much data is possible by adding more the characters inside the brackets:
% du -sch [ef]*
% for i in [ef]*; do mv $i /var/tmp/ba && mkdir $i && mount --bind /var/tmp/ba/$i $i; done
% restic prune
% for i in /var/tmp/ba/*; do umount $i:t && rmdir $i:t && mv $i .; done
@stefan-as

This comment has been minimized.

Copy link

@stefan-as stefan-as commented Nov 7, 2018

@fd0 some updates about the prune run based on the prune-aggressive branch: even if there is any effect on disk usage in the long run, this approach does not seems to be usable because of the runtime it requires to complete.

Remove(<data/2222cd9bac>) returned error, retrying after 16.70137233s: Chmod: chmod /mnt/backup/kb_restic/data/22/2222cd9bac8304a6d080b3f969e26f1bae8648966cdb7435a6ffd59a95ac5487: no space left on device
Remove(<data/2222cd9bac>) returned error, retrying after 14.017690407s: Chmod: chmod /mnt/backup/kb_restic/data/22/2222cd9bac8304a6d080b3f969e26f1bae8648966cdb7435a6ffd59a95ac5487: no space left on device
[155:30:25] 5.80%  17263 / 297794 packs deleted

This is going to finish in spring 2019 ;-)

@Vanav

This comment has been minimized.

Copy link

@Vanav Vanav commented Nov 24, 2018

I confirm the bug.

  1. So, there is no workaround, and if disk is full and no other disk is available, it is a game over?

Update:
I have 2000 GB repository, disk is full. I had to manually delete 131 GB of repository files, only then "prune" was able to finish.
2. How can I estimate space required for prune, before game is over?
3. Is there a way to manually delete files, and repair repository after it? Now I have:

$ restic check
using temporary cache in /tmp/restic-check-cache-873644429
repository 652d090b opened successfully, password is correct
created new cache in /tmp/restic-check-cache-873644429
create exclusive lock for repository
load indexes
check all packs
check snapshots, trees and blobs
error for tree b849b624:
  tree b849b6241edda87155006f49e36e3005640026636d97847a016c9a6c557db8d5 not found in repository
...
Fatal: repository contains errors

Update 2: I was able to repair repository with restic forget for deleted id's. Sue this is not universal way, also I've lost 131 GB of recent backups, so questions remain.

@Kjwon15

This comment has been minimized.

Copy link

@Kjwon15 Kjwon15 commented Mar 9, 2019

I found a temporary solution: use overlayfs and rsync them back

mkdir /tmp/{upper,overlay,workdir}
mount -t overlay overlay -o upperdir=/tmp/upper,lowerdir=/mnt/backup,workdir=/tmp/workdir /tmp/overlay
restic -r /tmp/overlay prune
rsync --delete -rulHtv /tmp/overlay/ /mnt/backup/
umount /tmp/overlay
rm -r /tmp/{upper,overlay,workdir}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants
You can’t perform that action at this time.