Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backup fails with Fatal: failed to refresh lock in time is host is suspended in the meantime #4274

Closed
twaldecker opened this issue Mar 29, 2023 · 5 comments · Fixed by #4374

Comments

@twaldecker
Copy link

Output of restic version

restic 0.15.1 compiled with go1.19.5 on windows/amd64

How did you run restic exactly?

Trying to adopt restic as backup solution, built a rather big filelist which should result in a ~6TB backup.
All paths which should be backed up are local on a windows machine.

PS C:\Users\Thomas> restic -r X:\restic-repo backup --files-from S:\path\backuplist
enter password for repository:
repository 3f1dcb70 opened (version 2, compression level auto)
no parent snapshot found, will read all files
error: open \\?\C:\Users\Thomas\SkyDrive\.849UUID: The process cannot access the file because it is being used by another process.PXL_202XXXX.MP.jpg
Fatal: failed to refresh lock in time
Fatal: unable to save snapshot: context canceled

What backend/server/service did you use to store the repository?

The backup location is a harddisk plugged in via USB3.1.

Expected behavior

Complete the backup correctly

Actual behavior

Storage space is used, but backup failed and no snapshots are created.

Steps to reproduce the behavior

Currently reproducible on my machine. It was the third try already.

Do you have any idea what may have caused this?

The first try resulted in an error because of virus or potentially unwanted software which should be backed up.
I think this comes from the windows integrated security and is correct as it contains indeed samples.

error: open \\?\E:\FH\ee\6\6-Network Security\Virus\dd153.zip: Operation did not complete successfully because the file contains a virus or potentially unwanted software.
error: open \\?\E:\FH\ee\6\6-Network Security\Virus\subseven-Dateien\bwg503.zip: Operation did not complete successfully because the file contains a virus or potentially unwanted software.
ly because the file contains a virus or potentially unwanted software.
Fatal: failed to refresh lock in time
error: no result
Fatal: unable to save snapshot: context canceled

There is also windows power management which suspends after short period of time. I disabled it now and try again.

Do you have an idea how to solve the issue?

no

Did restic help you today? Did it make you happy in any way?

Not today and not yet, but I hope so in the future. Thanks for restic!

@twaldecker
Copy link
Author

I disabled power management and used --use-fs-snapshot. It ran 16 hours but worked now.

Incremental backups now run in 4 minutes which is quite good. So happy now!

@carns
Copy link

carns commented Apr 3, 2023

Thank you @twaldecker for posting this! I was having the same problem and didn't realize that the laptop was triggering the problem by suspending itself while my initial backup was running; I was searching for other problems before reading this.

Same solution worked for me; the initial backup completed without any trouble at all once I set the laptop to never suspend while on AC power.

@castilma
Copy link

castilma commented Apr 7, 2023

I had a similar problem. Ran an initial backup on linux:

$ restic backup -r /run/media/me/mydrive/some/dir/ --exclude-caches=true --exclude=".cache" --exclude=som.big.file --exclude=some.big.file --exclude=some.big.file3 . 
enter password for repository: 
repository 0a9ce1c5 opened (version 2, compression level auto)
no parent snapshot found, will read all files
error: open archiv/hdd/repartition of my main/real-dm.undo: permission denied
Fatal: failed to refresh lock in time
error: no result
error: no result
error: no result
error: no result
error: no result
Fatal: unable to save snapshot: context canceled

using restic 0.15.1 compiled with go1.20 on linux/amd64. I have about 300GB to back up. I left my pc running and a few hours later it suspended.

This failed to refresh lock thing may need to be a bit more opportunistic in the case of a suspending system.

@MichaelEischer
Copy link
Member

This failed to refresh lock thing may need to be a bit more opportunistic in the case of a suspending system.

The failed to refresh lock in time is designed to cancel backups and other operations which are suspended for too long. That is, canceling a backup if the host was suspended is what is supposed to happen. The big problem here is that restic needs to be absolutely sure that no data was removed from the repository in the meantime or otherwise there's the risk of data loss. I'm not sure whether it is possible to detect that it is safe to continue a backup.

@MichaelEischer MichaelEischer changed the title On Windows initial backup fails with Fatal: failed to refresh lock in time Backup fails with Fatal: failed to refresh lock in time is host is suspened in the meantime May 13, 2023
@MichaelEischer MichaelEischer changed the title Backup fails with Fatal: failed to refresh lock in time is host is suspened in the meantime Backup fails with Fatal: failed to refresh lock in time is host is suspended in the meantime May 13, 2023
@MichaelEischer
Copy link
Member

I think we can use the following approach:

Once restic detects that it failed to refresh the lock in time, it can create a new lock file, then check whether the current lock still exists in the repository before cleaning up the old lock. This checks that the old lock file was not removed in the meantime, which means that the repository stayed locked all the time.

While this lock check takes place, all other backend operations should be paused.

For #2736 / #4262 this means that these must delete stale lock files. Just ignoring stale locks would lead to problems with the approach I've just described.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants