-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lock not found warnings on parallel restic runs #1523
Comments
Huh, interesting, thanks for the report. What happens in the background is that restic creates a lock file (non-exclusive, there may be multiple clients running backup in parallel), then waiting for other locks to appear. During that time, there was a lock file present (maybe from the other process), but which was removed. The messages are just telling you that restic retries loading the lock file several times (just in case it appears after a while, some cloud backends may do that). So, strictly speaking, restic isn't waiting for the other backup run to finish, but for the removed lock file to appear again. That's a rare timing issue, and I suspect not many users will run into this. We could disable retrying I agree that the messages could be better, do you have an idea for that? |
Oh, thanks for the comprehensive explanation. To give more background: I am planning to backup a bunch of Windows systems and wanted to test a few things first. Depending on how the backup is planned it might occur that two runs of restic are run at the same time and that is what i wanted to test. Based on your explanation it might not be a good idea to mention something like "waiting for parallel run to be finished", which i proposed in the initial report already. But maybe something else, which describes your explanation in short - maybe "Short presence of lock file detected, pausing and checking". I don't know how the waiting time between the checks is calculated and how many loops are done. Maybe the message could be extended with that information, so the user knows what is and will happen? But i think this is a very low priority issue, postpone it and do more important stuff earlier :) |
Thanks for the feedback |
Ideally restic shouldn't retry loading locks, but instead list the lock folder again to check whether the lock file was just removed in the meantime. If it still exists, then it should retry loading the lock file. |
I see this "every night", too. Some 190 clients across 8 repositories and one or two show this (not always the same ofc). Interestingly with always the exact same timings down to sub-nanoseconds? The same as Michael's output in #3652 , too .. sus, innit? :)
This happens in This is restic 0.13.1 from releases on a variety of linux machines. The OpenBSD machines are not showing this (so far?). I can roll+run a debug version if that helps. |
That sounds like the randomization didn't work as it should. But besides that the lock retries are sort of what is to be expected at the moment when many restic instances access the same repository at the same time. |
Output of
restic version
How did you run restic exactly?
As described in the title i made two runs, the first run (started a second earlier) worked just fine:
The second run obviously waited for the lock of the first one, but gave strange messages about that:
What backend/server/service did you use to store the repository?
Linux system (Debian) with SFTP
Expected behavior
More meaningful message like "waiting for parallel run to be finished" and best would be to include some kind of process id, starting time of the other process and a timeout (and how to solve that on a timeout - i dont know how the locking works...).
Actual behavior
Messages ("Load...") shown above
Steps to reproduce the behavior
Use restic on Windows. Use SFTP Backend with plink. Start two parallel runs or restic at the same time for a directory with some files/MBs (else the runs end too fast).
Do you have any idea what may have caused this?
Looks like the locking is already handled well (the second run waited for the first one to finish + x and then started itself), but the locking handled is presented to the user with not really self explaining messages.
Do you have an idea how to solve the issue?
Implement an error handler for locking-related things and in that case we have here, present a better message to the user.
Did restic help you or made you happy in any way?
The answer is 42! ...just kidding ;)
No restic in fact makes me very happy. You wont believe how many free backup solutions i searched and tried (for Linux + Windows). restic is the first one which fulfills my requirements AND works.
To be the killer-backup-app for Windows it would need a nice GUI and native-implemented VSS handling (and logging, and...), but i understand that the basics first need to work very stable and after that one could concentrate on nice extras like the ones i described.
The text was updated successfully, but these errors were encountered: