New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Locked files seems to cause corrupt backup #3594

Open
balchen opened this Issue Jan 9, 2019 · 4 comments

Comments

Projects
None yet
2 participants
@balchen
Copy link

balchen commented Jan 9, 2019

  • I have searched open and closed issues for duplicates.

Environment info

  • Duplicati version: 2.0.4.5_beta_2018-11-28
  • Operating system: Windows Server 2012 R2
  • Backend: IIS FTP

Description

I've configured a backup of our Atlassian Stash data. The backup failed every time with "One or more errors occurred". I clicked "Show log", but there were no log entries for "General", and "Remote" showed only successful uploads of data files. Re-running the backup without changes frequently caused error messages "Detected non-empty blockset with no associated blocks". Deleting and recreating the database failed. The only way to continue was to delete the database, delete the remote files, and start over.

Since the GUI log showed no entries, I configured file logging and was able to identify a possible reason for the issues. It all seems to start when Duplicati hits a file that Stash has locked (there are apparently several of them), which causes this:

2019-01-08 23:38:35 +01 - [Warning-Duplicati.Library.Main.Operation.Backup.FileBlockProcessor.FileEntry-PathProcessingFailed]: Failed to process path: C:\Atlassian\ApplicationData\Stash\plugins\.osgi-cache\felix\felix-cache\cache.lock
System.IO.IOException: The process cannot access the file because another process has locked a portion of the file.
[stacktrace]
2019-01-08 23:50:37 +01 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Completed: duplicati-b444ad11794f74db2a057e860451bc346.dblock.zip (499.90 MB)
2019-01-08 23:50:37 +01 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Started: duplicati-i2b73e5165b7a41de8c5ad97a34807d47.dindex.zip (410.63 KB)
2019-01-08 23:50:39 +01 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Completed: duplicati-i2b73e5165b7a41de8c5ad97a34807d47.dindex.zip (410.63 KB)
2019-01-08 23:50:39 +01 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Started: duplicati-b4def9402bd69477fb47bcd889109b69d.dblock.zip (499.99 MB)
2019-01-08 23:50:41 +01 - [Error-Duplicati.Library.Main.Operation.BackupHandler-FatalError]: Fatal error
System.Exception: Detected non-empty blocksets with no associated blocks!
[stacktrace]

and the backup terminates. The suggested resolution for non-empty blocksets with no associated blocks seems to be to repair or recover the local DB, so I did that:

2019-01-09 00:01:25 +01 - [Information-Duplicati.Library.Main.Controller-StartingOperation]: The operation Repair has started
2019-01-09 00:01:25 +01 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Started:  ()
2019-01-09 00:01:26 +01 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Completed:  (15 bytes)
2019-01-09 00:01:26 +01 - [Information-Duplicati.Library.Main.Operation.FilelistProcessor-RemoteUnwantedMissingFile]: removing file listed as Temporary: duplicati-20190108T222340Z.dlist.zip
2019-01-09 00:01:26 +01 - [Information-Duplicati.Library.Main.Operation.FilelistProcessor-Remove incomplete file]: removing incomplete remote file listed as Uploading: duplicati-b4def9402bd69477fb47bcd889109b69d.dblock.zip
2019-01-09 00:01:26 +01 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Started: duplicati-b4def9402bd69477fb47bcd889109b69d.dblock.zip ()
2019-01-09 00:01:27 +01 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-b4def9402bd69477fb47bcd889109b69d.dblock.zip ()
2019-01-09 00:01:29 +01 - [Information-Duplicati.Library.Main.Operation.FilelistProcessor-RemoteUnwantedMissingFile]: removing file listed as Temporary: duplicati-bd215a83de70d4b49895925c5aa2737c0.dblock.zip
2019-01-09 00:01:30 +01 - [Information-Duplicati.Library.Main.Operation.RepairHandler-DatabaseIsSynchronized]: Destination and database are synchronized, not making any changes
2019-01-09 00:02:07 +01 - [Information-Duplicati.Library.Main.Controller-StartingOperation]: The operation Backup has started
2019-01-09 00:02:23 +01 - [Error-Duplicati.Library.Main.Operation.BackupHandler-FatalError]: Fatal error
System.Exception: Detected non-empty blocksets with no associated blocks!
[stacktrace]

Obviously something was in an invalid state from the terminated backup, which the repair job tried to fix, but it did not resolve the issue. To recover from this issue, I had to delete the local database, delete all files in the remote location, make sure that all potentially locked files were excluded/enable shadow copy, and restart from scratch.

Steps to reproduce

  1. Configure and run a backup that will hit locked files.
  2. Attempt to recover from corrupt state by running database repair, with no success
  • Actual result:
    Locked files cause corrupt backup that cannot be recovered from
  • Expected result:
    Backup must not be corrupted by locked files. Correct way to resolve locked files is debatable, but resolution must never cause corrupt backup.

Screenshots

Debug log

@ts678

This comment has been minimized.

Copy link

ts678 commented Jan 11, 2019

I can't reproduce this with ordinary locked files, but there are a lot of ways to lock. Do you have a recipe that someone can follow with readily available equipment? We're on that chase in the forum, which includes some inspection of the database to confirm or refute guesses. Details at Fatal error: Detected non-empty blocksets with no associated blocks. Is your start-from-scratch also re-breakable by locked files? That would be a start. Possibly you'd want to use a test backup, not your production one, but cutting this down seems key to a fix.

If you're technically inclined, my locking is share-access. I'm still looking for an easy way to do byte-range, or to tell which variety some app is using. My locked files make only metadata entries in the backup. It's the file data ones that cause the problem. There's also a chance it's a latent error that needs compact or something.

Can you help find exact simple steps? I'm pretty sure I tried your steps 1 and 2 in my testing, with no luck...

@balchen

This comment has been minimized.

Copy link

balchen commented Jan 12, 2019

The error message says "has locked a portion of the file". I can see if I can write some code that does byte range locking and do a test with a different Duplicati setup to confirm.

@balchen

This comment has been minimized.

Copy link

balchen commented Jan 12, 2019

I wrote a simple console app to lock a file in every way possible (in .NET) and configured Duplicate to back up the files. All the different locks cause Duplicati to log an IOException (either the file is completely locked or parts are locked), but nothing makes the backup fail. So locking might not be the culprit after all, and it's just a coincidence that the failures stopped when I excluded the locked files.

I can provide the complete log file from the failed and successful backups (info log level) if that helps. I'll also start looking at other causes.

@balchen

This comment has been minimized.

Copy link

balchen commented Jan 13, 2019

When I was deleting and recreating the backup job to start over after database corruption, and was altering the configuration to exclude locked files, I see that I at one point changed the FTP backup from FTP (Alternate) to FTP. I chose FTP (Alternate) in the first place because it was able to report success on "Test connection"; FTP was never able to complete a test. But this seems to have other effects as well. FTP (Alternate) is much worse at connection handling (e.g. #3592) and "Channel is retired" exceptions are frequent with FTP (Alternate) and not reported at all with FTP.

The coincidence might be that I both excluded locked files and change the backend to FTP, and that in reality it was the change to FTP that made the issues disappear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment