Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compacting a backup ends with "Error while running [backup_name] : Invalid data format." #2303

Open
1 task done
umgfoin opened this issue Feb 2, 2017 · 17 comments
Open
1 task done

Comments

@umgfoin
Copy link

umgfoin commented Feb 2, 2017

I have:

  • searched open and closed issues for duplicates

Version info

Duplicati Version:
2.0.1.44_canary_2017-01-24

Operating System:
Linux 2.6.32-042stab120.16 #1 SMP Tue Dec 13 20:58:28 MSK 2016 x86_64 x86_64 x86_64 GNU/Linux
#mono --version
Mono JIT compiler version 3.12.1 (tarball Fr 10. Apr 00:46:47 CEST 2015)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
TLS: __thread
SIGSEGV: altstack
Notifications: epoll
Architecture: amd64
Disabled: none
Misc: softdebug
LLVM: supported, not enabled.
GC: sgen

Backend:
sftp

Bug description

Backup finishes with error "Invalid data format." in compact phase.

Steps to reproduce

  • Start backup or run "Compact now" on a backup job.
  • Wait for error-message.

Actual result:
The error occurs during phase compacting.

Screenshots

debug log

Feb 2, 2017 12:13 PM: Running Backup took 00:57:27.513

Feb 2, 2017 12:13 PM: Fatal error
{"ClassName":"System.IO.InvalidDataException","Message":"Invalid data format.","InnerException":null,"HelpURL":null,"StackTraceString":" at master._7zip.Legacy.Lzma2DecoderStream.Read (System.Byte[] buffer, Int32 offset, Int32 count) [0x00000] in :0 \n at ManagedLzma._7zip.Decoder.FileBufferedDecoderStream.Fetch () [0x00000] in :0 \n at ManagedLzma._7zip.Decoder.FileBufferedDecoderStream.Read (System.Byte[] buffer, Int32 offset, Int32 count) [0x00000] in :0 \n at master._7zip.Utilities.UnpackSubStream.Read (System.Byte[] buffer, Int32 offset, Int32 count) [0x00000] in :0 \n at Duplicati.Library.Utility.Utility.ForceStreamRead (System.IO.Stream stream, System.Byte[] buf, Int32 count) [0x00000] in :0 \n at Duplicati.Library.Main.Volumes.BlockVolumeReader.ReadBlock (System.String hash, System.Byte[] blockbuffer) [0x00000] in :0 \n at Duplicati.Library.Main.Operation.CompactHandler.DoCompact (Duplicati.Library.Main.Database.LocalDeleteDatabase db, Boolean hasVerifiedBackend, IDbTransaction& transaction) [0x00000] in :0 \n at Duplicati.Library.Main.Operation.DeleteHandler.DoRun (Duplicati.Library.Main.Database.LocalDeleteDatabase db, IDbTransaction& transaction, Boolean hasVerifiedBacked, Boolean forceCompact) [0x00000] in :0 \n at Duplicati.Library.Main.Operation.BackupHandler.CompactIfRequired (Duplicati.Library.Main.BackendManager backend, Int64 lastVolumeSize) [0x00000] in :0 \n at Duplicati.Library.Main.Operation.BackupHandler.Run (System.String[] sources, IFilter filter) [0x00000] in :0 ","RemoteStackTraceString":null,"RemoteStackIndex":0,"HResult":-2146233085,"Source":"managed-lzma","ExceptionMethod":null,"Data":null}

Feb 2, 2017 12:13 PM: Operation Get with file duplicati-b908bf5f6845a4300a3b2002de1b1457e.dblock.7z.gpg attempt 1 of 5 failed with message: Thread was being aborted
{"ClassName":"System.Threading.ThreadAbortException","Message":"Thread was being aborted","InnerException":null,"HelpURL":null,"StackTraceString":" at (wrapper managed-to-native) System.Threading.WaitHandle:WaitAny_internal (System.Threading.WaitHandle[],int,bool)\n at System.Threading.WaitHandle.WaitAny (System.Threading.WaitHandle[] waitHandles, TimeSpan timeout, Boolean exitContext) [0x00000] in :0 \n at System.Threading.WaitHandle.WaitAny (System.Threading.WaitHandle[] waitHandles, TimeSpan timeout) [0x00000] in :0 \n at Renci.SshNet.SubsystemSession.WaitOnHandle (System.Threading.WaitHandle waitHandle, TimeSpan operationTimeout) [0x00000] in :0 \n at Renci.SshNet.Sftp.SftpSession.RequestRead (System.Byte[] handle, UInt64 offset, UInt32 length) [0x00000] in :0 \n at Renci.SshNet.SftpClient.InternalDownloadFile (System.String path, System.IO.Stream output, Renci.SshNet.Sftp.SftpDownloadAsyncResult asyncResult, System.Action1 downloadCallback) [0x00000] in <filename unknown>:0 \n at Renci.SshNet.SftpClient.DownloadFile (System.String path, System.IO.Stream output, System.Action1 downloadCallback) [0x00000] in :0 \n at Duplicati.Library.Backend.SSHv2.Get (System.String remotename, System.IO.Stream stream) [0x00000] in :0 \n at Duplicati.Library.Main.BackendManager.coreDoGetPiping (Duplicati.Library.Main.FileEntryItem item, IEncryption useDecrypter, System.Int64& retDownloadSize, System.String& retHashcode) [0x00000] in :0 ","RemoteStackTraceString":null,"RemoteStackIndex":0,"HResult":-2146233040,"Source":"mscorlib","ExceptionMethod":null,"Data":null}

Feb 2, 2017 12:13 PM: RemoteOperationGet took 00:00:23.942

Feb 2, 2017 12:12 PM: Backend event: Get - Started: duplicati-b908bf5f6845a4300a3b2002de1b1457e.dblock.7z.gpg (134,43 MB)

Feb 2, 2017 12:12 PM: Starting - RemoteOperationGet

Feb 2, 2017 12:12 PM: RemoteOperationGet took 00:03:20.111

Feb 2, 2017 12:12 PM: Backend event: Get - Completed: duplicati-b6aa55475c7f64eb49475fe9ad75a7d60.dblock.7z.gpg (171,91 MB)

Feb 2, 2017 12:12 PM: Downloaded and decrypted 171,91 MB in 00:03:20.1098310, 879,69 KB/s

Feb 2, 2017 12:09 PM: Backend event: Get - Started: duplicati-b6aa55475c7f64eb49475fe9ad75a7d60.dblock.7z.gpg (171,91 MB)

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/41596531-compacting-a-backup-ends-with-error-while-running-backup_name-invalid-data-format?utm_campaign=plugin&utm_content=tracker%2F4870652&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F4870652&utm_medium=issues&utm_source=github).
@agrajaghh agrajaghh added the bug label Feb 3, 2017
@kenkendk
Copy link
Member

The error indicates that the reading of the 7z file fails.
Can you grab the file duplicati-b908bf5f6845a4300a3b2002de1b1457e.dblock.7z.gpg, decrypt it with GPG and try to open it with 7z to see if it is broken?

@umgfoin
Copy link
Author

umgfoin commented Feb 27, 2017

Hi Kenneth,
yes, the archive seems to be broken.
Should I remove the file and start a repair operation?
thanks, umgfoin.

[tadao@ando duplicati]# 7z t duplicati-b908bf5f6845a4300a3b2002de1b1457e.dblock.7z

7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=de_DE.UTF-8,Utf16=on,HugeFiles=on,64 bits,24 CPUs Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (206D7),ASM,AES-NI)

Scanning the drive for archives:
1 file, 142018102 bytes (136 MiB)

Testing archive: duplicati-b908bf5f6845a4300a3b2002de1b1457e.dblock.7z
ERROR: duplicati-b908bf5f6845a4300a3b2002de1b1457e.dblock.7z
duplicati-b908bf5f6845a4300a3b2002de1b1457e.dblock.7z
Open ERROR: Can not open the file as [7z] archive


ERRORS:
Headers Error

Can't open as archive: 1
Files: 0
Size:       0
Compressed: 0

@kenkendk
Copy link
Member

That sounds really bad, because that means that the 7z module creates broken files!
You are sure you have decrypted the file with GPG first?

@umgfoin
Copy link
Author

umgfoin commented Feb 27, 2017

sure - 7z-signature (37 7A BC AF 27 1C) and -version stamp (00 03) are correct.
Relative end-header offset looks reasonable, too (but I'm not too familiar with the 7z-format)
I can provide you the archive, if it helps.

[tadao@ando duplicati]# xxd -l 256 -g 1 duplicati-b908bf5f6845a4300a3b2002de1b1457e.dblock.7z

0000000: 37 7a bc af 27 1c 00 03 f5 c6 ed a9 78 33 6c 08  7z..'.......x3l.
0000010: 00 00 00 00 9e d2 0a 00 00 00 00 00 e9 6e 4b ea  .............nK.
0000020: b4 c6 33 79 60 8f 22 b4 5d 5a 29 b6 3f 6d c6 f0  ..3y`.".]Z).?m..
0000030: 4a f7 6c 6b 0e 73 64 59 4f 8f ff 46 5c 1c 01 33  J.lk.sdYO..F\..3
0000040: ec 15 90 87 76 93 2c 5d 04 96 29 32 f0 3c f7 e9  ....v.,]..)2.<..
0000050: 9a c4 31 ec 90 7e ef e7 4a f0 19 9c 25 4d 5e 81  ..1..~..J...%M^.
0000060: 1a 2c f2 1a d1 6d ba 72 d9 d9 93 14 3a 59 91 72  .,...m.r....:Y.r
0000070: 19 40 66 18 e5 d8 64 c7 32 14 e2 c5 fd 5f a7 3c  .@f...d.2...._.<
0000080: c2 a6 d8 0a 5f 9d 98 47 7b 4a 5e 70 b4 54 54 9a  ...._..G{J^p.TT.
0000090: 41 80 92 b4 08 4e 81 35 38 a1 3e 4c 42 6f 2a e0  A....N.58.>LBo*.
00000a0: e2 14 dd 3b f0 de 01 d6 ed c3 f0 9c dc 58 a0 21  ...;.........X.!
00000b0: 19 3d f1 57 f7 3e ac 76 f1 94 a9 b0 b1 42 82 38  .=.W.>.v.....B.8
00000c0: 5d 80 c4 43 4a df 29 dc 0c 6d 90 3e 3a 70 cc 0d  ]..CJ.)..m.>:p..
00000d0: c2 b6 db 51 00 1e 63 4b a5 d9 3f 84 16 44 4a 4e  ...Q..cK..?..DJN
00000e0: d1 3a ff f6 43 a5 bf 28 56 88 ea 3f d8 03 63 03  .:..C..(V..?..c.
00000f0: 4a 07 70 af 17 77 ef 67 4e 85 f8 2a 1c 74 19 08  J.p..w.gN..*.t..

@kenkendk
Copy link
Member

That does look like a broken 7z file ...
Unfortunately, GPG does not validate the completenes of the file, so it could be simply a truncated file. But Duplicati does track all remote file hashes and the file lengths, so that should have been detected somewhere.

Did you see any errors about hash mismatches or file lengths not being correct?

Not sure how many files you have, but can you run something like:

duplicati-cli verify sftp://server/path all --full-remote-verification

This will download all files and check their hashes. The --full-remote-verification will also open each archive and check that all files are found inside, and should trigger the problem.

@umgfoin
Copy link
Author

umgfoin commented Feb 27, 2017

there's no duplicati-cli script or executeable in my duplicati-installation.
Duplicati.CommandLine.exe fails as following:

mono /usr/share/duplicati/Duplicati.CommandLine.exe verify sftp://sftp.server.fqdns.com/users/egoneiermann/duplicati all --full-remote-verification

Verschlüsselungspassphrase eingeben:

Backend not supported: sftp

@kenkendk
Copy link
Member

Sorry, my bad, it should be ssh:// even though it uses sftp as the protocol.

You need to add a username and password too.

If you are normally using the WebUI, you can use the "export" feature to get the commandline equivalent.

@umgfoin
Copy link
Author

umgfoin commented Feb 27, 2017

you can use the "export" feature to get the commandline equivalent.

Cool feature!

[tadao@ando duplicati]#  mono /usr/share/duplicati/Duplicati.CommandLine.exe verify "ssh://sftp.fqdn.com//users/egoneiermann-backup/duplicati?auth-username=.....&auth-password=....." all --full-remote-verification --ssh-accept-any-fingerprints

Verschlüsselungspassphrase eingeben:
  Listing remote folder ...
removing file listed as Deleting: duplicati-b80e1f468b3754fd7bc6bf9219ac2d07f.dblock.7z.gpg
removing file listed as Temporary: duplicati-bee9bf11507b946fbb33759892d04807c.dblock.7z.gpg
removing file listed as Temporary: duplicati-i47402f1151d14ccf8f2404adf4bdeabc.dindex.7z.gpg
Missing file: duplicati-i2cfdeea5d4414025a656acd239082764.dindex.7z.gpg
Found 1 files that are missing from the remote storage, please run repair

I'm starting a repair-operation (via GUI).
Result:

MainOperation: Repair
RecreateDatabaseResults: null
ParsedResult: Success
EndTime: 27.02.2017 13:46:15
BeginTime: 27.02.2017 13:46:05
Duration: 00:00:10.3814020
Messages: [
    removing file listed as Deleting: duplicati-b80e1f468b3754fd7bc6bf9219ac2d07f.dblock.7z.gpg,
    removing file listed as Temporary: duplicati-bee9bf11507b946fbb33759892d04807c.dblock.7z.gpg,
    removing file listed as Temporary: duplicati-i47402f1151d14ccf8f2404adf4bdeabc.dindex.7z.gpg
]
Warnings: []
Errors: []
BackendStatistics:
    RemoteCalls: 2
    BytesUploaded: 290
    BytesDownloaded: 0
    FilesUploaded: 1
    FilesDownloaded: 0
    FilesDeleted: 0
    FoldersCreated: 0
    RetryAttempts: 0
    UnknownFileSize: 0
    UnknownFileCount: 0
    KnownFileCount: 1197
    KnownFileSize: 98335673875
    LastBackupDate: 27.02.2017 00:00:00
    BackupListCount: 30
    TotalQuotaSpace: 0
    FreeQuotaSpace: 0
    AssignedQuotaSpace: 96636764160
    ParsedResult: Success
[tadao@ando duplicati]#  mono /usr/share/duplicati/Duplicati.CommandLine.exe verify "ssh://sftp.fqdn.com//users/egoneiermann-backup/duplicati?auth-username=.......&auth-password=......" all --full-remote-verification --ssh-accept-any-fingerprints

Verschlüsselungspassphrase eingeben:
  Listing remote folder ...
  Downloading file (24,59 MB) ...
  Downloading file (24,60 MB) ...

Will come back when we have a result (Backup:88,60 GB / 30 Versions).

Btw.: Do you see any relation between possibly truncated files and
TotalQuotaSpace: 0
FreeQuotaSpace: 0
AssignedQuotaSpace: 96636764160

in the above log?
The available space on the backup-target is larger - I set the quota intentionally lower than available.

++umgfoin.

@kenkendk
Copy link
Member

I updated your post, not sure if the user/pass was a dummy or real data. If it was real data, you should change the password ASAP.

If you have the defect file (and it does not contain sensitive data) I would like a copy.
I will share it with the author of the managed-lzma project which is used for the 7z compression.

If it does contain sensitive data, maybe we can extract the headers only and have a look at them.

@kenkendk
Copy link
Member

The quota space is not currently used. It is intended to start dropping old backups when the storage space is low, but that feature has not yet been added.

@umgfoin
Copy link
Author

umgfoin commented Feb 27, 2017

I updated your post, not sure if the user/pass was a dummy or real data. If it was real data, you should change the password ASAP.

Thanks for being alert - credentials and FQDN are fake-data.

If you have the defect file (and it does not contain sensitive data) I would like a copy.

Indeed, I need to verify this. Would you have a sql-query at hand to select all file-names from the backup referencing blocks in this specific 7z-archive?

@kenkendk
Copy link
Member

To get the files referencing a specific dblock file, you can do:

mono Duplicati.CommandLine.exe affected ssh://... duplicati-d....7z.gpg --verbose

The --verbose option prints all filenames to screen, instead of just reporting the backup sets.

@umgfoin
Copy link
Author

umgfoin commented Feb 28, 2017

Kenneth,
the verify-operation is still in progress. It found 3 files with issues sofar (Invalid data format, Decoded stream has been corrupted). Should I continue or is this information already sufficient? I'll send you a link to the orginally discovered corrupt archive in pm.

@kenkendk
Copy link
Member

Yes, please send the broken file and I will forward it privately to the managed-lzma author.

If you have 3 files, then it means that this error is not a single problem.

This means that you can likely not restore files from your backup.
Right now I suspect that the problem is the 7z library, so if you start over, consider not using 7z.

@umgfoin
Copy link
Author

umgfoin commented Feb 28, 2017

Kenneth, I checked the broken archive - btw. it is partially recoverable.
Unfortuantely, too much unencrypted office-data is contained. I can't disclose that.
I'll check later, if other archives are safer in this respect, but obviously, chunks of the evergrowing mail-db can be found in nearly every dataset.
Anyway, I could extract headers or specific file-ranges if this is helpful to analyze the 7z

@kenkendk
Copy link
Member

I sent you a mail

@chymian
Copy link

chymian commented Sep 21, 2018

@kenkendk
any news on the 7z-corruption front?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants