Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
Upload throttle corrupts backup, especially OneDrive. Analyzed, with code proposed. #3787
Issue is that throttle can lose data, seemingly by internal overwrite, resulting in bad remote volume. Proposed fix is ThrottledStream Read should advance its buffer offset as it reads through its Stream.
This is on volume uploads, so is permanent damage, unlike Hash mismatch download corruption, likely from throttle buffer offset being reused where removing the download throttle resolves the issue. This problem is also more serious because it happens at reasonable levels of throttling (at least on OneDrive) but (like the other) can even happen on local files when throttle is extremely low. It's not proven that the problems at the two speeds are identical, however the symptoms sure look the same, and are reversed from the download case because this is from ThrottledStream Read, not Write. In both cases the buffer offset is not adjusted when data is read or written, but here the bug is reading newer data to the initial buffer offset, whereas in the other issue it was writing the initial buffer offset repeatedly to the output.
Both issues produce a file with the right size but scrambled content, however the upload case may not be immediately noticed after backup because correct size will be considered Verified in RemoteVolume table. The default verification of one sample trio of files might catch the issue as a "Hash mismatch", but it's not guaranteed and can also be disabled. If enabled, load-balancing might favor checking new files.
Seeing data corruption is easy using a hex viewer. Look for unexpected runs of NUL characters. Seeing the pattern to confirm the code theory was done using a test file with offset labels. This was easiest to do with a script, so I used Strawberry Perl on Windows. An example script:
This outputs text, but I put it into a file with a .pet extension, to stop compression to keep it readable.
To avoid confusion due to Duplicati block management, set --blocksize=30MB which also ensures only one source file content block will fit in the default 50MB dblock. There are still some discontinuities due to the deflate algorithm making roughly 64KB blocks, but the expectation is that the original file offset belongs in the copy at only a slightly larger offset. The exact corruption varies with parameters, but a OneDrive test with a 4 MiB file (trying to stay within GitHub upload limit) at 2 MB/sec (exceeding my actual upload capability) wound up writing the last 0x520 bytes of a ZIP file on top of its start (leaving NULs at the end of the file, and obliterating the PK usually seen at start -- similar obliteration of AES header on encrypted file can also occur, which causes "Invalid header marker" errors).
Here's what a suspected end-overwrote-start looks like (from test results with Microsoft OneDrive v2):
The 50 4b 05 06 "PK" signature is for End of central directory record (EOCD) and when the record ends, upload seamlessly resumes original pattern of offset-text-near-offset.
The proposed code change is to advance the buffer offset in the Read loop. Possibly this can be done by changing Read
New clue for the Fatal Failed: 400 - bad_request: Sha1 did not match data received found problems with Backblaze B2 and local file (testing by @JonMikelV). jshipp also did much debugging, and Omnicef reported a throttle connection. Some backends have uploads send a hash, so received content can be integrity-checked. This plan is similar to Duplicati's keeping a hash of remote volume, but error can be detected fast (Duplicati wouldn't be able to verify hash without downloading). Backblaze B2 wouldn't show any misbehavior for me though, and I'm not sure why that was so. Google Drive didn't either, but I didn't beat on it as extensively as I did with Backblaze. I don't plan a huge survey. Fix may be easier...
Error 400 - bad_request
dblocks missing SharpAESCrypt header causing multiple other errors #3746
Steps to reproduce
The above zip file has a "OneDrive without encryption" folder with "Temporary" and "Uploaded" folders containing the original dblock (copied out of Temp as dup-2b1129b8-428b-406c-89f1-5a68d662bcdd), and the uploaded dblock b492c27886f0647a2b524ef22bf4b16e5.dblock.zip. Some earlier less-realistic-but-more-portable tests using local file destination with very low --throttle-upload are in the folders "local_without_encryption" and "local_with_encryption". These could be done using the 1MiB_offset.pet, but the OneDrive test required 4MiB_offset.pet and is easier to get on larger files (also easier at lower throttles). There are some auxiliary files doing things like renaming the Temp file with .zip so as to see which is which, or decrypting .aes files with SharpAESCrypt.exe (where possible) to compare against the unencrypted Temp version. Sometimes copying things out of Temp was done with a .bat file to just copy what it found. There are DB bug reports mainly to show the backup structure and see the Verified state.
This is old code. I verified corruption (maybe with more work) in old releases such as 188.8.131.52 and 184.108.40.206.
I forgot to mention I used --asynchronous-concurrent-upload-limit=1 in original test, to allow such old releases to be tested in a comparable manner. It seemed maybe easier to get at 4 concurrent uploads, however other things might have changed too. Regardless, 100 KB/sec was breaking some testing, so I decided to try editing decompiled source of Read(), and adding my proposed fix. Working fine, so far...