Improve archiver performance for small files #3955

MichaelEischer · 2022-10-07T19:17:38Z

What does this PR change? What problem does it solve?

As discovered in #3917 (comment) the backup command is slower for small files (up to about 5-6 chunks) than it could be. The main bottleneck appears to be the FileSaver, as increasing the --read-concurrency also drastically improves the backup performance. The FileSaver proceeds in two steps, first it reads the file, splits it into chunks and passes it to the BlobSaver and secondly it waits for all blobs to complete saving (the upload happens asynchronously). Depending on the blob size it can take a few milliseconds to process it. During this time the FileSaver is blocked and not performing any useful work.

This PR changes the second phase of the FileSaver to work asynchronously using callbacks. For this the BlobSaver no longer returns a FutureBlob but instead calls a callback once a blob was saved. This provides the necessary means to asynchronously complete the FutureFile. Similar changes to FutureTree will have to wait at least until #3880 is merged.

The asynchronous file completion is protected using a mutex to avoid synchronization problems.

Asynchronously completing a file has a somewhat unexpected side effect. The status will now often list more than two files at the same time, even though the read concurrency is limited to two. Although this look a bit unexpected, it is actually correct. The status displays which files restic is currently working on, and not which file restic is reading from. However, I wonder whether we should change that to avoid confusing users.

The second commit in this PR changes the status output to clear no longer used status lines. With a frequently changing number of currently processed files, this ended up being confusing.

Performance measurements:
Command: restic backup -n --exclude .git linux using a checkout of Linux 6.0.
CPU: AMD Ryzen 7 PRO 5850U ; Repository stored on NVME disk (warmed up page cache).
Times reported are those printed in the backup command summary.

restic version / read concurrency	2	4	8
master `7112a13`	0:10	0:06	0:04
PR	0:04	0:03	0:03

I'm aware that the test dataset is too small to properly differentiate the different variants, but the trends should be obvious :-) .

Was the change previously discussed in an issue or on the forum?

See #3917.

Checklist

I have read the contribution guidelines.
I have enabled maintainer edits.
I have added tests for all code changes.
~~[ ] I have added documentation for relevant changes (in the manual).~~
There's a new file in changelog/unreleased/ that describes the changes for our users (see template).
I have run gofmt on the code in all commits.
All commit messages are formatted in the same style as the other commits in the repo.
I'm done! This pull request is ready for review.

JsBergbau · 2022-10-08T15:49:22Z

@MichaelEischer thanks for working on this.

I used the same folder as described there #3917

default read concurrency of 2

repository b677436b opened (repository version 2) successfully, password is correct
created new cache in C:\Users\XXX\AppData\Local\restic
no parent snapshot found, will read all files

Files:         556 new,     0 changed,     0 unmodified
Dirs:            4 new,     0 changed,     0 unmodified
Added to the repository: 2.526 GiB (2.504 GiB stored)

processed 556 files, 2.526 GiB in 1:52
snapshot d1c73cdc saved

So much faster than before. I noticed that before the PR restic showed at maximum as many files beeing processed as read concurrency was. Now the number varies a lot and about 4 to 10 files are shown simultaneously.

read concurrency of 6

repository e5410fd6 opened (repository version 2) successfully, password is correct
created new cache in C:\Users\XXX\AppData\Local\restic
no parent snapshot found, will read all files

Files:         556 new,     0 changed,     0 unmodified
Dirs:            4 new,     0 changed,     0 unmodified
Added to the repository: 2.526 GiB (2.504 GiB stored)

processed 556 files, 2.526 GiB in 1:52
snapshot 58d7deda saved

So now using read concurrency doesn't improve speed. However overall speed is decreased a bit compared to previous implementation when increasing the read concurrency.

MichaelEischer · 2022-10-08T16:16:52Z

Now the number varies a lot and about 4 to 10 files are shown simultaneously.

That is somewhat expected with this PR. However, showing more files than before does not mean that restic is concurrently reading from more files than before.

So now using read concurrency doesn't improve speed. However overall speed is decreased a bit compared to previous implementation when increasing the read concurrency.

I wouldn't consider that minimal performance difference to be too relevant. However, I have to admit that I don't have any real idea why the PR should result in a slightly higher backup time than before.

JsBergbau · 2022-10-08T17:14:16Z

I've done a test with the previous version again. With read conccurrency of 6 it is now 1:41

I've also tested the new restic version again. With read concurrency of 6 it takes 1:47

The difference for the faster speed is that the VM that was building the PR build was still running. With VM closed I guess Windows is caching the whole folder in RAM.

So this PR is about 6 % slower than the older implementation when using it with a higher read concurrency.

JsBergbau · 2022-10-09T21:22:56Z

@MichaelEischer

Did some tests again. Previously I used for the test https://beta.restic.net/latest/
This binary is about 30 MB in size. Now I've build restic directly from masterbranch, so without this PR. With read concurrency of 6 restic needs now 1:50. So this PR is even with read concurrency of 6 faster. Restic version restic 0.14.0 (v0.9.5-2107-g1a6160d15) compiled with go1.15.15 on windows/amd64 with file size of 22.7 MB.

Previously used restic version without this PR (all benchmarks before this post) restic 0.14.0-dev (compiled manually) compiled with go1.18 on windows/amd64 and file size of 29.2 MB.

So speed differences resp. faster results were because of newer go version. Sorry for not thinking about this before.

So regarding performance this PR can safely be merged.

MichaelEischer · 2022-10-21T20:04:01Z

What remains is to decide how to handle the much higher number of files displayed as part of the status message. It will likely confuse users...

JsBergbau · 2022-10-21T20:59:41Z

As far as I understood with this pull request restic still reads by default 2 files in parallel. So the question is how does this then happen that there are now displayed more files than previously?
Still it is a good question how to handle this. I have some ideas, but lets wait for your answer.

MichaelEischer · 2022-10-21T21:28:41Z

So the question is how does this then happen that there are now displayed more files than previously?

That is somewhat easy to explain. Let's assume we have a small file which only consists of a single chunk. Then restic reads the file (at this point it shows up on the status bar), hands it over to the blob saver which stores it in the repository, once that is completed the file is reported as complete and removed from the status bar. Only the first step actually involves reading the file from the disk. In particular, for maximum compression it takes quite some time to store the chunk in the repository.

Previously the read concurrency applied to this whole sequence, which means that restic cannot make use of more CPU cores that the read concurrency, and that the next file is only requested from disk when the previous one has completed processing. That way the CPU and disk are both underutilized.

With this PR, the read concurrency only applies to the step which actually reads the file. Thus, reading new files and compressing the old one happens concurrently.

The status bar currently shows all files which are in any of these processing steps. This has the consequence that all files which are currently read or saved to the repository are reported.

Still it is a good question how to handle this. I have some ideas, but lets wait for your answer.

The least confusing way is probably to hide files from the status bar once the file reading step is complete. The main complexity of that approach is that this introduces a new intermediate state in the output. The verbose output must still be able to report the final amount of data stored for a file. This means that we'd have to send an additional signal to the progress output once the file reading part is complete. But I guess that's better than confusing everyone.

JsBergbau · 2022-10-22T07:15:34Z

I think the problem is that restic users are accustomed that only the files are shown that restic reads currently from disk.
So one option would be to show a message that since restic 0.15 are lot more files are shown like "Note since restic 0.15 restic parallelizes file reading and compression so more files are shown here than before. Files read concurrently from disk: $readConcurrency"

Or the other option is the cosmetic implementation which needs quite some time for no real functional gain.

MichaelEischer · 2022-10-22T10:13:52Z

So one option would be to show a message that since restic 0.15 are lot more files are shown like "Note since restic 0.15 restic parallelizes file reading and compression so more files are shown here than before. Files read concurrently from disk: $readConcurrency"

That just feels wrong. It gives the impression that we're too lazy to implement a proper status output.

I've implemented the variant which just hides all files which already have completed reading from disk. It's actually less messy to implement than I've expected.

After reading and chunking all data in a file, the FutureFile still has to wait until the FutureBlobs are completed. This was done synchronously which results in blocking the file saver and prevents the next file from being read. By replacing the FutureBlob with a callback, it becomes possible to complete the FutureFile asynchronously.

Previously, the old status text remained until it was overwritten.

As the FileSaver is asynchronously waiting for all blobs of a file to be stored, the number of active files is higher than the number of files from which restic is reading concurrently. Thus to not confuse users, only display files in the status from which restic is currently reading.

MichaelEischer · 2022-10-30T09:32:31Z

LGTM. I've slightly tweaked a comment and added the isCompleted sanity check also to completeError.

mamoit · 2023-01-13T11:49:31Z

Damn...
Just tried out release 0.15.0 and my home dir hourly backup time was cut in half!
From around ~55s to <30s for <100MB changed in a total of 500k files and 30GB.

Hats off.

MichaelEischer mentioned this pull request Oct 7, 2022

kind restic and other tools benchmark report #3917

Closed

MichaelEischer mentioned this pull request Oct 8, 2022

backup: Use buffered channels to collect backup status #3959

Merged

6 tasks

MichaelEischer force-pushed the async-futurefile-completion branch from ac575af to 16baafb Compare October 8, 2022 20:50

MichaelEischer force-pushed the async-futurefile-completion branch from 16baafb to c4d55a9 Compare October 21, 2022 19:52

MichaelEischer force-pushed the async-futurefile-completion branch from c0c614c to 0825948 Compare October 22, 2022 10:10

MichaelEischer force-pushed the async-futurefile-completion branch from 0825948 to b7c80fe Compare October 22, 2022 20:49

MichaelEischer added 4 commits October 30, 2022 10:29

ui: Properly clear lines no longer used for status

b52a8ff

Previously, the old status text remained until it was overwritten.

add changelog for faster backups with small files

a571fc4

MichaelEischer force-pushed the async-futurefile-completion branch from b7c80fe to c0f34af Compare October 30, 2022 09:29

MichaelEischer merged commit 3499c63 into restic:master Oct 30, 2022

MichaelEischer deleted the async-futurefile-completion branch October 30, 2022 17:38

This was referenced Nov 4, 2022

restic prune: Fatal: index is not complete #3999

Closed

archiver: Check that saved file does not have null IDs in content #4010

Merged

archiver: Fix race condition resulting in files containing null IDs #4019

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve archiver performance for small files #3955

Improve archiver performance for small files #3955

MichaelEischer commented Oct 7, 2022 •

edited

Loading

JsBergbau commented Oct 8, 2022

MichaelEischer commented Oct 8, 2022

JsBergbau commented Oct 8, 2022

JsBergbau commented Oct 9, 2022 •

edited

Loading

MichaelEischer commented Oct 21, 2022

JsBergbau commented Oct 21, 2022

MichaelEischer commented Oct 21, 2022

JsBergbau commented Oct 22, 2022

MichaelEischer commented Oct 22, 2022 •

edited

Loading

MichaelEischer commented Oct 30, 2022

mamoit commented Jan 13, 2023

Improve archiver performance for small files #3955

Improve archiver performance for small files #3955

Conversation

MichaelEischer commented Oct 7, 2022 • edited Loading

What does this PR change? What problem does it solve?

Was the change previously discussed in an issue or on the forum?

Checklist

JsBergbau commented Oct 8, 2022

MichaelEischer commented Oct 8, 2022

JsBergbau commented Oct 8, 2022

JsBergbau commented Oct 9, 2022 • edited Loading

MichaelEischer commented Oct 21, 2022

JsBergbau commented Oct 21, 2022

MichaelEischer commented Oct 21, 2022

JsBergbau commented Oct 22, 2022

MichaelEischer commented Oct 22, 2022 • edited Loading

MichaelEischer commented Oct 30, 2022

mamoit commented Jan 13, 2023

MichaelEischer commented Oct 7, 2022 •

edited

Loading

JsBergbau commented Oct 9, 2022 •

edited

Loading

MichaelEischer commented Oct 22, 2022 •

edited

Loading