Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make IO-parallelism configurable #322

Open
episource opened this Issue Oct 24, 2015 · 11 comments

Comments

Projects
None yet
8 participants
@episource
Copy link
Contributor

episource commented Oct 24, 2015

Restic version: 5de36df
Observed behavior:
IO-parallelism is hard coded. E. g. defaultParallelism = 40 (checker.go:50).

The hard coded parallelism might be well chosen for solid state disks, but is not always benefical and could slow down IO-operations or cause other issues. In practice I'm observing issues checking a repository on a samba/cifs mounted network location: IO errors due to NT_STATUS_INSUFF_SERVER_RESOURCES. Reducing defaultParallelism (checker.go:50) solves these issues but requires code changes.

Expected behavior:
IO-Parallelism should be configurable with a command line parameter. E. g. restic --max-parallel-io 10 [...]. This option should be available for backup, restore and check operations.

Related: Reducing IO parallelism might also resolve #313. Exhausting resource consumption as reported in #300, #316, #298 might also be related to high IO-parallelism.

@fw42

This comment has been minimized.

Copy link
Member

fw42 commented Oct 24, 2015

Instead of exposing this to the user with a configuration parameter, I think it would be more useful to try to come up with an (automatic) way of determining an ideal value. This needs to be a sane (potentially automatic) default, but most users can't make an educated choice for this value by themselves and it will just make the command line user interface more confusing, imho.

@episource

This comment has been minimized.

Copy link
Contributor Author

episource commented Oct 24, 2015

Consider splitting help output into a reduced basic and an advanced section: restic help shows a reduced basic help text where as restic help --advanced shows all available options. Any user happy with restic's defaults reads the reduced help section, only. Could be applied to #321, too.

@fw42

This comment has been minimized.

Copy link
Member

fw42 commented Oct 24, 2015

"Advanced settings" are a total anti-pattern, in my opinion. Sane defaults and convention over configuration are much more useful and lead to better usability in the long-term, in my experience.

@fd0

This comment has been minimized.

Copy link
Member

fd0 commented Oct 25, 2015

I also don't like "advanced settings", however tempting that may be. I see you point @episource that there should be a way to either manually or automatically select the IO parallelism. #313 is not affected because the fuse backend doesn't limit the amount of parallel IO operations, that comes from the kernel.

@julianbrost

This comment has been minimized.

Copy link

julianbrost commented Feb 13, 2018

We recently into the same issue on a VM host which doesn't have too much spare I/O performance. Running restic with its default parallelism (which now is 10) results in a significant overall performance degradation.

I don't think that automatically determining a level of parallelism will ever work in all cases, as one could try to optimize for fast backups (higher parallelism as long as the storage can handle it) or minimal impact on I/O performance (maybe even parallelism = 1).

Do you have any strong objections against adding a command line flag now? It could still be combined with some auto-detect mechanism for the default value in the future. If there's no way to control the level of parallelism, we would have to decide between maintaining a patched version of restic or not using it at all.

@pvgoran

This comment has been minimized.

Copy link

pvgoran commented Feb 14, 2018

@julianbrost Did you try assigning IO priority to the restic process?

@julianbrost

This comment has been minimized.

Copy link

julianbrost commented Feb 14, 2018

@pvgoran Yes, tried to run it with ionice. Solves the issue with high I/O load but the backup takes forever (about 24 hours instead of 5).

@fd0

This comment has been minimized.

Copy link
Member

fd0 commented Feb 25, 2018

Do you have any strong objections against adding a command line flag now?

Yes: first we need to finish the new archiver code (see #1494), then we can tackle this. I'll keep it in mind while writing the code.

We're collecting samples for IO parallelism here: borgbackup/borg#3500 If you like, run the test program on your server and add it to the issue with a bit of description. Thanks!

@alphapapa

This comment has been minimized.

Copy link

alphapapa commented Aug 19, 2018

When my desktop system is busy, and my scheduled restic check job kicks off, even though I use nice -n 19 ionice -c3 restic check ..., it still affects my system's interactive responsiveness, because every core is saturated, as well as I/O.

Automatic configuration is good for default behavior, but it is not a substitute for user configuration. I need to be able to manually limit the number of threads.

@pmkane

This comment has been minimized.

Copy link
Contributor

pmkane commented Mar 1, 2019

Just a +1. The defaults for fileReadConcurrency and SaveBlobConcurrency aren't high enough for our systems.

Similarly, workerCount in restorer/filerestorer.go requires bumping up to extract the maximum performance from high throughput network interfaces and disks.

@whereisaaron

This comment has been minimized.

Copy link

whereisaaron commented Mar 3, 2019

I agree @pmkane I think the hard-coded workerCount = 8 is artificially slowing down our restores.

I don't think we should bust out any extra command line options until we are sure that they are useful and effective for everyday users. But we should be able to add various experimental --options keywords to enable runtime configuration of anything that might be useful for tuning. We could even prefix them so everyone knows the option could get deprecated, e.g. '--option restore.experimental.workercount=32`. The option names don't have to be short or pretty. Us tinkerers just want to be able experiment with tuning without recompiling every time.

@fd0 would you consider attaching tuning settings like this to --options keywords rather than hard-coding them? With just a sanity range check and a warning that these are 'at your own risk' settings?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.