Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle large prune much more efficent #2162

Open
neeral85 opened this issue Feb 4, 2019 · 14 comments

Comments

Projects
None yet
5 participants
@neeral85
Copy link

commented Feb 4, 2019

Output of restic version

restic 0.9.3 compiled with go1.10.4 on linux/amd64

What should restic do differently? Which functionality do you think we should add?

In general, I like restic very much, and creating/recovery snapshots works perfectly fine.
But run restic with large repository’s is almost impossible. I have a repository with 5 TB / 30 snapshots.
Intention was to do this like a circular buffer (remove oldest, add newest).

Adding a snapshot and removing them works perfect, but as you necessarily come to prune your repository it can take WEEKS to just free 1 TB (because of rewrite).
This makes it almost impossible to use restic anymore as you can't create new snapshots during that time.

As you mentioned already here
you may do something to improve this.

Example:
found 5967884 of 7336415 data blobs still in use, removing 1368531 blobs
will delete 144850 packs and rewrite 142751 packs, this frees 1.082 TiB (took 2 weeks!)

Especially on remote repository’s where you just bought storage (with ssh access) and CPU resources are limited its much faster upload the whole repository again.

@pmkane

This comment has been minimized.

Copy link
Contributor

commented Mar 1, 2019

With performance for restores for large repos/repos with large files now resolved by @ifedorenko's out-of-order branch, it looks like this is the next boulder for using restic in multi-terabyte environments where the repo isn't on local disk and isn't being accessed via the REST server on a loopback interface.

Currently, an empty prune (pruning a snapshot that was 100% identical to a previous snapshot) against a repo in an AZ-local S3 bucket on high-end AWS instances with 10gbit/sec theoretical bandwidth runs at:

  1. building new index -- ~160 packs/second
  2. finding data still in use -- 56 seconds total
  3. rewriting packs -- ~3 packs/second

160 packs/second is slow, but still tolerable (~80 minute run-time against a 3TB repo).

But the rewrite @ 3 packs/second will run for almost 10 hours, even for my noop prune (will delete 0 packs and rewrite 111098 packs, this frees 180.699 GiB). For a meaningful prune on a large repo, you freeze out new backups for 24+ hours.

It looks like the pack rewrites currently happen in a single-thread, so allowing that to run across multiple workers might help quite a bit, even if the current copy then purge approach is maintained.

@ifedorenko

This comment has been minimized.

Copy link
Contributor

commented Mar 2, 2019

Personally, I would not spend time optimizing current blocking prune implementation, I think non-blocking prune is better long-term solution.

@cbane

This comment has been minimized.

Copy link

commented Apr 24, 2019

I've been working on this recently. Here's what I've done:

  • Load the existing index instead of scanning all packs (and then scan any packs that weren't in the index)
  • Parallelized scanning the snapshots for used blobs
  • Parallelized rewriting partially used packs
  • Write the new index using the index info already in memory

I'm currently working to figure out the level of parallelism to use for rewriting partially used packs (I'm planning to base this on the configured number of connections for the backend). I also need to do a lot more testing in various error scenarios.

Here are some performance numbers (using a repository with 875 GiB of data, about 180,000 packs, and 36 snapshots, using a loopback rest server as the backend):

  • Current code:
    • 35-40 minutes (each) to build the index at the beginning and end of the prune (70-80 minutes total)
    • 4-5 minutes to find used blobs
    • A few seconds to rewrite partially used packs
  • My changes so far:
    • A few seconds to load the existing index (somewhat longer if it needs to scan unindexed packs)
    • Under 2 minutes to find used blobs
    • A few seconds to write the new index

I'm also planning to set up a generated test case that will involve a lot more pack rewriting.

@pmkane

This comment has been minimized.

Copy link
Contributor

commented Apr 24, 2019

Courtney:

Sounds super promising! Wanted to confirm that this is the branch you are working in? I'm happy to test.

https://github.com/cbane/restic/tree/prune-aggressive

@cbane

This comment has been minimized.

Copy link

commented Apr 25, 2019

No, that branch is part of the fork from the main repository. I haven't pushed my changes anywhere public yet. I should be able to push my work-in-progress version to GitHub in a few days so you can try it out.

@cbane

This comment has been minimized.

Copy link

commented May 1, 2019

OK, I have a version that should be ready for other people to try out. It's on this branch: https://github.com/cbane/restic/tree/prune-speedup

Current limitations:

  • I haven't implemented any automatic setting of the number of repack workers. For now, set the environment variable RESTIC_REPACK_WORKERS to the number of workers you want to use. It will default to 4 if it's not set.
  • I need to work on the error handling when repacking. I didn't make any real changes from the existing single-threaded repacking; I need to look over the various error cases and make sure that doing the repacking in parallel doesn't cause any problems.
@mholt

This comment has been minimized.

Copy link
Contributor

commented May 1, 2019

Um, that looks amazing. Thank you for your work!

@pmkane

This comment has been minimized.

Copy link
Contributor

commented May 3, 2019

I have tested this a bit with a copy of 3TB+ repo in Amazon S3 and so far it looks amazing. A repack prune that would have taken weeks now completes in under an hour, and that's using relatively slow EBS as tmp space.

A real game changer here! Great work, @cbane!

@pmkane

This comment has been minimized.

Copy link
Contributor

commented May 3, 2019

Eek, I realized I mistimed the run.

One area that is still single threaded and looks like it could benefit from parallelization is the "checking for packs not in index" step -- that can still take 3-4 hours in multi-terabyte repos -- but this is still a massive, massive improvement, thank you!

@pmkane

This comment has been minimized.

Copy link
Contributor

commented May 4, 2019

@cbane I wasn't able to open an issue against your fork, so let me know if there's a better place for these.

During another test run, the repack failed at the very end (rewriting the last pack), running with 32 workers:


found 1396709 of 2257203 data blobs still in use, removing 860494 blobs
will remove 0 invalid files
will delete 119301 packs and rewrite 88485 packs, this frees 896.269 GiB
using 32 repack workers
Save(<data/c136027b25>) returned error, retrying after 723.31998ms: client.PutObject: Put https://ak-logical-db-backup.s3.dualstack.us-west-1.amazonaws.com/xxxxx: Connection closed by foreign host https://ak-logical-db-backup.s3.dualstack.us-west-1.amazonaws.com/xxxx. Retry again.
Save(<data/09d4692900>) returned error, retrying after 538.771816ms: client.PutObject: Put https://ak-logical-db-backup.s3.dualstack.us-west-1.amazonaws.com/xxxxx: Connection closed by foreign host https://ak-logical-db-backup.s3.dualstack.us-west-1.amazonaws.com/xxxxx. Retry again.
Save(<data/23d0d4f842>) returned error, retrying after 617.601934ms: client.PutObject: Put https://ak-logical-db-backup.s3.dualstack.us-west-1.amazonaws.com/xxxx: Connection closed by foreign host https://ak-logical-db-backup.s3.dualstack.us-west-1.amazonaws.com/xxxx. Retry again.
[10:02] 100.00%  88484 / 88485 packs rewritten
panic: reporting in a non-running Progress

goroutine 386596 [running]:
github.com/restic/restic/internal/restic.(*Progress).Report(0xc42011c2c0, 0x0, 0x0, 0x0, 0x0, 0x1, 0x0)
        internal/restic/progress.go:122 +0x242
github.com/restic/restic/internal/repository.Repack.func2(0x8, 0xe17f58)
        internal/repository/repack.go:66 +0x136
github.com/restic/restic/vendor/golang.org/x/sync/errgroup.(*Group).Go.func1(0xc4389246c0, 0xc56f509160)
        vendor/golang.org/x/sync/errgroup/errgroup.go:57 +0x57
created by github.com/restic/restic/vendor/golang.org/x/sync/errgroup.(*Group).Go
        vendor/golang.org/x/sync/errgroup/errgroup.go:54 +0x66
@cbane

This comment has been minimized.

Copy link

commented May 16, 2019

I have a new version available, at the same branch as before. I also rebased against master.

Here are the main changes from the previous version:

  • Converted the repacking from having each worker do all stages of repacking a single pack to a pipeline.
  • Fixed the reported crash at the end of repacking.
  • Repacking now auto-scales the number of workers for the pipeline stages based on the number of CPUs and the configured connection limit for the backend. (The RESTIC_REPACK_WORKERS environment variable is no longer used.)
  • Minor tweaks to finding used blobs.
  • Parallelized the scanning of unknown packs.

I still want to do some more work on finding used blobs. Currently, each snapshot is processed by a single worker. This can leave CPU resources unused if there are fewer snapshots than CPUs, or if there are major size differences between snapshots. I would like to have it spread sub-tree processing across different workers; I think I know how to do this, I just need to actually implement it.

I'd lean toward continuing to discuss things on this issue (or the future pull request for this), so that everything stays here in the main repository instead of spread out.

@pmkane

This comment has been minimized.

Copy link
Contributor

commented May 20, 2019

Just tested. It resolves the crash at the end of repacking and works really, really well.

The only additional place that could benefit from increased parallelism is the deletion of packs, which is currently single threaded.

This bites particularly hard during the first prune of a repo that was not previously prune-able, since there are a lot of packs that need deletion.

Even with single threaded deletion, however, a daily forget/prune against a 1.7TB, 356k pack repository that rewrites 14.7k packs and deletes 33k packs now takes just under 20 minutes.
Previously, it was impossible to prune at all.

Excellent work, thank you!

@cbane

This comment has been minimized.

Copy link

commented Jun 5, 2019

OK, I have another version available. The only real change this time is it now deletes unused packs in parallel (plus a few minor tweaks to some previous changes). I implemented the modified snapshot scanning, but it didn't give enough of a speedup and there wasn't a good way to report progress to the user, so I removed it again.

I'm planning to open a pull request for this soon, assuming that nothing broke. (I'll clean up the history first, though.) @fd0, do you want to take a look at this first?

@pmkane

This comment has been minimized.

Copy link
Contributor

commented Jun 5, 2019

Worked great in our test run. Rewrote 30k packs in 225 seconds, deleted 73k packs in 50 seconds.

Total runtime against a 1.74TiB repo in S3 with 32 surviving snapshots was just over 6 minutes.

Brilliant work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.