gs: add option to set chunk size #1358

prattmic · 2017-10-15T19:06:26Z

By default, the GCS Go packages have an internal "chunk size" of 8MB,
used for blob uploads.

Media().Do() will buffer a full 8MB from the io.Reader (or less if EOF
is reached) then write that full 8MB to the network all at once.

This behavior does not play nicely with --limit-upload, which only
limits the Reader passed to Media. While the long-term average upload
rate will be correctly limited, the actual network bandwidth will be
very spikey.

e.g., if an 8MB/s connection is limited to 1MB/s, Media().Do() will
spend 8s reading from the rate-limited reader (performing no network
requests), then 1s writing to the network at 8MB/s.

This is bad for network connections hurt by full-speed uploads,
particularly when writing 8MB will take several seconds.

Add a backend option to allow specifying the ChunkSize media option,
which sets this internal buffer size in increments of 256KB, with 256KB
as the smallest chunk size.

My connection is around 1.5MB/s up, with nominal ~15ms ping times to
8.8.8.8.

Without this change, --limit-upload 1024 results in several seconds of
~200ms ping times (uploading), followed by several seconds of ~15ms ping
times (reading from rate-limited reader).

With this change and a chunk size of 256KB, --limit-upload 1024 results
in a relatively constant ~30-40ms ping times.

It may eventually make sense to automatically reduce the chunk size if
--limit-upload is set, but at this point I don't have a good heuristic
for what the size should be.

As far as I can tell, --limit-download is not affected by this problem,
as Get().Download() returns the real http.Response.Body without any
internal buffering.

Updates #1216

codecov-io · 2017-10-15T19:26:35Z

Codecov Report

Merging #1358 into master will decrease coverage by 5.6%.
The diff coverage is 0%.

@@            Coverage Diff             @@
##           master    #1358      +/-   ##
==========================================
- Coverage   51.18%   45.57%   -5.61%     
==========================================
  Files         136      136              
  Lines       13501    13528      +27     
==========================================
- Hits         6911     6166     -745     
- Misses       5742     6574     +832     
+ Partials      848      788      -60

Impacted Files	Coverage Δ
internal/backend/gs/gs.go	`0% <0%> (-65.6%)`	⬇️
internal/backend/b2/b2.go	`0% <0%> (-79.93%)`	⬇️
internal/backend/azure/azure.go	`0% <0%> (-75.35%)`	⬇️
internal/backend/swift/swift.go	`0% <0%> (-73.9%)`	⬇️
internal/backend/swift/config.go	`34.37% <0%> (-56.25%)`	⬇️
internal/checker/checker.go	`68.19% <0%> (-3.84%)`	⬇️
internal/archiver/archiver.go	`65.2% <0%> (-0.17%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 99b6163...9fa4f5e. Read the comment docs.

fd0 · 2017-10-16T11:32:18Z

Oh, good point, thanks for digging into the problem!

I'd like to avoid exposing this option to end users, is there any harm in using 256KiB by default?

From the backend, it's not possible to directly see if the upload limit is configured, so we'd need to pass that in when opening the backend. The upload limit divided by four or eight should be a good heuristic.

What do you think?

prattmic · 2017-10-17T04:01:55Z

Actually, looking into it more, I think it might be reasonable to set the chunk size to 0 by default.

The GCS library is doing this chunking to implement resumable uploads. It uploads the object in chunk size chunks and has automatic retry and exponential backoff if chunk upload fails.

The main purpose of resumable uploads is to avoid having to re-upload lots of data if a request fails. (Max re-upload is the chunk size). However, the default chunk size is 8MB, and if I understand correctly, the maximum pack size is 12MB (4MB min pack size + 8MB max blob size), so the default chunk size isn't much more than the maximum object size we'll upload anyways.

Setting the chunk size to 0 disables resumable uploads altogether and uploads the entire object in a single request. The passed Reader is used more directly, without explicit large buffers.

This wouldn't eliminate all buffering; the GCS libraries still io.Copy the Reader to an io.Pipe(), and net/http does a fair amount of io.Copy'ing itself, but I suspect this would get the best rate limiting behavior we can easily achieve.

Thoughts?

fd0 · 2017-10-17T07:58:44Z

I think we can set the chunk size to zero. That'll also improve memory usage, and we can easily retry uploads because we've saved the data in a temp file.

By default, the GCS Go packages have an internal "chunk size" of 8MB, used for blob uploads. Media().Do() will buffer a full 8MB from the io.Reader (or less if EOF is reached) then write that full 8MB to the network all at once. This behavior does not play nicely with --limit-upload, which only limits the Reader passed to Media. While the long-term average upload rate will be correctly limited, the actual network bandwidth will be very spikey. e.g., if an 8MB/s connection is limited to 1MB/s, Media().Do() will spend 8s reading from the rate-limited reader (performing no network requests), then 1s writing to the network at 8MB/s. This is bad for network connections hurt by full-speed uploads, particularly when writing 8MB will take several seconds. Disable resumable uploads entirely by setting the chunk size to zero. This causes the io.Reader to be passed further down the request stack, where there is less (but still some) buffering. My connection is around 1.5MB/s up, with nominal ~15ms ping times to 8.8.8.8. Without this change, --limit-upload 1024 results in several seconds of ~200ms ping times (uploading), followed by several seconds of ~15ms ping times (reading from rate-limited reader). A bandwidth monitor reports this as several seconds of ~1.5MB/s followed by several seconds of 0.0MB/s. With this change, --limit-upload 1024 results in ~20ms ping times and the bandwidth monitor reports a constant ~1MB/s. I've elected to make this change unconditional of --limit-upload because the resumable uploads shouldn't be providing much benefit anyways, as restic already uploads mostly small blobs and already has a retry mechanism. --limit-download is not affected by this problem, as Get().Download() returns the real http.Response.Body without any internal buffering. Updates restic#1216

prattmic · 2017-10-18T04:14:28Z

PTAL

fd0

LGTM!

fd0 · 2017-10-21T09:11:49Z

benchcmp agrees:

benchmark                                               old bytes     new bytes     delta
BenchmarkBackendGS/BenchmarkLoadFile-4                  373032        344088        -7.76%
BenchmarkBackendGS/BenchmarkLoadPartialFile-4           72112         72680         +0.79%
BenchmarkBackendGS/BenchmarkLoadPartialFileOffset-4     72336         84205         +16.41%
BenchmarkBackendGS/BenchmarkSave-4                      8581304       206912        -97.59%

gs: add option to set chunk size

prattmic force-pushed the chunk_size branch from b23926e to 9fa4f5e Compare October 18, 2017 04:13

fd0 approved these changes Oct 21, 2017

View reviewed changes

fd0 merged commit 9fa4f5e into restic:master Oct 21, 2017

fd0 added a commit that referenced this pull request Oct 21, 2017

Merge pull request #1358 from prattmic/chunk_size

d63ab4e

gs: add option to set chunk size

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gs: add option to set chunk size #1358

gs: add option to set chunk size #1358

prattmic commented Oct 15, 2017

codecov-io commented Oct 15, 2017 •

edited

fd0 commented Oct 16, 2017

prattmic commented Oct 17, 2017

fd0 commented Oct 17, 2017 •

edited

prattmic commented Oct 18, 2017

fd0 left a comment

fd0 commented Oct 21, 2017

gs: add option to set chunk size #1358

gs: add option to set chunk size #1358

Conversation

prattmic commented Oct 15, 2017

codecov-io commented Oct 15, 2017 • edited

Codecov Report

fd0 commented Oct 16, 2017

prattmic commented Oct 17, 2017

fd0 commented Oct 17, 2017 • edited

prattmic commented Oct 18, 2017

fd0 left a comment

Choose a reason for hiding this comment

fd0 commented Oct 21, 2017

codecov-io commented Oct 15, 2017 •

edited

fd0 commented Oct 17, 2017 •

edited