Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gs: add option to set chunk size #1358

Merged
merged 1 commit into from Oct 21, 2017
Merged

gs: add option to set chunk size #1358

merged 1 commit into from Oct 21, 2017

Conversation

prattmic
Copy link
Contributor

By default, the GCS Go packages have an internal "chunk size" of 8MB,
used for blob uploads.

Media().Do() will buffer a full 8MB from the io.Reader (or less if EOF
is reached) then write that full 8MB to the network all at once.

This behavior does not play nicely with --limit-upload, which only
limits the Reader passed to Media. While the long-term average upload
rate will be correctly limited, the actual network bandwidth will be
very spikey.

e.g., if an 8MB/s connection is limited to 1MB/s, Media().Do() will
spend 8s reading from the rate-limited reader (performing no network
requests), then 1s writing to the network at 8MB/s.

This is bad for network connections hurt by full-speed uploads,
particularly when writing 8MB will take several seconds.

Add a backend option to allow specifying the ChunkSize media option,
which sets this internal buffer size in increments of 256KB, with 256KB
as the smallest chunk size.

My connection is around 1.5MB/s up, with nominal ~15ms ping times to
8.8.8.8.

Without this change, --limit-upload 1024 results in several seconds of
~200ms ping times (uploading), followed by several seconds of ~15ms ping
times (reading from rate-limited reader).

With this change and a chunk size of 256KB, --limit-upload 1024 results
in a relatively constant ~30-40ms ping times.

It may eventually make sense to automatically reduce the chunk size if
--limit-upload is set, but at this point I don't have a good heuristic
for what the size should be.

As far as I can tell, --limit-download is not affected by this problem,
as Get().Download() returns the real http.Response.Body without any
internal buffering.

Updates #1216

@codecov-io
Copy link

codecov-io commented Oct 15, 2017

Codecov Report

Merging #1358 into master will decrease coverage by 5.6%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1358      +/-   ##
==========================================
- Coverage   51.18%   45.57%   -5.61%     
==========================================
  Files         136      136              
  Lines       13501    13528      +27     
==========================================
- Hits         6911     6166     -745     
- Misses       5742     6574     +832     
+ Partials      848      788      -60
Impacted Files Coverage Δ
internal/backend/gs/gs.go 0% <0%> (-65.6%) ⬇️
internal/backend/b2/b2.go 0% <0%> (-79.93%) ⬇️
internal/backend/azure/azure.go 0% <0%> (-75.35%) ⬇️
internal/backend/swift/swift.go 0% <0%> (-73.9%) ⬇️
internal/backend/swift/config.go 34.37% <0%> (-56.25%) ⬇️
internal/checker/checker.go 68.19% <0%> (-3.84%) ⬇️
internal/archiver/archiver.go 65.2% <0%> (-0.17%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 99b6163...9fa4f5e. Read the comment docs.

@fd0
Copy link
Member

fd0 commented Oct 16, 2017

Oh, good point, thanks for digging into the problem!

I'd like to avoid exposing this option to end users, is there any harm in using 256KiB by default?

From the backend, it's not possible to directly see if the upload limit is configured, so we'd need to pass that in when opening the backend. The upload limit divided by four or eight should be a good heuristic.

What do you think?

@prattmic
Copy link
Contributor Author

Actually, looking into it more, I think it might be reasonable to set the chunk size to 0 by default.

The GCS library is doing this chunking to implement resumable uploads. It uploads the object in chunk size chunks and has automatic retry and exponential backoff if chunk upload fails.

The main purpose of resumable uploads is to avoid having to re-upload lots of data if a request fails. (Max re-upload is the chunk size). However, the default chunk size is 8MB, and if I understand correctly, the maximum pack size is 12MB (4MB min pack size + 8MB max blob size), so the default chunk size isn't much more than the maximum object size we'll upload anyways.

Setting the chunk size to 0 disables resumable uploads altogether and uploads the entire object in a single request. The passed Reader is used more directly, without explicit large buffers.

This wouldn't eliminate all buffering; the GCS libraries still io.Copy the Reader to an io.Pipe(), and net/http does a fair amount of io.Copy'ing itself, but I suspect this would get the best rate limiting behavior we can easily achieve.

Thoughts?

@fd0
Copy link
Member

fd0 commented Oct 17, 2017

I think we can set the chunk size to zero. That'll also improve memory usage, and we can easily retry uploads because we've saved the data in a temp file.

By default, the GCS Go packages have an internal "chunk size" of 8MB,
used for blob uploads.

Media().Do() will buffer a full 8MB from the io.Reader (or less if EOF
is reached) then write that full 8MB to the network all at once.

This behavior does not play nicely with --limit-upload, which only
limits the Reader passed to Media. While the long-term average upload
rate will be correctly limited, the actual network bandwidth will be
very spikey.

e.g., if an 8MB/s connection is limited to 1MB/s, Media().Do() will
spend 8s reading from the rate-limited reader (performing no network
requests), then 1s writing to the network at 8MB/s.

This is bad for network connections hurt by full-speed uploads,
particularly when writing 8MB will take several seconds.

Disable resumable uploads entirely by setting the chunk size to zero.
This causes the io.Reader to be passed further down the request stack,
where there is less (but still some) buffering.

My connection is around 1.5MB/s up, with nominal ~15ms ping times to
8.8.8.8.

Without this change, --limit-upload 1024 results in several seconds of
~200ms ping times (uploading), followed by several seconds of ~15ms ping
times (reading from rate-limited reader). A bandwidth monitor reports
this as several seconds of ~1.5MB/s followed by several seconds of
0.0MB/s.

With this change, --limit-upload 1024 results in ~20ms ping times and
the bandwidth monitor reports a constant ~1MB/s.

I've elected to make this change unconditional of --limit-upload because
the resumable uploads shouldn't be providing much benefit anyways, as
restic already uploads mostly small blobs and already has a retry
mechanism.

--limit-download is not affected by this problem, as Get().Download()
returns the real http.Response.Body without any internal buffering.

Updates restic#1216
@prattmic
Copy link
Contributor Author

PTAL

Copy link
Member

@fd0 fd0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@fd0
Copy link
Member

fd0 commented Oct 21, 2017

benchcmp agrees:

benchmark                                               old bytes     new bytes     delta
BenchmarkBackendGS/BenchmarkLoadFile-4                  373032        344088        -7.76%
BenchmarkBackendGS/BenchmarkLoadPartialFile-4           72112         72680         +0.79%
BenchmarkBackendGS/BenchmarkLoadPartialFileOffset-4     72336         84205         +16.41%
BenchmarkBackendGS/BenchmarkSave-4                      8581304       206912        -97.59%

@fd0 fd0 merged commit 9fa4f5e into restic:master Oct 21, 2017
fd0 added a commit that referenced this pull request Oct 21, 2017
gs: add option to set chunk size
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants