New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gs: add option to set chunk size #1358
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1358 +/- ##
==========================================
- Coverage 51.18% 45.57% -5.61%
==========================================
Files 136 136
Lines 13501 13528 +27
==========================================
- Hits 6911 6166 -745
- Misses 5742 6574 +832
+ Partials 848 788 -60
Continue to review full report at Codecov.
|
Oh, good point, thanks for digging into the problem! I'd like to avoid exposing this option to end users, is there any harm in using 256KiB by default? From the backend, it's not possible to directly see if the upload limit is configured, so we'd need to pass that in when opening the backend. The upload limit divided by four or eight should be a good heuristic. What do you think? |
Actually, looking into it more, I think it might be reasonable to set the chunk size to 0 by default. The GCS library is doing this chunking to implement resumable uploads. It uploads the object in chunk size chunks and has automatic retry and exponential backoff if chunk upload fails. The main purpose of resumable uploads is to avoid having to re-upload lots of data if a request fails. (Max re-upload is the chunk size). However, the default chunk size is 8MB, and if I understand correctly, the maximum pack size is 12MB (4MB min pack size + 8MB max blob size), so the default chunk size isn't much more than the maximum object size we'll upload anyways. Setting the chunk size to 0 disables resumable uploads altogether and uploads the entire object in a single request. The passed Reader is used more directly, without explicit large buffers. This wouldn't eliminate all buffering; the GCS libraries still io.Copy the Reader to an io.Pipe(), and net/http does a fair amount of io.Copy'ing itself, but I suspect this would get the best rate limiting behavior we can easily achieve. Thoughts? |
I think we can set the chunk size to zero. That'll also improve memory usage, and we can easily retry uploads because we've saved the data in a temp file. |
By default, the GCS Go packages have an internal "chunk size" of 8MB, used for blob uploads. Media().Do() will buffer a full 8MB from the io.Reader (or less if EOF is reached) then write that full 8MB to the network all at once. This behavior does not play nicely with --limit-upload, which only limits the Reader passed to Media. While the long-term average upload rate will be correctly limited, the actual network bandwidth will be very spikey. e.g., if an 8MB/s connection is limited to 1MB/s, Media().Do() will spend 8s reading from the rate-limited reader (performing no network requests), then 1s writing to the network at 8MB/s. This is bad for network connections hurt by full-speed uploads, particularly when writing 8MB will take several seconds. Disable resumable uploads entirely by setting the chunk size to zero. This causes the io.Reader to be passed further down the request stack, where there is less (but still some) buffering. My connection is around 1.5MB/s up, with nominal ~15ms ping times to 8.8.8.8. Without this change, --limit-upload 1024 results in several seconds of ~200ms ping times (uploading), followed by several seconds of ~15ms ping times (reading from rate-limited reader). A bandwidth monitor reports this as several seconds of ~1.5MB/s followed by several seconds of 0.0MB/s. With this change, --limit-upload 1024 results in ~20ms ping times and the bandwidth monitor reports a constant ~1MB/s. I've elected to make this change unconditional of --limit-upload because the resumable uploads shouldn't be providing much benefit anyways, as restic already uploads mostly small blobs and already has a retry mechanism. --limit-download is not affected by this problem, as Get().Download() returns the real http.Response.Body without any internal buffering. Updates restic#1216
PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
|
By default, the GCS Go packages have an internal "chunk size" of 8MB,
used for blob uploads.
Media().Do() will buffer a full 8MB from the io.Reader (or less if EOF
is reached) then write that full 8MB to the network all at once.
This behavior does not play nicely with --limit-upload, which only
limits the Reader passed to Media. While the long-term average upload
rate will be correctly limited, the actual network bandwidth will be
very spikey.
e.g., if an 8MB/s connection is limited to 1MB/s, Media().Do() will
spend 8s reading from the rate-limited reader (performing no network
requests), then 1s writing to the network at 8MB/s.
This is bad for network connections hurt by full-speed uploads,
particularly when writing 8MB will take several seconds.
Add a backend option to allow specifying the ChunkSize media option,
which sets this internal buffer size in increments of 256KB, with 256KB
as the smallest chunk size.
My connection is around 1.5MB/s up, with nominal ~15ms ping times to
8.8.8.8.
Without this change, --limit-upload 1024 results in several seconds of
~200ms ping times (uploading), followed by several seconds of ~15ms ping
times (reading from rate-limited reader).
With this change and a chunk size of 256KB, --limit-upload 1024 results
in a relatively constant ~30-40ms ping times.
It may eventually make sense to automatically reduce the chunk size if
--limit-upload is set, but at this point I don't have a good heuristic
for what the size should be.
As far as I can tell, --limit-download is not affected by this problem,
as Get().Download() returns the real http.Response.Body without any
internal buffering.
Updates #1216