Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement optional concurrent "Range" requests (refs #86) #102

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

justinfx
Copy link

This is an implementation of an optional feature to have a Request download the payload in multiple chunks, using a "Range" request, if supported by the server.

API updates

The Request struct gains a new field called RangeRequestMax, which when set to > 0 controls how many chunks to download in parallel using a "Range" request, instead of a single request reading the full body.

Implementation details

High level steps:

  1. RangeRequestMax > 0
  2. Ensure that a head request is always performed, so we can verify the support for "Range" and get the content length
  3. Don't start the synchronous GET request in the state machine, as it will be executed in parallel later (HEAD request is the last request performed)
  4. Transition to openWriter
  5. Use an alternate transfer implementation, called transferRanges
  6. async copyFile works the same as before

Given the way the state machine works, it seemed easier to launch the concurrent range requests during the copy phase, instead of in the synchronous getRequest state and have to monitor a list of requests.

A new transferer interface has been introduced, to have a second implementation called transferRanges

transferRanges implementation

This alternate implementation handles launching the concurrent range requests, doing the copy of the data, and tracking the metrics.

It seemed easier to just pass the HEAD Response to the private constructor, as most of the needed details are present on that struct.

transferRanges.Copy() will start a number of Range requests with an offset-limit in goroutines, per the value of RangeRequestMax passed in. Each goroutines writes directly to the open file writer using WriteAt (with underlying pwrite() syscall) to write chunks of data at offsets to the same file descriptor in parallel. Metrics are atomically updated to keep N() and BPS() working.

Other details

I did also update the default setting in the Client when it comes to setting "TCP_NODELAY". In Go standard lib, this is set to true in the net.TCPConn to target rpc workloads. But it would make more sense, in theory, to disable Nagles Algorithm by default in a library specific to downloading files over HTTP. It can still be controlled by the user supplying a custom HTTPClient

@justinfx
Copy link
Author

justinfx commented Jul 4, 2023

I've realised that I have left the returned Response contains an HTTPResponse as the one from the HEAD request, when performing a "Range" request. Would be interested in feedback on this approach, since I wasn't sure of a better response to return, seeing as the parallel range requests are running transparently and being written to the target file. Another option would be that I clone the HEAD http response and modify it to represent the method that matches the Range requests (GET, ...).

@ananthb
Copy link

ananthb commented Aug 25, 2023

If multiple goroutines are writing to the output in parallel, will the output file have gaps in between partially written chunks? If so, how would resuming a partial download work?

@justinfx
Copy link
Author

justinfx commented Aug 25, 2023

@ananthb I think it is possible in the current implementation for the resume to not be correct, given a situation where a later range concurrently finishes sooner than an earlier range and then the transfer is stopped. The reason would be that as each concurrent range completes writing, it atomically adds to the total bytes written. So if 10,20,40 finish, but 30 does not, it would report 30 total written but there would be a gap, and the file itself would look like it had 40 30 bytes written since it is sparse.
Maybe concurrent range failures need to truncate back to before the earliest failed range?

@justinfx
Copy link
Author

@ananthb I've just pushed 8b4f8d2 to address the support for resume with ranged requests. It will now truncate the file to the end of the lowest successful range offset before a failure to avoid any gaps. This means you may lose progress on some chunks that concurrently finished just after the failed range.
I've updated the existing AutoResume test to also check resuming ranged requests that did not fail with a gap. And then an extra test that sets up a failure with a gap and asserts that it truncates (which would be a case that then would function as expected with a normal Resume).

@ananthb
Copy link

ananthb commented Aug 30, 2023

Yep that makes sense @justinfx.

@ananthb
Copy link

ananthb commented Aug 31, 2023

What happens if grab crashes before it can truncate the file?

If you only wrote completed chunks to the file, then you wouldn't need to truncate it in the first place.
Either by buffering incomplete chunks in-memory or on disk tmpfiles.

@justinfx
Copy link
Author

@ananthb yea if the process crashed before it could truncate, it would leave a file that could be corrupt and then used for resume. I definitely don't think we should buffer in memory because that could have surprising resource usage implications on large files.
The goal of using WriteTo was to be optimal in not having to buffer anything and only writing a single file without any post-process i/o. I admit that I wrote this feature based on my own use-case where I don't use the resume feature at all, so I didn't think through the edge cases of supporting it.
Couple of ideas in terms of writing to files:

  • Write to a ".partial" and rename to "" after chunks finish. This could leave being a ".partial" file on crash, which would always be overwritten from the start on next resume.
  • On posix OS, unlink the name of the of the target and only write to the file descriptor. Then hard link / linkat the fd to a named file after chunks finish. This would avoid leaving behind a ".partial" file on crash since there is no reference anymore.

@ananthb
Copy link

ananthb commented Sep 1, 2023

Yeah in-memory buffers alone won't be enough to cover, say large chunk sizes or many parallel chunks. I've been toying with the idea of an in-memory buffer that spills over onto disk.

My basic idea to make resume work is that the output file should always be consistent and not have any "holes". Basically write only completed chunks in order to the output file.

I could use anonymous chunks to buffer in-progress chunks and too.

@justinfx
Copy link
Author

justinfx commented Sep 1, 2023

The current implementation splits large chunks by the number of parallel workers, so I wonder if your idea could manage to avoid large memory usage. You might have to buffer alot before a gap closed.
Maybe the single temp file that only moves on success or truncation is eaiser?
But I'm interested to see what your approach produces!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants