-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor read_many
to take a stream of iovecs as input
#416
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
duarten
reviewed
Sep 23, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very nice, some small comments.
glommer
approved these changes
Sep 25, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me.
Once you and Duarte are on the same page for the comments we can merge
HippoBaro
force-pushed
the
fair_many_read
branch
2 times, most recently
from
September 27, 2021 17:19
e4798ec
to
e01758e
Compare
duarten
approved these changes
Sep 27, 2021
The code performs less work and doesn't impose the iovec to be `Copy`. It also removes tour dependency on the `itertool` crate and that's very nice.
Instead of taking an iterator of (offsets, positions), make it a stream. The reason behind doing that is that using an iterator forces us to schedule all the IO synchronously at call-time (or keep them in memory which isn't great either). While this is fine when dealing with a small number of requests (and -marginally- more efficient) it becomes problematic as the number of IO requests increases. Specifically doing this may starve other tasks performing IO in the same task queue since the reactor submits IO requests to the devices in FIFO order. Therefore, if two concurrent tasks call `read_many` they will race and all the IO submitted by the first task will be processed before a single IO from the second reaches the IO ring. This is obviously wrong as we want some amount of fairness even within a common task queue. Another use case is when performing parallel calls to `read_many` with the invention of zipping the resulting streams. For instance, if some data is spread in many files we may want to fetch data in parallel and combine the result asynchronously. One caveat of the current state of the code is that the stream will create only one `Source` at a time. Therefore, this refactoring temporarily removes all IO concurrency within a `read_many` invocation. The next commits will reintroduce concurrency.
The `read_many` stream now eagerly consumes the input stream until said stream yields or until we reach 128 in-flight sources. This allows a maximum of 128 sources to run concurrently. After that, the stream will wait until the first source is fulfilled. 128 is a constant for now and was chosen to match the depth of our io_uring submission queues. Although this value is a constant right now, a further improvement would be to make it dynamic to maximize throughput and lower IO latencies on the fly. Baby step though, one thing at a time.
It is not okay to delay IO requests ever so if the IO request stream yield Poll::Pending then try to flush the merger so that we avoid starving the IO reactor.
Don't acquire a strong reference to the reactor each time we want to do IO, do it once and keep the reference around until we are done with it.
HippoBaro
force-pushed
the
fair_many_read
branch
from
September 27, 2021 19:18
e01758e
to
9f3badd
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Instead of taking an iterator of (offsets, positions), make it a stream.
The reason behind doing this is that using an iterator forces us to
schedule all the IO synchronously at call-time (or keep them in memory
which isn't great either). While this is fine when dealing with a small
number of requests (and -marginally- more efficient) it becomes
problematic as the number of IO requests increases.
Specifically, doing this may starve other tasks performing IO in the same
task queue since the reactor submits IO requests to the device in FIFO
order. Therefore, if two concurrent tasks call
read_many
they willrace and all the IO submitted by the first task will be processed before
a single IO from the second reaches the IO ring. This is obviously wrong
as we want some amount of fairness even within a common task queue.
Another use case is when performing parallel calls to
read_many
withthe intention of zipping the resulting streams. For instance, if some
data is spread in many files we may want to fetch data in parallel and
combine the result asynchronously.
The
read_many
stream now eagerly consumes the input stream untilit yields or until we reach 128 in-flight sources. This allows a maximum
of 128 sources to run concurrently. After that, the stream will wait until
the first source is fulfilled.
128 is a constant for now and was chosen to match the depth of our
io_uring submission queues. Although this value is a constant right now,
a further improvement would be to make it dynamic to maximize
throughput and lower IO latency on the fly.
Baby steps!