refactor `read_many` to take a stream of iovecs as input #416

HippoBaro · 2021-09-23T16:06:42Z

Instead of taking an iterator of (offsets, positions), make it a stream.

The reason behind doing this is that using an iterator forces us to
schedule all the IO synchronously at call-time (or keep them in memory
which isn't great either). While this is fine when dealing with a small
number of requests (and -marginally- more efficient) it becomes
problematic as the number of IO requests increases.

Specifically, doing this may starve other tasks performing IO in the same
task queue since the reactor submits IO requests to the device in FIFO
order. Therefore, if two concurrent tasks call read_many they will
race and all the IO submitted by the first task will be processed before
a single IO from the second reaches the IO ring. This is obviously wrong
as we want some amount of fairness even within a common task queue.

Another use case is when performing parallel calls to read_many with
the intention of zipping the resulting streams. For instance, if some
data is spread in many files we may want to fetch data in parallel and
combine the result asynchronously.

The read_many stream now eagerly consumes the input stream until
it yields or until we reach 128 in-flight sources. This allows a maximum
of 128 sources to run concurrently. After that, the stream will wait until
the first source is fulfilled.

128 is a constant for now and was chosen to match the depth of our
io_uring submission queues. Although this value is a constant right now,
a further improvement would be to make it dynamic to maximize
throughput and lower IO latency on the fly.

Baby steps!

duarten

Looks very nice, some small comments.

glommio/src/io/immutable_file.rs

glommio/src/io/bulk_io.rs

glommer

Looks great to me.

Once you and Duarte are on the same page for the comments we can merge

The code performs less work and doesn't impose the iovec to be `Copy`. It also removes tour dependency on the `itertool` crate and that's very nice.

Instead of taking an iterator of (offsets, positions), make it a stream. The reason behind doing that is that using an iterator forces us to schedule all the IO synchronously at call-time (or keep them in memory which isn't great either). While this is fine when dealing with a small number of requests (and -marginally- more efficient) it becomes problematic as the number of IO requests increases. Specifically doing this may starve other tasks performing IO in the same task queue since the reactor submits IO requests to the devices in FIFO order. Therefore, if two concurrent tasks call `read_many` they will race and all the IO submitted by the first task will be processed before a single IO from the second reaches the IO ring. This is obviously wrong as we want some amount of fairness even within a common task queue. Another use case is when performing parallel calls to `read_many` with the invention of zipping the resulting streams. For instance, if some data is spread in many files we may want to fetch data in parallel and combine the result asynchronously. One caveat of the current state of the code is that the stream will create only one `Source` at a time. Therefore, this refactoring temporarily removes all IO concurrency within a `read_many` invocation. The next commits will reintroduce concurrency.

The `read_many` stream now eagerly consumes the input stream until said stream yields or until we reach 128 in-flight sources. This allows a maximum of 128 sources to run concurrently. After that, the stream will wait until the first source is fulfilled. 128 is a constant for now and was chosen to match the depth of our io_uring submission queues. Although this value is a constant right now, a further improvement would be to make it dynamic to maximize throughput and lower IO latencies on the fly. Baby step though, one thing at a time.

It is not okay to delay IO requests ever so if the IO request stream yield Poll::Pending then try to flush the merger so that we avoid starving the IO reactor.

Don't acquire a strong reference to the reactor each time we want to do IO, do it once and keep the reference around until we are done with it.

duarten reviewed Sep 23, 2021

View reviewed changes

glommio/src/io/immutable_file.rs Show resolved Hide resolved

glommio/src/io/immutable_file.rs Show resolved Hide resolved

glommio/src/io/bulk_io.rs Outdated Show resolved Hide resolved

glommio/src/io/bulk_io.rs Outdated Show resolved Hide resolved

glommer approved these changes Sep 25, 2021

View reviewed changes

HippoBaro force-pushed the fair_many_read branch 2 times, most recently from e4798ec to e01758e Compare September 27, 2021 17:19

duarten approved these changes Sep 27, 2021

View reviewed changes

HippoBaro added 6 commits September 27, 2021 21:18

refactor the read_many internal code

32d5ab3

The code performs less work and doesn't impose the iovec to be `Copy`. It also removes tour dependency on the `itertool` crate and that's very nice.

flush the IO merger if user-provided stream return Poll::Pending

105f391

It is not okay to delay IO requests ever so if the IO request stream yield Poll::Pending then try to flush the merger so that we avoid starving the IO reactor.

upgrade the reactor once throughout the read_many stream lifetime

6fb78b6

Don't acquire a strong reference to the reactor each time we want to do IO, do it once and keep the reference around until we are done with it.

make the io_uring submission queue depth a const

9f3badd

HippoBaro force-pushed the fair_many_read branch from e01758e to 9f3badd Compare September 27, 2021 19:18

HippoBaro merged commit bac65bb into DataDog:master Sep 27, 2021

HippoBaro deleted the fair_many_read branch September 27, 2021 19:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor `read_many` to take a stream of iovecs as input #416

refactor `read_many` to take a stream of iovecs as input #416

HippoBaro commented Sep 23, 2021

duarten left a comment

glommer left a comment

refactor read_many to take a stream of iovecs as input #416

refactor read_many to take a stream of iovecs as input #416

Conversation

HippoBaro commented Sep 23, 2021

duarten left a comment

Choose a reason for hiding this comment

glommer left a comment

Choose a reason for hiding this comment

refactor `read_many` to take a stream of iovecs as input #416

refactor `read_many` to take a stream of iovecs as input #416