buffer: wake tasks waiting for channel capacity when terminating #480

hawkw · 2020-10-27T22:15:35Z

Depends on #479

#476 introduces one regression in buffer: tasks waiting for a
buffer::Service to become ready (in poll_ready) are no longer
notified when the buffer worker task terminates or the inner service
fails. This means that if a buffer is over its channel capacity, and
tasks are waiting for it to become ready again, and the buffered service
fails before sufficient channel capacity is released to wake those
tasks, those tasks are effectively leaked. They will never become ready.

This is because we are no longer using Tokio's bounded MPSC channel, and
are instead implementing our own bounded channel on top of the unbounded
MPSC using a semaphore. While the bounded MPSC sender's poll_ready
method would notify waiting tasks when the receiver is dropped, that
method no longer exists in Tokio 0.3, and we have to use an external
semaphore to provide a polling based API for waiting for channel
capacity. The semaphore does not wake pending tasks because it must live
in an Arc in order for the Semaphore::acquire_owned method to be
useable, and thus it will never be dropped while there are waiting
tasks. Not dropping the semaphore is necessary to uphold the safety
invariants of the intrusive wait list, but it means that the tasks will
not be woken by the semaphore being dropped.

This branch fixes this issue by adding the ability to wake all tasks
waiting on a semaphore by adding close to the maximum allowed number of
permits to it. This should be sufficient to wake all waiters, but it's
a little bit janky. The ideal solution would be for Tokio to expose the
close method on the internal semaphore implementation in a public API.
However, this works now without requiring an upstream change.

I've also added tests for this behavior.

Signed-off-by: Eliza Weisman <eliza@buoyant.io>

When other tower-batch tasks drop, wake any tasks that are waiting for a semaphore permit. Otherwise, tower-batch can hang. We currently pin tower in our workspace to: d4d1c67 hedge: use auto-resizing histograms (tower-rs/tower#484) Copy tower/src/semaphore.rs from that commit, to pick up tower-rs/tower#480.

…922) The proxy currently has its own implementation of a `tower` `Service` that makes an inner service `Clone`able by driving it in a spawned task and buffering requests on a channel. This also exists upstream, as `tower::buffer`. We implemented our own version for a couple of reasons: to avoid an upstream issue where memory was leaked when a buffered request was cancelled, and to implement an idle timeout when the buffered service has been unready for too long. However, it's no longer necessary to reimplement our own buffer service for these reasons: the upstream bug was fixed in `tower` 0.4 (see tower-rs/tower#476, tower-rs/tower#480, and tower-rs/tower#556); and we no longer actually use the buffer idle timeout (instead, we idle out unresponsive services with the separate `Failfast` middleware, note that `push_spawn_buffer_with_idle_timeout` is never actually used). Therefore, we can remove our _sui generis_ implementation in favour of `tower::buffer` from upstream. This eliminates dead code for the idle timeout, which we never actually use, and reduces duplication (since `tonic` uses `tower::buffer` internally, its code is already compiled into the proxy). It also reduces the amount of code I'm personally responsible for maintaining in two separate places ;) Since the `linkerd-buffer` crate erases the type of the buffered service, while `tower::buffer` does not, I've changed the `push_spawn_buffer`/`spawn_buffer` helpers to also include a `BoxService` layer. This required adding a `BoxServiceLayer` type, since `BoxService::layer` returns a `LayerFn` with an unnameable type. Also, this change ran into issues due to a compiler bug where generators (async blocks) sometimes forget concrete lifetimes, rust-lang/rust#64552. In order to resolve this, I had to remove the outermost `async` blocks from the OpenCensus and identity daemon tasks. These async blocks were used only for emitting a tracing event when the task is started, so it wasn't a big deal to remove them; I moved the trace events into the actual daemon task functions, and used a `tracing` span to propagate the remote addresses which aren't known inside the daemon task functions. Signed-off-by: Eliza Weisman <eliza@buoyant.io>

hawkw added 4 commits October 27, 2020 12:15

test: add tracing to tests

3102667

ensure tests use current-thread runtime

ea05fdc

Signed-off-by: Eliza Weisman <eliza@buoyant.io>

add tests for pending waiters on buffer

e796887

fix buffer not notifying waiters when closing/erroring

9cecf46

hawkw requested review from jonhoo and LucioFranco October 27, 2020 22:15

hawkw mentioned this pull request Oct 27, 2020

sync: consider exposing Semaphore::close in public API tokio-rs/tokio#3061

Closed

Base automatically changed from eliza/trace-tests to master October 28, 2020 15:40

LucioFranco approved these changes Oct 28, 2020

View reviewed changes

LucioFranco merged commit 43c4492 into master Oct 28, 2020

LucioFranco deleted the eliza/fix-buffer branch October 28, 2020 15:41

teor2345 mentioned this pull request Feb 17, 2021

Incomplete: Wake waiting tower-batch clones on close ZcashFoundation/zebra#1764

Merged

hawkw mentioned this pull request Feb 17, 2021

buffer: replace linkerd-buffer with tower::buffer from upstream linkerd/linkerd2-proxy#922

Merged

teor2345 mentioned this pull request Feb 23, 2021

Update tower-batch to avoid hangs on drop ZcashFoundation/zebra#1805

Closed

10 tasks

yaahc mentioned this pull request Mar 15, 2021

tower-batch: wake waiting workers on close to avoid hangs ZcashFoundation/zebra#1908

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

buffer: wake tasks waiting for channel capacity when terminating #480

buffer: wake tasks waiting for channel capacity when terminating #480

hawkw commented Oct 27, 2020

buffer: wake tasks waiting for channel capacity when terminating #480

buffer: wake tasks waiting for channel capacity when terminating #480

Conversation

hawkw commented Oct 27, 2020