Add backpresure capabilities to `Service` #6

carllerche · 2017-09-15T17:00:19Z

Currently, Service does not provide a mechanism by which it can signal to the
caller that it is at capacity. This commit adds a poll_ready function to the
Service trait. Callers are able to first check poll_ready before calling
Service::call.

poll_ready is expected to be a hint and will be implemented in a best effort
fashion. It is permitted for a Service to return Ready from poll_ready
and the next invocation of Service::call fails.

I will be adding more docs and examples before merging the PR.

Unresolved questions

poll_ready currently has an Err variant. What should this signify and is it needed at all?
Should the poll_ready error be the same as call?
Should there be a default poll_ready impl that always returns ready?
Should Service take &mut self or &self.

Fixes #3

Currently, `Service` does not provide a mechanism by which it can signal to the caller that it is at capacity. This commit adds a `poll_ready` function to the `Service` trait. Callers are able to first check `poll_ready` before calling `Service::call`. `poll_ready` is expected to be a hint and will be implemented in a best effort fashion. It is permitted for a `Service` to return `Ready` from `poll_ready` and the next invocation of `Service::call` fails.

danburkert · 2017-09-15T17:52:28Z

The combination of poll_ready not being a guarantee, and not having a Poll-style NotReady return variant from call pretty much guarantees that Service implementations' error type will have to include a NotReady(Request), no? I think this is going to be very boilerplate-y to handle both in the service impl and in the caller.

carllerche · 2017-09-15T18:03:38Z

@danburkert well, I would say that services that can't guarantee poll_ready can return the request in the error slot if possible, or in some cases the request will be lost. This will be up to the service implementation to do what fits best.

seanmonstar · 2017-09-15T19:33:38Z

examples/channel_service.rs

+
+    fn poll_ready(&mut self) -> Poll<(), Error> {
+        self.tx.lock().unwrap().poll_ready()
+            .map_err(|_| Error::AtCapacity)


The semantics of poll_ready are that Async::NotReady is to mean at-capacity, but the capacity can reduce in the future. So, should this use poll_ready().or(Ok(Async::NotReady)?

seanmonstar · 2017-09-15T19:34:51Z

src/lib.rs

+    /// This is a **best effort** implementation. False positives are permitted.
+    /// It is permitted for the service to return `Ready` from a `poll_ready`
+    /// call and the next invocation of `call` results in an error.
+    fn poll_ready(&mut self) -> Poll<(), Self::Error>;


The Err variant is to mean that the Service is in an unrecoverable state, and will never be ready again?

I don't know if it means that it is in a permanently unrecoverable state... maybe it does...

It could be used to signal shutdown maybe. I'd say it is a TBD.

I added a list of unresolved questions to the PR.

danburkert · 2017-09-15T19:55:13Z

examples/channel_service.rs

+#[derive(Debug)]
+pub struct ChannelService {
+    // Send the request and a oneshot Sender to push the response into.
+    tx: Mutex<Sender>,


This seems like a big issue. I just looked into why call takes &self and not &mut self and found tokio-rs/tokio-service#9 (comment). That issue points out that taking &mut self means chained calls require wrapping the second service in Arc<Mutex<S>>, but this example has to use a Mutex, and it's not even chaining. Perhaps we should re-open the &/&mut discussion, given this example? In general it seems strange to me that Service takes &self when everything else in the futures stack takes &mut self (Future::poll, Sink::start_send).

Can you elaborate on what you see as a big issue?

Mutex here isn't strictly necessary, but it does make it Sync. You could use RefCell and it would be !Sync and honestly try_send is safe to call w/o any coordination on a single thread but the signature takes &mut self right now.

Seems to me that it's going to be overwhelmingly common to use Mutex or RefCell internally in Service impls because call takes &self (will Service impls really be stateless? Not the interesting ones). I think it may be better to have call take &mut self, and force the caller to do external synchronization, if necessary. In this case, the caller could wrap the ChannelService in a Mutex/RefCell, or just clone it as necessary.

The main argument against &mut self, as I understand it in the linked comment, is that it makes it more difficult to create services which internally chain calls to other services. But as this example shows, &self is making it difficult in an even simpler case.

I think you have a valid point. The use case of chaining multiple calls together is going to be more complicated w/ back pressure handling anyway...

I think it is worth reconsidering &self / &mut self. The answer isn't obviously clear to me right now.

The main argument against &mut self, as I understand it in the linked comment, is that it makes it more difficult to create services which internally chain calls to other services.

If the middleware trait in #2 is merged, I feel like that would make it less important to be able to reasonably write a service that calls other services?

alexcrichton

Unfortunately I'm not too familiar with the problems and design space once we hit the service layer, so I may not be too too helpful in review here :(

olix0r

So far, lgtm. I've updated a project to use the Ready api and everything feels fine so far.

My 2c re: open questions:

poll_ready currently has an Err variant. What should this signify and is it needed at all?

Should the poll_ready error be the same as call?

I think so. It's basically the error that would be returned if call were to be invoked, if it is known in advance.

Errors should, imo, signify that the Service is no longer usable.

Should there be a default poll_ready impl that always returns ready?

IMO, no. I think it's important for implementations to think about readiness explicitly.

Should Service take &mut self or &self.

idk

olix0r · 2017-09-21T00:37:51Z

poll_ready currently has an Err variant. What should this signify and is it needed at all?

Should the poll_ready error be the same as call?

I think so. It's basically the error that would be returned if call were to be invoked, if it is known in advance.

Errors should, imo, signify that the Service is no longer usable.

Eh, as soon as I wrote that i second guessed myself. call may return an error to indicate, for instance, that the request was rejected or canceled though other requests may succeed.

By this measure, I think poll_ready's errors should be considered fatal to the service. But I'm less convinced that it should be the same error type as that of call

carllerche · 2017-09-21T23:59:23Z

@olix0r I'm actually kind of leaning towards keeping them the same, that way any futures that take a request, wait until the service is ready, then send it can have a single error type vs. an enum of the poll_ready error vs. Self::Error.

carllerche · 2017-09-22T00:13:08Z

I also just switched call to take &mut self instead of &self. Based on the middleware in tower-middlware, adding back pressure support w/ poll_ready significantly changes how things fit together.

First, you can't just put Service in an Arc because poll_ready takes &mut self. This means that sequentially calling two services is more complicated. I added Filter specifically to experiment with this kind of interaction.

I landed on cloning the second service into the response future. This means that you can get &mut self when calling the second service.

hawkw · 2017-09-22T00:16:58Z

src/lib.rs

@@ -236,7 +236,7 @@ where T: Service,
    }
 }

-impl<F, R, S> NewService for F
+impl<F, R, E, S> NewService for F


impl<F, R, E, D> NewService for F

carllerche · 2017-09-22T00:20:38Z

All: I'm actually pretty happy with the way this is currently. I would like to try to get this PR merged sooner than later with the expectation that the current state of Tower is still quite experimental and subject to change.

withoutboats · 2017-09-22T00:40:40Z

I've been feeling pretty weary about this PR but I haven't found time to think it through enough to make a substantive comment. The essence of my concern is that this PR makes tower unsuitable for building business logic applications with mid-level performance needs by making their lives significantly more complicated. In other words, this doesn't seem to let you just not care about backpressure.

Its possible that this is inevitable - that these users' needs cannot be squared with the needs of absolute high performance services. But the status quo is that Service is the lowest common denominator between protocol libraries like hyper and arbitrary user applications, so I'm unclear what the story will be if tower becomes more narrowly targeted.

Will try to sit down and work through the options in my head to be more constructive, but likely not until next week.

carllerche · 2017-09-22T01:16:29Z

@withoutboats could you elaborate? You aren't providing much in the way of specifics to actually work through.

danburkert · 2017-09-22T01:24:03Z

... Its possible that this is inevitable - that these users' needs cannot be squared with the needs of absolute high performance services...

I think this is a good point, I have a couple thoughts:

The simple (perhaps cop-out) response is that these application developers should be using a sync wrapper over the async implementation. Backpressure can then can be handled naturally (by blocking), but only if the underlying implementation has the capability to signal it.

I do think there's an opportunity to make this easier to use with async/await by providing a way to call a Service, and have it automatically try poll_ready() in a loop, yielding until there is capacity available, and only then performing the actual call. I briefly tried implementing this, but haven't been able to figure out an API to do it with the current borrowing restrictions in the async/await impl. I'm fairly confident that this could work in the long-term, though (perhaps someone smarter could figure out how to do it now).

Will try to sit down and work through the options in my head to be more constructive, but likely not until next week.

I'd be especially interested to hear if you continue to have the same concerns you brought up in tokio-rs/tokio-service#9 (comment), given that some of the particulars have changed.

carllerche · 2017-09-22T02:54:48Z

@withoutboats

A few things.

The majority of the Filter impl has nothing to do with backpressure, but it is mostly the pain of dealing w/ futures and not allocating.
Frameworks can easily layer on an unbounded API if they so desire. It's simply always returning Ready from poll_ready.
You can always come up w/ an UnboundedService and impl Service for Arc<T: UnboundedService>

carllerche · 2017-09-26T17:14:33Z

@withoutboats Any follow up thoughts on top of implementing Service for Arc<NoBackpressure>?

Again, I'd like to merge this to master as it seems to panning out so far. It isn't a commitment for a release though.

aturon

I've had a chance to review the ideas here, and talk with @withoutboats about it some as well. The insights on the filter PR helped a lot in grokking your vision of "backpressure as buffer reservation". This model is feeling really clean to me, and makes me think we should perhaps revisit Sink along similar lines.

I was initially worried about the move to &mut, but I believe the approach of requiring Clone for services that are to be composed will work just fine; in practice, you just move the Arc needed for composition today "inward", into the service itself.

One passing question: are there ever cases where you want to tell the caller "please slow down this type of request"? Do you intend to do that through an error variant for call?

aturon · 2017-09-27T01:36:38Z

Oh, also: should we consider making Service: Clone, so that composability is baked in more fundamentally?

carllerche · 2017-09-27T16:56:40Z

@aturon I considered this, but I am doubtful all services could impl clone.

aturon · 2017-09-27T17:14:20Z

@carllerche isn't that a problem for the composition story, though? Non-Clone services won't be usable with things like Filter, right?

carllerche · 2017-09-27T17:20:45Z

Clone can be added as a decorator around a service IMO that provides some level of configurable buffering.

aturon · 2017-09-27T17:31:23Z

@carllerche would love to see a sketch of that!

carllerche · 2017-09-27T17:39:08Z

@aturon sure, it will come...

carllerche · 2017-09-27T17:41:35Z

@aturon

That said, re: Buffer, the middleware will need to spawn a task to flush the buffer to the inner future, but besides that, it should be a straightforward impl.

withoutboats · 2017-09-27T18:56:37Z

@aturon I considered this, but I am doubtful all services could impl clone.

Services that don't impl Clone can't be moved into a future during a nested call. I think that services that don't impl Clone won't be very useful & more worried about library authors forgetting that they need to make their Service be Clone.

carllerche · 2017-09-27T22:15:11Z

@withoutboats I mean, you could impl an h1 client that directly wraps a socket & provides Service, but can't impl Clone (without adding overhead).

IMO that case should have Clone punted to a wrapper.

withoutboats · 2017-09-27T22:16:50Z

@carllerche But that's exactly my concern, such a Service would be unusable in any use case that sequenced asynchronous events (which I believe will ultimately predominate).

carllerche · 2017-09-27T22:31:30Z

@withoutboats I don't see how it would be unusable? It's very usable in the case of sequential use. You can only have one in-flight request at the time.

Either way, I would appreciate it if you could sum up the concerns and open an issue to continue discussion there vs. here? I don't consider the matter settled.

carllerche requested review from alexcrichton, aturon, withoutboats, danburkert, olix0r and seanmonstar September 15, 2017 17:00

Add an example of a service with backpressure

7f59c78

seanmonstar reviewed Sep 15, 2017

View reviewed changes

danburkert reviewed Sep 15, 2017

View reviewed changes

Fix error mapping in poll_ready example

690de2a

alexcrichton reviewed Sep 18, 2017

View reviewed changes

carllerche mentioned this pull request Sep 18, 2017

Revisit &self vs. &mut self #8

Closed

olix0r approved these changes Sep 21, 2017

View reviewed changes

Merge branch 'master' into backpressure

565e70c

Switch call(&self) -> &mut self

f657024

hawkw reviewed Sep 22, 2017

View reviewed changes

danburkert approved these changes Sep 22, 2017

View reviewed changes

aturon approved these changes Sep 27, 2017

View reviewed changes

carllerche merged commit bacb7db into master Sep 27, 2017

withoutboats mentioned this pull request Sep 27, 2017

Should Service require Clone? #11

Closed

carllerche deleted the backpressure branch October 3, 2017 16:15

Add backpresure capabilities to Service #6

Add backpresure capabilities to Service #6

Conversation

carllerche commented Sep 15, 2017 • edited Loading

Unresolved questions

danburkert commented Sep 15, 2017

carllerche commented Sep 15, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexcrichton left a comment

Choose a reason for hiding this comment

olix0r left a comment

Choose a reason for hiding this comment

olix0r commented Sep 21, 2017 • edited Loading

carllerche commented Sep 21, 2017

carllerche commented Sep 22, 2017

hawkw Sep 22, 2017 • edited Loading

Choose a reason for hiding this comment

carllerche commented Sep 22, 2017

withoutboats commented Sep 22, 2017

carllerche commented Sep 22, 2017

danburkert commented Sep 22, 2017

carllerche commented Sep 22, 2017

carllerche commented Sep 26, 2017

aturon left a comment

Choose a reason for hiding this comment

aturon commented Sep 27, 2017

carllerche commented Sep 27, 2017

aturon commented Sep 27, 2017

carllerche commented Sep 27, 2017

aturon commented Sep 27, 2017

carllerche commented Sep 27, 2017

carllerche commented Sep 27, 2017

withoutboats commented Sep 27, 2017 • edited Loading

carllerche commented Sep 27, 2017

withoutboats commented Sep 27, 2017

carllerche commented Sep 27, 2017

Add backpresure capabilities to `Service` #6

Add backpresure capabilities to `Service` #6

carllerche commented Sep 15, 2017 •

edited

Loading

olix0r commented Sep 21, 2017 •

edited

Loading

hawkw Sep 22, 2017 •

edited

Loading

withoutboats commented Sep 27, 2017 •

edited

Loading