Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add backpresure capabilities to Service #6

Merged
merged 5 commits into from
Sep 27, 2017
Merged

Add backpresure capabilities to Service #6

merged 5 commits into from
Sep 27, 2017

Conversation

carllerche
Copy link
Member

@carllerche carllerche commented Sep 15, 2017

Currently, Service does not provide a mechanism by which it can signal to the
caller that it is at capacity. This commit adds a poll_ready function to the
Service trait. Callers are able to first check poll_ready before calling
Service::call.

poll_ready is expected to be a hint and will be implemented in a best effort
fashion. It is permitted for a Service to return Ready from poll_ready
and the next invocation of Service::call fails.

I will be adding more docs and examples before merging the PR.

Unresolved questions

  • poll_ready currently has an Err variant. What should this signify and is it needed at all?
  • Should the poll_ready error be the same as call?
  • Should there be a default poll_ready impl that always returns ready?
  • Should Service take &mut self or &self.

Fixes #3

Currently, `Service` does not provide a mechanism by which it can signal
to the caller that it is at capacity. This commit adds a `poll_ready`
function to the `Service` trait. Callers are able to first check
`poll_ready` before calling `Service::call`.

`poll_ready` is expected to be a hint and will be implemented in a best
effort fashion. It is permitted for a `Service` to return `Ready` from
`poll_ready` and the next invocation of `Service::call` fails.
@danburkert
Copy link

The combination of poll_ready not being a guarantee, and not having a Poll-style NotReady return variant from call pretty much guarantees that Service implementations' error type will have to include a NotReady(Request), no? I think this is going to be very boilerplate-y to handle both in the service impl and in the caller.

@carllerche
Copy link
Member Author

@danburkert well, I would say that services that can't guarantee poll_ready can return the request in the error slot if possible, or in some cases the request will be lost. This will be up to the service implementation to do what fits best.


fn poll_ready(&mut self) -> Poll<(), Error> {
self.tx.lock().unwrap().poll_ready()
.map_err(|_| Error::AtCapacity)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The semantics of poll_ready are that Async::NotReady is to mean at-capacity, but the capacity can reduce in the future. So, should this use poll_ready().or(Ok(Async::NotReady)?

/// This is a **best effort** implementation. False positives are permitted.
/// It is permitted for the service to return `Ready` from a `poll_ready`
/// call and the next invocation of `call` results in an error.
fn poll_ready(&mut self) -> Poll<(), Self::Error>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Err variant is to mean that the Service is in an unrecoverable state, and will never be ready again?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if it means that it is in a permanently unrecoverable state... maybe it does...

It could be used to signal shutdown maybe. I'd say it is a TBD.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a list of unresolved questions to the PR.

#[derive(Debug)]
pub struct ChannelService {
// Send the request and a oneshot Sender to push the response into.
tx: Mutex<Sender>,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a big issue. I just looked into why call takes &self and not &mut self and found tokio-rs/tokio-service#9 (comment). That issue points out that taking &mut self means chained calls require wrapping the second service in Arc<Mutex<S>>, but this example has to use a Mutex, and it's not even chaining. Perhaps we should re-open the &/&mut discussion, given this example? In general it seems strange to me that Service takes &self when everything else in the futures stack takes &mut self (Future::poll, Sink::start_send).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on what you see as a big issue?

Mutex here isn't strictly necessary, but it does make it Sync. You could use RefCell and it would be !Sync and honestly try_send is safe to call w/o any coordination on a single thread but the signature takes &mut self right now.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to me that it's going to be overwhelmingly common to use Mutex or RefCell internally in Service impls because call takes &self (will Service impls really be stateless? Not the interesting ones). I think it may be better to have call take &mut self, and force the caller to do external synchronization, if necessary. In this case, the caller could wrap the ChannelService in a Mutex/RefCell, or just clone it as necessary.

The main argument against &mut self, as I understand it in the linked comment, is that it makes it more difficult to create services which internally chain calls to other services. But as this example shows, &self is making it difficult in an even simpler case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you have a valid point. The use case of chaining multiple calls together is going to be more complicated w/ back pressure handling anyway...

I think it is worth reconsidering &self / &mut self. The answer isn't obviously clear to me right now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main argument against &mut self, as I understand it in the linked comment, is that it makes it more difficult to create services which internally chain calls to other services.

If the middleware trait in #2 is merged, I feel like that would make it less important to be able to reasonably write a service that calls other services?

Copy link

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I'm not too familiar with the problems and design space once we hit the service layer, so I may not be too too helpful in review here :(

Copy link
Collaborator

@olix0r olix0r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far, lgtm. I've updated a project to use the Ready api and everything feels fine so far.

My 2c re: open questions:

poll_ready currently has an Err variant. What should this signify and is it needed at all?

Should the poll_ready error be the same as call?

I think so. It's basically the error that would be returned if call were to be invoked, if it is known in advance.

Errors should, imo, signify that the Service is no longer usable.

Should there be a default poll_ready impl that always returns ready?

IMO, no. I think it's important for implementations to think about readiness explicitly.

Should Service take &mut self or &self.

idk

@olix0r
Copy link
Collaborator

olix0r commented Sep 21, 2017

poll_ready currently has an Err variant. What should this signify and is it needed at all?

Should the poll_ready error be the same as call?

I think so. It's basically the error that would be returned if call were to be invoked, if it is known in advance.

Errors should, imo, signify that the Service is no longer usable.

Eh, as soon as I wrote that i second guessed myself. call may return an error to indicate, for instance, that the request was rejected or canceled though other requests may succeed.

By this measure, I think poll_ready's errors should be considered fatal to the service. But I'm less convinced that it should be the same error type as that of call

@carllerche
Copy link
Member Author

@olix0r I'm actually kind of leaning towards keeping them the same, that way any futures that take a request, wait until the service is ready, then send it can have a single error type vs. an enum of the poll_ready error vs. Self::Error.

@carllerche
Copy link
Member Author

I also just switched call to take &mut self instead of &self. Based on the middleware in tower-middlware, adding back pressure support w/ poll_ready significantly changes how things fit together.

First, you can't just put Service in an Arc because poll_ready takes &mut self. This means that sequentially calling two services is more complicated. I added Filter specifically to experiment with this kind of interaction.

I landed on cloning the second service into the response future. This means that you can get &mut self when calling the second service.

src/lib.rs Outdated
@@ -236,7 +236,7 @@ where T: Service,
}
}

impl<F, R, S> NewService for F
impl<F, R, E, S> NewService for F
Copy link
Member

@hawkw hawkw Sep 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

impl<F, R, E, D> NewService for F

@carllerche
Copy link
Member Author

All: I'm actually pretty happy with the way this is currently. I would like to try to get this PR merged sooner than later with the expectation that the current state of Tower is still quite experimental and subject to change.

@withoutboats
Copy link
Contributor

I've been feeling pretty weary about this PR but I haven't found time to think it through enough to make a substantive comment. The essence of my concern is that this PR makes tower unsuitable for building business logic applications with mid-level performance needs by making their lives significantly more complicated. In other words, this doesn't seem to let you just not care about backpressure.

Its possible that this is inevitable - that these users' needs cannot be squared with the needs of absolute high performance services. But the status quo is that Service is the lowest common denominator between protocol libraries like hyper and arbitrary user applications, so I'm unclear what the story will be if tower becomes more narrowly targeted.

Will try to sit down and work through the options in my head to be more constructive, but likely not until next week.

@carllerche
Copy link
Member Author

@withoutboats could you elaborate? You aren't providing much in the way of specifics to actually work through.

@danburkert
Copy link

... Its possible that this is inevitable - that these users' needs cannot be squared with the needs of absolute high performance services...

I think this is a good point, I have a couple thoughts:

The simple (perhaps cop-out) response is that these application developers should be using a sync wrapper over the async implementation. Backpressure can then can be handled naturally (by blocking), but only if the underlying implementation has the capability to signal it.

I do think there's an opportunity to make this easier to use with async/await by providing a way to call a Service, and have it automatically try poll_ready() in a loop, yielding until there is capacity available, and only then performing the actual call. I briefly tried implementing this, but haven't been able to figure out an API to do it with the current borrowing restrictions in the async/await impl. I'm fairly confident that this could work in the long-term, though (perhaps someone smarter could figure out how to do it now).

Will try to sit down and work through the options in my head to be more constructive, but likely not until next week.

I'd be especially interested to hear if you continue to have the same concerns you brought up in tokio-rs/tokio-service#9 (comment), given that some of the particulars have changed.

@carllerche
Copy link
Member Author

@withoutboats

A few things.

  1. The majority of the Filter impl has nothing to do with backpressure, but it is mostly the pain of dealing w/ futures and not allocating.

  2. Frameworks can easily layer on an unbounded API if they so desire. It's simply always returning Ready from poll_ready.

  3. You can always come up w/ an UnboundedService and impl Service for Arc<T: UnboundedService>

@carllerche
Copy link
Member Author

@withoutboats Any follow up thoughts on top of implementing Service for Arc<NoBackpressure>?

Again, I'd like to merge this to master as it seems to panning out so far. It isn't a commitment for a release though.

Copy link

@aturon aturon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've had a chance to review the ideas here, and talk with @withoutboats about it some as well. The insights on the filter PR helped a lot in grokking your vision of "backpressure as buffer reservation". This model is feeling really clean to me, and makes me think we should perhaps revisit Sink along similar lines.

I was initially worried about the move to &mut, but I believe the approach of requiring Clone for services that are to be composed will work just fine; in practice, you just move the Arc needed for composition today "inward", into the service itself.

One passing question: are there ever cases where you want to tell the caller "please slow down this type of request"? Do you intend to do that through an error variant for call?

@aturon
Copy link

aturon commented Sep 27, 2017

Oh, also: should we consider making Service: Clone, so that composability is baked in more fundamentally?

@carllerche
Copy link
Member Author

@aturon I considered this, but I am doubtful all services could impl clone.

@aturon
Copy link

aturon commented Sep 27, 2017

@carllerche isn't that a problem for the composition story, though? Non-Clone services won't be usable with things like Filter, right?

@carllerche
Copy link
Member Author

Clone can be added as a decorator around a service IMO that provides some level of configurable buffering.

@aturon
Copy link

aturon commented Sep 27, 2017

@carllerche would love to see a sketch of that!

@carllerche
Copy link
Member Author

@aturon sure, it will come...

@carllerche carllerche merged commit bacb7db into master Sep 27, 2017
@carllerche
Copy link
Member Author

@aturon

That said, re: Buffer, the middleware will need to spawn a task to flush the buffer to the inner future, but besides that, it should be a straightforward impl.

@withoutboats
Copy link
Contributor

withoutboats commented Sep 27, 2017

@aturon I considered this, but I am doubtful all services could impl clone.

Services that don't impl Clone can't be moved into a future during a nested call. I think that services that don't impl Clone won't be very useful & more worried about library authors forgetting that they need to make their Service be Clone.

@carllerche
Copy link
Member Author

@withoutboats I mean, you could impl an h1 client that directly wraps a socket & provides Service, but can't impl Clone (without adding overhead).

IMO that case should have Clone punted to a wrapper.

@withoutboats
Copy link
Contributor

@carllerche But that's exactly my concern, such a Service would be unusable in any use case that sequenced asynchronous events (which I believe will ultimately predominate).

@carllerche
Copy link
Member Author

@withoutboats I don't see how it would be unusable? It's very usable in the case of sequential use. You can only have one in-flight request at the time.

Either way, I would appreciate it if you could sum up the concerns and open an issue to continue discussion there vs. here? I don't consider the matter settled.

@carllerche carllerche deleted the backpressure branch October 3, 2017 16:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Figure out backpressure
8 participants