Add support for timeouts in ServiceProxy #360

jessepollak · 2016-09-15T01:02:11Z

Following up on the discussion here, I wanted to do a quick spike on adding timeouts to ServiceProxy (and thereby RpcProxy). This PR currently has the best implementation I could think of, but given my relative lack of experience with nameko core development, I'd love any and all feedback. Some things I'm currently hesitant / have questions about:

Reading through the code, it seems like all Standalone proxy (ClusterProxy, StandaloneProxy) already have timeouts due to them using the SingleThreadedReplyListener. Is this correct? If so, I won't worry about adding timeout support for them.
When the ReplyListener receives messages, it needs to distinguish between those messages that are "unknown" and those that have timed out. To do this, I check if the reply_event has already raised a RpcTimeoutError and if it has I throw the event away. This extra try / catch seems a little heavy handed, but I couldn't think of another way to convey that information to ReplyListener. What are thoughts on this? Should we explore refactoring ReplyListener to make this a little easier?

Thanks for the feedback etc!

davidszotten · 2016-09-15T09:32:00Z

in the past when we've talked about this, it was often required on a per-call basis. it's tricky because all parameters to rpc are considered rpc paramters

jessepollak · 2016-09-15T15:26:51Z

In this PR, I've add a default timeout on the RpcTimeout initialization and then the ability to override that default (or specify one if none was supplied as a default) in the method call through the _timeout parameter.

Looking through the discussion here, it seems like there's a strong preference towards not entering the "user space" of the base function call of a method. The two possible solutions that come to mind are:

Using a context processor similar to the one described here that would allow one to modify the timeout on a per call basis.
Add an additional way to call an rpc call that exposes more functionality, while leaving the plain calling method with the defaults around. This would look something like:

n.rpc.greeting_service.hello(name='Matt')
n.rpc.greeting_service.hello.send(args=(), kwargs={'name': 'Matt'}, timeout=2)

Celery does the same thing with delay (which uses defaults and mimics a standard function call by passing through all of the args and kwargs) and apply_async (which you have to specify kwargs and args, but allows much more customization.

Personally, I lean towards implementing an interface like (2) where the "default" way of calling a method is the simplest / easiest to read, but uses system default, while a slightly more involved way of calling the method allows full customization of how the call is made.

Three questions for you @davidszotten:

Is there any opposition to adding a timeout that doesn't stem from the stylistic concerns you've described here? If so, what is that opposition?
How do you handle timeouts / service disappearances in situations where you currently don't have this timeout-like functionality?
Would you be open to me modifying this PR to implement the pattern loosely described in (2) above? If so, do you have any guiding thoughts / stylistic preferences about the implementation?

Thanks for your thoughts / time! Also, @mattbennett would love your thoughts on this.

davidszotten · 2016-09-18T20:43:34Z

no, i think the idea of timeouts is sound. though i'd say it's about ergonomics more than style
we managed ok without timeouts at ofs (though the idea did come up from time to time, and we never got around to it). it's worth noting that the absence of timeouts does provide an extent of backpressure

a few other thoughts. there are two sides to what you are calling "timeout" here, client side (aborting the wait) and server side (some kind of ttl). do we definitely wanto to tie these together? (probaby yes?) we might need to think about how we handle the latter, e.g. (of the top of my head)
a) does it propagate to rpc calls made by the responding worker?
b) does nameko set a timer and kill a worker handling a request with a ttl?

your example uses the standalone proxy, but i think the rpc extension is as, if not more important to consider

i think i'm on the fence on configuring via context manager vs adding another call syntax with args and kwargs params

mattbennett · 2016-10-05T15:40:48Z

My thoughts:

I’m fine with approach (2). I agree that a nice “default” and a slightly more involved alternative makes for a good API.

I also favour supporting defaults on the proxy, in a way that they can be changed with a subclass or at instantiation time. Which is more-or-less what this PR adds.

Adding this to the standalone RPC proxy, you get an API similar to the final one in #292:

with ServiceRpcProxy('service', rabbit_config, timeout=30) as rpc_proxy:
    ...

These two approaches can obviously be combined to create a proxy with a default but also allow options to be set on a per-call basis:

# service case

class Service(object):
     name = "service"

     other_rpc = RpcProxy("other", timeout=30)

     @rpc
     def method(self):
         self.other_rpc.remote_method.call(args=args, timeout=15)

# standalone case

with ServiceRpcProxy('service', rabbit_config, timeout=30) as proxy:
    proxy.remote_method(...)
    proxy.remote_method.call(..., timeout=15)

@jessepollak I would certainly be open to you expanding this PR to cover these cases. Stylistically, I think it would be nice if the timeout was added as a property that could be overridden by a subclass if the user wanted to. I have used this pattern here.

On the "two sides" of a timeout:

I think the client abort and the TTL should be tied together, but only in a limited way.

A client using a timeout should set that same timeout as a TTL on its request. If the message expires before being consumed, we'll have avoided some unnecessary work. But I think that trying to kill workers that are already running would cause more problems than it solves. The library has no knowledge of what the worker is doing, so it's not possible to stop them safely.

Similarly for propagating the TTL to RPC calls made by that worker -- it's not necessarily true that child RPC calls (or events, or any other kind of side-effect) should be aborted if the client stops caring about the result.

If aborting workers was really required, it could be implemented with a custom entrypoint that extracted the incoming TTL and allowed the worker to alter its behaviour accordingly. In other words I think this is application-level logic rather than something to be implemented by the framework.

xqliang · 2016-11-22T03:03:33Z

nameko/rpc.py

@@ -275,6 +278,9 @@ def get_reply_event(self, correlation_id):
        self._reply_events[correlation_id] = reply_event
        return reply_event

+    def pop_reply_event(self, correlation_id):
+        self._replay_events.pop(correlation_id, None)


Missing return

xqliang · 2016-11-22T03:05:20Z

nameko/rpc.py

    def _call(self, *args, **kwargs):
        _log.debug('invoking %s', self)

        worker_ctx = self.worker_ctx
        container = worker_ctx.container
+        timeout = kwargs.pop('_timeout', None) or self.timeout


timeout = kwargs.pop('_timeout', self.timeout) will make sense.

manfred-chebli · 2017-08-23T12:19:03Z

I'm digging up the subject here.

Is implementing a timeout still a viable option? Since it's been a while since the last commit, is there any way to detect a service is down before calling it via RPC?

mattbennett · 2017-08-28T08:28:58Z

Timeouts are still viable and desired. I think both "client-side" and TTL-based timeouts would be valuable, and I like the interface that was proposed:

n.rpc.service.method.call(args=(), kwargs={'foo': 'bar'}, timeout=10, expire=10)

There's no way to tell whether a service is available I'm afraid, other than trying to call it.

signalpillar · 2018-07-25T13:44:33Z

nameko/rpc.py

+    def get_message_headers(self, worker_context, timeout=None):
+        headers = HeaderEncoder.get_message_headers(self, worker_context)
+        if timeout:
+            headers['x-message-ttl'] = timeout * 1000


I've tried this key and the one that is mentioned in the RabbitMQ documentation and used by kombu (expiration) and it doesn't work.
Expected that when the expired message reaches the head of the queue it shouldn't be delivered but it is.

Ok, I've got it working. Instead of setting it to the headers (app specific) we need to set the property on the message when we publish it.

class _TimeoutAwareMethodProxy(nameko.rpc.MethodProxy): """Timeout aware RPC method proxy.""" def __init__( self, *, timeout_secs: float, **kwargs, ) -> None: super().__init__(**kwargs) self.publisher.expiration = timeout_secs

daviskirk · 2019-08-18T08:09:10Z

I revived this here #652, hopefully including the feedback from this pull request

jessepollak force-pushed the add-timeouts-for-rpc-proxy branch from bf613f2 to adbd803 Compare September 15, 2016 01:03

jessepollak changed the title ~~Add support for timeouts in StandaloneProxy~~ Add support for timeouts in ServiceProxy Sep 15, 2016

jessepollak force-pushed the add-timeouts-for-rpc-proxy branch from adbd803 to e544575 Compare September 15, 2016 01:04

feat(RpcProxy): adds support for timeouts in ServiceProxy

b441801

jessepollak force-pushed the add-timeouts-for-rpc-proxy branch from e544575 to b441801 Compare September 15, 2016 01:05

fix(rpc): fix typo

6bcf204

xqliang reviewed Dec 29, 2016

View reviewed changes

signalpillar reviewed Jul 25, 2018

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for timeouts in ServiceProxy #360

Add support for timeouts in ServiceProxy #360

jessepollak commented Sep 15, 2016

davidszotten commented Sep 15, 2016

jessepollak commented Sep 15, 2016

davidszotten commented Sep 18, 2016

mattbennett commented Oct 5, 2016

xqliang Nov 22, 2016

xqliang Nov 22, 2016

manfred-chebli commented Aug 23, 2017

mattbennett commented Aug 28, 2017

signalpillar Jul 25, 2018

signalpillar Jul 25, 2018

daviskirk commented Aug 18, 2019

Add support for timeouts in ServiceProxy #360

Are you sure you want to change the base?

Add support for timeouts in ServiceProxy #360

Conversation

jessepollak commented Sep 15, 2016

davidszotten commented Sep 15, 2016

jessepollak commented Sep 15, 2016

davidszotten commented Sep 18, 2016

mattbennett commented Oct 5, 2016

xqliang Nov 22, 2016

Choose a reason for hiding this comment

xqliang Nov 22, 2016

Choose a reason for hiding this comment

manfred-chebli commented Aug 23, 2017

mattbennett commented Aug 28, 2017

signalpillar Jul 25, 2018

Choose a reason for hiding this comment

signalpillar Jul 25, 2018

Choose a reason for hiding this comment

daviskirk commented Aug 18, 2019