New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simple RPC based service for PyZMQ #174
Conversation
nice! pluggable serialization will be a lot simpler if you just put serialize/deserialize in isolated methods, rather than bundling them in with dispatch. |
Yes good idea, I will try to get that done tonight. On Thu, Feb 16, 2012 at 5:05 PM, Min RK
Brian E. Granger |
OK, I have made the serialization pluggable and added the async client example. I have one design question about the async proxy object's api. The current API looks like this: echo.echo(callback, *args, **kwargs) Where the |
There are two choices for serialization:
The first is easiest to write, and easiest to drop in new serialization, but only the second can give you zero-copy, because a zero-copy message invariably turns an object into two or more parts (metadata + buffer). As for callback/errback, I don't think passing an Exception to the callback makes sense, so you should probably specify an errback. Unfortunately, the signature you provided makes optionally specifying the errback impossible. You could define a single errback at the class level. I'm not sure how best to deal with that. Something to consider: What happens when a worker never responds to a request? |
I actually think we could get both 1+2 in the following manner. If the serializer returns a list/tuple, send all element as message frames. Otherwise just send the result. On the deserialization side, if there is one message frame, pass that to the deserializer as is, otherwise pass a list of frames. Do you think that will work? On the errback: what should the errback get passed? Also, how do you think we should pass the original exception to the RemoteRPCException so the user to figure out what went wrong? Not sure how to handle failures. Would you integrate heartbeating? What would that API look like on the client side of things? |
Hm, not really. It would only work if you require that a list-returning serializer never returns a list of length 1. Also, the more black-box approach should probably be passed both args and kwargs, rather than getting them separately, because how are you going to determine how many frames should be passed to the args deserializer, and how many should be passed for kwargs?
Probably the same thing we do in IPython - ename, evalue, traceback as strs.
I would skip heartbeats, and just use timeouts. This would mean requests would need to discard replies with the wrong msg_id, as they would be stale replies from timed out requests. |
* Central Serializer class is used, which allows more customization of the serialization of args/kwargs/result. * Errback added to the AsyncRPCServiceProxy. This errback is passed the ename, evalue, tb as strings. * Timeout added to the AsyncRPCServiceProxy. * self.urls removed.
* Examples use errback and timeout. * Refactored the Serializer class and created a JSONSerializer whose usage is demoed in the examples. * Fixed minor bugs in simplerrpc.
OK I think I have fixed everything. For the serialization I created a Serializer object that supports both full blown customization as well as the simpler approach. I like how that part of it turned out. I have implemented errback and timeout and added relevant examples. Only one more question: is there anything I need to do to target Python 3.0 that you can see? |
Not sure if there's room for more serializers, especially ones which add more dependencies, but I threw together a symmetric cipher serializer here in an example https://gist.github.com/1911540. |
* Finish multi-ident handling. * ename, evalue, tb is serialized in a better manner. * Changes from PR #183 incorporated. * Updated example.
@minrk: what do you think about including this serializer? If we include this, it might be nice to improve the API of the base class to handle loads/dumps separately from the encrypt/decrypt. The reason for this is that it would be useful to change the serialization without changing the encryption and vice versa. |
I think it's a useful example, showing how to do subclassing for your own serialization/encryption, so it should go in examples. I don't think encrypt/decrypt should be separate from serialize/deserialize. I wouldn't want to give anyone the illusion that this code is secure, since it makes no protection against fuzz attacks, or any such thing. |
I think the encryption itself is sound save for the trivial key. You're right about the security issues since there's no effort to sanitize user inputs. It would not be a lot more work to use a library like Flatland to ensure user input is sane. I could improve the example to do that if you think its worth it. |
I think the example is good as it is, and helpful for users who want to write derivatives with their own serialization and/or encryption. Simple error handling when unpacking messages should be done in the RPC code itself. |
OK, can you create a new pull request that puts the serializer example On Sun, Feb 26, 2012 at 8:16 PM, Min RK
Brian E. Granger |
@minrk: this one is ready to go as well. I can do the merge if you want. |
I still don't like the fact that you have to specify callback, errback, and timeout every time. That seems pretty annoying and un-pythonic. Can you think of a way to make errback and timeout optional args? |
If we want to pass _args and *_kwargs, we can't really use keywords call(func, callback=None, errback=None, timeout=-1, args=None, kwargs=None) And force people to pass tuple and dicts for args, kwargs. Which do you prefer? On Mon, Feb 27, 2012 at 12:00 PM, Min RK
Brian E. Granger |
mmh, this is always a decision we have to make - is passing _args, *_kwargs nice enough that it is worth requiring multiple extra arguments that are rarely used be specified every time. We sidestep this in IPython.parallel by setting defaults via attributes. This sort of approach might make sense, at least where errback/timeout are concerned, for which the values will likely be the same for every single call for a given RPC instance. |
As one who really wants to use this rpc library, I would not want the apparent interface to expose any inability to just call functions via the proxy. Forcing the users to use args=, kwargs= is pretty rough. I like the idea of setting these values on a class if you can allow users to pass their own class to use for method wrapping. Its unlikely a user would want these values to change for specific functions. |
But I don't think the class default solves the issue in this case On Mon, Feb 27, 2012 at 8:16 PM, Dan Colish
Brian E. Granger |
Compromise: just pull timeout off to an attribute? It seems extremely unlikely that this would ever change within a given application. I still think errback is likely to be None the majority of the time, and it seems like a poor Python interface that forces you to specify We have already discovered that there's no such thing as a really good interface that accepts |
OK license is updated. |
Why not merge this rpc part? I find it very useful. |
The RPC code is incomplete and needs work, particularly on handling dying workers. I don't think it should go in before at least the interface is relatively settled. |
My memory is a bit fuzzy about this now. IIRC though, this stalled out
Neither option is thrilling. @minrk, @takluyver, @fperez, Which do you prefer? I do think that the core of pyzmq is quite stable and people are using it On Fri, Aug 17, 2012 at 10:01 AM, Min RK notifications@github.com wrote:
Brian E. Granger |
Is the plan to later expand this to cover more complex RPC uses, or is it just intended to be a simple base that third parties can expand? In the latter case, I'd side with including it in pyzmq. In the former case, I think a separate package would make more sense, as a matter of perception: there are lots of use cases for ZMQ besides RPC, and you probably don't want people to think pyzmq is turning into an RPC package. |
We had decided that development should happen in code where it is used (IPython multiuser) rather than in isolation. Then, when it is demonstrated satisfactory and stable for a real application, it can be pulled upstream. Cutting a release of pyzmq with RPC adds enormous inertia to the APIs, which I am not at all convinced are final (especially given that the API discussion here so far remains totally inconclusive). |
I have decided to maintain this as a separate project: |
This adds an rpc subpackage to zmq and implements a "simple" RPC service. This version is focused only on the request/reply pattern, but handles load balancing using ROUTER/DEALER sockets. While we definitely want to explore richer RPC APIs that allow for arbitrary socket types and message patterns, this version is very useful because of its extreme simplicity and good performance..
Still working on:
Needs review: