Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

ZeroMQ based handlers for tornado.web #177

Merged
merged 19 commits into from Mar 1, 2012

Conversation

Projects
None yet
3 participants
Contributor

ellisonbg commented Feb 18, 2012

This PR adds a new zmq.web.zmqweb module. This module adds the ability to run tornado.web.RequestHandlers in a scalable and distributed manner using zmq ROUTER/DEALER sockets. You can instantiate multiple instances of handlers in backend processes and requests to those handlers will be load balanced. See the example in examples/zmqweb for a simple demonstration. The core work is done but there are some additional things that I need to finish before this is merged:

  • Better example and documentation.
  • Testing.

ellisonbg added some commits Feb 17, 2012

@ellisonbg ellisonbg Adding new zmq web.
This makes it possible to run Tornado web request handlers in
a distributed and load balanced manner using ZeroMQ.
f261a5f
@ellisonbg ellisonbg Adding simple example of zmqweb. efa45d1
Owner

minrk commented Feb 18, 2012

Nice! Perhaps a less redundant name would be preferable. zmq.web.zmqweb.ZMQWebApplication is awfully repetitive. Also, the imports should probably de defined in zmq.web.__init__, so you can import directly from zmq.web. The same should be true of zmq.rpc.

Owner

minrk commented Feb 18, 2012

How does ZMQRequestHandler differ from web.RequestHandler?

Owner

minrk commented Feb 18, 2012

And in general, where you are subclassing/replacing tornado classes, you should make it abundantly clear what code is copied and what is changed, because right now it's extremely hard to tell.

Also, for imports, since you discuss at the end 'monkey patching' (what you do is not monkey patching), so that users can import our code instead of tornado.web, this import should really be zmq.web, not zmq.web.zmqweb.

Contributor

ellisonbg commented Feb 19, 2012

Public API wise, ZMQRequestHandler is the same as web.RequestHandler. Only the internal implementation details are different.

I will try to add more comments about what I have copied and what is changed - it is a bit of a nightmare to keep in synch with tornado.

The Monkey patching was not originally for the user's benefit but to enable the existing parts of tornado to pick up the right handlers. These handlers are used in tornado.web in methods we don't override - we need those methods to pick up our versions.

But you bring up a good point. Do we want to present zmq.web as a complete and API compatible version of tornado.web? There are some other classes that are in tornado.web that I haven't imported and stored references to. I sort of like that idea. Also, should we encourage people to use the original names or our "ZMQ" prefixed ones?

Owner

minrk commented Feb 19, 2012

Public API wise, ZMQRequestHandler is the same as web.RequestHandler. Only the internal implementation details are different.

Yes, this is what I expected, but reading the code, I didn't see anything that would suggest that there is anything in this handler that differs from tornado, and I didn't want to go make a diff myself. Keeping in sync with tornado is hard, so we need to document very clearly what our changes are when we are doing clobber&copy. I think you should writeup separately, or at least document inline quite thoroughly where the ZMQ subclasses are copied and changed from tornado, so we can see what it's doing, and why what tornado itself does cannot be made to work with existing APIs.

The Monkey patching was not originally for the user's benefit but to enable the existing parts of tornado to pick up the right handlers. These handlers are used in tornado.web in methods we don't override - we need those methods to pick up our versions.

Okay, that makes sense, but is not what the comment suggests to me. In that case, I would just leave it as-is with a more detailed comment explaining that these names are purely for use in the classes in this module, and not to be imported by anyone (e.g. the should be excluded from __all__).

Do we want to present zmq.web as a complete and API compatible version of tornado.web?

No, I think this is a bad idea. Monkeypatching is never good, and should be avoided whenever possible. A simple subclass approach, as we are doing here makes perfect sense.

@minrk minrk commented on an outdated diff Feb 19, 2012

zmq/web/zmqweb.py
+ req['uri'] = request.uri
+ req['version'] = request.version
+ req['headers'] = dict(request.headers)
+ body = request.body
+ req['remote_ip'] = request.remote_ip
+ req['protocol'] = request.protocol
+ req['host'] = request.host
+ req['files'] = request.files
+ req['arguments'] = request.arguments
+ req['args'] = args
+ req['kwargs'] = kwargs
+
+ msg_id = bytes(uuid.uuid4())
+ self.stream.send(msg_id, zmq.SNDMORE)
+ self.stream.send(jsonapi.dumps(req), zmq.SNDMORE)
+ self.stream.send(body)
@minrk

minrk Feb 19, 2012

Owner

Why do you never use send_multipart?

@minrk minrk commented on an outdated diff Feb 19, 2012

zmq/web/zmqweb.py
+ req['protocol'] = request.protocol
+ req['host'] = request.host
+ req['files'] = request.files
+ req['arguments'] = request.arguments
+ req['args'] = args
+ req['kwargs'] = kwargs
+
+ msg_id = bytes(uuid.uuid4())
+ self.stream.send(msg_id, zmq.SNDMORE)
+ self.stream.send(jsonapi.dumps(req), zmq.SNDMORE)
+ self.stream.send(body)
+ self._callbacks[msg_id] = callback
+ return msg_id
+
+ def _handle_reply(self, msg_list):
+ msg_id = msg_list[0]
@minrk

minrk Feb 19, 2012

Owner

Do you want to protect against invalid replies? If a peer sends a reply of the wrong length here, your stream will be disabled because the exception is uncaught.

@minrk minrk commented on an outdated diff Feb 19, 2012

zmq/web/zmqweb.py
+from zmq.eventloop.ioloop import IOLoop
+from zmq.utils import jsonapi
+
+#-----------------------------------------------------------------------------
+# Service client
+#-----------------------------------------------------------------------------
+
+class ZMQWebApplicationProxy(object):
+ """A client to a ZeroMQ based backend service."""
+
+ def __init__(self, loop=None, context=None):
+ self.loop = loop if loop is not None else IOLoop.instance()
+ self.context = context if context is not None else zmq.Context.instance()
+ self._callbacks = {}
+ self.socket = self.context.socket(zmq.DEALER)
+ self.socket.setsockopt(zmq.IDENTITY, bytes(uuid.uuid4()))
@minrk

minrk Feb 19, 2012

Owner

Are you sure you want to set the IDENTITY here? Setting identities causes the socket to become durable, which is rarely desirable, and is the only difference between setting a random identity and leaving the default.

@minrk minrk commented on an outdated diff Feb 19, 2012

zmq/web/zmqweb.py
+ transforms=transforms, wsgi=wsgi, settings=settings
+ )
+ self.socket = self.context.socket(zmq.ROUTER)
+ self.stream = ZMQStream(self.socket, self.loop)
+ self.stream.on_recv(self._handle_request)
+ self.urls = []
+
+ def connect(self, url):
+ """Connect the service to the proto://ip:port given in the url."""
+ self.urls.append(url)
+ self.socket.connect(url)
+
+ def bind(self, url):
+ """Bind the service to the proto://ip:port given in the url."""
+ self.urls.append(url)
+ self.socket.bind(url)
@minrk

minrk Feb 19, 2012

Owner

you track a list of urls, but don't distingush between bind and connect. Is there a reason to track them, and if so, should they be tracked separately?

ellisonbg added some commits Feb 22, 2012

@ellisonbg ellisonbg Updating zmqweb.
* flush and partial writes are now supported in the backend.
* Massive cleanup. We no longer require a custom RequestHandler
  subclass on the backend side of things. This means that
  standard request handlers can be used with 0 changes.
* Many fixes to address code review comments.
* Not using durable sockets.
f655491
@ellisonbg ellisonbg Removed XSRF handling in the zmqweb proxy. Handled in backend now. 24ac691
@ellisonbg ellisonbg Refactoring zmqweb.
* Much less is done in subclasses.
* We use leverage the existing write/flush/finish logic on both
  sides.
* New ZMQStreamingApplicationProxy/ZMQStreamingHTTPRequest that
  enables full streaming mode.
* Default ZMQApplicationProxy/ZMQHTTPRequest sends all reply parts
  in a single zmq message.
* Timeouts are working.
* Improved example.
239e120
@ellisonbg ellisonbg Updating examples for zmqweb. 32c9dcd
@ellisonbg ellisonbg Finishing streaming examples. 44e5977
@ellisonbg ellisonbg Implementing multi-identity handling. 54f7ef3
@ellisonbg ellisonbg Finishing multi-identity handling. 28cd124
Contributor

ellisonbg commented Feb 27, 2012

@minrk: OK I think this branch is ready to do, should I merge?

Owner

minrk commented Feb 27, 2012

Looking good. I don't suppose you want to add some tests?

Contributor

ellisonbg commented Feb 28, 2012

Is there any way to add tests for things that rely on an eventloop and
client? Seems like that is much more difficult to test in unit tests.
I have written the examples so they test the main points.

On Mon, Feb 27, 2012 at 11:58 AM, Min RK
reply@reply.github.com
wrote:

Looking good.  I don't suppose you want to add some tests?


Reply to this email directly or view it on GitHub:
#177 (comment)

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

Owner

minrk commented Feb 28, 2012

Sure you can, though the easiest is a sort of half-test, where you start the apps (in this case, web server and workers) in subprocesses (workers can be threads in a single process with the webapp), and only test the fully assembled code from a client (using urllib here).

If you want, we can do that later, and just add simple import tests (easiest first check for py3) before merge.

-MinRK

On Feb 27, 2012, at 20:58, "Brian E. Granger"reply@reply.github.com wrote:

Is there any way to add tests for things that rely on an eventloop and
client? Seems like that is much more difficult to test in unit tests.
I have written the examples so they test the main points.

On Mon, Feb 27, 2012 at 11:58 AM, Min RK
reply@reply.github.com
wrote:

Looking good. I don't suppose you want to add some tests?


Reply to this email directly or view it on GitHub:
#177 (comment)

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com


Reply to this email directly or view it on GitHub:
#177 (comment)

minrk and others added some commits Feb 29, 2012

@minrk minrk add simple import test for zmq.web ab6f843
@minrk minrk fix urlparse import for py3k 65a651a
@minrk minrk split Proxy classes from zmq.web.zmqweb into zmq.web.proxy bcd6a4b
@minrk minrk lazy-format logging messages
This avoids computation of the formatted message unless the message will actually be displayed.
a33e414
@minrk minrk minor typos in zmq.web de9d4e6
@minrk minrk catch any exception when unpacking requests
e.g. invalid JSON, rather than just message length and DELIM presence
b6c33b4
@ellisonbg ellisonbg Merge pull request #185 from minrk/splitweb
Split Proxy classes out of zmq.web.zmqweb into zmq.web.proxy.

Also not doing string formatting in calls to logging.
03499ff
@ellisonbg ellisonbg Updating license of zmqweb to BSD. 2f0bf6a
@ellisonbg ellisonbg Fixing small comment. 9077bc6
Contributor

ellisonbg commented Mar 1, 2012

OK I have address everything:

  • License updated to BSD.
  • Minor comment fixes.

I still don't have tests, but everything else is ready to go. I will also update the license in the rpcweb PR now as well.

Owner

minrk commented on zmq/web/zmqweb.py in 9077bc6 Mar 1, 2012

There's another similar comment a few lines above, which probably wants the same fix.

Owner

minrk commented Mar 1, 2012

I made a simple import test, which I put in before making the splitweb PR (and fixed a py3 import that it revealed), which is enough of a token effort to start out, I think.

Owner

minrk commented Mar 1, 2012

Okay, great. I think this is ready for master, then.

@minrk minrk added a commit that referenced this pull request Mar 1, 2012

@minrk minrk Merge pull request #177 from zeromq/zmqweb
add zmq.web

ZeroMQ-based handlers for scaling tornado web applications
d87f34e

@minrk minrk merged commit d87f34e into master Mar 1, 2012

pelson commented Apr 7, 2013

For those following this through, zmq.web was removed by #285.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment