New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reverse protocol handling for WebSocketResource #641
Comments
Hi @warner, the test-suite can be run with |
Ah, thanks. |
http://twistedmatrix.com/trac/ticket/3204 8 years old bug. The background is: to add a Twisted Web resource that runs WebSocket, we (Autobahn) have to get at the full initial HTTP request bytes - which have been eaten away by Twisted Web before. I consider the code in Autobahn a "gentle hack" .. due to lack of appropriate interfaces in Twisted (and me, and that time, not having the patience to go through months of discussions to land something on Twisted;) |
Twisted has made an attribute private that was formerly public?
I'm not sure what this is supposed to mean;) Anyway: Twisted should IMO have an API so that I can take over the full HTTP request on a Web resource .. all the bytes from byte 0 on .. |
Btw: this is a bad one .. it'll basically break almost all Crossbar.io deployments .. |
@warner could you please post the traceback you get from your tests here? |
It feels like the websocket code is attempting to do a I don't think that's quite it, though, because I'd expect the websocket code to be doing There might be something else in my code that's triggering an error at an unusual time, causing the websocket protocol to shut down at an unusual time (maybe before connection negotiation has completed?), and then maybe some infrequently-used error path is writing to the wrong place? |
Leaping in quickly: that makes sense. I wouldn't be surprised if the websocket code grabs |
Oh, wow, I missed that: they not only deprecated access, they completely redefined what
I wonder if that was intentional, or if transports and channels were so similar (or were just always used in the same way) that it made sense to make that change. |
Heh, that wasn't a "they", that was me. =) The concern here was that quite a lot of twisted.web code would grab the backing transport directly, which breaks with HTTP/2 (because you can't just write bytes onto the transport and expect that to work). The goal, then, was to pass all writes through the channel, which "owns" the transport. Clearly, Autobahn was doing what the rest of twisted.web did (which was just grab the transport out of the request, without considering what the channel might want to do). Again, Autobahn can just grab the transport out of the channel: that should work on all Twisted versions. |
Hah :). I realized I probably truncated the traceback here.. it actually starts in I attached the full traceback to the twisted bug (8191). |
Hm, maybe there's something AutoBahn could do to put the channel into the SENT_HEADERS "let me write bytes now" state and then just leave it there forever. |
@warner AutoBahn could, but I don't really know why it'd bother when it can just grab the actual transport instead. |
Ah, cool. |
Weell. I am a bit concerned about all this stuff going on before Autobahn can grab the transport. Because of performance: WebSocket connection establishments / sec. Note: I don't care at all about HTTP/2 .. I am thinking about if we can do it reverse: Autobahn has the listening socket, and only if it doesn't match a configured WebSocket path, it'll hand over to Twisted Web .. |
@oberstet So that's an understandable concern, but it's already true. We didn't add any extra work here. =) That said, you'd almost certainly get a huge performance boost by grabbing the transport earlier. =) If you're interested in a way you could do it, you might want to check out the |
Another reason we probably will go "reverse": the main use case for this feature in Autobahn (having WebSocket on a Twisted Web resource) is in Crossbar.io. And we want to have the ability to have both WebSocket and RawSocket (a WAMP transport) on the same listening port. And the latter is distinguished from WebSocket by a first magic octet (x7f) that isn't valid for any HTTP. This is by design (of RawSocket) .. |
I'd expect that |
@oberstet All of this seems like a really good reason to prioritise having a "pluggable" protocol in Twisted, letting you swap them in and out in a defined way (as you suggested initially).
Right now it would because it defaults to HTTP, but you could swap out the approach. It basically just overloads dataReceived and swaps some backing stuff around, so if you wanted to you could do something very similar and allow first octet |
A "pluggable" protocol sounds desirable .. I agree. At least for Crossbar.io, ideally it would work like this:
|
FWIW, I now realize that the HTTP/2 thing in Twisted Web now, even if we change how to get at the transport, will shuffle each and every write through its code and state machine. This is an absolute no-go for Atuobahn because of performance. So, the real solution to this issue for AB is to bypass Twisted Web in the first place altogether (the "reverse" thing above) .. |
@Lukasa when is this goin to land on twisted? I mean, released .. |
@oberstet No it won't. If you grab the actual transport ( |
No, because of the reasons above, we'll do it "reverse" .. that fixes this, and allows us to co-listen for WebSocket and RawSocket. As said, I have no intention / use case for HTTP/2 - so this is basically a drag .. but the good thing: we'll get the co-listening thing in CB;) |
@Lukasa when does it land? |
Mmh, ok, then we'll probably be better for now to pin Twisted at <= 16.1.1 for Autobahn and Crossbar.io .. |
Great! Thanks for diving into this so quickly.. I'm really digging Autobahn so far, it was a big improvement over my Server-Sent-Events approach. |
Thanks for the nice words=) Cool to have you as a user .. I mean, Buildbot, Tahoe .. gems! And BB9 is gonna use Autobahn/Crossbar.io at the WAMP level .. pretty exciting;) |
@warner this is how the "reversed" thing looks like https://github.com/oberstet/autobahn-python/blob/multisocket/examples/twisted/multisocket/echo/server.py#L75 it'll work with all Twisted versions, shield from the HTTP2 fallout, does support Web+WebSocket+RawSocket all on 1 listening port and is faster. no more WebSocketResource at all. does that work for you? |
@oberstet - it seems like this would break idioms like authenticating access to websocket resources, or using a different websocket factory for different users, or based on a cookie; things like that. It also breaks the ability to compose multiple websocket applications in a single process via delegation, as opposed to hard-coding the integration up front in a configuration file. There appears to be a lot of misguided premature optimization going on in this thread; for example, how "will shuffle each and every write through its code and state machine" and this is a performance problem. You can't possibly have performance tested this, because your assumption is incorrect, so how do you know it's a performance problem? :) PyPy tends not to penalize low-overhead abstractions like a simple state machine. It would also seem to break the ability to speak HTTP/2 and websockets on the same port, because it would interpose between the necessary NPN handshaking interface and the Site object. Websocket clients themselves will of course be http/1 only, but HTTP/2 is a huge optimization in terms of latency for browsers that support it; and latency usually dominates in browser/server conversations. So if we're concerned about performance, this is the place to do it. All in all it seems like the right solution is to address http://twistedmatrix.com/trac/ticket/3204 properly, or to continually be hacking around Twisted's object model and breaking edge-cases. I realize that it's a bit of work to contribute to Twisted but avoiding it in the first place has not had a great result in this case :). |
@glyph You left out the elephant;) We have a need to run RawSocket, WebSocket and Twisted Web all on one port. Only the latter 2 are HTTP like at all. The first one starts with a So I am looking for a clean way of serving my own protocol, and only if I wish so, switch over to Twisted Web. Is that possible?
Contributing to Twisted: I was hoping that (partially) paying @hawkowl to work on Twisted stuff would count;) Now, she probably has more the Twisted perspective and needs on radar rather than Autobahn/Crossbar.io =( Anyway, I have way too much to do .. |
A bit of brainstorming here, hacking on top of @oberstet's example above; is this more-like something that could go in Twisted proper? Essentially it just re-organizes @oberstet's code a bit, around an (order-matters) list of @glyph I read through 3204, and presume that my |
Hmm, okay more reading of 3204 seems you're more looking for the actual protocol-switch to be more like |
I'm not keeping score here :). Of course your contributions to @hawkowl's time on Twisted are deeply appreciated. It's just that, technically, on this specific issue, a fix in Twisted would be the right way forward.
Actually, she did specifically raise this issue as a potential problem for Autobahn as we were discussing it; we tried to come up with a solution that was as low-impact to Autobahn specifically as possible. I think the sticking point here is just that this code path happened to be un-covered by tests? |
Mutation is always more efficient than delegation. PyPy saves you a bit on some kinds of delegation, making them almost as cheap, but at the very least you have the memory for an extra pointer floating around. Share all the mutable state! :-( |
@meejah @oberstet - btw, Tubes took this problem into account early in its design. See http://twisted.github.io/tubes/docs/tubes.tube.Diverter.html for example. |
To answer the main concern here, for @oberstet:
That is definitely possible. But it also, by definition, leaves out the possibility of properly treating your protocol as an HTTP resource. Instead it's a hack to trade off the resource abstraction against wire speed. However, I should note that this kind of abstraction-breaking hackery is basically a best practice in the HTTP world. Putting nginx in front of things and manually maintaining a URL mapping in its configuration file which is specific to and tightly coupled with your application, yet managing it separately, is totally the normal way most web services are deployed. Everybody denormalizes and shards all of their data in their SQL databases, and manually (and usually incorrectly) manages caching into NoSQL or memory caches beyond that. In the face of these widespread obviously broken patterns, maintaining a tiny separate URL mapping for a few websockets on an arguably dedicated service is an infinitesimally small offense :). I wouldn't want something like this in Twisted itself, because it breaks its abstraction model, but allowing you to break the abstraction when necessary is part of the layered nature of Twisted's API, so there's no hard reason why Autobahn shouldn't do this. |
@glyph The Here is what I came up now (which works ... but it is a new hack;) I think something in the direction what @meejah suggests is even better! Less "ad hoc", and a user can modify the mapping from Request-URI to protocol more flexibly and also after construction. Mmh. Maybe we need both? I mean, having something like Autobahn's WebSocketResource, but using officical Twisted machinery and where Twisted Web is owning the listening socket and something like the Diverter or MultiSocket, where the latter is sitting on the socket, and Twisted Web gets switched over to. |
@glyph This is totally OT rgd this thread, but going to Zero Inbox, Email Bankruptcy and "We’re In This Together" is awesome=) Absolutely. I give up on catching up;) https://glyph.twistedmatrix.com/2016/04/email-isnt-the-problem.html |
I feel like we haven't made something clear here: the only time bytes pass through the HTTP/2 state machine is if you're actually using HTTP/2. If you don't negotiate HTTP/2 in the ALPN connection you pay no cost for the HTTP/2 support. Given that WebSockets-over-HTTP/2 does not exist, the only possible place that HTTP/2 could be a performance problem for Autobahn is in the actual serving of resources over HTTP/2, in which case some early benchmarking from me suggests that the various HTTP/2 efficiency gains lead to improved performance over HTTP/1.1 in Twisted. |
@oberstet FWIW, I quickly rebased the patch from Twisted's 3204 to Twisted trunk, and further hacked my example above so that it does So, if that gets patched up and landed in twisted then potentially something like the hacked up |
@oberstet I understand that crossbar.io wants to control the entire interaction, and only delegate to twisted.web for HTTP interaction. But personally, I have no interest in giving autobahn control at that level. And it isn't necessarily even possible to do that ... if I am writing a plugin for some project that just expects me to provide a |
@tomprince Yeah, I do understand that. Which is the reason I think that ideally we (Twisted + Autobahn) find a way to support both (having Autobahn take control and handover to Twisted Web and having Twisted Web take control and handover to Autobahn). @Lukasa That is certainly good (not shuffling bytes through a HTTP2 state-machine), but Crossbar.io needs are: we have to take control, since we want RawSocket on the same port. Plus: we don't care about HTTP2 at all. The performance of Twisted Web on HTTP/1.1 serving static assets is limited by Twisted Web not employing in-memory caching of static assets from files. https://github.com/crossbario/crossbarexamples/tree/master/benchmark/web. Then, HTTP2 handshake is more complex than HTTP1, so WebSocket opening handshakes / sec will likely be slower .. but I haven't measured that. What I did measure is: WebSocketResource on Twisted Web HTTP1 is slower on WS handshakes/sec than pure WS (Autobahn). |
So I'm totally understanding of crossbar's needs: I'm not telling you what you should and shouldn't have. =) I just want to make sure you're aware of what the facts are with Twisted's HTTP/2 support, so you can make the best decision possible. =) When you say the HTTP/2 handshake is more complex than HTTP/1, what do you mean exactly? |
@Lukasa The (sad) truth is: neither WebSockets-over-HTTP/2 nor WebSocket-MUX exists. This sucks, but the relevant IETF WGs failed to reach consensus on both. The reasons for this from my perspective are at least partially politics of big G. They've been on both parties. Anway. From my understanding, the HTTP2 handshake at least needs to establish one HTTP2 stream to actually shuffle data, after the actual HTTP2 Upgrade (both peers agreed to talk HTTP2) already happened. |
Twisted doesn't support HTTP/2 plaintext upgrade (that is, upgrade using the Upgrade header). This is mostly for technical reasons: as you know, right now there are no good APIs for doing plaintext upgrade. This oversight is acceptable because no browsers support it either. Twisted and the browsers support negotiating HTTP/2 using ALPN. This is a TLS extension which makes the negotiation happen during the handshake. That means negotiating HTTP/2 is no slower than negotiating HTTPS. That means the Time To First Byte is the same, but HTTP/2 imposes a minor (usually 24-byte) overhead for the initial SETTINGS frame exchange. This doesn't add much latency (it's asynchronous), and is the only difference between HTTP/2 and HTTP/1.1 over TLS. Given HTTP/2 header compression, the first request ends up taking roughly the same time for both, at which point HTTP/2's various efficiency improvements cause latency to be much lower for H2 that HTTP/1.1. |
I only care about latency until WebSocket takes over. I don't care about header compression or general latency of HTTP request (because we don't use that other than for serving a handful SPA assets). HTTP2 will establish at least one HTTP2 stream before WebSocket kicks in? Anyway: this is all moot from Crossbar.io's perspective, because we want RawSocket, and the latency when Autobahn is in charge will be lower - I don't really have to measure that, since I know the code of Autobahn, and cycles spent will be less - the "Upgrade" to RawSocket using the MultiSocket POC I wrote above is absolutely minimal. We do care about WebSocket/RawSocket handshakes/sec as one performance metric. For Crossbar.io at least, we won't use WebSocketResource anymore beginning with the next release. Of course I can't speak for other users of Autobahn using it at the plain WebSocket level .. they might want to continue using WebSocketResource. I would be interested in a ALPN based negotiation mechanism for WebSocket though -- but that isn't spec'ed as far as I know=( |
Not as far as I know. There is no precedent for how to establish a WebSocket connection from a HTTP/2 connection. That means, as far as I know, that no browser will connect to a wss:// URL and negotiate HTTP/2 within it: they will always negotiate HTTP/1.1. That should mean, again, that in practice having HTTP/2 enabled for Autobahn/Crossbar is just a no-op: it doesn't affect websockets at all. |
My point about latency is that you may be serving both traditional-web-application and websocket/autobahn interfaces for the same application. I understand that you could do this a different way, but keeping all your services the same can really simplify deployment pipelines, and it's definitely something that Twisted works hard to make possible, and Autobahn specifically allows for. But, as @Lukasa explained - there's no overhead beyond the initial TLS handshake for websocket connections anyway, since |
Alright, here is an update:
I think supporting both directions for integration between Twisted and Autobahn like in above is a good approach and should cover most users needs in this area. Of course, both features internally use "hacks" (unsupported, non-official "API" in Twisted) - and we should use official Twisted machinery once it is there. For now, "it works" .. I close this now - if there are still open aspects, please come up, we can reopen should there be any. |
I have a set of unit tests (in https://github.com/warner/magic-wormhole) that recently started failing when I combine autobahn-0.13.1 and Twisted trunk. It looks like twisted/twisted@4f9e36d (landed 29-Mar-2016) is where the problem started (that's the first revision after Twisted-16.1.1 was branched off that causes my project's tests to fail).
The commit says:
And my tests are using
WebSocketResource
to put a websocket on a specific URL. I haven't tracked down the specific failure yet, but it seems like the server is having an internal error when the client attempts to connect to it, or the server isn't accepting the connection at all.The relevant Twisted bug is at https://twistedmatrix.com/trac/ticket/8191 .
When I get some time (maybe next week), I'll figure out how to run autobahn's test suite against various versions of Twisted and see what happens.
The text was updated successfully, but these errors were encountered: