-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hide the transport backing the HTTPChannel object from twisted.web Resource objects. #8191
Comments
The first draft patch is available and R4R. |
(In [46784]) Branching to hide-request-transport-8191. |
Thanks for your work on HTTP2 for Twisted, lukasa. And again thank you for taking the trouble to start off by breaking up this work into sanely digestible chunks. I really just have one big chunk of feedback here about the proposed architecture. I think changing the type of It's worth noting that neither Is there some reason that you want to leave transport stuff publicly exposed for HTTP2? Is there a feature you want to present there? Other than that, you can see some twistedchecker errors on the buildbot; so please fix those before resubmitting too :). Thanks again! |
We can do all that, sure. The goal here was to avoid substantially rearchitecting all the things. A lot of things in Do you have suggestions for how best to approach that? |
Regarding
I think that this should be decided before merging this branch. Either remove the comment or create a follow up ticket and convert it into a FIXME comment. Maybe things using directly the transport could be left with HTTP1/1 support only and slowly migrate the other code of twisted.web which uses the transport/channel to HTTP2 friendly API. You can leave the transport as it is now, and update the code to use the channel instead of the transport
With the current path we have both self.transport and self.channel which look like the same thing. This is confusing. So the roadmap could be:
Thanks for your work! |
So following Glyph's suggestion initially, here is a patch that deprecates |
|
(In [46811]) Branching to hide-request-transport-8191-2. |
Many thanks for the patch. It looks good. Are you happy with the result :) Here is a quick and dirty review :( feel free to ignore it if you find it to hard to follow. The deprecation need a dedicated news fragment. With the current patch, it looks like there were not many things in twisted.web using directly the channel or the transport... not sure how external projects were handling this. I don't understand this instruction
I assume it want to say something like:
I don't know what to say about send100Continue. I would say that we should first look at how we want to have the API support for #6928 I think that the method is ok, but do we also have this method on the Channel ? Maybe rename it to just I don't know what to say about the new methods on the channel. Why is Request using Request._transport ? For example for this code:
why not have
BTW. I think there is a bug here as it should not call I was expecting that we would just have
and that for StringTransport we would have some support in the HTTPChannel Basically, Request will not have direct access to the transport, but will use the channel for all its I/O needs. Maybe for the first patch, it ok to have both Request._transport and Request._channel but as a follow up maybe we can look how we can get rid of Request._transport Patch applied and sent to buildbot. I guess that some test will fail due to internal usage of self.channel which is not deprecated. Again, sorry for the messy review, but I hope it help. Leaving this for review as I only did a quick review and I have little information about how twisted.web should be designed. Thanks! |
Maybe this review should also be done in parallel with [#8194](#8194) to check that this changes make sense and will help the implementation from #8194 |
@adiroiban Thanks for your review, it's very helpful! As noted, we'll leave this open to more review. =) So before I dive in I think you're misunderstanding some of the code here and I want to make sure we're on the same page here:
The reason This is why the |
This stack overflow question points to an interesting use-case: (It also indicates that we don't have a good story around |
Ok, I've uploaded the third draft patch that uses adirioban's deprecatedProperty work instead of my hand-rolled deprecations. |
hawkowl provided some informal feedback in IRC that centered around concerns about send100Continue potentially allowing code to send misframed HTTP messages. We agreed on two changes:
The first part was easy, but the second part is a lot trickier, because the HTTPChannel class did not previously keep track of where it was in the response cycle (this, by the way, is a pervasive problem in twisted.web: it did not prevent the sending of malformed HTTP, it just avoided it by use of an implicit state machine distributed across about three different classes). To that end, I've added the barest minimum of state tracking to the HTTPChannel class by introducing a set of flags that track where in the response cycle we are. This of course adds restrictions that were never previously present in the HTTPChannel class, so I also had to add some tests to ensure that those restrictions were enforced. This sadly inflates the patch a bit, but you gotta do what you gotta do. |
(In [47052]) Branching to hide-request-transport-8191-3. |
Hi Lukasa, thanks for your continued effort on this ticket. Thanks for adding some new tests, but with some things moved about, there's some changed code with red lines as per https://codecov.io/github/twisted/twisted/commit/117ab382091660c27b84608932204b315b425344 -- would you be able to get these green, where possible? There are also no tests for the deprecations -- please see test_http.RequestTests.test_addCookieNonStringArgument as an example of this. Some docstrings refer to Python identifiers without being wrapped in As mentioned on IRC, the tests also fail on Python 3 in the XMLRPC sections -- writeHeaders in two locations have bare strings, which are unicode on 3.4. Thanks again for this. I am in favour of these changes, and some of the cleanups that have been done to make it all work are very nice. Not writing raw |
I've applied the feedback from hawkie. This should improve the test coverage everywhere except for the following places:
Otherwise, this should have sorted out most of our problems. |
(In [47059]) Branching to hide-request-transport-8191-4. |
The sixth draft patch is now uploaded: this should fix the twistedchecker and hopefully the Python 3 failures as well. |
(In [47101]) Branching to hide-request-transport-8191-5. |
This change will break almost all Crossbar.io deployment out there, and many AutobahnPython based WebSocket server deployments: crossbario/autobahn-python#641 IMO, this http://twistedmatrix.com/trac/ticket/3204 should have been fixed before landing this .. because there should be a clean, supported way of taking over a transport .. |
There are some hacks on Autobahn's side though: https://github.com/crossbario/autobahn-python/blob/master/autobahn/twisted/resource.py#L113 I'd love to get rid of those of course. What we would need is a clean, officially sanctioned way of taking over the transport on a Web resource and get all the bytes that originally arrived on that. Eg, consider a Web resource on path How would I do that? |
What's the actual failure you're seeing on autobahn? The goal here was to deprecate but not to remove, so the code should still function. |
It seems to be breaking code (as reported here crossbario/autobahn-python#641) - I've asked to get the traceback. I am wondering if this is about new style, old style or hybrid classes. What is t.w.http.Request of kind? Ideally, it should be pure new style .. |
t.w.http.Request is an old-style class, sadly, which is why this patch had to use a |
The traceback (which occurs during in https://github.com/warner/magic-wormhole , in
|
As discussed on the other issue, the core problem is that we changed request.transport to be the HTTPChannel instead. This was deliberate: HTTP/2 needs it. However, it means there's now a state machine enforcing the writes through to the underlying transport, which Autobahn obviously breaks (why wouldn't it?). If we want, we can remove that state machine: hawkowl wanted it to enforce some things around 100 Continue responses, which is where this state machine came from. Alternatively, we can go back to allowing access to the transport as _transport: that's ok for now as we're deprecating that access, but it needs to be gone before HTTP/2 lands properly, or code like Autobahn's might start doing some really stupid stuff. Leaving that up to the Twisted maintainers. |
Assertions are a violation of the Twisted coding standard, so we need to eliminate that regardless. I'm sorry I didn't take more time to provide feedback on this issue, but to quote my one early piece of relevant review feedback:
So um. Yeah. It is. I think we might need to put out a fix (perhaps even a point-version fix?) to roll this back. In the future, the right way to make a change like this is to preserve the old behavior, but to have a new API entry-point (for example: a new keyword argument to At the very least though: no asserts. Asserts are an anti-pattern; it's a way of saying "we cared enough to give you an error, but not enough to tell you what the hell is going on or allow you to handle it". Which is to say: we cared enough to break your program, but not enough to give you the way to fix it. Especially asserts like this which don't even have an error message. |
@glyph: regarding asserts being an anti-pattern - would you say this also applies to this use? assert False, "bad things happened - you need to do X, not Y" And this is bad too? def say_hello(msg):
assert type(msg) == six.text_type, "msg must be unicode, but was {}".format(type(msg)) |
Replying to oberstet:
Yes. In both of these circumstances, In the first case, what bad thing happened? In the second, depending on the nature of the nature of the type mismatch, In no case is |
Replying to glyph:
No, they're not (http://twistedmatrix.com/documents/current/core/development/policy/coding-standard.html). :P
This is not in a released version of Twisted, soooo we needn't worry too much yet. I don't think this is an incompatible change, it doesn't change the "type", it only changes the attribute fetching of the attribute. |
I have no aversion to rolling this back and going in a different direction. However, I don't think chasing this direction is the right way to go. The Autobahn devs have made it pretty clear that they are uninterested in adjusting to a version of Twisted where they can't get at the transport from the Request, because otherwise they have no logical hook to implement their upgrade. That's reasonable enough from my position. However, that requirement makes it very hard to justify landing the HTTP/2 support in its current form, because we'll end up in one of the following situations:
Neither of these is good, so we really need a third way. What we need is a formalised system whereby a twisted.web consumer (either twisted.web itself or something like autobahn) can tell twisted.web that it needs to "upgrade" the connection by replacing the channel with something else (websockets or HTTP/2 or w/e). I have no idea how that should look at this time. I'll have a think. |
Hm. As for problem 2.2 (incautious consumers), what if any access to the (or maybe catch Then in the longer run, we use a specially-marked Resource or Site to |
(In [47306]) Branching to hide-request-transport-8191-6. |
Whoops, I reviewed this when I merged it. |
Ok, so here's the state of play. Part of this patch is still necessary for HTTP/2: specifically, the part where we stop Request from just writing into .transport. Unfortunately, to avoid breaking downstream like we did here, .transport must be either a StringTransport (because of f!***ing pipelining support) or it must be the underlying transport object. However, I want all writes from the Request to pass through either the StringTransport (because pipelining) or the channel. That doesn't work with the current transport field. That means there are two options. The first is that H2Stream.transport can be set to The second option is to add a Any thoughts here? |
Ok, I've implemented the |
So we've since gone a whole other way on this. We merged #8230, which removes the pipelining support from twisted.web. This means we can now simply use a patch that changes the t.w.h.Request object to only ever write to the channel (which is basically a subset of the previous patches), which is now always going to be valid. For the moment I plan to leave the transport on: HTTP/2 will simply refuse to expose the transport, throwing exceptions if you try to get at it. Instead, we only need a patch to change where the Request objects write to. That's my next step here. |
Ok, so I've just done another pass over this work, which can be found on a branch of my GitHub fork. Please now ignore the patches linked on this issue: they don't represent the most up-to-date representation of the work. This change now restricts itself to internal changes in twisted.web.http.Request. Specifically, it changes all the Twisted code to stop writing to the transport, and instead to write directly to the channel. There are some things that should be noted here:
Regardless, this is now open for review. |
Hi, thanks for this. Two things:
Please fix these and resubmit. |
I also checked the coverage, and:
That's all for now; I think this is pretty close. RE 2 of comment 46, I believe this belongs in another branch, maybe one where H2 starts landing. I would say failing hard would be the way -- maybe no |
Ok, I believe I've addressed those comments with further commits on my branch. Resubmitting. |
Pulled it in, spun the builders... |
my expert review is as follows
|
Pushed a new commit. =) |
In changeset b313e7a
|
[mass edit] Removing review from closed tickets. |
Part of a series of patches needed for HTTP/2 support: see #7460 for more.
This patch rewrites the HTTPChannel class somewhat to avoid exposing the transport directly to the Resource object in twisted.web. Essentially, the HTTPChannel picks up the ITransport interface, which it proxies directly to its underlying transport.
The reason for this apparently unnecessary indirection is because, while HTTP/1.1's framing layer is simple enough that the TCP transport can be treated like a dumb data pipe, the same is not true of HTTP/2. This means that in the HTTP/2 layer it is extremely useful to have the HTTPChannel equivalent expose the ITransport protocol directly, where it can do many cleverer things than the HTTPChannel does here. For the sake of parity, then, it's useful to expose this interface for HTTPChannel.
This patch currently adds no new tests, but I'm happy to add some if reviewers have suggestions of what needs testing. This patch also contains one TODO note that I'd like reviewers to weigh in on.
Attachments:
Searchable metadata
The text was updated successfully, but these errors were encountered: