Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hide the transport backing the HTTPChannel object from twisted.web Resource objects. #8191

Closed
twisted-trac opened this issue Feb 1, 2016 · 55 comments

Comments

@twisted-trac
Copy link

Lukasa's avatar @Lukasa reported
Trac ID trac#8191
Type enhancement
Created 2016-02-01 17:46:13Z
Branch https://github.com/twisted/twisted/tree/hide-request-transport-8191-7

Part of a series of patches needed for HTTP/2 support: see #7460 for more.

This patch rewrites the HTTPChannel class somewhat to avoid exposing the transport directly to the Resource object in twisted.web. Essentially, the HTTPChannel picks up the ITransport interface, which it proxies directly to its underlying transport.

The reason for this apparently unnecessary indirection is because, while HTTP/1.1's framing layer is simple enough that the TCP transport can be treated like a dumb data pipe, the same is not true of HTTP/2. This means that in the HTTP/2 layer it is extremely useful to have the HTTPChannel equivalent expose the ITransport protocol directly, where it can do many cleverer things than the HTTPChannel does here. For the sake of parity, then, it's useful to expose this interface for HTTPChannel.

This patch currently adds no new tests, but I'm happy to add some if reviewers have suggestions of what needs testing. This patch also contains one TODO note that I'd like reviewers to weigh in on.

Attachments:

  • 8191_1.patch (12804 bytes) - added by Lukasa on 2016-02-01 17:47:30Z - First draft patch
  • 8191_2.patch (27371 bytes) - added by Lukasa on 2016-02-15 22:26:50Z - Second draft patch.
  • 8191_3.patch (26183 bytes) - added by Lukasa on 2016-03-16 16:27:36Z - Third draft patch.
  • 8191_4.patch (34312 bytes) - added by Lukasa on 2016-03-17 11:00:19Z - Fourth draft patch
  • 8191_5.patch (41555 bytes) - added by Lukasa on 2016-03-23 13:10:00Z - Fifth draft patch
  • 8191_6.patch (41628 bytes) - added by Lukasa on 2016-03-23 19:27:27Z -
  • 8191_7.patch (29341 bytes) - added by Lukasa on 2016-04-28 08:48:28Z - Seventh draft patch
Searchable metadata
trac-id__8191 8191
type__enhancement enhancement
reporter__Lukasa Lukasa
priority__normal normal
milestone__None None
branch__hide_request_transport_8191_7 hide-request-transport-8191-7
branch_author__hawkowl__adiroiban__glyph hawkowl, adiroiban, glyph
status__closed closed
resolution__fixed fixed
component__web web
keywords__None None
time__1454348773210408 1454348773210408
changetime__1474198049187875 1474198049187875
version__None None
owner__Amber_Brown__HawkOwl___hawkowl_____ Amber Brown (HawkOwl) <hawkowl@...>

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

The first draft patch is available and R4R.

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

For those keeping track of ordering, this patch blocks #8194.

@twisted-trac
Copy link
Author

glyph's avatar @glyph commented

(In [46784]) Branching to hide-request-transport-8191.

@twisted-trac
Copy link
Author

glyph's avatar @glyph set owner to @Lukasa

Thanks for your work on HTTP2 for Twisted, lukasa. And again thank you for taking the trouble to start off by breaking up this work into sanely digestible chunks.

I really just have one big chunk of feedback here about the proposed architecture.

I think changing the type of request.transport might be an incompatible change. It's certainly an incompatible change without documenting it :-). More importantly though: why do you want to do it? Wouldn't it just be better to deprecate transport with the goal of removing it entirely?

It's worth noting that neither Request.channel nor Request.transport are part of IRequest, and ideally I don't think we should be exposing either publicly. Is there any need for applications to access them? Perhaps we need an IConsumer so that we can get backpressure when writing to a channel, but in that case that should be added to IRequest and done in as isolated a fashion as possible, rather than mixed in with all the rest of the methods on ITransport.

Is there some reason that you want to leave transport stuff publicly exposed for HTTP2? Is there a feature you want to present there?

Other than that, you can see some twistedchecker errors on the buildbot; so please fix those before resubmitting too :).

Thanks again!

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

More importantly though: why do you want to do it? Wouldn't it just be better to deprecate transport with the goal of removing it entirely?

Perhaps we need an IConsumer so that we can get backpressure when writing to a channel, but in that case that should be added to IRequest and done in as isolated a fashion as possible, rather than mixed in with all the rest of the methods on ITransport.

We can do all that, sure. The goal here was to avoid substantially rearchitecting all the things. A lot of things in twisted.web love to reach into the Request and grab the transport and channel attributes. We absolutely could deprecate those things and replace them with a proper interface, but I don't really know what the procedure is or how that would affect the timeline.

Do you have suggestions for how best to approach that?

@twisted-trac
Copy link
Author

adiroiban's avatar @adiroiban commented

Regarding

# TODO: Does this need to be smarter, particularly about queued

I think that this should be decided before merging this branch.

Either remove the comment or create a follow up ticket and convert it into a FIXME comment.


Maybe things using directly the transport could be left with HTTP1/1 support only and slowly migrate the other code of twisted.web which uses the transport/channel to HTTP2 friendly API.


You can leave the transport as it is now, and update the code to use the channel instead of the transport

self.channel.writeHeaders(*self._queuedHeaders)

With the current path we have both self.transport and self.channel which look like the same thing. This is confusing.


So the roadmap could be:

  • Update Request to write things to the channel, and to not bypass it. This is this patch.
  • Update twisted.web to no longer grab Request.transport but rather use the channel. This might need multiple tickets
  • Once no code in twisted.web is using Request.transport, deprecate Request.transport in a new ticket informing that Request.channel should be used as a replacement.

Thanks for your work!

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

So following Glyph's suggestion initially, here is a patch that deprecates Request.transport and Request.channel. Please take a look at it, paying careful attention to what I had to do for twisted.web.server.Request (a http.Request subclass).

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa removed owner

@twisted-trac
Copy link
Author

adiroiban's avatar @adiroiban commented

(In [46811]) Branching to hide-request-transport-8191-2.

@twisted-trac
Copy link
Author

adiroiban's avatar @adiroiban commented

Many thanks for the patch. It looks good.

Are you happy with the result :)

Here is a quick and dirty review :( feel free to ignore it if you find it to hard to follow.


The deprecation need a dedicated news fragment.

With the current patch, it looks like there were not many things in twisted.web using directly the channel or the transport... not sure how external projects were handling this.


I don't understand this instruction

Call directly into the Request object instead.

I assume it want to say something like:


Call the channel method with the same name directly into the Request object.


I don't know what to say about send100Continue.

I would say that we should first look at how we want to have the API support for #6928

I think that the method is ok, but do we also have this method on the Channel ?
Maybe we can go without it, at least for the start so that we have less to deprecate.

Maybe rename it to just continue(), but this is a minor comment.


I don't know what to say about the new methods on the channel.
I assume that are required as for HTTP2 we will have a separate channel implementation with separate handling for those new methods.


Why is Request using Request._transport ?

For example for this code:


        if self._send100:
            self.channel.send100Continue()
        if self._queuedHeaders:
            self._transport.writeHeaders(*self._queuedHeaders)
            self._queuedHeaders = None

OR

        # if we have producer, register it with transport
        if (self.producer is not None) and not self.finished:
            self._transport.registerProducer(self.producer, self.streamingProducer)

why not have

        if self._send100:
            self._channel.send100Continue()
        if self._queuedHeaders:
            self._channel.writeHeaders(*self._queuedHeaders)
            self._queuedHeaders = None

AND

        # if we have producer, register it with transport
        if (self.producer is not None) and not self.finished:
            self._channel.registerProducer(self.producer, self.streamingProducer)


BTW. I think there is a bug here as it should not call self.channel but self._channel

I was expecting that we would just have

    @property
    def transport(self):
        # Warning here.
        return self._channel.transport

and that for StringTransport we would have some support in the HTTPChannel

Basically, Request will not have direct access to the transport, but will use the channel for all its I/O needs.

Maybe for the first patch, it ok to have both Request._transport and Request._channel but as a follow up maybe we can look how we can get rid of Request._transport


Patch applied and sent to buildbot.

I guess that some test will fail due to internal usage of self.channel which is not deprecated.

Again, sorry for the messy review, but I hope it help.

Leaving this for review as I only did a quick review and I have little information about how twisted.web should be designed.

Thanks!

@twisted-trac
Copy link
Author

adiroiban's avatar @adiroiban commented

Maybe this review should also be done in parallel with [#8194](#8194) to check that this changes make sense and will help the implementation from #8194

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

@adiroiban Thanks for your review, it's very helpful! As noted, we'll leave this open to more review. =)

So before I dive in I think you're misunderstanding some of the code here and I want to make sure we're on the same page here:

Why is Request using Request._transport ?

The reason Request uses _transport is that _transport is conditionally a StringTransport, not a HTTPChannel. This is because the HTTPChannel allows HTTP pipelining: essentially, there can be more than one alive Request per HTTPChannel. However, only one Request can actually write to the channel at any time (because that's how HTTP/1.1 works), so for all the Request objects that are queued behind the current one they set up a StringTransport into which all the write calls go. Then, when they're unqueued, the StringTransport is poured onto the TCP connection.

This is why the _transport field exists on the Request. I've been extremely reluctant to change this dynamic because it would require a pretty sizeable refactor of the HTTPChannel interface: specifically, right now HTTPChannel does not allow multiple callers at the same time, and I'd have to refactor it to allow that. I'm disinclined to do that, given that the current model works right now. (FYI, the HTTP/2 stuff gets around this by never marking a Request as queued, and then using the H2Stream class to disambiguate between Request objects: essentially, each Request believes it's the sole request for a single H2Stream.)

@twisted-trac
Copy link
Author

glyph's avatar @glyph commented

This stack overflow question points to an interesting use-case:

http://stackoverflow.com/questions/35583718/how-can-i-access-the-socket-object-from-within-a-twisted-klein-route-method-in-p/35586154#35586154

(It also indicates that we don't have a good story around SCM_RIGHTS and we should probably have a UNIX transport method for that.)

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

Ok, I've uploaded the third draft patch that uses adirioban's deprecatedProperty work instead of my hand-rolled deprecations.

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

hawkowl provided some informal feedback in IRC that centered around concerns about send100Continue potentially allowing code to send misframed HTTP messages. We agreed on two changes:

  1. Make send100Continue private.
  2. Enforce that it can only be called at the appropriate time in the response flow.

The first part was easy, but the second part is a lot trickier, because the HTTPChannel class did not previously keep track of where it was in the response cycle (this, by the way, is a pervasive problem in twisted.web: it did not prevent the sending of malformed HTTP, it just avoided it by use of an implicit state machine distributed across about three different classes).

To that end, I've added the barest minimum of state tracking to the HTTPChannel class by introducing a set of flags that track where in the response cycle we are. This of course adds restrictions that were never previously present in the HTTPChannel class, so I also had to add some tests to ensure that those restrictions were enforced.

This sadly inflates the patch a bit, but you gotta do what you gotta do.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

(In [47052]) Branching to hide-request-transport-8191-3.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

(In [47053]) apply patch from lukasa, refs #8191

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl set owner to @Lukasa

Hi Lukasa, thanks for your continued effort on this ticket.

Thanks for adding some new tests, but with some things moved about, there's some changed code with red lines as per https://codecov.io/github/twisted/twisted/commit/117ab382091660c27b84608932204b315b425344 -- would you be able to get these green, where possible?

There are also no tests for the deprecations -- please see test_http.RequestTests.test_addCookieNonStringArgument as an example of this.

Some docstrings refer to Python identifiers without being wrapped in C{} (for code constants or args) or L{} (locally-resolvable Python identifiers). An example of locations that miss it are in the new tests, where HTTPChannel is mentioned. Please note that Python types, like list and bytes, should be wrapped in L{}.

As mentioned on IRC, the tests also fail on Python 3 in the XMLRPC sections -- writeHeaders in two locations have bare strings, which are unicode on 3.4.

Thanks again for this. I am in favour of these changes, and some of the cleanups that have been done to make it all work are very nice. Not writing raw \r\ns in Request code is something I like. Please fix the above issues and submit for re-review! :)

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa removed owner

I've applied the feedback from hawkie.

This should improve the test coverage everywhere except for the following places:

  • registering transports when unqueuing Requests. This is already untested, so I'm not regressing coverage, but if it's compelling I can probably write a test that does that quite easily.
  • A couple of test methods that I had to add aren't actually called. They should stay there (to ensure the implementation of ITransport is complete), but they are uncovered. I don't think that matters.

Otherwise, this should have sorted out most of our problems.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

(In [47059]) Branching to hide-request-transport-8191-4.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

(In [47060]) apply patch from lukasa, refs #8191

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

The sixth draft patch is now uploaded: this should fix the twistedchecker and hopefully the Python 3 failures as well.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

(In [47101]) Branching to hide-request-transport-8191-5.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

(In [47102]) apply patch from lukasa, refs #8191

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl set status to closed

(In [47104]) Merge hide-request-transport-8191-5: Hide the transport backing the HTTPChannel object from twisted.web Resource objects.

Author: lukasa
Reviewers: glyph, adiroiban, hawkowl
Fixes: #8191

@twisted-trac
Copy link
Author

oberstet's avatar @oberstet commented

This change will break almost all Crossbar.io deployment out there, and many AutobahnPython based WebSocket server deployments: crossbario/autobahn-python#641

IMO, this http://twistedmatrix.com/trac/ticket/3204 should have been fixed before landing this .. because there should be a clean, supported way of taking over a transport ..

@twisted-trac
Copy link
Author

oberstet's avatar @oberstet commented

There are some hacks on Autobahn's side though: https://github.com/crossbario/autobahn-python/blob/master/autobahn/twisted/resource.py#L113

I'd love to get rid of those of course. What we would need is a clean, officially sanctioned way of taking over the transport on a Web resource and get all the bytes that originally arrived on that.

Eg, consider a Web resource on path /websocket. To talk WebSocket on that path, we need the access to the full HTTP request that came in, to parse the initial WebSocket opening handshake, and then take over the whole transport.

How would I do that?

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

What's the actual failure you're seeing on autobahn? The goal here was to deprecate but not to remove, so the code should still function.

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

Just to be clear: if this broke running code, it's a bug that I should fix. And I agree, we should come up with a "protocol upgrade" solution for HTTP, especially because I'll want it for plaintext HTTP. Something like #3204 sounds about right.

@twisted-trac
Copy link
Author

oberstet's avatar @oberstet commented

It seems to be breaking code (as reported here crossbario/autobahn-python#641) - I've asked to get the traceback.

I am wondering if this is about new style, old style or hybrid classes.

What is t.w.http.Request of kind?

Ideally, it should be pure new style ..

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

t.w.http.Request is an old-style class, sadly, which is why this patch had to use a __getattr__ hack rather than an @property decorator.

@twisted-trac
Copy link
Author

warner's avatar @warner commented

The traceback (which occurs during in https://github.com/warner/magic-wormhole , in wormhole.test.test_server.WebSocketAPI.test_allocate_1) is:

2016-04-21 00:30:20-0700 [-] Unhandled Error
        Traceback (most recent call last):
          File "/Users/warner/stuff/tahoe/magic-wormhole/ve/lib/python2.7/site-packages/autobahn/websocket/protocol.py", line 2585, in processHandshake
            self._onConnect(request)
          File "/Users/warner/stuff/tahoe/magic-wormhole/ve/lib/python2.7/site-packages/autobahn/twisted/websocket.py", line 205, in _onConnect
            res.addErrback(forwardError)
          File "/Users/warner/stuff/python/twisted/twisted/internet/defer.py", line 328, in addErrback
            errbackKeywords=kw)
          File "/Users/warner/stuff/python/twisted/twisted/internet/defer.py", line 306, in addCallbacks
            self._runCallbacks()
        --- <exception caught here> ---
          File "/Users/warner/stuff/python/twisted/twisted/internet/defer.py", line 588, in _runCallbacks
            current.result = callback(current.result, *args, **kw)
          File "/Users/warner/stuff/tahoe/magic-wormhole/ve/lib/python2.7/site-packages/autobahn/twisted/websocket.py", line 203, in forwardError
            return self.failHandshake("Internal server error: {}".format(failure.value), ConnectionDeny.INTERNAL_SERVER_ERROR)
          File "/Users/warner/stuff/tahoe/magic-wormhole/ve/lib/python2.7/site-packages/autobahn/websocket/protocol.py", line 2767, in failHandshake
            self.sendHttpErrorResponse(code, reason, responseHeaders)
          File "/Users/warner/stuff/tahoe/magic-wormhole/ve/lib/python2.7/site-packages/autobahn/websocket/protocol.py", line 2779, in sendHttpErrorResponse
            self.sendData(response.encode('utf8'))
          File "/Users/warner/stuff/tahoe/magic-wormhole/ve/lib/python2.7/site-packages/autobahn/websocket/protocol.py", line 1192, in sendData
            self.transport.write(data)
          File "/Users/warner/stuff/python/twisted/twisted/web/http.py", line 2055, in write
            assert self._sendState == _ChannelSendState.SENT_HEADERS
        exceptions.AssertionError:

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

As discussed on the other issue, the core problem is that we changed request.transport to be the HTTPChannel instead. This was deliberate: HTTP/2 needs it. However, it means there's now a state machine enforcing the writes through to the underlying transport, which Autobahn obviously breaks (why wouldn't it?).

If we want, we can remove that state machine: hawkowl wanted it to enforce some things around 100 Continue responses, which is where this state machine came from. Alternatively, we can go back to allowing access to the transport as _transport: that's ok for now as we're deprecating that access, but it needs to be gone before HTTP/2 lands properly, or code like Autobahn's might start doing some really stupid stuff.

Leaving that up to the Twisted maintainers.

@twisted-trac
Copy link
Author

glyph's avatar @glyph commented

Assertions are a violation of the Twisted coding standard, so we need to eliminate that regardless.

I'm sorry I didn't take more time to provide feedback on this issue, but to quote my one early piece of relevant review feedback:

“changing the type of request.transport might be an incompatible change”

So um. Yeah. It is.

I think we might need to put out a fix (perhaps even a point-version fix?) to roll this back.

In the future, the right way to make a change like this is to preserve the old behavior, but to have a new API entry-point (for example: a new keyword argument to Site, or an alternative like SiteWithoutRequeststThatHaveTransport but with a good name) that changes the behavior so that code which has explicitly invoked the new thing will get totally new behavior. Then, deprecate the top-level entry point (Site itself, let's say) rather than trying to deprecate individual aspects of data structures.

At the very least though: no asserts. Asserts are an anti-pattern; it's a way of saying "we cared enough to give you an error, but not enough to tell you what the hell is going on or allow you to handle it". Which is to say: we cared enough to break your program, but not enough to give you the way to fix it. Especially asserts like this which don't even have an error message.

@twisted-trac
Copy link
Author

oberstet's avatar @oberstet commented

@glyph: regarding asserts being an anti-pattern - would you say this also applies to this use?

assert False, "bad things happened - you need to do X, not Y"

And this is bad too?

def say_hello(msg):
    assert type(msg) == six.text_type, "msg must be unicode, but was {}".format(type(msg))

@twisted-trac
Copy link
Author

glyph's avatar @glyph commented

Replying to oberstet:

@glyph: regarding asserts being an anti-pattern - would you say this also applies to this use?

assert False, "bad things happened - you need to do X, not Y"

And this is bad too?

def say_hello(msg):
    assert type(msg) == six.text_type, "msg must be unicode, but was {}".format(type(msg))

Yes. In both of these circumstances, raise is the appropriate keyword to use, not assert. Raise an instance of a specific exception type, document the fact that that exception was raised, test the fact that it is raised.

In the first case, what bad thing happened? BadThingHappened might be appropriate, or RuntimeError if the thing was really bad.

In the second, depending on the nature of the nature of the type mismatch, TypeError or UnicodeError might be appropriate.

In no case is AssertionError appropriate, unless perhaps you found an unexpected NULL pointer in some C extension and cannot proceed.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

Replying to glyph:

Assertions are a violation of the Twisted coding standard, so we need to eliminate that regardless.

No, they're not (http://twistedmatrix.com/documents/current/core/development/policy/coding-standard.html). :P

I think we might need to put out a fix (perhaps even a point-version fix?) to roll this back.

This is not in a released version of Twisted, soooo we needn't worry too much yet. I don't think this is an incompatible change, it doesn't change the "type", it only changes the attribute fetching of the attribute.

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

In the future, the right way to make a change like this is to preserve the old behavior, but to have a new API entry-point (for example: a new keyword argument to Site, or an alternative like SiteWithoutRequeststThatHaveTransport but with a good name) that changes the behavior so that code which has explicitly invoked the new thing will get totally new behavior. Then, deprecate the top-level entry point (Site itself, let's say) rather than trying to deprecate individual aspects of data structures.

I have no aversion to rolling this back and going in a different direction.

However, I don't think chasing this direction is the right way to go. The Autobahn devs have made it pretty clear that they are uninterested in adjusting to a version of Twisted where they can't get at the transport from the Request, because otherwise they have no logical hook to implement their upgrade. That's reasonable enough from my position.

However, that requirement makes it very hard to justify landing the HTTP/2 support in its current form, because we'll end up in one of the following situations:

  1. The HTTP/2 support hides the transport from the request, instead using the channel (as proposed here). That causes problems for anyone who expects the transport to be a dumb data pipe because it isn't anymore (it has a state machine in the way that limits what you can do).
  2. The HTTP/2 support is forced to plumb the actual transport through to the request. That causes two problems. The first is simple: it means that a whole lot of objects that don't need the actual transport have to have a reference to it or a way to get it in order to pass it on, which is dumb. The second is worse: incautious consumers of that transport expecting it to obey the rules of HTTP/1.1 will treat it as a data pipe and bust the HTTP/2 framing, ruining their connections. This would be extremely difficult to debug.

Neither of these is good, so we really need a third way. What we need is a formalised system whereby a twisted.web consumer (either twisted.web itself or something like autobahn) can tell twisted.web that it needs to "upgrade" the connection by replacing the channel with something else (websockets or HTTP/2 or w/e).

I have no idea how that should look at this time. I'll have a think.

@twisted-trac
Copy link
Author

warner's avatar @warner commented

Hm. As for problem 2.2 (incautious consumers), what if any access to the
deprecated request.transport declares "HTTP/2 bankruptcy", and moves the
HTTP/2 state machine into a mode where all subsequent normal HTTP/2 things
throw errors? Maybe also throw an "HeyIAmInTheMiddleOfSomethingError" if
someone reads the .transport property while the state machine is not in the
"about to send headers" state (i.e. when there's already some framing to
break). So consumers could read request.transport once (at the start of the
render method, like Autobahn does), but if they allow anything else in the
HTTP stack to proceed, they get an immediate easier-to-debug error?

(or maybe catch request.transport.write calls instead of the property get)

Then in the longer run, we use a specially-marked Resource or Site to
indicate that we want some paths to get access to the raw transport, and add
a new API to access it. Maybe create RawTransportResource, and if
getChildForRequest sees one of them, it passes in the transport instead of
the request.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl set status to reopened

(In [47304]) Revert r47104: "Merge hide-request-transport-8191-5: Hide the transport backing the HTTPChannel object from twisted.web Resource objects."

Reopens: #8191

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

(In [47306]) Branching to hide-request-transport-8191-6.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

Whoops, I reviewed this when I merged it.

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

Ok, so here's the state of play.

Part of this patch is still necessary for HTTP/2: specifically, the part where we stop Request from just writing into .transport. Unfortunately, to avoid breaking downstream like we did here, .transport must be either a StringTransport (because of f!***ing pipelining support) or it must be the underlying transport object.

However, I want all writes from the Request to pass through either the StringTransport (because pipelining) or the channel. That doesn't work with the current transport field.

That means there are two options. The first is that H2Stream.transport can be set to self: that is, to the same instance of H2Stream. That will entirely obviate the need for this patch. We can then pursue deprecating pipelining, which will get rid of the wacky transport shenanigans the Request pulls, at which time it can be cleaned up to only talk to .channel like it should have always done. This has the advantage of being really simple and requiring relatively few code changes, at the cost of being super dumb: H2Stream is technically an ITransport, but anyone who expects to get the "real" transport will be mightily confused.

The second option is to add a Request._writer object that swaps between the StringTransport and the channel. This object will be the place that the Request directs all its writes, ensuring that everything behaves sensibly. At that point Request.transport will essentially be vestigial: nothing in the Request will use it.

Any thoughts here?

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa set status to new

Ok, I've implemented the _writer approach: let me know what people think.

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

So we've since gone a whole other way on this. We merged #8230, which removes the pipelining support from twisted.web. This means we can now simply use a patch that changes the t.w.h.Request object to only ever write to the channel (which is basically a subset of the previous patches), which is now always going to be valid.

For the moment I plan to leave the transport on: HTTP/2 will simply refuse to expose the transport, throwing exceptions if you try to get at it. Instead, we only need a patch to change where the Request objects write to. That's my next step here.

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

Ok, so I've just done another pass over this work, which can be found on a branch of my GitHub fork. Please now ignore the patches linked on this issue: they don't represent the most up-to-date representation of the work.

This change now restricts itself to internal changes in twisted.web.http.Request. Specifically, it changes all the Twisted code to stop writing to the transport, and instead to write directly to the channel.

There are some things that should be noted here:

  1. This change does not reintroduce the state machine that Autobahn was concerned about, though hawkowl may want to shout about us putting it back. Nor does it hide transport in any way, which is the other thing that broke Autobahn. It's my belief that this change should be totally safe for Autobahn, though it'd be good if someone familiar with Autobahn would check that before we ship this, rather than afterwards!
  2. There's nothing here that notifies users that transport may become a minefield in HTTP/2-land. Should we consider providing such a thing? If so, as part of this patch or as a separate patch, and what should we do? If not, how do we intend to make this clear to users?

Regardless, this is now open for review.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl set owner to @hawkowl

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl set owner to @Lukasa

Hi, thanks for this.

Two things:

  1. A topfile will need to be added.
  2. I am unsure about trunk...hide-request-transport-8191-7#diff-db5a08b9de46095e805af98c529a23a3R1250 -- if there's no channel, this will just explode? I'm not sure if this is how it used to work, it appears you'd always get a response, even if there was no transport.

Please fix these and resubmit.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

I also checked the coverage, and:

  1. HTTPChannel.isSecure, writeSequence, and loseConnection do not appear to have coverage.

That's all for now; I think this is pretty close.

RE 2 of comment 46, I believe this belongs in another branch, maybe one where H2 starts landing. I would say failing hard would be the way -- maybe no .transport if it's HTTP/2. But that's a question for another ticket, and there's more to consider (like how you access things like TLS info, that currently one can only do through the transport) :)

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa removed owner

Ok, I believe I've addressed those comments with further commits on my branch. Resubmitting.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

Pulled it in, spun the builders...

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

my expert review is as follows

[04:11:46]  <hawkowl>	lukasa also L{}s on https://github.com/twisted/twisted/compare/trunk...hide-request-transport-8191-7#diff-db5a08b9de46095e805af98c529a23a3R1946 plz?
[04:12:08]  <hawkowl>	lukasa also it should be L{None} and L{True}/L{False}

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

Pushed a new commit. =)

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl set owner to @hawkowl
@hawkowl set status to closed

In changeset b313e7a

#!CommitTicketReference repository="" revision="b313e7aeaada834add57acfe845419fb792b2b59"
Merge hide-request-transport-8191-7: twisted.web.http.Request should route writes through the HTTPChannel

Author: lukasa
Reviewer: hawkowl
Fixes: #8191

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

[mass edit] Removing review from closed tickets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants