Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HTTP/2 implementation of HTTPChannel. #8194

Closed
twisted-trac opened this issue Feb 2, 2016 · 40 comments
Closed

Add HTTP/2 implementation of HTTPChannel. #8194

twisted-trac opened this issue Feb 2, 2016 · 40 comments

Comments

@twisted-trac
Copy link

Lukasa's avatar @Lukasa reported
Trac ID trac#8194
Type enhancement
Created 2016-02-02 14:24:40Z
Branch https://github.com/twisted/twisted/tree/h2channel-8194-8

This is part of a series of patches adding support for HTTP/2 to twisted.web: see #7460 for more.

This patch contains the meat of the HTTP/2 support, introducing two new classes: H2Connection and H2Stream. These classes together implement the same support as HTTPChannel. The reason for the two-class approach is explained somewhat in code comments, but I'll explain it again here.

With HTTPChannel it is capable of implementing both IConsumer and IProducer directly because there is a strictly one-to-one correlation between a request/response pair and a transport. While a single transport may carry many requests/responses over time, at any one time there is only one outstanding.

This is not true with HTTP/2: at any one time there may be as many as 2 billion request/response pairs outstanding (though generally this is limited to more like 100, for obvious reasons!). For this reason, it's not possible to naively implement both IProducer and IConsumer on one class.

Instead, we have one object implementing IPushProducer directly and pushing data to the TCP connection (H2Connection), while implementing IProtocol to receive data from the TCP connection. Another object (H2Stream) implements IPushProducer the other way, pushing data to the IRequest. H2Stream also implements IConsumer and ITransport, consuming the response body from the IRequest, allowing the HTTP/2 flow control signalling to propagate out to the IRequest objects. Between the H2Stream and H2Connection objects there is an interface that looks very similar to IProducer/IConsumer, but with the addition of stream IDs that allow for there to be many IRequest objects for one underlying TCP stream.

Essentially, for each m HTTP/2 connections, each with n requests/responses active at any one time, we have: m TCP streams, m H2Connections, m x n H2Streams, and m x n IRequest objects.


The patch I'm about to upload implements this functionality and has been live tested. By itself it doesn't do anything: it needs the other patches to be properly integrated into twisted.web. However, it should be possible for us to unit test this patch, and I'll aim to do that. Before I get to that though, I did want some Twisted contributors to cast an eye over it and see if they're happy with the general shape and direction of the patch.

Attachments:

  • 8194_1.patch (733 bytes) - added by Lukasa on 2016-02-02 14:25:43Z - First draft patch.
  • 8194_2.patch (33439 bytes) - added by Lukasa on 2016-02-03 07:56:21Z - Patch containing HTTP/2 work!
  • 8194_3.patch (101683 bytes) - added by Lukasa on 2016-04-06 10:53:24Z - First patch with HTTP/2 tests.
  • 8194_4.patch (101843 bytes) - added by Lukasa on 2016-04-06 12:16:29Z - Patch with fixed logging for twistd web
  • 8194_5.patch (102937 bytes) - added by Lukasa on 2016-04-06 12:32:32Z - Testing improvements.
  • 8194_6.patch (107149 bytes) - added by Lukasa on 2016-04-11 17:16:58Z - Sixth draft patch
  • 8194_7.patch (107494 bytes) - added by Lukasa on 2016-04-12 07:50:04Z - Patch that rejects Python 2.7.3 and older.
  • 8194_8.patch (107493 bytes) - added by Lukasa on 2016-04-12 08:28:05Z - Eighth draft patch
  • 8194_9.patch (108031 bytes) - added by Lukasa on 2016-04-12 08:44:52Z - Patch with intersphinx for hyper-h2.
  • 8194_10.patch (108022 bytes) - added by Lukasa on 2016-04-12 09:12:02Z - Correctly reference UserWarning.
  • 8194_11.patch (107782 bytes) - added by Lukasa on 2016-04-12 09:22:26Z - Further Python 3 test fixes.
  • 8194_12.patch (107780 bytes) - added by Lukasa on 2016-04-12 09:43:44Z - Playing whack-a-mole with some more Python 3 failures.
Searchable metadata
trac-id__8194 8194
type__enhancement enhancement
reporter__Lukasa Lukasa
priority__normal normal
milestone__None None
branch__h2channel_8194_8 h2channel-8194-8
branch_author__hawkowl hawkowl
status__closed closed
resolution__fixed fixed
component__web web
keywords__None None
time__1454423080530040 1454423080530040
changetime__1465786982036110 1465786982036110
version__None None
owner__hawkowl hawkowl

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

Having uploaded the first draft patch, I'll put this out for early review.

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

Turns out I totally forgot to add HTTP/2 into this patch! The second draft patch contains the actual HTTP/2 work.

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

So that people know the ordering of patches, this patch cannot have tests that actually run until [#8191](#8191) is merged, because it assumes the layout changes in #8191.

@twisted-trac
Copy link
Author

adiroiban's avatar @adiroiban set owner to @Lukasa

I guess that with latest changes in [#8191](#8191) this needs to be updated.

Also, maybe is a good idea to rewrite this based on latest changes from #8191 to validate if they make sense.

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

With #8191 landed I can rework this patch to behave correctly.

Watch this space.

@twisted-trac
Copy link
Author

glyph's avatar @glyph commented

Space is watched - how's the re-work going?

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

Currently going for 100% test coverage. At 92% at the moment, so expect something this week, basically.

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa removed owner

Ok, for the sake of early review, I've uploaded the draft patch containing some HTTP/2 tests.

Here are some notes about this patch. Firstly, I believe it's basically entirely feature complete for the basic use-case: that is, when combined with a slightly updated patch for #8188, we can happily serve HTTP/2, at least when running twistd.

Secondly, it does not currently have complete coverage. Essentially all of the missing coverage comes down to the functionality around IPushProducer in the H2Connection object. The reason I haven't tested this is that I'm not sure whether the H2Connection should test, at the time connectionMade is called, whether its transport is an IConsumer and if it is, register itself as a producer, or whether registering H2Connection as a producer should be the job of some higher level component. One thing to note is that currently HTTPChannel does not implement IPushProducer in this way, and so this hasn't really been addressed in Twisted before. Once I can get some guidance on what I should do here I can implement the relevant tests.

Thirdly, some portions of the code are quite unpleasant. In particular, sendPrioritisedData is a pretty nasty function: advice on how to refactor it while ensuring that it behaves correctly would be appreciated.

Fourthly, the tests are crazy complex. This is admittedly because most of the tests have to exercise weird code paths, but it's also partly because of the way _sendPrioritisedData works. I'm open to feedback about simplifying here as well.

Of course, any other feedback would be welcomed as well!

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

As requested by hawkowl, I've updated the patch in #8188 so that it should apply cleanly alongside this patch. With both patches applied, twistd web should automatically serve HTTP/2 to HTTP/2-capable hosts. I also updated this patch to ensure that twistd web logs correctly.

To test this live, take your copy of Twisted and apply the latest patches on this issue and #8188. You can then, for example, serve your Twisted source tree by running twistd -n web --path . --https=8081 --privkey server.key --certificate server.crt. You'll need a TLS cert and key of course. You can then test the functionality by browsing to https://localhost:8081/, and click around. If you want to do CLI support, recent versions of curl support HTTP/2 (for the OS X users amongst you, the curl available from Homebrew has HTTP/2 support).

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

(In [47207]) Branching to h2channel-8194.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

(In [47209]) Branching to h2channel-8194-2.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

(In [47210]) Apply patch from lukasa, refs #8194

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl set owner to @Lukasa

Hi Lukasa, thanks for the continued work on this wonderful feature!

Just a few notes for this early-review:

  1. Python 3 builders and Python 2 builders on tox are failing. https://buildbot.twistedmatrix.com/boxes-supported?branch=/branches/h2channel-8194-2 . Also, the api docs and twistedchecker builds are failing. Ignore the API doc ones that say 'h2', we will need to figure out how to add H2's intersphinx at some point.

  2. There's some bare strings here, this is why I think the Python 3 builders are failing:

    getRequestHeaders = [

  3. The Python 2 on tox (12.04) builder is failing; I've not a clue why. Since it's running under Tox, maybe try running the local tests with PYTHONHASHSEED='3182490488' like it is?

  4. New optional dependencies need tests in twisted.python.test.test_dist.

  5. The spacing is a little odd at the top of some of the new files. trunk...h2channel-8194-2#diff-23344a2a2fe4717f941cd563348d1cc9R4 and trunk...h2channel-8194-2#diff-23344a2a2fe4717f941cd563348d1cc9R17 need new lines.

  6. All __future__ imports need to be absolute_import, division, even if there's no division in the file.

Overall, I like this. I think it's a sufficiently large batch of functionality that we will really only know how it works once it's in the field, so I am considering that a slightly-more-polished version of this would be able to land, gated behind a TWISTED_HTTP2=1 flag, with a set "this becomes stable API". This means we might be able to get this tested in useful ways (that is, in real life), but still give us the wiggle room to fix things that might not be perfect, which we can only honestly know after it's deployed. What do you think?

But, for the meantime: please fix the tests, and the things I mentioned above, and then please resubmit.

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa removed owner

Ok, so I've addressed 1, 2, 4, 5, and 6 (at least, I believe I have) and uploaded a new draft patch that addresses those problems. That's ready for review.

On the topic of the 12.04 builder, the problem is this:

root@ubuntu-512mb-lon1-01:~# python
Python 2.7.3 (default, Jun 22 2015, 19:33:41) 
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> data = b'hello world'
>>> import struct
>>> data = b'\x00\x01\x02\x03\x04\x05'
>>> struct.unpack("!HL", memoryview(data))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
struct.error: unpack requires a string argument of length 6

That is, the struct module on Python 2.7.3 does not accept memoryview objects. That's a problem for the tests, but a much bigger problem for the implementation itself, which uses memoryview objects extensively to avoid doing too many data copies. This problem is fixed in later patch releases of 2.7, which means it's mostly Ubuntu's fault for using such a wildly outdated Python version.

I'm not sure exactly what you'd like to do about that. Suggestions?

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

(In [47211]) Branching to h2channel-8194-3.

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

Per an informal chat with hawkowl, I've uploaded a new patch that raises a UserWarning on Python 2.7.3 and older and then refuses to import HTTP/2 support. That should resolve the problem by degrading gracefully on crappy old versions of Python.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

(In [47212]) applying patch from lukasa, refs #8194

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

The last two patches fix up a couple of minor issues, including a stupid tuple comparison and a unicode/bytes problem.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

(In [47213]) Branching to h2channel-8194-4.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

(In [47214]) applying patch from lukasa, refs #8194

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

Patch 11 should resolve some more of our Python 3 test problems (hopefully all of them).

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

(In [47215]) Branching to h2channel-8194-5.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

(In [47216]) applying patch from lukasa, refs #8194

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa commented

Uploaded a new file that fixes another failing test.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

(In [47217]) Branching to h2channel-8194-6.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

(In [47218]) applying patch from lukasa, refs #8194

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl set owner to @Lukasa

Hi Lukasa, thanks again for your continued work on this branch! The tests all pass now which makes me happy :)

  1. Since we now have the H2 docs linked by intersphinx, could you make all references to H2 objects use L{}?

  2. Same for Python stdlib types.

  3. References to other Python objects need L{} -- eg. H2Connection's docstrings mentioning IProtocol, H2Stream, IConsumer, GenericHTTPChannel, H2Connection.

  4. I am uncomfortable with h2 giving us Unicode headers. Everywhere else in Twisted uses bytes where possible, and encodes using latin1 for values and utf8 for values elsewhere. Is it possible to get H2 to give us un-decoded values that we can just pass through, rather than decoding and then reencoding?

  5. Where you use @see, it needs to be @see: L{whatever}.

  6. H2Stream.__init__ does not have a docstring, and the "headers" argument is therefore undocumented.

  7. After discussions with Glyph about how to do this, I would like you to add a note to the docstring saying that this is private API, and have a __all__}} of {{{[] to signify this as well. Once this is all landed, I want to make it public, but for right now I think it's better if we leave it private.

  8. Please look at the api docs builder failures; please fix where possible (ginore the stdlib reference ones). https://buildbot.twistedmatrix.com/builders/documentation/builds/6580/steps/api-documentation/logs/new%20pydoctor%20errors It looks like you might need to add an additional intersphinx for priority.

Thanks again. Please fix these issues and resubmit!

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa removed owner

Ok, so!

We've merged some of the important issues, so I'm providing a new version of this with the feedback from above applied. Per the other tickets I'm no longer using patches but instead am maintaining a branch on GitHub, which can be found here. I've updated the branch to handle the feedback from hawkie.

This is now good for review. I think we're going to be short on coverage, but I'd like to confirm that everything is still working and get a coverage report from the builders before I chase coverage again.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

Pulled in Lukasa's branch, sending to the builders...

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl set owner to @Lukasa

As mentioned at pycon:

  1. Fix up the little bits of coverage issues.
  2. Throw _ in front of the http2.py name.
  3. Remove the assert in trunk...h2channel-8194-7#diff-db5a08b9de46095e805af98c529a23a3R2302 .
  4. Remove the todo in trunk...h2channel-8194-7#diff-23344a2a2fe4717f941cd563348d1cc9R154 .
  5. trunk...h2channel-8194-7#diff-23344a2a2fe4717f941cd563348d1cc9R897 should be bytes not str?
  6. Should all of this be bytes? trunk...h2channel-8194-7#diff-20196d59da6b16960e251a22fef5eaacR371

Please fix these up and throw more code at me :)

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa removed owner

Ok, hawkie, I believe I've done this.

One note on the coverage: I haven't covered the 2.7.3 and earlier import rejection because I can't think of a good way to do it. Let me know if we need to chase it.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl set owner to @Lukasa

Hi Cory, thanks for this.

  1. I don't know a good way, but I did spot a bug -- there's a %s but nothing is formatted. Please manually test this code, and that'll do for now. We don't really support any Pythons that old, really...
  2. I just noticed -- this uses a global reactor import :(. Please parameterise reactor onto something that H2Connection is given, make it a private instance variable on the connection, and use self._reactor where the plain reactor is used.
  3. trunk...h2channel-8194-7#diff-20196d59da6b16960e251a22fef5eaacR1034 needs a L{}

Otherwise, I think we're getting close here. Please fix these up and resubmit.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

Oh, also Python 3 is broken. :P

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa removed owner

Ok, I've fixed all that up and pushed to my branch.

Unfortunately, hawkie, I've had to force push to my branch, in part because I wanted to rebase on top of trunk to take account of your intersphinx changes and to make the final merge easier. This, in turn, means you'll need to pull to a new branch.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl set owner to @Lukasa

Hi Cory, thanks again.

Unfortunately a few tests fail:

  1. dist3 doesn't have the name updated to the new _http2 module.
  2. See https://buildbot.twistedmatrix.com/builders/ubuntu12.04-py2.7/builds/187/steps/select/logs/problems for the other failures.
  3. There needs to be another newline on trunk...h2channel-8194-8#diff-20196d59da6b16960e251a22fef5eaacR2146 .

Please fix these up, then I think we could just land this...

@twisted-trac
Copy link
Author

@twisted-trac
Copy link
Author

Lukasa's avatar @Lukasa removed owner

Ok, I've updated my branch to address those problems. Let's go again!

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl commented

Fixed up the test failure on my branch... waiting for the builders.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl set owner to @hawkowl

I think, for all intents and purposes, this is good to land. It's private API, we can change it later, etc etc -- and it gives us something that works, right now.

Will merge tomorrow.

@twisted-trac
Copy link
Author

hawkowl's avatar @hawkowl set status to closed

In changeset d48e903

#!CommitTicketReference repository="" revision="d48e903ddadac5b81b4079d40efaa5e2eb8cb05b"
Merge h2channel-8194-8: Add HTTP/2 implementation of HTTPChannel.

Author: lukasa
Reviewers: adiroiban, hawkowl
Fixes: #8194

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants