Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement HTTP/2 server push #133

Merged
merged 25 commits into from Feb 5, 2015
Merged

implement HTTP/2 server push #133

merged 25 commits into from Feb 5, 2015

Conversation

kazuho
Copy link
Member

@kazuho kazuho commented Feb 3, 2015

still wip
relates to: #50

@kazuho
Copy link
Member Author

kazuho commented Feb 3, 2015

At the moment, the status is:

  • the HTTP/2 level implementation seems to be working
  • the priority of the pushed stream is hard-coded to 256 (heighest weight) without any dependency
    • [Q] how should I calculate the priority of a pushed stream?
  • still no handler-level code that actually registers the URLs to be pushed

To test the feature, I have applied this patch so that it would send main.css before sending / (files of @ipeychev's http2rulez.com were used for the tests). When accessing the server using Firefox Nightly 38.0a1 (2015-02-02), a log like following was emitted by the server, which indicates that the CSS file was actually sent before the HTML file (i.e. /).

127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /assets/css/main.css HTTP/2" 200 1528 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET / HTTP/2" 200 24583 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /assets/css/bootstrap.css HTTP/2" 200 132472 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /assets/css/magnific-popup.css HTTP/2" 200 7782 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /assets/css/font-awesome.css HTTP/2" 200 25174 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /assets/css/header.css HTTP/2" 200 162 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /assets/css/main.css HTTP/2" 200 1528 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /js/jquery-1.11.0.js HTTP/2" 200 96383 "-" "-"
(snip)

However, looking at the log it is obvious that Firefox sent a pull request for the same CSS file, even though it had been already pushed. So we need to find out the reason why (does Firefox Nightly already support server-push?) (note: RST_STREAM frame was not received from Firefox).

EDIT: the situation was the same for Chrome 42.0.2293.0 canary.

@kazuho
Copy link
Member Author

kazuho commented Feb 3, 2015

@bagder @igrigorik Do you have any information regarding the status of server push support in Firefox / Google Chrome? The browsers do not seem to use the the CSS file being pushed. They simply ignore it, and sends an ordinary pull request for the file.

Thank you in advance for your help.

@bagder
Copy link

bagder commented Feb 3, 2015

Firefox supports push for sure in Nightly, but I can't remember exactly when the support (will) exist in stable versions. I'll ping hurley and mcmanus to see if they can bring some insights here.

@kazuho
Copy link
Member Author

kazuho commented Feb 3, 2015

@bagder Thank you for the quick response and for pinging your colleagues.

Firefox supports push for sure in Nightly

Hmm. That makes me wonder why it is sending a request to a file that has already been pushed.

Maybe is it due to the response headers sent along with the push? I believe H2O is sending something like https://gist.github.com/kazuho/1c891149199f5ac2e971

@nwgh
Copy link

nwgh commented Feb 3, 2015

I'm willing to bet the issue with Nightly is https://bugzilla.mozilla.org/show_bug.cgi?id=1127618 given that e10s is enabled by default on Nightly.

@kazuho - if you try disabling e10s via Preferences -> General -> Uncheck "Enable E10S (multi-process)" and then restart Nightly (required to disable e10s), does push work for you with Nightly? If not, then we'll have to dig deeper, but that's a good first place to start.

@tatsuhiro-t
Copy link
Contributor

[Q] how should I calculate the priority of a pushed stream?

https://tools.ietf.org/html/draft-ietf-httpbis-http2-16#section-5.3.5

Pushed streams   (Section 8.2) initially depend on their associated stream.  In both
   cases, streams are assigned a default weight of 16.

@kazuho
Copy link
Member Author

kazuho commented Feb 4, 2015

@todesschaf Thank you very much for the suggestions.

I have found and fixed a number of bugs in H2O. With the changes up to 94c42b5, and e10s disabled on Firefox Nightly (380.a1 2015-02-03), server push is working like a charm.

With this patch applied to the file handler to send the CSS files, the access log of H2O is printed as follows.

127.0.0.1 - - [04/Feb/2015:11:27:25 +0900] "GET /assets/css/magnific-popup.css HTTP/2" 200 7782 "-" "-"
127.0.0.1 - - [04/Feb/2015:11:27:25 +0900] "GET /assets/css/font-awesome.css HTTP/2" 200 25174 "-" "-"
127.0.0.1 - - [04/Feb/2015:11:27:25 +0900] "GET /assets/css/header.css HTTP/2" 200 162 "-" "-"
127.0.0.1 - - [04/Feb/2015:11:27:25 +0900] "GET /assets/css/main.css HTTP/2" 200 1528 "-" "-"
127.0.0.1 - - [04/Feb/2015:11:27:25 +0900] "GET / HTTP/2" 200 24583 "-" "-"
127.0.0.1 - - [04/Feb/2015:11:27:25 +0900] "GET /assets/css/bootstrap.css HTTP/2" 200 132472 "-" "-"
127.0.0.1 - - [04/Feb/2015:11:27:27 +0900] "GET /js/jquery-1.11.0.js HTTP/2" 200 96383 "-" "-"

It is also evident from the Network panel of Nightly that server push is working, as the waiting times have gone away for the CSS files.

Network Panel

note: transfer of bootstrap.css completes after /, as it is large and requires multiple WINDOW_UPDATE frames to be send from the client.

I will continue working on the server push support in H2O to make it easier to be used by programmers / system administrators.

@kazuho
Copy link
Member Author

kazuho commented Feb 4, 2015

@tatsuhiro-t

[Q] how should I calculate the priority of a pushed stream?

https://tools.ietf.org/html/draft-ietf-httpbis-http2-16#section-5.3.5

Pushed streams   (Section 8.2) initially depend on their associated stream.  In both
cases, streams are assigned a default weight of 16.

Thank you for the suggestion.

The problem is that when sending CSS files using server push, they need to be given higher priority than the HTML file that uses the CSS files (since it is totally impossible to render the HTML without having a complete set of CSS files, while it is possible to progressively render HTML when once all the CSS files become ready). In other words, IMO the default behavior to make CSS streams dependent to the HTML stream is inappropriate in this case.

EDIT: Ideally speaking, pushed streams should be given the same priority as if it were being pulled. The question is what the best approximation is.

@kazuho
Copy link
Member Author

kazuho commented Feb 4, 2015

Confirmed that server push also works with Chrome Canary (42.0.2294.0) using the H2O configuration described in #133 (comment).

However, as Canary sets the weight of HTML to 256, there was a need to set even higher priority for CSS files to be sent before the HTML (8907f2e). Using a weight value of 257 is not a problem even though it exceeds the bounds defined by the HTTP2 spec., since the value is never exposed over the network.

note: the internal weights are never exposed to the client, as sending PRIORITY frames might confuse the clients

@kazuho
Copy link
Member Author

kazuho commented Feb 5, 2015

@pmeenan

I'm pretty sure Chrome's prioritization for HTTP2 is pretty far behind Firefox's implementation. I know SPDY just uses the coarse 5 priority levels and I'm not sure what we pass through in the case of HTTP2 but I am pretty sure we just map the priorities to stream weights (which causes interleaving).

Yeah! Using Chrome Canary I see weights of 256 (HTML), 220 (CSS), 183 (JavaScript), 110 (images), which corresponds to your description.

Actual dependency-based prioritization like Firefox launched recently is absolutely the way to go. I'm just not sure what the timeframe is for that in Chrome.

Thank you for the clarification. Looking forward to see improvements to / experiments on the browser side.

@igrigorik

Fair enough, but I'm still wary of this strategy as a default. It's very easy to construct cases where shipping HTML last would actually hurt overall performance by delaying discovery of other critical resources - e.g. a Google Fonts CSS file which resides on a different origin and might block text rendering.

Thank you for pointing that out. Stepping back from comparing between the weight numbers or dependencies, what would be the ideal approach? Would it be something like: send the HEAD element of the HTML first, then push the contents of the CSS / JavaScript files (that block the renderer), and then send the BODY element? If the approach sounds like a good way for most of the cases, it might be worth to consider adding a one-shot feature (i.e. send DATA only once at a very high weight and then return to the original weight) to the HTTP/2 scheduler of H2O.

That's not surprising. HTML spec dictates certain processing logic that determines high-level relative priority of various resources, but how those resources are actually scheduled and/or fetched is a whole different story.. This is a space where we all (user agents) have and should continue to experiment to adapt to the continuously evolving architecture of pages + networks our users are using.

I agree that the view better describes the situation (than what I had expected).

kazuho added a commit that referenced this pull request Feb 5, 2015
implement HTTP/2 server push
@kazuho kazuho merged commit 7e8836d into master Feb 5, 2015
@kazuho
Copy link
Member Author

kazuho commented Feb 5, 2015

Thank you all for your advises, the feature has successfully been merged to master.

The reverse proxy module of H2O now recognizes a response header called x-server-push, which can be emitted by the application servers (running upstream) to request H2O (running as the reverse proxy) to push the resources.

The syntax of the header is: x-server-push: URL; attr1=foo; attr=bar where URL indicates the URL of the content to be pushed to the client. Attributes can be omitted; there are yet no attributes that are recognized.
(FYI to ease debugging, H2O inserts x-http-pushed response header to the streams that are pushed)

It is unfortunate that I have to close this PR even though interesting discussions are ongoing; it seems like there is no way to keep a PR open after merging the code.

I would appreciate it if you could post suggestions from now on to #137.

Please let me express my gratitude to your help in implementing / improving support for server push in H2O.

@igrigorik
Copy link

Thank you for pointing that out. Stepping back from comparing between the weight numbers or dependencies, what would be the ideal approach? Would it be something like: send the HEAD element of the HTML first, then push the contents of the CSS / JavaScript files (that block the renderer), and then send the BODY element?

~ish, yeah. Typically, we don't need all of the CSS or JavaScript to get visible content on the screen, and this is where the app developer needs to step in and provide the right context to the server - e.g. push these CSS and JS bytes alongside the HTML response to deliver a fast first render, then stream remaining markup to fill in the remaining bits.

If the approach sounds like a good way for most of the cases, it might be worth to consider adding a one-shot feature (i.e. send DATA only once at a very high weight and then return to the original weight) to the HTTP/2 scheduler of H2O.

I do like the idea of allowing "send X bytes of resource Y then yield it and use a lower priority for the rest". This would be useful for streaming initial HTML payload for large pages, and/or even images: I wan to stream header of the image to allow the UA to decode its geometry and perform layout (if progressive, then a rough preview as well), but I'll stream the image bytes themselves later after other more critical resources.

Also, as an aside... Given that HTML is often dynamic and takes some time to generate, whereas CSS/JS is typically static, I'm guessing that even if we set them at same priority.. a good fraction of CSS/JS bytes will still come in front of HTML due to the associated app server response time delays.

@kazuho
Copy link
Member Author

kazuho commented Feb 6, 2015

@igrigorik
Thank you for the response. I will see if I can implement the one-shot (first-shot to be more precise) priority escalation. And regarding the images, I do remember you mentioning the feature in HTTP2 Conference in Tokyo at the end of last year. The discussion here is indeed a variation of the approach.

Also, as an aside... Given that HTML is often dynamic and takes some time to generate, whereas CSS/JS is typically static, I'm guessing that even if we set them at same priority.. a good fraction of CSS/JS bytes will still come in front of HTML due to the associated app server response time delays.

Sounds interesting. Such a feature can definitely be implemented within the reverse proxy.

As a note, redirections can be made faster by using server push. Proxies can monitor if Location: header is emitted, and in case the URL in the header is of the same authority, push the redirected resource along with the 30x response.

@tatsuhiro-t
Copy link
Contributor

Awesome work, @kazuho.
I'm interested in reverse proxy usecase, since we are also planning to add server push to nghttpx.
Reading h2o source code, currently couple of headers from associated request are copied to push request, such as accept* and user-agent. I think there is a case that other headers like cookies and authorization affect resource retrieval. More than that, proxied server can have their liberty to process requests based on arbitrary headers, so would it be more safer to copy all headers?
referer can be updated to associated URL.
Other concern is accept header field. Browsers change accept header field based on they are requesting. For example, Firefox sends completely different accept header field between getting HTML and CSS. I'm not sure how this variation of header field affect the contents we get.

@tatsuhiro-t
Copy link
Contributor

As for header field to instrument resources to push, Link header field might be a good candidate: http://www.chromium.org/spdy/link-headers-and-server-hint/link-rel-subresource
If rel=subresource, then it is a good target for push. We can invent new rel value as well...

@igrigorik
Copy link

As for header field to instrument resources to push, Link header field might be a good candidate: http://www.chromium.org/spdy/link-headers-and-server-hint/link-rel-subresource
If rel=subresource, then it is a good target for push. We can invent new rel value as well...

FWIW, our plan is to retire subresource in favor of "preload", see: http://w3c.github.io/preload/

@kazuho
Copy link
Member Author

kazuho commented Feb 6, 2015

@tatsuhiro-t Thank you for the comment. That is a good question.

The fact is, I have cowardly limited the headers to be copied for building push responses (see 812dec8 for an example). The reason consists of two points described below.

  1. As briefly suggested in improve HTTP/2 server push #137 I am going to improve the server-push logic so that it would push the resources only when it is unlikely that they already exist within the client cache. To achieve the goal, the server needs to issue a conditional request internally, check that the response is not 304 Not Modified, and after then, send the PUSH_PROMISE header. At the same time it is essential to send the PUSH_PROMISE header before sending the contents of the resource that refers to the pushed resource (e.g. if we are to push a CSS file, client should receive the PUSH_PROMISE header prior to the <LINK REL="stylesheet"> tag), or we might waste the bandwidth as clients may issue another request for a resource that is being pushed. These two requirements limit the type of resources that can be pushed only to those that are available instantly within the reverse proxy (e.g. static files or cached content within the proxy cache). As it is unlikely that such resources are access-controlled resources, it seemed unnecessary to copy all the request headers when building a request for server-push (in case of H2O at the moment, the statically served contents cannot be access controlled).
  2. Should the pushed response vary depending on the value of certain header (e.g. cookie), we need to include vary: cookie in the pushed response, which in turn means that the cookie header must be included in the PUSH_PROMISE frame being sent. However, client implementations might reject such server-pushed streams, as other requests flying may change the value of the cookie header; clients are required to throw away the pushed response, if the value of the cookies have changed when it needs to actually use the resource in question. In other words, I thought that contents that become conditionally retrievable should better not be pushed.

After reading your comments (esp. the lines regarding the accept header), I think it might be better to rewrite the vary headers of the server-pushed responses to cache-control: private (if vary exists) and also do not send any headers in the PUSH_PROMISE frame. And if we are to adjust the implementation as such, then we can for sure copy all the request headers when building a internal request to initiate server-push (with the exception that accept header may not be usable as you pointed out).

@kazuho
Copy link
Member Author

kazuho commented Feb 6, 2015

@tatsuhiro-t

As for header field to instrument resources to push, Link header field might be a good candidate: http://www.chromium.org/spdy/link-headers-and-server-hint/link-rel-subresource
If rel=subresource, then it is a good target for push. We can invent new rel value as well...

Sounds interesting. Although I am not excluding such possibility, however, regarding the issue of discovering the resources to be pushed my tendency goes to using response headers (as has been implemented by this PR) or using a mapping file for statically served contents,

In case of automatic discovery, it is important to not have false positives. We would never want to push a resource that would not be used. That means that we would need a sophisticated parser for discovering the resources (e.g. the parser that extracts the LINK tags should assert that it is not surrounded by <!-- -->), which in turn may mean that such parsers are slow, not to mention that we would need to implement such parsers for each type of resource (e.g. for HTML we need to extract valid LINK tags, for CSS we need to extract @import).

So I believe that for the short term it would be better to use headers or mapping files for specifying the resources that should be pushed. Users can write the mapping files by hand, or use a tool (likely to be written in scripting languages) to extract the URLs of the resources that need to be pushed (and then possibly adjust the list by hand).

EDIT: As an afterthought, if I were to implement such automatic discovery I would spawn an external filter that extracts the necessary resources for frequently served contents, and associate the results to the cache entry so that the associated contents can be pushed for future requests arriving to the resource.

@igrigorik
Copy link

@kazuho I'd start with basic Link support. I interpreted @tatsuhiro-t's comment as: instead of using a custom header name, use link with rel=subresource. Except.. Instead of subresource, I think you should use rel=preload. E.g..

200 OK ...
Link: </font.woff>; rel=preload; as=font

^ This tells you the resource that could/should be pushed and its type, which can help determine priority. For more, see: http://w3c.github.io/preload/#interoperability-with-http-link-header

@kazuho
Copy link
Member Author

kazuho commented Feb 6, 2015

@igrigorik Thank you for pointing that out. It is clear that I did not read @tatsuhiro-t's comment carefully enough. My apologies.

@tatsuhiro-t
Copy link
Contributor

Yeah, I was a bit short of words, I mean Link header field and no link element in HTML.
As for cookies and authorization stuff, I have to read preload spec, but it could be safer to just omit it for now since chromium document also refers this as well. Hopefully we'll gain more experience this year about this new technology and find out what is the best we can do.

@kazuho
Copy link
Member Author

kazuho commented Feb 9, 2015

FYI as of 85f4471 H2O recognizes Link: <URL>; rel=preload headers and push the contents referred to by the URLs.

@igrigorik
Copy link

@kazuho \o/ ... woot! Time to run some experiments...

@tatsuhiro-t
Copy link
Contributor

Great! https://nghttp2.org also enabled server push using Link header field, so we have suddenly 2 implementations using preload relation, which sounds very exciting.

@kazuho
Copy link
Member Author

kazuho commented Feb 9, 2015

@tatsuhiro-t 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants