implement HTTP/2 server push #133

kazuho · 2015-02-03T05:37:47Z

still wip
relates to: #50

…aming it to `open_pull`, as pushed streams should be counted separately)

…ngle hpack header element into multiple HTTP2 frames)

…commit)

kazuho · 2015-02-03T05:50:42Z

At the moment, the status is:

the HTTP/2 level implementation seems to be working
the priority of the pushed stream is hard-coded to 256 (heighest weight) without any dependency
- [Q] how should I calculate the priority of a pushed stream?
still no handler-level code that actually registers the URLs to be pushed

To test the feature, I have applied this patch so that it would send main.css before sending / (files of @ipeychev's http2rulez.com were used for the tests). When accessing the server using Firefox Nightly 38.0a1 (2015-02-02), a log like following was emitted by the server, which indicates that the CSS file was actually sent before the HTML file (i.e. /).

127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /assets/css/main.css HTTP/2" 200 1528 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET / HTTP/2" 200 24583 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /assets/css/bootstrap.css HTTP/2" 200 132472 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /assets/css/magnific-popup.css HTTP/2" 200 7782 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /assets/css/font-awesome.css HTTP/2" 200 25174 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /assets/css/header.css HTTP/2" 200 162 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /assets/css/main.css HTTP/2" 200 1528 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /js/jquery-1.11.0.js HTTP/2" 200 96383 "-" "-"
(snip)

However, looking at the log it is obvious that Firefox sent a pull request for the same CSS file, even though it had been already pushed. So we need to find out the reason why (does Firefox Nightly already support server-push?) (note: RST_STREAM frame was not received from Firefox).

EDIT: the situation was the same for Chrome 42.0.2293.0 canary.

kazuho · 2015-02-03T06:17:01Z

@bagder @igrigorik Do you have any information regarding the status of server push support in Firefox / Google Chrome? The browsers do not seem to use the the CSS file being pushed. They simply ignore it, and sends an ordinary pull request for the file.

Thank you in advance for your help.

bagder · 2015-02-03T06:45:20Z

Firefox supports push for sure in Nightly, but I can't remember exactly when the support (will) exist in stable versions. I'll ping hurley and mcmanus to see if they can bring some insights here.

kazuho · 2015-02-03T06:59:05Z

@bagder Thank you for the quick response and for pinging your colleagues.

Firefox supports push for sure in Nightly

Hmm. That makes me wonder why it is sending a request to a file that has already been pushed.

Maybe is it due to the response headers sent along with the push? I believe H2O is sending something like https://gist.github.com/kazuho/1c891149199f5ac2e971

nwgh · 2015-02-03T14:57:01Z

I'm willing to bet the issue with Nightly is https://bugzilla.mozilla.org/show_bug.cgi?id=1127618 given that e10s is enabled by default on Nightly.

@kazuho - if you try disabling e10s via Preferences -> General -> Uncheck "Enable E10S (multi-process)" and then restart Nightly (required to disable e10s), does push work for you with Nightly? If not, then we'll have to dig deeper, but that's a good first place to start.

tatsuhiro-t · 2015-02-03T16:58:57Z

[Q] how should I calculate the priority of a pushed stream?

https://tools.ietf.org/html/draft-ietf-httpbis-http2-16#section-5.3.5

Pushed streams   (Section 8.2) initially depend on their associated stream.  In both
   cases, streams are assigned a default weight of 16.

… they can be cancelled silently, without sending RST_STREAM or DATA(END_STREAM)

kazuho · 2015-02-04T02:37:28Z

@todesschaf Thank you very much for the suggestions.

I have found and fixed a number of bugs in H2O. With the changes up to 94c42b5, and e10s disabled on Firefox Nightly (380.a1 2015-02-03), server push is working like a charm.

With this patch applied to the file handler to send the CSS files, the access log of H2O is printed as follows.

127.0.0.1 - - [04/Feb/2015:11:27:25 +0900] "GET /assets/css/magnific-popup.css HTTP/2" 200 7782 "-" "-"
127.0.0.1 - - [04/Feb/2015:11:27:25 +0900] "GET /assets/css/font-awesome.css HTTP/2" 200 25174 "-" "-"
127.0.0.1 - - [04/Feb/2015:11:27:25 +0900] "GET /assets/css/header.css HTTP/2" 200 162 "-" "-"
127.0.0.1 - - [04/Feb/2015:11:27:25 +0900] "GET /assets/css/main.css HTTP/2" 200 1528 "-" "-"
127.0.0.1 - - [04/Feb/2015:11:27:25 +0900] "GET / HTTP/2" 200 24583 "-" "-"
127.0.0.1 - - [04/Feb/2015:11:27:25 +0900] "GET /assets/css/bootstrap.css HTTP/2" 200 132472 "-" "-"
127.0.0.1 - - [04/Feb/2015:11:27:27 +0900] "GET /js/jquery-1.11.0.js HTTP/2" 200 96383 "-" "-"

It is also evident from the Network panel of Nightly that server push is working, as the waiting times have gone away for the CSS files.

note: transfer of bootstrap.css completes after /, as it is large and requires multiple WINDOW_UPDATE frames to be send from the client.

I will continue working on the server push support in H2O to make it easier to be used by programmers / system administrators.

kazuho · 2015-02-04T02:44:56Z

@tatsuhiro-t

[Q] how should I calculate the priority of a pushed stream?

https://tools.ietf.org/html/draft-ietf-httpbis-http2-16#section-5.3.5
Pushed streams   (Section 8.2) initially depend on their associated stream.  In both
cases, streams are assigned a default weight of 16.

Thank you for the suggestion.

The problem is that when sending CSS files using server push, they need to be given higher priority than the HTML file that uses the CSS files (since it is totally impossible to render the HTML without having a complete set of CSS files, while it is possible to progressively render HTML when once all the CSS files become ready). In other words, IMO the default behavior to make CSS streams dependent to the HTML stream is inappropriate in this case.

EDIT: Ideally speaking, pushed streams should be given the same priority as if it were being pulled. The question is what the best approximation is.

…any streams being pulled

kazuho · 2015-02-04T02:59:09Z

Confirmed that server push also works with Chrome Canary (42.0.2294.0) using the H2O configuration described in #133 (comment).

However, as Canary sets the weight of HTML to 256, there was a need to set even higher priority for CSS files to be sent before the HTML (8907f2e). Using a weight value of 257 is not a problem even though it exceeds the bounds defined by the HTTP2 spec., since the value is never exposed over the network.

note: the internal weights are never exposed to the client, as sending PRIORITY frames might confuse the clients

Conflicts: include/h2o/string_.h

…tp2_conn_push_url

kazuho · 2015-02-05T01:35:17Z

@pmeenan

I'm pretty sure Chrome's prioritization for HTTP2 is pretty far behind Firefox's implementation. I know SPDY just uses the coarse 5 priority levels and I'm not sure what we pass through in the case of HTTP2 but I am pretty sure we just map the priorities to stream weights (which causes interleaving).

Yeah! Using Chrome Canary I see weights of 256 (HTML), 220 (CSS), 183 (JavaScript), 110 (images), which corresponds to your description.

Actual dependency-based prioritization like Firefox launched recently is absolutely the way to go. I'm just not sure what the timeframe is for that in Chrome.

Thank you for the clarification. Looking forward to see improvements to / experiments on the browser side.

@igrigorik

Fair enough, but I'm still wary of this strategy as a default. It's very easy to construct cases where shipping HTML last would actually hurt overall performance by delaying discovery of other critical resources - e.g. a Google Fonts CSS file which resides on a different origin and might block text rendering.

Thank you for pointing that out. Stepping back from comparing between the weight numbers or dependencies, what would be the ideal approach? Would it be something like: send the HEAD element of the HTML first, then push the contents of the CSS / JavaScript files (that block the renderer), and then send the BODY element? If the approach sounds like a good way for most of the cases, it might be worth to consider adding a one-shot feature (i.e. send DATA only once at a very high weight and then return to the original weight) to the HTTP/2 scheduler of H2O.

That's not surprising. HTML spec dictates certain processing logic that determines high-level relative priority of various resources, but how those resources are actually scheduled and/or fetched is a whole different story.. This is a space where we all (user agents) have and should continue to experiment to adapt to the continuously evolving architecture of pages + networks our users are using.

I agree that the view better describes the situation (than what I had expected).

… pushed

…` header and push the contents if possible

implement HTTP/2 server push

kazuho · 2015-02-05T07:15:10Z

Thank you all for your advises, the feature has successfully been merged to master.

The reverse proxy module of H2O now recognizes a response header called x-server-push, which can be emitted by the application servers (running upstream) to request H2O (running as the reverse proxy) to push the resources.

The syntax of the header is: x-server-push: URL; attr1=foo; attr=bar where URL indicates the URL of the content to be pushed to the client. Attributes can be omitted; there are yet no attributes that are recognized.
(FYI to ease debugging, H2O inserts x-http-pushed response header to the streams that are pushed)

It is unfortunate that I have to close this PR even though interesting discussions are ongoing; it seems like there is no way to keep a PR open after merging the code.

I would appreciate it if you could post suggestions from now on to #137.

Please let me express my gratitude to your help in implementing / improving support for server push in H2O.

igrigorik · 2015-02-05T15:59:11Z

Thank you for pointing that out. Stepping back from comparing between the weight numbers or dependencies, what would be the ideal approach? Would it be something like: send the HEAD element of the HTML first, then push the contents of the CSS / JavaScript files (that block the renderer), and then send the BODY element?

~ish, yeah. Typically, we don't need all of the CSS or JavaScript to get visible content on the screen, and this is where the app developer needs to step in and provide the right context to the server - e.g. push these CSS and JS bytes alongside the HTML response to deliver a fast first render, then stream remaining markup to fill in the remaining bits.

If the approach sounds like a good way for most of the cases, it might be worth to consider adding a one-shot feature (i.e. send DATA only once at a very high weight and then return to the original weight) to the HTTP/2 scheduler of H2O.

I do like the idea of allowing "send X bytes of resource Y then yield it and use a lower priority for the rest". This would be useful for streaming initial HTML payload for large pages, and/or even images: I wan to stream header of the image to allow the UA to decode its geometry and perform layout (if progressive, then a rough preview as well), but I'll stream the image bytes themselves later after other more critical resources.

Also, as an aside... Given that HTML is often dynamic and takes some time to generate, whereas CSS/JS is typically static, I'm guessing that even if we set them at same priority.. a good fraction of CSS/JS bytes will still come in front of HTML due to the associated app server response time delays.

kazuho · 2015-02-06T02:10:20Z

@igrigorik
Thank you for the response. I will see if I can implement the one-shot (first-shot to be more precise) priority escalation. And regarding the images, I do remember you mentioning the feature in HTTP2 Conference in Tokyo at the end of last year. The discussion here is indeed a variation of the approach.

Also, as an aside... Given that HTML is often dynamic and takes some time to generate, whereas CSS/JS is typically static, I'm guessing that even if we set them at same priority.. a good fraction of CSS/JS bytes will still come in front of HTML due to the associated app server response time delays.

Sounds interesting. Such a feature can definitely be implemented within the reverse proxy.

As a note, redirections can be made faster by using server push. Proxies can monitor if Location: header is emitted, and in case the URL in the header is of the same authority, push the redirected resource along with the 30x response.

tatsuhiro-t · 2015-02-06T14:49:53Z

Awesome work, @kazuho.
I'm interested in reverse proxy usecase, since we are also planning to add server push to nghttpx.
Reading h2o source code, currently couple of headers from associated request are copied to push request, such as accept* and user-agent. I think there is a case that other headers like cookies and authorization affect resource retrieval. More than that, proxied server can have their liberty to process requests based on arbitrary headers, so would it be more safer to copy all headers?
referer can be updated to associated URL.
Other concern is accept header field. Browsers change accept header field based on they are requesting. For example, Firefox sends completely different accept header field between getting HTML and CSS. I'm not sure how this variation of header field affect the contents we get.

tatsuhiro-t · 2015-02-06T15:47:33Z

As for header field to instrument resources to push, Link header field might be a good candidate: http://www.chromium.org/spdy/link-headers-and-server-hint/link-rel-subresource
If rel=subresource, then it is a good target for push. We can invent new rel value as well...

igrigorik · 2015-02-06T16:55:54Z

As for header field to instrument resources to push, Link header field might be a good candidate: http://www.chromium.org/spdy/link-headers-and-server-hint/link-rel-subresource
If rel=subresource, then it is a good target for push. We can invent new rel value as well...

FWIW, our plan is to retire subresource in favor of "preload", see: http://w3c.github.io/preload/

kazuho · 2015-02-06T19:28:56Z

@tatsuhiro-t Thank you for the comment. That is a good question.

The fact is, I have cowardly limited the headers to be copied for building push responses (see 812dec8 for an example). The reason consists of two points described below.

As briefly suggested in improve HTTP/2 server push #137 I am going to improve the server-push logic so that it would push the resources only when it is unlikely that they already exist within the client cache. To achieve the goal, the server needs to issue a conditional request internally, check that the response is not 304 Not Modified, and after then, send the PUSH_PROMISE header. At the same time it is essential to send the PUSH_PROMISE header before sending the contents of the resource that refers to the pushed resource (e.g. if we are to push a CSS file, client should receive the PUSH_PROMISE header prior to the <LINK REL="stylesheet"> tag), or we might waste the bandwidth as clients may issue another request for a resource that is being pushed. These two requirements limit the type of resources that can be pushed only to those that are available instantly within the reverse proxy (e.g. static files or cached content within the proxy cache). As it is unlikely that such resources are access-controlled resources, it seemed unnecessary to copy all the request headers when building a request for server-push (in case of H2O at the moment, the statically served contents cannot be access controlled).
Should the pushed response vary depending on the value of certain header (e.g. cookie), we need to include vary: cookie in the pushed response, which in turn means that the cookie header must be included in the PUSH_PROMISE frame being sent. However, client implementations might reject such server-pushed streams, as other requests flying may change the value of the cookie header; clients are required to throw away the pushed response, if the value of the cookies have changed when it needs to actually use the resource in question. In other words, I thought that contents that become conditionally retrievable should better not be pushed.

After reading your comments (esp. the lines regarding the accept header), I think it might be better to rewrite the vary headers of the server-pushed responses to cache-control: private (if vary exists) and also do not send any headers in the PUSH_PROMISE frame. And if we are to adjust the implementation as such, then we can for sure copy all the request headers when building a internal request to initiate server-push (with the exception that accept header may not be usable as you pointed out).

kazuho · 2015-02-06T20:07:42Z

@tatsuhiro-t

As for header field to instrument resources to push, Link header field might be a good candidate: http://www.chromium.org/spdy/link-headers-and-server-hint/link-rel-subresource
If rel=subresource, then it is a good target for push. We can invent new rel value as well...

Sounds interesting. Although I am not excluding such possibility, however, regarding the issue of discovering the resources to be pushed my tendency goes to using response headers (as has been implemented by this PR) or using a mapping file for statically served contents,

In case of automatic discovery, it is important to not have false positives. We would never want to push a resource that would not be used. That means that we would need a sophisticated parser for discovering the resources (e.g. the parser that extracts the LINK tags should assert that it is not surrounded by ), which in turn may mean that such parsers are slow, not to mention that we would need to implement such parsers for each type of resource (e.g. for HTML we need to extract valid LINK tags, for CSS we need to extract @import).

So I believe that for the short term it would be better to use headers or mapping files for specifying the resources that should be pushed. Users can write the mapping files by hand, or use a tool (likely to be written in scripting languages) to extract the URLs of the resources that need to be pushed (and then possibly adjust the list by hand).

EDIT: As an afterthought, if I were to implement such automatic discovery I would spawn an external filter that extracts the necessary resources for frequently served contents, and associate the results to the cache entry so that the associated contents can be pushed for future requests arriving to the resource.

igrigorik · 2015-02-06T20:45:03Z

@kazuho I'd start with basic Link support. I interpreted @tatsuhiro-t's comment as: instead of using a custom header name, use link with rel=subresource. Except.. Instead of subresource, I think you should use rel=preload. E.g..

200 OK ...
Link: </font.woff>; rel=preload; as=font

^ This tells you the resource that could/should be pushed and its type, which can help determine priority. For more, see: http://w3c.github.io/preload/#interoperability-with-http-link-header

kazuho · 2015-02-06T20:49:38Z

@igrigorik Thank you for pointing that out. It is clear that I did not read @tatsuhiro-t's comment carefully enough. My apologies.

tatsuhiro-t · 2015-02-07T01:44:17Z

Yeah, I was a bit short of words, I mean Link header field and no link element in HTML.
As for cookies and authorization stuff, I have to read preload spec, but it could be safer to just omit it for now since chromium document also refers this as well. Hopefully we'll gain more experience this year about this new technology and find out what is the best we can do.

kazuho · 2015-02-09T05:10:01Z

FYI as of 85f4471 H2O recognizes Link: <URL>; rel=preload headers and push the contents referred to by the URLs.

igrigorik · 2015-02-09T05:15:10Z

@kazuho \o/ ... woot! Time to run some experiments...

tatsuhiro-t · 2015-02-09T15:24:20Z

Great! https://nghttp2.org also enabled server push using Link header field, so we have suddenly 2 implementations using preload relation, which sounds very exciting.

kazuho · 2015-02-09T22:08:35Z

@tatsuhiro-t 👍

kazuho added 10 commits February 3, 2015 10:03

adjust the way the number of open streams are counted (as well as ren…

6c7ec73

…aming it to `open_pull`, as pushed streams should be counted separately)

simplify h2o_hpack_flatten_headers (as the spec. permits to divide si…

c024897

…ngle hpack header element into multiple HTTP2 frames)

no need to check the maximum length of the header value (amends prev …

9aef844

…commit)

Merge branch 'kazuho/refactor/hpack-encoder' into kazuho/push

3663ea0

Merge branch 'master' into kazuho/push

02ad6f9

[refactor] h2o_parse_url returns the result using struct

bb463b4

h2o_parse_url extracts the authority as well

bb0f192

update the examples following the change

371c48d

Merge branch 'kazuho/parse_authority_from_url' into kazuho/push

26627ca

[http2] implement push in the protocol layer

61e2638

kazuho added enhancement http2 labels Feb 3, 2015

kazuho added 5 commits February 4, 2015 11:06

fix bugs that sent corrupt PUSH_PROMISE frame

fc62e2b

fix stream-level errors being misreported

9e357cf

for debugging, add HPACK encoder that does not compress data

b4ff856

:authority of server push should not be hostport, not host

f791707

do not request to proceed the to-be-pushed streams being cancelled as…

94c42b5

… they can be cancelled silently, without sending RST_STREAM or DATA(END_STREAM)

set weight of pushed streams to 257 so that they will be sent before …

8907f2e

…any streams being pulled

kazuho mentioned this pull request Feb 4, 2015

compare if-modified-since and last-modified algebraically #134

Merged

kazuho added 2 commits February 5, 2015 04:03

Merge branch 'master' into kazuho/push

fddfbce

Conflicts: include/h2o/string_.h

for readability, move the code that sends push_promise next to h2o_ht…

468fabb

…tp2_conn_push_url

kazuho added 2 commits February 5, 2015 12:57

refactor the tokenizer for simplicity and performance, add tests

4566ad8

implement parser for name-value pairs

cdc2996

kazuho mentioned this pull request Feb 5, 2015

implement parser for name-value pairs #136

Merged

kazuho added 3 commits February 5, 2015 13:37

Merge branch 'kazuho/nvlist-parser' into kazuho/push

ea6b92c

to ease debugging, send x-http2-pushed header if the content has been…

5983381

… pushed

in reverse proxy, recognize `x-server-push: URL; attr1=foo; attr2=bar…

bce3d5d

…` header and push the contents if possible

kazuho added a commit that referenced this pull request Feb 5, 2015

Merge pull request #133 from h2o/kazuho/push

7e8836d

implement HTTP/2 server push

kazuho merged commit 7e8836d into master Feb 5, 2015

kazuho mentioned this pull request Feb 5, 2015

improve HTTP/2 server push #137

Open

kazuho mentioned this pull request Feb 6, 2015

Use Link header for server-push #141

Closed

tatsuhiro-t mentioned this pull request Apr 11, 2015

Push on nghttpx nghttp2/nghttp2#149

Closed

igrigorik mentioned this pull request Jul 27, 2015

Feature: Link rel=preload support for H2 push icing/mod_h2#38

Closed

kazuho mentioned this pull request Aug 9, 2015

cache-aware server-push #421

Open

s0j0urn mentioned this pull request Sep 10, 2015

h2o configuration for server push #490

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement HTTP/2 server push #133

implement HTTP/2 server push #133

kazuho commented Feb 3, 2015

kazuho commented Feb 3, 2015

kazuho commented Feb 3, 2015

bagder commented Feb 3, 2015

kazuho commented Feb 3, 2015

nwgh commented Feb 3, 2015

tatsuhiro-t commented Feb 3, 2015

kazuho commented Feb 4, 2015

kazuho commented Feb 4, 2015

kazuho commented Feb 4, 2015

kazuho commented Feb 5, 2015

kazuho commented Feb 5, 2015

igrigorik commented Feb 5, 2015

kazuho commented Feb 6, 2015

tatsuhiro-t commented Feb 6, 2015

tatsuhiro-t commented Feb 6, 2015

igrigorik commented Feb 6, 2015

kazuho commented Feb 6, 2015

kazuho commented Feb 6, 2015

igrigorik commented Feb 6, 2015

kazuho commented Feb 6, 2015

tatsuhiro-t commented Feb 7, 2015

kazuho commented Feb 9, 2015

igrigorik commented Feb 9, 2015

tatsuhiro-t commented Feb 9, 2015

kazuho commented Feb 9, 2015

implement HTTP/2 server push #133

implement HTTP/2 server push #133

Conversation

kazuho commented Feb 3, 2015

kazuho commented Feb 3, 2015

kazuho commented Feb 3, 2015

bagder commented Feb 3, 2015

kazuho commented Feb 3, 2015

nwgh commented Feb 3, 2015

tatsuhiro-t commented Feb 3, 2015

kazuho commented Feb 4, 2015

kazuho commented Feb 4, 2015

kazuho commented Feb 4, 2015

kazuho commented Feb 5, 2015

kazuho commented Feb 5, 2015

igrigorik commented Feb 5, 2015

kazuho commented Feb 6, 2015

tatsuhiro-t commented Feb 6, 2015

tatsuhiro-t commented Feb 6, 2015

igrigorik commented Feb 6, 2015

kazuho commented Feb 6, 2015

kazuho commented Feb 6, 2015

igrigorik commented Feb 6, 2015

kazuho commented Feb 6, 2015

tatsuhiro-t commented Feb 7, 2015

kazuho commented Feb 9, 2015

igrigorik commented Feb 9, 2015

tatsuhiro-t commented Feb 9, 2015

kazuho commented Feb 9, 2015