New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement HTTP/2 server push #133

Merged
merged 25 commits into from Feb 5, 2015

Conversation

Projects
None yet
6 participants
@kazuho
Member

kazuho commented Feb 3, 2015

still wip
relates to: #50

@kazuho

This comment has been minimized.

Show comment
Hide comment
@kazuho

kazuho Feb 3, 2015

Member

At the moment, the status is:

  • the HTTP/2 level implementation seems to be working
  • the priority of the pushed stream is hard-coded to 256 (heighest weight) without any dependency
    • [Q] how should I calculate the priority of a pushed stream?
  • still no handler-level code that actually registers the URLs to be pushed

To test the feature, I have applied this patch so that it would send main.css before sending / (files of @ipeychev's http2rulez.com were used for the tests). When accessing the server using Firefox Nightly 38.0a1 (2015-02-02), a log like following was emitted by the server, which indicates that the CSS file was actually sent before the HTML file (i.e. /).

127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /assets/css/main.css HTTP/2" 200 1528 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET / HTTP/2" 200 24583 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /assets/css/bootstrap.css HTTP/2" 200 132472 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /assets/css/magnific-popup.css HTTP/2" 200 7782 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /assets/css/font-awesome.css HTTP/2" 200 25174 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /assets/css/header.css HTTP/2" 200 162 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /assets/css/main.css HTTP/2" 200 1528 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /js/jquery-1.11.0.js HTTP/2" 200 96383 "-" "-"
(snip)

However, looking at the log it is obvious that Firefox sent a pull request for the same CSS file, even though it had been already pushed. So we need to find out the reason why (does Firefox Nightly already support server-push?) (note: RST_STREAM frame was not received from Firefox).

EDIT: the situation was the same for Chrome 42.0.2293.0 canary.

Member

kazuho commented Feb 3, 2015

At the moment, the status is:

  • the HTTP/2 level implementation seems to be working
  • the priority of the pushed stream is hard-coded to 256 (heighest weight) without any dependency
    • [Q] how should I calculate the priority of a pushed stream?
  • still no handler-level code that actually registers the URLs to be pushed

To test the feature, I have applied this patch so that it would send main.css before sending / (files of @ipeychev's http2rulez.com were used for the tests). When accessing the server using Firefox Nightly 38.0a1 (2015-02-02), a log like following was emitted by the server, which indicates that the CSS file was actually sent before the HTML file (i.e. /).

127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /assets/css/main.css HTTP/2" 200 1528 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET / HTTP/2" 200 24583 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /assets/css/bootstrap.css HTTP/2" 200 132472 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /assets/css/magnific-popup.css HTTP/2" 200 7782 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /assets/css/font-awesome.css HTTP/2" 200 25174 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /assets/css/header.css HTTP/2" 200 162 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /assets/css/main.css HTTP/2" 200 1528 "-" "-"
127.0.0.1 - - [03/Feb/2015:14:38:16 +0900] "GET /js/jquery-1.11.0.js HTTP/2" 200 96383 "-" "-"
(snip)

However, looking at the log it is obvious that Firefox sent a pull request for the same CSS file, even though it had been already pushed. So we need to find out the reason why (does Firefox Nightly already support server-push?) (note: RST_STREAM frame was not received from Firefox).

EDIT: the situation was the same for Chrome 42.0.2293.0 canary.

@kazuho

This comment has been minimized.

Show comment
Hide comment
@kazuho

kazuho Feb 3, 2015

Member

@bagder @igrigorik Do you have any information regarding the status of server push support in Firefox / Google Chrome? The browsers do not seem to use the the CSS file being pushed. They simply ignore it, and sends an ordinary pull request for the file.

Thank you in advance for your help.

Member

kazuho commented Feb 3, 2015

@bagder @igrigorik Do you have any information regarding the status of server push support in Firefox / Google Chrome? The browsers do not seem to use the the CSS file being pushed. They simply ignore it, and sends an ordinary pull request for the file.

Thank you in advance for your help.

@bagder

This comment has been minimized.

Show comment
Hide comment
@bagder

bagder Feb 3, 2015

Firefox supports push for sure in Nightly, but I can't remember exactly when the support (will) exist in stable versions. I'll ping hurley and mcmanus to see if they can bring some insights here.

bagder commented Feb 3, 2015

Firefox supports push for sure in Nightly, but I can't remember exactly when the support (will) exist in stable versions. I'll ping hurley and mcmanus to see if they can bring some insights here.

@kazuho

This comment has been minimized.

Show comment
Hide comment
@kazuho

kazuho Feb 3, 2015

Member

@bagder Thank you for the quick response and for pinging your colleagues.

Firefox supports push for sure in Nightly

Hmm. That makes me wonder why it is sending a request to a file that has already been pushed.

Maybe is it due to the response headers sent along with the push? I believe H2O is sending something like https://gist.github.com/kazuho/1c891149199f5ac2e971

Member

kazuho commented Feb 3, 2015

@bagder Thank you for the quick response and for pinging your colleagues.

Firefox supports push for sure in Nightly

Hmm. That makes me wonder why it is sending a request to a file that has already been pushed.

Maybe is it due to the response headers sent along with the push? I believe H2O is sending something like https://gist.github.com/kazuho/1c891149199f5ac2e971

@nwgh

This comment has been minimized.

Show comment
Hide comment
@nwgh

nwgh Feb 3, 2015

I'm willing to bet the issue with Nightly is https://bugzilla.mozilla.org/show_bug.cgi?id=1127618 given that e10s is enabled by default on Nightly.

@kazuho - if you try disabling e10s via Preferences -> General -> Uncheck "Enable E10S (multi-process)" and then restart Nightly (required to disable e10s), does push work for you with Nightly? If not, then we'll have to dig deeper, but that's a good first place to start.

nwgh commented Feb 3, 2015

I'm willing to bet the issue with Nightly is https://bugzilla.mozilla.org/show_bug.cgi?id=1127618 given that e10s is enabled by default on Nightly.

@kazuho - if you try disabling e10s via Preferences -> General -> Uncheck "Enable E10S (multi-process)" and then restart Nightly (required to disable e10s), does push work for you with Nightly? If not, then we'll have to dig deeper, but that's a good first place to start.

@tatsuhiro-t

This comment has been minimized.

Show comment
Hide comment
@tatsuhiro-t

tatsuhiro-t Feb 3, 2015

Contributor

[Q] how should I calculate the priority of a pushed stream?

https://tools.ietf.org/html/draft-ietf-httpbis-http2-16#section-5.3.5

Pushed streams   (Section 8.2) initially depend on their associated stream.  In both
   cases, streams are assigned a default weight of 16.
Contributor

tatsuhiro-t commented Feb 3, 2015

[Q] how should I calculate the priority of a pushed stream?

https://tools.ietf.org/html/draft-ietf-httpbis-http2-16#section-5.3.5

Pushed streams   (Section 8.2) initially depend on their associated stream.  In both
   cases, streams are assigned a default weight of 16.
@kazuho

This comment has been minimized.

Show comment
Hide comment
@kazuho

kazuho Feb 4, 2015

Member

@todesschaf Thank you very much for the suggestions.

I have found and fixed a number of bugs in H2O. With the changes up to 94c42b5, and e10s disabled on Firefox Nightly (380.a1 2015-02-03), server push is working like a charm.

With this patch applied to the file handler to send the CSS files, the access log of H2O is printed as follows.

127.0.0.1 - - [04/Feb/2015:11:27:25 +0900] "GET /assets/css/magnific-popup.css HTTP/2" 200 7782 "-" "-"
127.0.0.1 - - [04/Feb/2015:11:27:25 +0900] "GET /assets/css/font-awesome.css HTTP/2" 200 25174 "-" "-"
127.0.0.1 - - [04/Feb/2015:11:27:25 +0900] "GET /assets/css/header.css HTTP/2" 200 162 "-" "-"
127.0.0.1 - - [04/Feb/2015:11:27:25 +0900] "GET /assets/css/main.css HTTP/2" 200 1528 "-" "-"
127.0.0.1 - - [04/Feb/2015:11:27:25 +0900] "GET / HTTP/2" 200 24583 "-" "-"
127.0.0.1 - - [04/Feb/2015:11:27:25 +0900] "GET /assets/css/bootstrap.css HTTP/2" 200 132472 "-" "-"
127.0.0.1 - - [04/Feb/2015:11:27:27 +0900] "GET /js/jquery-1.11.0.js HTTP/2" 200 96383 "-" "-"

It is also evident from the Network panel of Nightly that server push is working, as the waiting times have gone away for the CSS files.

Network Panel

note: transfer of bootstrap.css completes after /, as it is large and requires multiple WINDOW_UPDATE frames to be send from the client.

I will continue working on the server push support in H2O to make it easier to be used by programmers / system administrators.

Member

kazuho commented Feb 4, 2015

@todesschaf Thank you very much for the suggestions.

I have found and fixed a number of bugs in H2O. With the changes up to 94c42b5, and e10s disabled on Firefox Nightly (380.a1 2015-02-03), server push is working like a charm.

With this patch applied to the file handler to send the CSS files, the access log of H2O is printed as follows.

127.0.0.1 - - [04/Feb/2015:11:27:25 +0900] "GET /assets/css/magnific-popup.css HTTP/2" 200 7782 "-" "-"
127.0.0.1 - - [04/Feb/2015:11:27:25 +0900] "GET /assets/css/font-awesome.css HTTP/2" 200 25174 "-" "-"
127.0.0.1 - - [04/Feb/2015:11:27:25 +0900] "GET /assets/css/header.css HTTP/2" 200 162 "-" "-"
127.0.0.1 - - [04/Feb/2015:11:27:25 +0900] "GET /assets/css/main.css HTTP/2" 200 1528 "-" "-"
127.0.0.1 - - [04/Feb/2015:11:27:25 +0900] "GET / HTTP/2" 200 24583 "-" "-"
127.0.0.1 - - [04/Feb/2015:11:27:25 +0900] "GET /assets/css/bootstrap.css HTTP/2" 200 132472 "-" "-"
127.0.0.1 - - [04/Feb/2015:11:27:27 +0900] "GET /js/jquery-1.11.0.js HTTP/2" 200 96383 "-" "-"

It is also evident from the Network panel of Nightly that server push is working, as the waiting times have gone away for the CSS files.

Network Panel

note: transfer of bootstrap.css completes after /, as it is large and requires multiple WINDOW_UPDATE frames to be send from the client.

I will continue working on the server push support in H2O to make it easier to be used by programmers / system administrators.

@kazuho

This comment has been minimized.

Show comment
Hide comment
@kazuho

kazuho Feb 4, 2015

Member

@tatsuhiro-t

[Q] how should I calculate the priority of a pushed stream?

https://tools.ietf.org/html/draft-ietf-httpbis-http2-16#section-5.3.5

Pushed streams   (Section 8.2) initially depend on their associated stream.  In both
cases, streams are assigned a default weight of 16.

Thank you for the suggestion.

The problem is that when sending CSS files using server push, they need to be given higher priority than the HTML file that uses the CSS files (since it is totally impossible to render the HTML without having a complete set of CSS files, while it is possible to progressively render HTML when once all the CSS files become ready). In other words, IMO the default behavior to make CSS streams dependent to the HTML stream is inappropriate in this case.

EDIT: Ideally speaking, pushed streams should be given the same priority as if it were being pulled. The question is what the best approximation is.

Member

kazuho commented Feb 4, 2015

@tatsuhiro-t

[Q] how should I calculate the priority of a pushed stream?

https://tools.ietf.org/html/draft-ietf-httpbis-http2-16#section-5.3.5

Pushed streams   (Section 8.2) initially depend on their associated stream.  In both
cases, streams are assigned a default weight of 16.

Thank you for the suggestion.

The problem is that when sending CSS files using server push, they need to be given higher priority than the HTML file that uses the CSS files (since it is totally impossible to render the HTML without having a complete set of CSS files, while it is possible to progressively render HTML when once all the CSS files become ready). In other words, IMO the default behavior to make CSS streams dependent to the HTML stream is inappropriate in this case.

EDIT: Ideally speaking, pushed streams should be given the same priority as if it were being pulled. The question is what the best approximation is.

@kazuho

This comment has been minimized.

Show comment
Hide comment
@kazuho

kazuho Feb 4, 2015

Member

Confirmed that server push also works with Chrome Canary (42.0.2294.0) using the H2O configuration described in #133 (comment).

However, as Canary sets the weight of HTML to 256, there was a need to set even higher priority for CSS files to be sent before the HTML (8907f2e). Using a weight value of 257 is not a problem even though it exceeds the bounds defined by the HTTP2 spec., since the value is never exposed over the network.

note: the internal weights are never exposed to the client, as sending PRIORITY frames might confuse the clients

Member

kazuho commented Feb 4, 2015

Confirmed that server push also works with Chrome Canary (42.0.2294.0) using the H2O configuration described in #133 (comment).

However, as Canary sets the weight of HTML to 256, there was a need to set even higher priority for CSS files to be sent before the HTML (8907f2e). Using a weight value of 257 is not a problem even though it exceeds the bounds defined by the HTTP2 spec., since the value is never exposed over the network.

note: the internal weights are never exposed to the client, as sending PRIORITY frames might confuse the clients

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Feb 4, 2015

@kazuho great work on this, really happy to see focus on optimizing push!

The problem is that when sending CSS files using server push, they need to be given higher priority than the HTML file that uses the CSS files (since it is totally impossible to render the HTML without having a complete set of CSS files, while it is possible to progressively render HTML when once all the CSS files become ready). In other words, IMO the default behavior to make CSS streams dependent to the HTML stream is inappropriate in this case.

You're right in that CSS blocks rendering, but I don't think this means that CSS bytes are strictly higher priority than HTML. For example, while rendering may be blocked on CSSOM, it still makes sense to stream the HTML bytes early to allow the browser to initiate fetches for other (non-PUSH, e.g. third party origins, JavaScript initiated, etc.) resources as early as possible. Given that many CSS files are actually quiet large (see bottom of https://www.w3.org/Bugs/Public/show_bug.cgi?id=27303#c17), you don't want to stuff the first CWND with just CSS.

As a rule of thumb, I'd suggest treating CSS as equal priority as HTML and allow the server to interleave those bytes... That said, the specifics might vary for a particular site, so it'd be nice to have a clean way to override and/or hint these things to the server - e.g. some stylesheets don't block rendering and should have a much lower priority.

/cc @pmeenan... who might have some insights on used priorities on Blink side.

@kazuho great work on this, really happy to see focus on optimizing push!

The problem is that when sending CSS files using server push, they need to be given higher priority than the HTML file that uses the CSS files (since it is totally impossible to render the HTML without having a complete set of CSS files, while it is possible to progressively render HTML when once all the CSS files become ready). In other words, IMO the default behavior to make CSS streams dependent to the HTML stream is inappropriate in this case.

You're right in that CSS blocks rendering, but I don't think this means that CSS bytes are strictly higher priority than HTML. For example, while rendering may be blocked on CSSOM, it still makes sense to stream the HTML bytes early to allow the browser to initiate fetches for other (non-PUSH, e.g. third party origins, JavaScript initiated, etc.) resources as early as possible. Given that many CSS files are actually quiet large (see bottom of https://www.w3.org/Bugs/Public/show_bug.cgi?id=27303#c17), you don't want to stuff the first CWND with just CSS.

As a rule of thumb, I'd suggest treating CSS as equal priority as HTML and allow the server to interleave those bytes... That said, the specifics might vary for a particular site, so it'd be nice to have a clean way to override and/or hint these things to the server - e.g. some stylesheets don't block rendering and should have a much lower priority.

/cc @pmeenan... who might have some insights on used priorities on Blink side.

@pmeenan

This comment has been minimized.

Show comment
Hide comment
@pmeenan

pmeenan Feb 4, 2015

The blink priorities are here (though those may not pass through to http2 yet). Generally the "main resource" is one step higher than CSS which would be the base HTML for any HTML pages.

That said, there may be cases where applications may want iFrame HTML to NOT be higher priority than main page CSS which is where the more complicated dependency tress could be helpful.

pmeenan commented Feb 4, 2015

The blink priorities are here (though those may not pass through to http2 yet). Generally the "main resource" is one step higher than CSS which would be the base HTML for any HTML pages.

That said, there may be cases where applications may want iFrame HTML to NOT be higher priority than main page CSS which is where the more complicated dependency tress could be helpful.

@kazuho

This comment has been minimized.

Show comment
Hide comment
@kazuho

kazuho Feb 4, 2015

Member

@igrigorik Thank you for the insights.

You're right in that CSS blocks rendering, but I don't think this means that CSS bytes are strictly higher priority than HTML. For example, while rendering may be blocked on CSSOM, it still makes sense to stream the HTML bytes early to allow the browser to initiate fetches for other (non-PUSH, e.g. third party origins, JavaScript initiated, etc.) resources as early as possible. Given that many CSS files are actually quiet large (see bottom of https://www.w3.org/Bugs/Public/show_bug.cgi?id=27303#c17), you don't want to stuff the first CWND with just CSS.

Understood. I had overlooked that HTML is an anchor to the other resources that need to be fetched.

OTOH I would argue that for people who want to fine-tune their web-site performance by instructing the httpd to use server push, prioritizing CSS above HTML would be good as the default behavior for two reasons: i) they can setup the server to push only the contents that block the rendering process, ii) loading a third-party resource that blocks rendering is what they avoid in case of optimizing load speed.

As a rule of thumb, I'd suggest treating CSS as equal priority as HTML and allow the server to interleave those bytes...

Wow!

I had expected that Chrome would start using a dependency-based prioritization that prefers CSS over HTML, the way Mozilla is introducing in Firefox 37 (described here).

Your suggestion is surprising to me as it means that there is still no consensus between the browser vendors in how the prioritization logic of HTTP/2 should be used.

That said, the specifics might vary for a particular site, so it'd be nice to have a clean way to override and/or hint these things to the server - e.g. some stylesheets don't block rendering and should have a much lower priority.

Agreed that the parameter should better be adjustable.

@pmeenan

The blink priorities are here (though those may not pass through to http2 yet). Generally the "main resource" is one step higher than CSS which would be the base HTML for any HTML pages.

Thank you for the link and the insights. I am now starting to look into how Chrome prioritize the resources.

That said, there may be cases where applications may want iFrame HTML to NOT be higher priority than main page CSS which is where the more complicated dependency tress could be helpful.

Are you suggesting of using dependency-based prioritization to prefer downloading the contents of an IFRAME after the parent frame? Sounds like a neat idea.

Member

kazuho commented Feb 4, 2015

@igrigorik Thank you for the insights.

You're right in that CSS blocks rendering, but I don't think this means that CSS bytes are strictly higher priority than HTML. For example, while rendering may be blocked on CSSOM, it still makes sense to stream the HTML bytes early to allow the browser to initiate fetches for other (non-PUSH, e.g. third party origins, JavaScript initiated, etc.) resources as early as possible. Given that many CSS files are actually quiet large (see bottom of https://www.w3.org/Bugs/Public/show_bug.cgi?id=27303#c17), you don't want to stuff the first CWND with just CSS.

Understood. I had overlooked that HTML is an anchor to the other resources that need to be fetched.

OTOH I would argue that for people who want to fine-tune their web-site performance by instructing the httpd to use server push, prioritizing CSS above HTML would be good as the default behavior for two reasons: i) they can setup the server to push only the contents that block the rendering process, ii) loading a third-party resource that blocks rendering is what they avoid in case of optimizing load speed.

As a rule of thumb, I'd suggest treating CSS as equal priority as HTML and allow the server to interleave those bytes...

Wow!

I had expected that Chrome would start using a dependency-based prioritization that prefers CSS over HTML, the way Mozilla is introducing in Firefox 37 (described here).

Your suggestion is surprising to me as it means that there is still no consensus between the browser vendors in how the prioritization logic of HTTP/2 should be used.

That said, the specifics might vary for a particular site, so it'd be nice to have a clean way to override and/or hint these things to the server - e.g. some stylesheets don't block rendering and should have a much lower priority.

Agreed that the parameter should better be adjustable.

@pmeenan

The blink priorities are here (though those may not pass through to http2 yet). Generally the "main resource" is one step higher than CSS which would be the base HTML for any HTML pages.

Thank you for the link and the insights. I am now starting to look into how Chrome prioritize the resources.

That said, there may be cases where applications may want iFrame HTML to NOT be higher priority than main page CSS which is where the more complicated dependency tress could be helpful.

Are you suggesting of using dependency-based prioritization to prefer downloading the contents of an IFRAME after the parent frame? Sounds like a neat idea.

@pmeenan

This comment has been minimized.

Show comment
Hide comment
@pmeenan

pmeenan Feb 4, 2015

I'm pretty sure Chrome's prioritization for HTTP2 is pretty far behind Firefox's implementation. I know SPDY just uses the coarse 5 priority levels and I'm not sure what we pass through in the case of HTTP2 but I am pretty sure we just map the priorities to stream weights (which causes interleaving).

Actual dependency-based prioritization like Firefox launched recently is absolutely the way to go. I'm just not sure what the timeframe is for that in Chrome.

pmeenan commented Feb 4, 2015

I'm pretty sure Chrome's prioritization for HTTP2 is pretty far behind Firefox's implementation. I know SPDY just uses the coarse 5 priority levels and I'm not sure what we pass through in the case of HTTP2 but I am pretty sure we just map the priorities to stream weights (which causes interleaving).

Actual dependency-based prioritization like Firefox launched recently is absolutely the way to go. I'm just not sure what the timeframe is for that in Chrome.

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Feb 5, 2015

OTOH I would argue that for people who want to fine-tune their web-site performance by instructing the httpd to use server push, prioritizing CSS above HTML would be good as the default behavior for two reasons: i) they can setup the server to push only the contents that block the rendering process

Fair enough, but I'm still wary of this strategy as a default. It's very easy to construct cases where shipping HTML last would actually hurt overall performance by delaying discovery of other critical resources - e.g. a Google Fonts CSS file which resides on a different origin and might block text rendering.

ii) loading a third-party resource that blocks rendering is what they avoid in case of optimizing load speed.

True, but the reality is that sites make heavy use of third party origins: http://bigqueri.es/t/what-is-the-distribution-of-1st-party-vs-3rd-party-resources/100/5.

Your suggestion is surprising to me as it means that there is still no consensus between the browser vendors in how the prioritization logic of HTTP/2 should be used.

That's not surprising. HTML spec dictates certain processing logic that determines high-level relative priority of various resources, but how those resources are actually scheduled and/or fetched is a whole different story.. This is a space where we all (user agents) have and should continue to experiment to adapt to the continuously evolving architecture of pages + networks our users are using.

OTOH I would argue that for people who want to fine-tune their web-site performance by instructing the httpd to use server push, prioritizing CSS above HTML would be good as the default behavior for two reasons: i) they can setup the server to push only the contents that block the rendering process

Fair enough, but I'm still wary of this strategy as a default. It's very easy to construct cases where shipping HTML last would actually hurt overall performance by delaying discovery of other critical resources - e.g. a Google Fonts CSS file which resides on a different origin and might block text rendering.

ii) loading a third-party resource that blocks rendering is what they avoid in case of optimizing load speed.

True, but the reality is that sites make heavy use of third party origins: http://bigqueri.es/t/what-is-the-distribution-of-1st-party-vs-3rd-party-resources/100/5.

Your suggestion is surprising to me as it means that there is still no consensus between the browser vendors in how the prioritization logic of HTTP/2 should be used.

That's not surprising. HTML spec dictates certain processing logic that determines high-level relative priority of various resources, but how those resources are actually scheduled and/or fetched is a whole different story.. This is a space where we all (user agents) have and should continue to experiment to adapt to the continuously evolving architecture of pages + networks our users are using.

@kazuho

This comment has been minimized.

Show comment
Hide comment
@kazuho

kazuho Feb 5, 2015

Member

@pmeenan

I'm pretty sure Chrome's prioritization for HTTP2 is pretty far behind Firefox's implementation. I know SPDY just uses the coarse 5 priority levels and I'm not sure what we pass through in the case of HTTP2 but I am pretty sure we just map the priorities to stream weights (which causes interleaving).

Yeah! Using Chrome Canary I see weights of 256 (HTML), 220 (CSS), 183 (JavaScript), 110 (images), which corresponds to your description.

Actual dependency-based prioritization like Firefox launched recently is absolutely the way to go. I'm just not sure what the timeframe is for that in Chrome.

Thank you for the clarification. Looking forward to see improvements to / experiments on the browser side.

@igrigorik

Fair enough, but I'm still wary of this strategy as a default. It's very easy to construct cases where shipping HTML last would actually hurt overall performance by delaying discovery of other critical resources - e.g. a Google Fonts CSS file which resides on a different origin and might block text rendering.

Thank you for pointing that out. Stepping back from comparing between the weight numbers or dependencies, what would be the ideal approach? Would it be something like: send the HEAD element of the HTML first, then push the contents of the CSS / JavaScript files (that block the renderer), and then send the BODY element? If the approach sounds like a good way for most of the cases, it might be worth to consider adding a one-shot feature (i.e. send DATA only once at a very high weight and then return to the original weight) to the HTTP/2 scheduler of H2O.

That's not surprising. HTML spec dictates certain processing logic that determines high-level relative priority of various resources, but how those resources are actually scheduled and/or fetched is a whole different story.. This is a space where we all (user agents) have and should continue to experiment to adapt to the continuously evolving architecture of pages + networks our users are using.

I agree that the view better describes the situation (than what I had expected).

Member

kazuho commented Feb 5, 2015

@pmeenan

I'm pretty sure Chrome's prioritization for HTTP2 is pretty far behind Firefox's implementation. I know SPDY just uses the coarse 5 priority levels and I'm not sure what we pass through in the case of HTTP2 but I am pretty sure we just map the priorities to stream weights (which causes interleaving).

Yeah! Using Chrome Canary I see weights of 256 (HTML), 220 (CSS), 183 (JavaScript), 110 (images), which corresponds to your description.

Actual dependency-based prioritization like Firefox launched recently is absolutely the way to go. I'm just not sure what the timeframe is for that in Chrome.

Thank you for the clarification. Looking forward to see improvements to / experiments on the browser side.

@igrigorik

Fair enough, but I'm still wary of this strategy as a default. It's very easy to construct cases where shipping HTML last would actually hurt overall performance by delaying discovery of other critical resources - e.g. a Google Fonts CSS file which resides on a different origin and might block text rendering.

Thank you for pointing that out. Stepping back from comparing between the weight numbers or dependencies, what would be the ideal approach? Would it be something like: send the HEAD element of the HTML first, then push the contents of the CSS / JavaScript files (that block the renderer), and then send the BODY element? If the approach sounds like a good way for most of the cases, it might be worth to consider adding a one-shot feature (i.e. send DATA only once at a very high weight and then return to the original weight) to the HTTP/2 scheduler of H2O.

That's not surprising. HTML spec dictates certain processing logic that determines high-level relative priority of various resources, but how those resources are actually scheduled and/or fetched is a whole different story.. This is a space where we all (user agents) have and should continue to experiment to adapt to the continuously evolving architecture of pages + networks our users are using.

I agree that the view better describes the situation (than what I had expected).

kazuho added a commit that referenced this pull request Feb 5, 2015

Merge pull request #133 from h2o/kazuho/push
implement HTTP/2 server push

@kazuho kazuho merged commit 7e8836d into master Feb 5, 2015

1 check passed

continuous-integration/travis-ci The Travis CI build passed
Details
@kazuho

This comment has been minimized.

Show comment
Hide comment
@kazuho

kazuho Feb 5, 2015

Member

Thank you all for your advises, the feature has successfully been merged to master.

The reverse proxy module of H2O now recognizes a response header called x-server-push, which can be emitted by the application servers (running upstream) to request H2O (running as the reverse proxy) to push the resources.

The syntax of the header is: x-server-push: URL; attr1=foo; attr=bar where URL indicates the URL of the content to be pushed to the client. Attributes can be omitted; there are yet no attributes that are recognized.
(FYI to ease debugging, H2O inserts x-http-pushed response header to the streams that are pushed)

It is unfortunate that I have to close this PR even though interesting discussions are ongoing; it seems like there is no way to keep a PR open after merging the code.

I would appreciate it if you could post suggestions from now on to #137.

Please let me express my gratitude to your help in implementing / improving support for server push in H2O.

Member

kazuho commented Feb 5, 2015

Thank you all for your advises, the feature has successfully been merged to master.

The reverse proxy module of H2O now recognizes a response header called x-server-push, which can be emitted by the application servers (running upstream) to request H2O (running as the reverse proxy) to push the resources.

The syntax of the header is: x-server-push: URL; attr1=foo; attr=bar where URL indicates the URL of the content to be pushed to the client. Attributes can be omitted; there are yet no attributes that are recognized.
(FYI to ease debugging, H2O inserts x-http-pushed response header to the streams that are pushed)

It is unfortunate that I have to close this PR even though interesting discussions are ongoing; it seems like there is no way to keep a PR open after merging the code.

I would appreciate it if you could post suggestions from now on to #137.

Please let me express my gratitude to your help in implementing / improving support for server push in H2O.

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Feb 5, 2015

Thank you for pointing that out. Stepping back from comparing between the weight numbers or dependencies, what would be the ideal approach? Would it be something like: send the HEAD element of the HTML first, then push the contents of the CSS / JavaScript files (that block the renderer), and then send the BODY element?

~ish, yeah. Typically, we don't need all of the CSS or JavaScript to get visible content on the screen, and this is where the app developer needs to step in and provide the right context to the server - e.g. push these CSS and JS bytes alongside the HTML response to deliver a fast first render, then stream remaining markup to fill in the remaining bits.

If the approach sounds like a good way for most of the cases, it might be worth to consider adding a one-shot feature (i.e. send DATA only once at a very high weight and then return to the original weight) to the HTTP/2 scheduler of H2O.

I do like the idea of allowing "send X bytes of resource Y then yield it and use a lower priority for the rest". This would be useful for streaming initial HTML payload for large pages, and/or even images: I wan to stream header of the image to allow the UA to decode its geometry and perform layout (if progressive, then a rough preview as well), but I'll stream the image bytes themselves later after other more critical resources.

Also, as an aside... Given that HTML is often dynamic and takes some time to generate, whereas CSS/JS is typically static, I'm guessing that even if we set them at same priority.. a good fraction of CSS/JS bytes will still come in front of HTML due to the associated app server response time delays.

Thank you for pointing that out. Stepping back from comparing between the weight numbers or dependencies, what would be the ideal approach? Would it be something like: send the HEAD element of the HTML first, then push the contents of the CSS / JavaScript files (that block the renderer), and then send the BODY element?

~ish, yeah. Typically, we don't need all of the CSS or JavaScript to get visible content on the screen, and this is where the app developer needs to step in and provide the right context to the server - e.g. push these CSS and JS bytes alongside the HTML response to deliver a fast first render, then stream remaining markup to fill in the remaining bits.

If the approach sounds like a good way for most of the cases, it might be worth to consider adding a one-shot feature (i.e. send DATA only once at a very high weight and then return to the original weight) to the HTTP/2 scheduler of H2O.

I do like the idea of allowing "send X bytes of resource Y then yield it and use a lower priority for the rest". This would be useful for streaming initial HTML payload for large pages, and/or even images: I wan to stream header of the image to allow the UA to decode its geometry and perform layout (if progressive, then a rough preview as well), but I'll stream the image bytes themselves later after other more critical resources.

Also, as an aside... Given that HTML is often dynamic and takes some time to generate, whereas CSS/JS is typically static, I'm guessing that even if we set them at same priority.. a good fraction of CSS/JS bytes will still come in front of HTML due to the associated app server response time delays.

@kazuho

This comment has been minimized.

Show comment
Hide comment
@kazuho

kazuho Feb 6, 2015

Member

@igrigorik
Thank you for the response. I will see if I can implement the one-shot (first-shot to be more precise) priority escalation. And regarding the images, I do remember you mentioning the feature in HTTP2 Conference in Tokyo at the end of last year. The discussion here is indeed a variation of the approach.

Also, as an aside... Given that HTML is often dynamic and takes some time to generate, whereas CSS/JS is typically static, I'm guessing that even if we set them at same priority.. a good fraction of CSS/JS bytes will still come in front of HTML due to the associated app server response time delays.

Sounds interesting. Such a feature can definitely be implemented within the reverse proxy.

As a note, redirections can be made faster by using server push. Proxies can monitor if Location: header is emitted, and in case the URL in the header is of the same authority, push the redirected resource along with the 30x response.

Member

kazuho commented Feb 6, 2015

@igrigorik
Thank you for the response. I will see if I can implement the one-shot (first-shot to be more precise) priority escalation. And regarding the images, I do remember you mentioning the feature in HTTP2 Conference in Tokyo at the end of last year. The discussion here is indeed a variation of the approach.

Also, as an aside... Given that HTML is often dynamic and takes some time to generate, whereas CSS/JS is typically static, I'm guessing that even if we set them at same priority.. a good fraction of CSS/JS bytes will still come in front of HTML due to the associated app server response time delays.

Sounds interesting. Such a feature can definitely be implemented within the reverse proxy.

As a note, redirections can be made faster by using server push. Proxies can monitor if Location: header is emitted, and in case the URL in the header is of the same authority, push the redirected resource along with the 30x response.

@tatsuhiro-t

This comment has been minimized.

Show comment
Hide comment
@tatsuhiro-t

tatsuhiro-t Feb 6, 2015

Contributor

Awesome work, @kazuho.
I'm interested in reverse proxy usecase, since we are also planning to add server push to nghttpx.
Reading h2o source code, currently couple of headers from associated request are copied to push request, such as accept* and user-agent. I think there is a case that other headers like cookies and authorization affect resource retrieval. More than that, proxied server can have their liberty to process requests based on arbitrary headers, so would it be more safer to copy all headers?
referer can be updated to associated URL.
Other concern is accept header field. Browsers change accept header field based on they are requesting. For example, Firefox sends completely different accept header field between getting HTML and CSS. I'm not sure how this variation of header field affect the contents we get.

Contributor

tatsuhiro-t commented Feb 6, 2015

Awesome work, @kazuho.
I'm interested in reverse proxy usecase, since we are also planning to add server push to nghttpx.
Reading h2o source code, currently couple of headers from associated request are copied to push request, such as accept* and user-agent. I think there is a case that other headers like cookies and authorization affect resource retrieval. More than that, proxied server can have their liberty to process requests based on arbitrary headers, so would it be more safer to copy all headers?
referer can be updated to associated URL.
Other concern is accept header field. Browsers change accept header field based on they are requesting. For example, Firefox sends completely different accept header field between getting HTML and CSS. I'm not sure how this variation of header field affect the contents we get.

@tatsuhiro-t

This comment has been minimized.

Show comment
Hide comment
@tatsuhiro-t

tatsuhiro-t Feb 6, 2015

Contributor

As for header field to instrument resources to push, Link header field might be a good candidate: http://www.chromium.org/spdy/link-headers-and-server-hint/link-rel-subresource
If rel=subresource, then it is a good target for push. We can invent new rel value as well...

Contributor

tatsuhiro-t commented Feb 6, 2015

As for header field to instrument resources to push, Link header field might be a good candidate: http://www.chromium.org/spdy/link-headers-and-server-hint/link-rel-subresource
If rel=subresource, then it is a good target for push. We can invent new rel value as well...

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Feb 6, 2015

As for header field to instrument resources to push, Link header field might be a good candidate: http://www.chromium.org/spdy/link-headers-and-server-hint/link-rel-subresource
If rel=subresource, then it is a good target for push. We can invent new rel value as well...

FWIW, our plan is to retire subresource in favor of "preload", see: http://w3c.github.io/preload/

As for header field to instrument resources to push, Link header field might be a good candidate: http://www.chromium.org/spdy/link-headers-and-server-hint/link-rel-subresource
If rel=subresource, then it is a good target for push. We can invent new rel value as well...

FWIW, our plan is to retire subresource in favor of "preload", see: http://w3c.github.io/preload/

@kazuho

This comment has been minimized.

Show comment
Hide comment
@kazuho

kazuho Feb 6, 2015

Member

@tatsuhiro-t Thank you for the comment. That is a good question.

The fact is, I have cowardly limited the headers to be copied for building push responses (see 812dec8 for an example). The reason consists of two points described below.

  1. As briefly suggested in #137 I am going to improve the server-push logic so that it would push the resources only when it is unlikely that they already exist within the client cache. To achieve the goal, the server needs to issue a conditional request internally, check that the response is not 304 Not Modified, and after then, send the PUSH_PROMISE header. At the same time it is essential to send the PUSH_PROMISE header before sending the contents of the resource that refers to the pushed resource (e.g. if we are to push a CSS file, client should receive the PUSH_PROMISE header prior to the <LINK REL="stylesheet"> tag), or we might waste the bandwidth as clients may issue another request for a resource that is being pushed. These two requirements limit the type of resources that can be pushed only to those that are available instantly within the reverse proxy (e.g. static files or cached content within the proxy cache). As it is unlikely that such resources are access-controlled resources, it seemed unnecessary to copy all the request headers when building a request for server-push (in case of H2O at the moment, the statically served contents cannot be access controlled).
  2. Should the pushed response vary depending on the value of certain header (e.g. cookie), we need to include vary: cookie in the pushed response, which in turn means that the cookie header must be included in the PUSH_PROMISE frame being sent. However, client implementations might reject such server-pushed streams, as other requests flying may change the value of the cookie header; clients are required to throw away the pushed response, if the value of the cookies have changed when it needs to actually use the resource in question. In other words, I thought that contents that become conditionally retrievable should better not be pushed.

After reading your comments (esp. the lines regarding the accept header), I think it might be better to rewrite the vary headers of the server-pushed responses to cache-control: private (if vary exists) and also do not send any headers in the PUSH_PROMISE frame. And if we are to adjust the implementation as such, then we can for sure copy all the request headers when building a internal request to initiate server-push (with the exception that accept header may not be usable as you pointed out).

Member

kazuho commented Feb 6, 2015

@tatsuhiro-t Thank you for the comment. That is a good question.

The fact is, I have cowardly limited the headers to be copied for building push responses (see 812dec8 for an example). The reason consists of two points described below.

  1. As briefly suggested in #137 I am going to improve the server-push logic so that it would push the resources only when it is unlikely that they already exist within the client cache. To achieve the goal, the server needs to issue a conditional request internally, check that the response is not 304 Not Modified, and after then, send the PUSH_PROMISE header. At the same time it is essential to send the PUSH_PROMISE header before sending the contents of the resource that refers to the pushed resource (e.g. if we are to push a CSS file, client should receive the PUSH_PROMISE header prior to the <LINK REL="stylesheet"> tag), or we might waste the bandwidth as clients may issue another request for a resource that is being pushed. These two requirements limit the type of resources that can be pushed only to those that are available instantly within the reverse proxy (e.g. static files or cached content within the proxy cache). As it is unlikely that such resources are access-controlled resources, it seemed unnecessary to copy all the request headers when building a request for server-push (in case of H2O at the moment, the statically served contents cannot be access controlled).
  2. Should the pushed response vary depending on the value of certain header (e.g. cookie), we need to include vary: cookie in the pushed response, which in turn means that the cookie header must be included in the PUSH_PROMISE frame being sent. However, client implementations might reject such server-pushed streams, as other requests flying may change the value of the cookie header; clients are required to throw away the pushed response, if the value of the cookies have changed when it needs to actually use the resource in question. In other words, I thought that contents that become conditionally retrievable should better not be pushed.

After reading your comments (esp. the lines regarding the accept header), I think it might be better to rewrite the vary headers of the server-pushed responses to cache-control: private (if vary exists) and also do not send any headers in the PUSH_PROMISE frame. And if we are to adjust the implementation as such, then we can for sure copy all the request headers when building a internal request to initiate server-push (with the exception that accept header may not be usable as you pointed out).

@kazuho

This comment has been minimized.

Show comment
Hide comment
@kazuho

kazuho Feb 6, 2015

Member

@tatsuhiro-t

As for header field to instrument resources to push, Link header field might be a good candidate: http://www.chromium.org/spdy/link-headers-and-server-hint/link-rel-subresource
If rel=subresource, then it is a good target for push. We can invent new rel value as well...

Sounds interesting. Although I am not excluding such possibility, however, regarding the issue of discovering the resources to be pushed my tendency goes to using response headers (as has been implemented by this PR) or using a mapping file for statically served contents,

In case of automatic discovery, it is important to not have false positives. We would never want to push a resource that would not be used. That means that we would need a sophisticated parser for discovering the resources (e.g. the parser that extracts the LINK tags should assert that it is not surrounded by <!-- -->), which in turn may mean that such parsers are slow, not to mention that we would need to implement such parsers for each type of resource (e.g. for HTML we need to extract valid LINK tags, for CSS we need to extract @import).

So I believe that for the short term it would be better to use headers or mapping files for specifying the resources that should be pushed. Users can write the mapping files by hand, or use a tool (likely to be written in scripting languages) to extract the URLs of the resources that need to be pushed (and then possibly adjust the list by hand).

EDIT: As an afterthought, if I were to implement such automatic discovery I would spawn an external filter that extracts the necessary resources for frequently served contents, and associate the results to the cache entry so that the associated contents can be pushed for future requests arriving to the resource.

Member

kazuho commented Feb 6, 2015

@tatsuhiro-t

As for header field to instrument resources to push, Link header field might be a good candidate: http://www.chromium.org/spdy/link-headers-and-server-hint/link-rel-subresource
If rel=subresource, then it is a good target for push. We can invent new rel value as well...

Sounds interesting. Although I am not excluding such possibility, however, regarding the issue of discovering the resources to be pushed my tendency goes to using response headers (as has been implemented by this PR) or using a mapping file for statically served contents,

In case of automatic discovery, it is important to not have false positives. We would never want to push a resource that would not be used. That means that we would need a sophisticated parser for discovering the resources (e.g. the parser that extracts the LINK tags should assert that it is not surrounded by <!-- -->), which in turn may mean that such parsers are slow, not to mention that we would need to implement such parsers for each type of resource (e.g. for HTML we need to extract valid LINK tags, for CSS we need to extract @import).

So I believe that for the short term it would be better to use headers or mapping files for specifying the resources that should be pushed. Users can write the mapping files by hand, or use a tool (likely to be written in scripting languages) to extract the URLs of the resources that need to be pushed (and then possibly adjust the list by hand).

EDIT: As an afterthought, if I were to implement such automatic discovery I would spawn an external filter that extracts the necessary resources for frequently served contents, and associate the results to the cache entry so that the associated contents can be pushed for future requests arriving to the resource.

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Feb 6, 2015

@kazuho I'd start with basic Link support. I interpreted @tatsuhiro-t's comment as: instead of using a custom header name, use link with rel=subresource. Except.. Instead of subresource, I think you should use rel=preload. E.g..

200 OK ...
Link: </font.woff>; rel=preload; as=font

^ This tells you the resource that could/should be pushed and its type, which can help determine priority. For more, see: http://w3c.github.io/preload/#interoperability-with-http-link-header

@kazuho I'd start with basic Link support. I interpreted @tatsuhiro-t's comment as: instead of using a custom header name, use link with rel=subresource. Except.. Instead of subresource, I think you should use rel=preload. E.g..

200 OK ...
Link: </font.woff>; rel=preload; as=font

^ This tells you the resource that could/should be pushed and its type, which can help determine priority. For more, see: http://w3c.github.io/preload/#interoperability-with-http-link-header

@kazuho

This comment has been minimized.

Show comment
Hide comment
@kazuho

kazuho Feb 6, 2015

Member

@igrigorik Thank you for pointing that out. It is clear that I did not read @tatsuhiro-t's comment carefully enough. My apologies.

Member

kazuho commented Feb 6, 2015

@igrigorik Thank you for pointing that out. It is clear that I did not read @tatsuhiro-t's comment carefully enough. My apologies.

@tatsuhiro-t

This comment has been minimized.

Show comment
Hide comment
@tatsuhiro-t

tatsuhiro-t Feb 7, 2015

Contributor

Yeah, I was a bit short of words, I mean Link header field and no link element in HTML.
As for cookies and authorization stuff, I have to read preload spec, but it could be safer to just omit it for now since chromium document also refers this as well. Hopefully we'll gain more experience this year about this new technology and find out what is the best we can do.

Contributor

tatsuhiro-t commented Feb 7, 2015

Yeah, I was a bit short of words, I mean Link header field and no link element in HTML.
As for cookies and authorization stuff, I have to read preload spec, but it could be safer to just omit it for now since chromium document also refers this as well. Hopefully we'll gain more experience this year about this new technology and find out what is the best we can do.

@kazuho

This comment has been minimized.

Show comment
Hide comment
@kazuho

kazuho Feb 9, 2015

Member

FYI as of 85f4471 H2O recognizes Link: <URL>; rel=preload headers and push the contents referred to by the URLs.

Member

kazuho commented Feb 9, 2015

FYI as of 85f4471 H2O recognizes Link: <URL>; rel=preload headers and push the contents referred to by the URLs.

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Feb 9, 2015

@kazuho \o/ ... woot! Time to run some experiments...

@kazuho \o/ ... woot! Time to run some experiments...

@tatsuhiro-t

This comment has been minimized.

Show comment
Hide comment
@tatsuhiro-t

tatsuhiro-t Feb 9, 2015

Contributor

Great! https://nghttp2.org also enabled server push using Link header field, so we have suddenly 2 implementations using preload relation, which sounds very exciting.

Contributor

tatsuhiro-t commented Feb 9, 2015

Great! https://nghttp2.org also enabled server push using Link header field, so we have suddenly 2 implementations using preload relation, which sounds very exciting.

@kazuho

This comment has been minimized.

Show comment
Hide comment
Member

kazuho commented Feb 9, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment