New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preload and Server Push #38

Closed
mnot opened this Issue Dec 3, 2015 · 17 comments

Comments

Projects
None yet
7 participants
@mnot
Member

mnot commented Dec 3, 2015

More of a heads-up than an issue --

It look like Link: <...>; rel=preload is getting traction as the way to indicate to a server (whether local or remote to the content, such as with a CDN) that it should push the indicated assets.

See:

I think this is the right thing to do based upon the semantics (although my first instinct was prefetch, and it appears to have been that of Jake as well: https://youtu.be/CSjL1lrNAx4?t=4m29s)

So, if you don't want preload to be used in this fashion, I'd suggest jumping up and down about it pretty soon.

@mnot

This comment has been minimized.

Show comment
Hide comment
Member

mnot commented Dec 3, 2015

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Dec 4, 2015

Member

Yep. I've been working with @kazuho and @icing on their implementations. Preload is the right directive, not prefetch: prefetch is an optional resource for a future navigation, preload is required resource for current navigation.

We should probably add a section in the spec on Link: rel=preload + server push: best practices, interop, etc.

Member

igrigorik commented Dec 4, 2015

Yep. I've been working with @kazuho and @icing on their implementations. Preload is the right directive, not prefetch: prefetch is an optional resource for a future navigation, preload is required resource for current navigation.

We should probably add a section in the spec on Link: rel=preload + server push: best practices, interop, etc.

@mnot

This comment has been minimized.

Show comment
Hide comment
@mnot

mnot Dec 4, 2015

Member

OK, good. Will bring to Jake's attention.

Member

mnot commented Dec 4, 2015

OK, good. Will bring to Jake's attention.

@yoavweiss

This comment has been minimized.

Show comment
Hide comment
@yoavweiss

yoavweiss Jan 28, 2016

Collaborator

Also, since the question keeps coming up, we should probably add a section that states the differences between preload and push.

Collaborator

yoavweiss commented Jan 28, 2016

Also, since the question keeps coming up, we should probably add a section that states the differences between preload and push.

@yoavweiss

This comment has been minimized.

Show comment
Hide comment
@yoavweiss

yoavweiss Mar 4, 2016

Collaborator

Early experimentation with the preload implementation had me realize that push and preload in fact have semantics that are rather different, and I'm starting to doubt that using rel=preload as a push directive makes sense:

  1. One of the main advantages of push is that it enables the server to start sending content down while the server is preparing their response. The application server cannot send down the headers, as it is not aware of the status of the response until it is at least partially prepared to be shipped. Therefore the Web server or proxy cannot use the link headers in order to start pushing the content early.
  2. Even if we ignore that, the resources that you want to push are not necessarily the resources that you want to preload. Markup based critical resources (external CSS, blocking JS) is definitely something you want to push, if possible, before the HTML starts to be sent down. But they are not resources you want to send rel=preload directives for, because it's not necessary, and the time difference between the browser acting on these headers and the natural discovery of these resources by the preloader is most probably tiny.

I'm afraid that tying these two semantics together would be a mistake and would result in pages that add Link: rel=preload cruft that will not help neither the server nor the browser to make things faster.

A Link: rel=push will mitigate this to some extent (the cruft will be different and the browser would be able to freely ignore it), but I'm not sure it will help either, due to (1).

Collaborator

yoavweiss commented Mar 4, 2016

Early experimentation with the preload implementation had me realize that push and preload in fact have semantics that are rather different, and I'm starting to doubt that using rel=preload as a push directive makes sense:

  1. One of the main advantages of push is that it enables the server to start sending content down while the server is preparing their response. The application server cannot send down the headers, as it is not aware of the status of the response until it is at least partially prepared to be shipped. Therefore the Web server or proxy cannot use the link headers in order to start pushing the content early.
  2. Even if we ignore that, the resources that you want to push are not necessarily the resources that you want to preload. Markup based critical resources (external CSS, blocking JS) is definitely something you want to push, if possible, before the HTML starts to be sent down. But they are not resources you want to send rel=preload directives for, because it's not necessary, and the time difference between the browser acting on these headers and the natural discovery of these resources by the preloader is most probably tiny.

I'm afraid that tying these two semantics together would be a mistake and would result in pages that add Link: rel=preload cruft that will not help neither the server nor the browser to make things faster.

A Link: rel=push will mitigate this to some extent (the cruft will be different and the browser would be able to freely ignore it), but I'm not sure it will help either, due to (1).

@yoavweiss yoavweiss reopened this Mar 4, 2016

@mnot

This comment has been minimized.

Show comment
Hide comment
@mnot

mnot Mar 4, 2016

Member

The first problem seems like it would be shared by any header-based approach. I agree that it's a problem, but I think there's still value in a header-based approach.

I'm struggling to understand why the second is a problem. It's true, but it doesn't actually cause any issues AFAICT.

Member

mnot commented Mar 4, 2016

The first problem seems like it would be shared by any header-based approach. I agree that it's a problem, but I think there's still value in a header-based approach.

I'm struggling to understand why the second is a problem. It's true, but it doesn't actually cause any issues AFAICT.

@kazuho

This comment has been minimized.

Show comment
Hide comment
@kazuho

kazuho Mar 6, 2016

  1. One of the main advantages of push is that it enables the server to start sending content down while the server is preparing their response. The application server cannot send down the headers, as it is not aware of the status of the response until it is at least partially prepared to be shipped. Therefore the Web server or proxy cannot use the link headers in order to start pushing the content early.

FWIW, application servers can theoretically send 100 continue with link: rel=preload headers, then start processing the request and send the final response. And I believe that is the way we should go.

kazuho commented Mar 6, 2016

  1. One of the main advantages of push is that it enables the server to start sending content down while the server is preparing their response. The application server cannot send down the headers, as it is not aware of the status of the response until it is at least partially prepared to be shipped. Therefore the Web server or proxy cannot use the link headers in order to start pushing the content early.

FWIW, application servers can theoretically send 100 continue with link: rel=preload headers, then start processing the request and send the final response. And I believe that is the way we should go.

@yoavweiss

This comment has been minimized.

Show comment
Hide comment
@yoavweiss

yoavweiss Mar 7, 2016

Collaborator

I'm struggling to understand why the second is a problem. It's true, but it doesn't actually cause any issues AFAICT.

I guess that beyond just confusion between the two, the scenario that scares me the most is the "preload critical BG image" scenario, where you want that critical image to be discovered early, but you don't want it prioritized over more critical resources such as CSS and JS.
The browser can be smart about it and send out that request with the right priority at the right time.

Can the push server do the same? Maybe, by parsing as, translating it to priority in some way, and delaying push for low priority resources.

Can we rely on the fact that they all will? I'm not sure we can. And if we can't, we may have compat issues that would result in slowdown when certain servers are used, and may deter people from using preload altogether.

FWIW, application servers can theoretically send 100 continue with link: rel=preload headers, then start processing the request and send the final response. And I believe that is the way we should go.

That's a great idea! :) I always thought of 100 Continue in terms of POSTs, but it's true that it can apply here as well.

Why did you say "theoretically" though? Are you aware of issues this would raise in practice?

Collaborator

yoavweiss commented Mar 7, 2016

I'm struggling to understand why the second is a problem. It's true, but it doesn't actually cause any issues AFAICT.

I guess that beyond just confusion between the two, the scenario that scares me the most is the "preload critical BG image" scenario, where you want that critical image to be discovered early, but you don't want it prioritized over more critical resources such as CSS and JS.
The browser can be smart about it and send out that request with the right priority at the right time.

Can the push server do the same? Maybe, by parsing as, translating it to priority in some way, and delaying push for low priority resources.

Can we rely on the fact that they all will? I'm not sure we can. And if we can't, we may have compat issues that would result in slowdown when certain servers are used, and may deter people from using preload altogether.

FWIW, application servers can theoretically send 100 continue with link: rel=preload headers, then start processing the request and send the final response. And I believe that is the way we should go.

That's a great idea! :) I always thought of 100 Continue in terms of POSTs, but it's true that it can apply here as well.

Why did you say "theoretically" though? Are you aware of issues this would raise in practice?

@icing

This comment has been minimized.

Show comment
Hide comment
@icing

icing Mar 7, 2016

Not convinced. The 100-continue implementation is borked in many older implementations, more so probably if a "100 continue" appears without an Expect: header and HTTP/2 does not have it. So it cannot be transported through h1/h2 gateways either. Often, the first h1/h2 proxy will do the push, but it does not have to be like that.

I agree that preloadand push have overlapping, but not the same semantics. The extra parameter nopush may help application developers to fine tune these cases, but it seems hackish. I feel that instead of enhancing the protocol, we build a mechanism to allow web developers to peek and poke the system.

Maybe that is a good thing to have. But it should not be the only thing. Because ultimately, if one really, really wants to optimize, the client implementation will enter the equation. An abstraction of these are the device hints that have been proposed. (Which was once though to be addresses by CSS media types, if I understand that correctly.) So, more peek and poke via HTTP headers.

A different approach would be to PUSH to the client a resource that contains a meta description of the page/site: which resources are depending on what other of which type, so the client can immediately start requesting the resources in the order/priority that fits best its needs/network situation.

Such a description does not need to be written manually, I would guess that a browser could be made to spit that out very easily. Copy it, place it on the server, configure it to be pushed.

icing commented Mar 7, 2016

Not convinced. The 100-continue implementation is borked in many older implementations, more so probably if a "100 continue" appears without an Expect: header and HTTP/2 does not have it. So it cannot be transported through h1/h2 gateways either. Often, the first h1/h2 proxy will do the push, but it does not have to be like that.

I agree that preloadand push have overlapping, but not the same semantics. The extra parameter nopush may help application developers to fine tune these cases, but it seems hackish. I feel that instead of enhancing the protocol, we build a mechanism to allow web developers to peek and poke the system.

Maybe that is a good thing to have. But it should not be the only thing. Because ultimately, if one really, really wants to optimize, the client implementation will enter the equation. An abstraction of these are the device hints that have been proposed. (Which was once though to be addresses by CSS media types, if I understand that correctly.) So, more peek and poke via HTTP headers.

A different approach would be to PUSH to the client a resource that contains a meta description of the page/site: which resources are depending on what other of which type, so the client can immediately start requesting the resources in the order/priority that fits best its needs/network situation.

Such a description does not need to be written manually, I would guess that a browser could be made to spit that out very easily. Copy it, place it on the server, configure it to be pushed.

@mnot

This comment has been minimized.

Show comment
Hide comment
@mnot

mnot Mar 8, 2016

Member

+1 to @icing - we shouldn't be doing anything to promote 100: continue use on the open Web, the interop problems are nasty.

To be clear -- if server / intermediary implementers can agree to move from overloading preload to a separate push link relation, I have no problem with that; my concern here is mostly that without that buy-in, interop will suffer.

Description resources are interesting (and I'd love to spend some time on them; heck, what's another tilt at the windmill? :) but I think that a header-based approach is also good to have, considering its simplicity and familiarity even despite its limitations.

Member

mnot commented Mar 8, 2016

+1 to @icing - we shouldn't be doing anything to promote 100: continue use on the open Web, the interop problems are nasty.

To be clear -- if server / intermediary implementers can agree to move from overloading preload to a separate push link relation, I have no problem with that; my concern here is mostly that without that buy-in, interop will suffer.

Description resources are interesting (and I'd love to spend some time on them; heck, what's another tilt at the windmill? :) but I think that a header-based approach is also good to have, considering its simplicity and familiarity even despite its limitations.

@kazuho

This comment has been minimized.

Show comment
Hide comment
@kazuho

kazuho Mar 11, 2016

@yoavweiss

Why did you say "theoretically" though? Are you aware of issues this would raise in practice?

It is because of interoperability issues on when used on the open Web, as pointed out by @icing and @mnot.

But IMO that does not preclude a reverse proxy (or an edge server) from recognizing 100-continue+Link: as a signal to trigger H2 push, since in such deployments web-application developers can know for certain that they would be terminated by the reverse proxy.

@icing

A different approach would be to PUSH to the client a resource that contains a meta description of the page/site: which resources are depending on what other of which type, so the client can immediately start requesting the resources in the order/priority that fits best its needs/network situation.

With ServiceWorkers you can do that today. Generating SW script (or a configuration file for a SW script) based on the real access patterns sounds like very interesting.

kazuho commented Mar 11, 2016

@yoavweiss

Why did you say "theoretically" though? Are you aware of issues this would raise in practice?

It is because of interoperability issues on when used on the open Web, as pointed out by @icing and @mnot.

But IMO that does not preclude a reverse proxy (or an edge server) from recognizing 100-continue+Link: as a signal to trigger H2 push, since in such deployments web-application developers can know for certain that they would be terminated by the reverse proxy.

@icing

A different approach would be to PUSH to the client a resource that contains a meta description of the page/site: which resources are depending on what other of which type, so the client can immediately start requesting the resources in the order/priority that fits best its needs/network situation.

With ServiceWorkers you can do that today. Generating SW script (or a configuration file for a SW script) based on the real access patterns sounds like very interesting.

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Mar 22, 2016

Member

One of the main advantages of push is that it enables the server to start sending content down while the server is preparing their response. The application server cannot send down the headers, as it is not aware of the status of the response until it is at least partially prepared to be shipped. Therefore the Web server or proxy cannot use the link headers in order to start pushing the content early.

I agree with @mnot, this is a general limitation of any header-based approach. Whether we use rel=preload or rel=push makes no difference. That said, and FWIW, some optimization products "work around this" by immediately sending a 200 with synthesized head; pagespeed service and others do this, and while this approach is not without its issues, it does work.

Even if we ignore that, the resources that you want to push are not necessarily the resources that you want to preload. Markup based critical resources (external CSS, blocking JS) is definitely something you want to push, if possible, before the HTML starts to be sent down. But they are not resources you want to send rel=preload directives for, because it's not necessary...

Why not? rel=preload also enables intermediaries (e.g. caches) to start early fetch to the edge -- that alone can be a big win. If the intermediary is push-capable, they can also initiate push. If you want to opt-out from the latter, you can add nopush.

Also, as I noted in #54 (comment), rel=push comes with its own set own problems: duplication, h2-only, etc. That's not to say that rel=preload approach is perfect, and we may indeed want to investigate another + more flexible mechanism, but as far as header-based mechanisms go, I do think it (rel=preload with optional nopush) offers a better story than any of the other options.

I'm inclined to close this. WDYT?

Member

igrigorik commented Mar 22, 2016

One of the main advantages of push is that it enables the server to start sending content down while the server is preparing their response. The application server cannot send down the headers, as it is not aware of the status of the response until it is at least partially prepared to be shipped. Therefore the Web server or proxy cannot use the link headers in order to start pushing the content early.

I agree with @mnot, this is a general limitation of any header-based approach. Whether we use rel=preload or rel=push makes no difference. That said, and FWIW, some optimization products "work around this" by immediately sending a 200 with synthesized head; pagespeed service and others do this, and while this approach is not without its issues, it does work.

Even if we ignore that, the resources that you want to push are not necessarily the resources that you want to preload. Markup based critical resources (external CSS, blocking JS) is definitely something you want to push, if possible, before the HTML starts to be sent down. But they are not resources you want to send rel=preload directives for, because it's not necessary...

Why not? rel=preload also enables intermediaries (e.g. caches) to start early fetch to the edge -- that alone can be a big win. If the intermediary is push-capable, they can also initiate push. If you want to opt-out from the latter, you can add nopush.

Also, as I noted in #54 (comment), rel=push comes with its own set own problems: duplication, h2-only, etc. That's not to say that rel=preload approach is perfect, and we may indeed want to investigate another + more flexible mechanism, but as far as header-based mechanisms go, I do think it (rel=preload with optional nopush) offers a better story than any of the other options.

I'm inclined to close this. WDYT?

@martinthomson

This comment has been minimized.

Show comment
Hide comment
@martinthomson

martinthomson Mar 22, 2016

Member

I think that fixating on the inadequacies on a header-based approach neglects to recognize that, when headers are available they can be of use. And that rel=preload is a good fit. It's almost as though some of the comments have assumed that there isn't even the possibility of another method for signaling the desire for push.

I agree with @igrigorik, it looks like there is a valuable signal in rel=preload.

Member

martinthomson commented Mar 22, 2016

I think that fixating on the inadequacies on a header-based approach neglects to recognize that, when headers are available they can be of use. And that rel=preload is a good fit. It's almost as though some of the comments have assumed that there isn't even the possibility of another method for signaling the desire for push.

I agree with @igrigorik, it looks like there is a valuable signal in rel=preload.

@reschke

This comment has been minimized.

Show comment
Hide comment
@reschke

reschke Mar 23, 2016

Just to clarify: what has been problematic in HTTP/1.1 was the expect-continue mechanism, not non-final status codes in general. Those still exist in HTTP/2, so if we wanted to use this, we could define a new status code, such as "104 Early Metadata".

reschke commented Mar 23, 2016

Just to clarify: what has been problematic in HTTP/1.1 was the expect-continue mechanism, not non-final status codes in general. Those still exist in HTTP/2, so if we wanted to use this, we could define a new status code, such as "104 Early Metadata".

@icing

This comment has been minimized.

Show comment
Hide comment
@icing

icing Mar 23, 2016

I agree with @igrigorik that preload + nopushis good enough for a header based approach.

To reiterate: the problem, addressed by @yoavweiss, is the missed timing opportunity inside the server/intermediate. The server/intermediate has identified the request resource (file, cgi, php, backend proxy) and triggered processing. It could send PUSHes right away, but needs to wait for the resource processing to produce headers, e.g. the response. For each hop in the request chain, the latency is added on top of that "missed" time.

An intermediate 104 Early Meta is a another way of PUSHing a meta data resource. But we have PUSH and it is connected to a client initiated stream, so for h2 it is way simpler to use that. To make it clear in the push promise what is being pushed, the promise should be linked.

:path: <url of meta data resources>
:method: GET
...
link: <initiated resource url>; rel="dependency-for"
content-type: application/json-xml

[resource dependency graph]

How such a push would be triggered is really up to the implementation, but it is no longer intertwined with other parts of the response.

icing commented Mar 23, 2016

I agree with @igrigorik that preload + nopushis good enough for a header based approach.

To reiterate: the problem, addressed by @yoavweiss, is the missed timing opportunity inside the server/intermediate. The server/intermediate has identified the request resource (file, cgi, php, backend proxy) and triggered processing. It could send PUSHes right away, but needs to wait for the resource processing to produce headers, e.g. the response. For each hop in the request chain, the latency is added on top of that "missed" time.

An intermediate 104 Early Meta is a another way of PUSHing a meta data resource. But we have PUSH and it is connected to a client initiated stream, so for h2 it is way simpler to use that. To make it clear in the push promise what is being pushed, the promise should be linked.

:path: <url of meta data resources>
:method: GET
...
link: <initiated resource url>; rel="dependency-for"
content-type: application/json-xml

[resource dependency graph]

How such a push would be triggered is really up to the implementation, but it is no longer intertwined with other parts of the response.

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Mar 23, 2016

Member

Discussed this on the webperf WG call with @yoavweiss and we agreed to close this. The main concern is over limitations of a header-based approach, but as we noted here, that does not preclude the need for a header-based mechanism; we may want to explore other approaches elsewhere.

Member

igrigorik commented Mar 23, 2016

Discussed this on the webperf WG call with @yoavweiss and we agreed to close this. The main concern is over limitations of a header-based approach, but as we noted here, that does not preclude the need for a header-based mechanism; we may want to explore other approaches elsewhere.

@kazuho

This comment has been minimized.

Show comment
Hide comment
@kazuho

kazuho Apr 6, 2016

In regard to using a 1xx response for sending Link: rel=preload, please let me leave a note that the credit of the idea goes to @tatsuhiro-t before we discuss the possibility further in various places.

kazuho commented Apr 6, 2016

In regard to using a 1xx response for sending Link: rel=preload, please let me leave a note that the credit of the idea goes to @tatsuhiro-t before we discuss the possibility further in various places.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment