Pushing resources (HTTP/2) #1

unbit · 2014-09-21T04:27:58Z

HTTP/2 (and SPDY) is a fully bi-directional stream, in which the client and the server can voluntary initiate a new flux. This approach allows the server to gratuitously send resources to the client without waiting for it to request them.

Example:

the resource /foo.html requires /one.css and /two.js to render a page.

In the standard HTTP 1.x way, the client will firstly ask for /foo.html, it will parses it and will start asking for each resource required for the full rendering.

In HTTP/2 the server is allowed to directly send /foo.html, /one.css and /two.js in response of the single /foo.html request.

My proposal for implementing it at the WSGI level is very similar to what we already do with static resources, in which you have two approaches:

The WSGI-centric wsgi.file_wrapper, that is a callable exposed by the environ allowing you to use low-level accelerator exposed by the server.

The Web-server-centric X-Sendfile header, in which the app simply set the X-Sendfile response header that will be trapped by the webserver router/gateway/proxy that will start transferring the file.

For pushing i propose:

wsgi.push (or wsgi.push_wrapper) that is a callable (like wsgi.file_wrapper) taking a dictionary in this form:

push_dict = {
    '/one.css': [ '200 OK', [('Content-Type', 'text/css')], _iterable_],
    '/two.js': [ '200 OK', [('Content-Type', 'application/javascript')], _iterable_],
}

that you will send with

environ['wsgi.push'](push_dict)

Alternatively the 'X-Associated-Content' (already working for apache mod_spdy and proposed for nginx) will work as X-Sendfile, with the frontend/proxy/router server catching it and sending the resource content to the client.

rbtcollins · 2014-09-21T11:20:41Z

Lets see, the requirements for HTTP/2 and push are that the headers that will be used to request the object be supplied by the server - check, both approaches handle that. And they'd handle any simple bytestream.

Theres a few curly bits though.

Firstly, the use of PUSH_PROMISE can be disabled, so if we are creating iterables in our main context, folk will have to be diligent about making them lazy, or they can be fairly trivially dos'd - connect, disable PUSH, request the main page, cancel the stream.

Secondly, multiprocess servers may benefit substantially from executing the computation and/or data handling for additionally pushed requests in separate Python processes, so I'm leaning fairly strongly on the X-Associated-Content style approach. A possible question there - should we use the header, or provide an API one can call - an API could act immediately, vs a header which has to wait for the response to be handed back. Also an API could be used by a server to generate the X-Associated-Content header automatically (when the server is itself behind nginx/mod_spdy/whatever).

So how about:

environ['wsgi.associated_content'].extend(urls)

[and that can be either a plain list or something smarter if the server wants to do stuff immediately].

Lukasa · 2014-09-21T11:45:30Z

Separate processes for pushed responses could be really valuable. Being able to push dynamic content isn't necessarily going to be a common use-case, but it's a possible one.

Also remember that pushed responses can be cancelled by the client midway through even if they allow PUSH_PROMISE. Another good reason to lean on associated_content.

Closes: #1

rbtcollins · 2014-09-26T03:18:16Z

I've push prose matching this discussion into the draft, closing as such.

pjeby · 2014-09-26T20:06:14Z

How does this affect middleware?

The reason the WSGI file wrapper is part of the response is so that it will cleanly fail when passed through oblivious middleware. IIUC, this proposal is broken in that regard: to prevent the associated content from being pushed or to change it, every piece of middleware between the true server and the true app must understand and intercept this value -- a non-starter from a compatibility POV.

This should be implemented by attaching data to the response body, so that it doesn't reach the server in the event a piece of middleware replaces the response body.

rbtcollins · 2014-09-27T00:00:23Z

I don't follow that logic. oblivious middleware shouldn't stop features working - I don't see how attaching the associated_content object to the response object makes this any better or worse - other than making it totally stop working if someone replaces the response object and doesn't think about this.

CONTENT_LENGTH isn't attached to the response body, nor is CONTENT_TYPE, but both of those need to be altered when transforming the response body.

Middleware might want to do several things here.

A) response body changing.
A1) it might want to transform the response body and add references to new objects which should be pushed to the client. E.g. it might be injecting a JS library into the HTML header section, which needs to be pushed at high priority to prevent blocking browser rendering. In this case, the middleware wants to add to the associated content, and it can do so by calling this API, and has no need to layer on top of the API.

A2) it might want to transform the response body and remove references, e.g. it might be stripping out legacy content when it detects an HTML5 capable browser. [we'll ignore that those old browsers don't support HTTP/2 :)]. In this case it needs to intercept the API:
remove = set()
associated = {}
old_associated = environ['wsgi.associated_content']
def new_associated(url, priority=None):
associated[url] = priority
environ['wsgi_associated_content'] = new_associated
response = child_app(environ)
...
to_associate = set(associated) - remove
for url in to_associate:
old_associated(url, associated[url])

A more opinionated middleware could wait for one block before passing on.

B) it might want to disable all push - which is trivial, just put a no-op in the environ.

The consequence of getting a push promise wrong is that content which the user already had access to, but didn't need, will get put on the wire, at a lower priority than the main object, and will be cancelled as soon as the browser realises it doesn't need the content. So its preferrable to have the push happen than not.

pjeby · 2014-09-27T03:08:33Z

You missed scenario C: the middleware creates an entirely new response, incorporating information extracted from the wrapped app's response -- or better yet, information extracted from multiple subrequests. (Or perhaps, as the result of trapping an error from the wrapped app, after it's already invoked this API!)

In the event that the middleware doesn't know about this extension API, the result will be data corruption, because each of the subrequests will send annotation information to the server, which will not be able to tell that those were separate (and discarded) subrequests.

This kind of thing is covered in PEP 333 and 3333 under http://legacy.python.org/dev/peps/pep-0333/#server-extension-apis -- any extension APIs that replace or augment the request or response have to either verify the context, or directly augment the feature in question. That is, an API that augments the request must take the request environ as a parameter so it can be verified to be unchanged. And an API that augments the response must either augment the response body directly, or take start_response as a parameter so the server API implementation can verify that it's the same start_response the server sent through.

IOW, this isn't something specific to the use cases for associated content, it's a general problem with defining middleware-safe extensions to the core protocol. You could say, "well, in the new spec, this is a core protocol, so all middleware has to support it intelligently", but then you have the problem of not being interoperable with WSGI 1 middleware, without a more sophisticated converter... and there, you need the sophisticated converter just to successfully stop it from working, because otherwise scenario C will still break.

However, if you follow the core extension practices built into the original WSGI design, then it's okay: you can have stuff that passes through oblivious WSGI 1 middleware safely -- either leaving it alone in the case where the middleware doesn't alter the response, or correctly leaving it out, in the case where the middleware constructs a response of its own.

And of course, middleware authors' life is a lot easier if they don't have to think about extensions they aren't directly touching.

rbtcollins · 2014-09-27T03:50:42Z

So, in case C one should pass a null lambda to contained applications. Server push is a core component of HTTP/2, not an optional add-on: it is something that folk wishing to write HTTP/2 middleware need to be aware of. Its as fundamental to the protocol's end user performance a correct handling of Accept-Encoding is in HTTP/1.1.

The plan as I see it for WSGI 1 is that we're going to write two converters as part of the redesign process: One that is WSGI2 on the top, and WSGI1 on the bottom, and one that is WSGI1 on the top, and WSGI2 on the bottom. The former gets you complete compatibility with PEP-3333 apps, and the latter the same with PEP-3333 servers. The former will handle associated_content by looking for the X-Associated-Content header and translating and stripping it;the latter will handle it by translating calls to it to X-Associated-Content headers.

That said, I will admit to not entirely understanding the advice in http://legacy.python.org/dev/peps/pep-0333/#server-extension-apis - functions that operate on environ don't need to be middleware, so why would something that needs to be middleware be implemented as functions that operate on environ? I think the point its trying to suggest across is that the environ dict isn't 'the environ dict' - it is 'the environ dict at this layer in the middleware stack': it may differ from layer to layer in a middleware stack, and that you cannot assume its unchanged: OTOH middleware is by definition in the middle, and all it needs to be aware of is the environ that it received (and thus it's server provided and expects a result based on), and the environ that it hands off to its contained app(s), which it may have modified as it sees fit. IIUC the key issue is that middleware may have bugs where it e.g. modifies start_response but does so inconsistently, not unwrapping it in all cases (e.g. the error case), and so a defensive approach is recommended? Perhaps we should put some code examples together to demonstrate the advice.

Lukasa · 2014-09-27T07:21:38Z

The relevant section seems to be this:

So, to provide maximum compatibility, servers and gateways that provide extension APIs that replace some WSGI functionality, must design those APIs so that they are invoked using the portion of the API that they replace.

My concern is that server push doesn't replace any portion of the API at all. It extends the API in a brand new direction. I simply don't believe this section of the old PEP provides us any meaningful guidance at all.

I do accept the problem that oblivious middleware can remove a response and in so doing remove the semantic association between the response and the pushed resources. That's certainly unpleasant. However, almost any other approach has the risk of doing exactly the same thing. Fundamentally, no WSGI 1.X middlewares will have been designed with an eye towards the fact that HTTP/2 applications may want to push resources.

Any feature we add to the API is either a) an extension of which these middlewares are unaware, causing the problem discussed; or b) an overloading of a feature of the API of which they are aware, which allows for middlewares to accidentally prevent or cause resource pushing.

I'm interested in a concrete proposal of how we'd overload the response body to do resource pushing, but without seeing it I don't believe our approach is worse than the alternative.

pjeby · 2014-09-27T16:53:33Z

That said, I will admit to not entirely understanding the advice in http://legacy.python.org/dev/peps/pep-0333/#server-extension-apis

Perhaps it will help if I translate it in terms of HTTP, rather than Python. The goal of this part of the WSGI spec is to ensure transport encapsulation. That is, to prevent there being any side-channel that can bypass middleware.

If you think of middleware as a proxy or gateway server, with the app as an origin server, then putting server extensions into the environ is roughly equivalent to a web browser adding an X-Contact-Me-Here: myip:port header to its request, which the app then uses to bypass the gateway or proxy and communicate directly with the browser!

This is a bad design, because presumably the proxy/gateway is there for a reason. Sure, the proxy could remove the header, if it knows it exists. But in the general case, this is a bad design: all communication with the browser should be through the proxy or gateway, as you otherwise run the risk of errors, security holes, and other "unpredictable results".

So, the intent of the spec is that all communication with the browser should pass through the middleware in a visible way. And anything the middleware doesn't pass on, should not be passed on. Middleware-bypassing APIs break the invariant that:

If you don't pass through `start_response` and the body iterator, then no trace of that subresponse should be received by the client.

almost any other approach has the risk of doing exactly the same thing.

Not necessarily. Putting information in a header doesn't have this problem, nor does attaching it to the response body.

Even the problem of needing to pass arbitrary objects through WSGI 1 middleware can be solved by having a WSGI 2 (required) extension API that simply registers objects and returns unique string keys, that are valid for the duration of the originating request. These strings can then be used in X-WSGI2-* headers that are interpreted and stripped by the WSGI 2 server implementation. And WSGI 2 middleware could retrieve or replace the objects via the same API as it manipulates headers... but if the middleware creates a new response with its own headers (rather than passing through the same or altered headers) then the extra information will be silently (and correctly) discarded.

(This is only one possible approach, of course.)

IIUC the key issue is that middleware may have bugs where it e.g. modifies start_response but does so inconsistently, not unwrapping it in all cases (e.g. the error case), and so a defensive approach is recommended?

No, the issue is that if middleware replaces start_response in the environ to make a sub-request, and then (for whatever reason) decides not to return the sub-request's response as its response, then having a server extension for this feature means data leaks from subrequest responses to the parent response.

One example of a trivial fix for this problem (in the WSGI 1 context), would be to add such APIs as attributes on start_response (e.g. start_response.set_associated_content()), so that replacing start_response automatically drops any extensions that would bypass the middleware. (An idea I confess I didn't think of at the time -- function attributes were still a relatively new concept back then!)

Of course, if we're getting rid of start_response, this approach wouldn't work. But it illustrates the point: by design, WSGI intends that middleware should be able to clone environ to make subrequests that inherit data by default from a parent request -- without this resulting in the child responses becoming part of the parent response, except by the middleware's explicit choice to pass that data on.

The former will handle associated_content by looking for the X-Associated-Content header and translating and stripping it;the latter will handle it by translating calls to it to X-Associated-Content headers.

Then why not just use X-Associated-Content in the first place? What's the advantage in having more than One Obvious Way To Do It? If you can already encode this feature within WSGI 1, what advantage does middleware gain from the data being passed in a way that it has to explicitly intercept by reimplementing an API, instead of just passively altering data in transit?

Essentially, that's the design principle applied in the original spec: wherever possible, middleware shouldn't need to reimplement server APIs just to read or alter the request or response. It especially shouldn't need to reimplement (or remove) a server API in order to not pass on a response.

rbtcollins · 2014-09-27T21:52:55Z

Interesting. So today, if you copy() environ, and replace start_response there's nothing in the PEP that will lead to a leak, but there is in common server implementations (such as exposing the raw socket). So implementors haven't been following this principle :).

I like the function attribute idea.

rbtcollins added a commit that referenced this issue Sep 26, 2014

Strawman definition for server push.

7b6c3e9

Closes: #1

rbtcollins closed this as completed Sep 26, 2014

rbtcollins reopened this Sep 27, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pushing resources (HTTP/2) #1

Pushing resources (HTTP/2) #1

unbit commented Sep 21, 2014

rbtcollins commented Sep 21, 2014

Lukasa commented Sep 21, 2014

rbtcollins commented Sep 26, 2014

pjeby commented Sep 26, 2014

rbtcollins commented Sep 27, 2014

pjeby commented Sep 27, 2014

rbtcollins commented Sep 27, 2014

Lukasa commented Sep 27, 2014

pjeby commented Sep 27, 2014

rbtcollins commented Sep 27, 2014

Pushing resources (HTTP/2) #1

Pushing resources (HTTP/2) #1

Comments

unbit commented Sep 21, 2014

rbtcollins commented Sep 21, 2014

Lukasa commented Sep 21, 2014

rbtcollins commented Sep 26, 2014

pjeby commented Sep 26, 2014

rbtcollins commented Sep 27, 2014

pjeby commented Sep 27, 2014

rbtcollins commented Sep 27, 2014

Lukasa commented Sep 27, 2014

pjeby commented Sep 27, 2014

If you don't pass through start_response and the body iterator, then no trace of that subresponse should be received by the client.

rbtcollins commented Sep 27, 2014

If you don't pass through `start_response` and the body iterator, then no trace of that subresponse should be received by the client.