Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trailers are not currently supported #9

Open
rbtcollins opened this issue Sep 26, 2014 · 15 comments
Open

Trailers are not currently supported #9

rbtcollins opened this issue Sep 26, 2014 · 15 comments

Comments

@rbtcollins
Copy link
Contributor

HTTP/1.1 and 2 define 'trailers' - regular headers but allowed after the body content of a response. WSGI does not provide a way to provide those today.

@Lukasa
Copy link
Contributor

Lukasa commented Sep 27, 2014

Trailers are poorly supported by many client implementations. httplib, for example, has no support for them. I wouldn't consider this a high priority work item.

@rbtcollins
Copy link
Contributor Author

Thats true, but I posit that the majority of UAs out there by frequency are browsers, not httplib. Trailer handling is a fundamental aspect of HTTP/1.1 and /2 and without it many use cases are impossible without buffering, which the WSGI spec prohibits. Fixing this in the protocol does not imply fixing client implementations - if they are ignoring trailers today they are already broken should a server send trailers.

@Lukasa
Copy link
Contributor

Lukasa commented Sep 27, 2014

In principle that's true. In practice, chunked encoding (the primary case for trailers) is sufficiently rare that, again, I don't believe this to be particularly common.

I have no objection to supporting trailers, but as I said before, it's just not something I'd be too worried about supporting.

@legitparty
Copy link

Every HTTP implementation I have ever written has supported trailers. It isn't hard. It's the same code as supporting the headers!

@Lukasa
Copy link
Contributor

Lukasa commented Aug 27, 2016

@legitparty The parsing code is the same, sure, but the problem with trailers is the way they affect the state machine.

Specifically, implementations (particularly older ones) had no expectation that header fields would arrive after the completion of a body. That requires an entire extra new branch of code to understand that the trailers are part of the previous request and to handle them appropriately. It turns out that that's not commonly done.

@legitparty
Copy link

See 3.6.1 of RFC 2616.

You don't send trailers unless you get "TE: trailers", and you only send headers that are non-essential. A server that is sending trailers to clients that don't support them is broken.

Frameworks that don't support them prevent applications from using them for any valid reason, and they exist for a variety of reasons. I am not going to advocate for reasons. If the spec is to be deviated from, I think a better argument needs to be made against them.

I have implemented trailers 4 times in 3 languages, with both threads and events. I implemented instance digests with them twice. I don't see what the issue is. The issue that I see is that I am not able to port my old code to WSGI without ripping out features that existed long before WSGI existed.

@sigmavirus24
Copy link

@legitparty let me assure you that those of us involved in this discussion know very well how trailers work. You don't need to prove your own knowledge of the subject matter or continue to refer to your implementations (although I'd be unsurprised if you brought it up again). The comments above are merely pointing out that the clients common to the Python ecosystem (although I believe there are other languages with similarly poor support) do not handle trailers.

Also @legitparty, RFC 2616 is dead. You should be referencing the appropriate sections of RFC 7230, 7231, 7232, 7233, 7234, or 7235.

@legitparty
Copy link

Thank you for clarifying that it isn't about the implementation of a protocol, but integration with non-compliant tools in a limited "ecosystem".

And 7230 says the same thing in 4.1.2. And actually, it specifically addresses the post-processing status and integrity concerns brought up with WSGI: http://rhodesmill.org/brandon/2013/chunked-wsgi/

These are other people, not me. It is apparently a common problem that trailers are suited to alleviate.

And just yesterday, I ran into this issue again with a flask integration. That is how I found that post, and this issue via Google. Surely you want the input of people actually running into this issue to being up their personal experience, right? What am I allowed to bring up as evidence? Can you enumerate it for me, please?

@Lukasa
Copy link
Contributor

Lukasa commented Aug 28, 2016

The issue here is very much the ecosystem. I highly recommend you trying with a number of user agents to send trailers and see what effect that has on their handling of the content.

One of the biggest problems with HTTP/1.1 was that an enormous amount of the specification was optional, or incompletely implemented. Chunked transfer encoding remains relatively uncommon, despite being mandatory, and trailers are even less common.

That doesn't mean we shouldn't support them, but it does mean that they aren't a high priority work item.

@rbtcollins
Copy link
Contributor Author

@legitparty there is no need for evidence: the ticket says that this is a known defect, your arguments that it should be fixed are not controversial.

What is controversial are two things:

  1. That it is easy to do so. WGSI is a very highly deployed specification, and that means there are lots and lots of not-quite-the-same implementations out there. So any proposed change to WSGI to supply this in a backwards compatible way has a wide set of not-quite-compatible things to be verified against, otherwise we drive a wedge into the ecosystem. If we choose to do a clean break and hope we've got sufficient mindshare to have folk update their implementations to a v2, then we still need to ensure that we've got some ability to deliver a compatibility thunk, otherwise whoever moves first will cut themselves off from the entire ecosystem, and we'll be facing another Python 2.x -> 3.x style transition, except with less developers and a fraction of the community.
  2. Exactly what it should look like: the WSGI protocol (code level - you'd probably call it an API but there's a meme in Python that APIs have implementations) is very narrowly defined and optimised for CGI style traffic - streaming body uploads was a late arrival, let alone trailers - and in fact the last major update to WSGI punted on trailers- thus us being here today examining them.

What this ticket needs is not evidence for doing it! it needs someone to get themselves up to speed with the overall initiative, including whether @Lukasa (who has taken over from me in driving this) is planning a clean break / lookalike-update-with-adapters / compatible-even-though-it-hurts strategy, and then design within the shape of the current strategy an update to the protocol to support trailers in both directions and vet that with the server authors like mod_wsgi (apache), uwsgi (nginx) etc. The server authors are particularly important here since they often have constraints that are nowhere near as simple as fixing-some-python-code, and that can drive us to need to make things - like trailers - either optional, or 'might not be present and if you try the server will error at you'.

@legitparty
Copy link

There is both a technical issue involving trailers in your argument against it, AND a design issue of when to completely implement a protocol in my argument for it.

  • the trailer issue *

You keep making the ecosystem argument, but you are not addressing the logical flaw in it that I have already pointed out twice: the ecosystem can only be a problem if sending trailers to clients that first send "TE: trailers" is a problem. So someone needs to have their client send "TE: trailers" AND break when receiving trailers. The chances of that are near and likely zero, so I am not going to waste time testing software for that case.

And now I am frustrated further by the continuation of this bad logic to chunked encoding itself. Trailers are only necessary when encoding is chunked, and chunked encoding back to the client already happens depending on how the API is used, unless you are buffering an entire stream no matter how large it gets. That is also irrelevant. Please stop bringing up irrelevant points.

  • the spec completeness issue *

The optional parts of the spec have protections in place such that servers need to implement them, but clients do not. Let me explain why, and how this works.

A client can leave out features while still logically implementing them, because a client is a final instance of the expression of the protocol, and can guarantee that it will never be involved with those features because it initiates the request and declares the level of support. That is, if it never uses those features, it is logically equal to having those features supported. So when you cite that a lot of clients don't support a feature, it is logically irrelevant. For example, a client can not bother even with chunked encoding when it is requesting objects that it knows are not streams, and can assume that the server will send it back whole. The client may be designed to work with just that one server that it can make assumptions about.

A server does not have the same benefit because it has to respond to the level of support requested by the client. That is, the final expression of the protocol also involves the support-level of the client. This design of letting the client dictate the level of support keeps clients more thin, but requires at least full wire support from the server. The one exception is when the server is designed to only work with one type of client by design, but application servers, especially community driven ones, cannot make this assumption.

An app often cannot just decide to not use a particular protocol feature, because an app often integrates with several interfaces, and it is forced to using the least common denominator of available features. This is especially troublesome when an app implements a particular feature set, and then is later integrated with a framework that does not fully support the protocol. Then the app then must have its design changed to support the lower-class interface.

It would be like operating on TCP without the ability to resend packets, especially after the all is designed to rely on that ability. Errors still exist, but they are of a different class, with different properties that require different design to handle.

For example, in the case of WSGI, all of the apps that support integrity checking at the HTTP level are forced to change an integration to a two-phase transaction. That means that the whole request needs to be queued or blocked in some way during a second request-response trip, and you have increased latency and memory pressure, meaning that more hardware is required. On the server side, it cannot merely block because the requests are stateless, and the second request comes in a separate execution context. This is especially a problem if the end-point is just load balancing, because now it has to be a stateful filter -- the solution for horizontal scale now runs into vertical scale problems. That is a design change that requires the app being a whole separate class of app.

The alternatives are 1. to put an additional service that synchronizes the requests before sending them off to the load balancer (this effectively separates the vertical and horizontal scale problems), 2. an encapsulation layer that does integrity checking in the media itself (and have both sides support it, which limits what you can connect to), or 3. to just bypass the entire WSGI ecosystem and use something else. In the end, some design change needs to happen.

The thing that is frustrating me is not that the issue is considered low priority, but that it is being considered a design issue to support it. That is the kind of response I typically get from Mozilla.org when I can tell that they just don't want to deal with it right now. It is frustrating to run into an issue designing something to spec, then spend time clarifying the issue, only to get a BS response back. Your WSGI is claiming to be an interface to HTTP, but it is incomplete. If you don't want to deal with community issues involving that, then don't offer it to the community, or at least don't provide a feedback channel. There are millions of programmers. Let someone else take care of the problem.

@legitparty
Copy link

@rbtcollins I saw your response after my last one. I only kept arguing for it because I was still getting arguments against it. But I will stop arguing for it now.

@Lukasa
Copy link
Contributor

Lukasa commented Aug 28, 2016

The thing that is frustrating me is not that the issue is considered low priority, but that it is being considered a design issue to support it. That is the kind of response I typically get from Mozilla.org when I can tell that they just don't want to deal with it right now. It is frustrating to run into an issue designing something to spec, then spend time clarifying the issue, only to get a BS response back. Your WSGI is claiming to be an interface to HTTP, but it is incomplete. If you don't want to deal with community issues involving that, then don't offer it to the community, or at least don't provide a feedback channel. There are millions of programmers. Let someone else take care of the problem.

I had absolutely no problem with your response up until this paragraph, and I'd like to address it separately.

Firstly, WSGI is not mine. WSGI is a specification that dates back in its original form to 2003, which was quite literally a totally different time in the web, and is also a time when I was really concerned more with other things (what with my being 13 at the time). Since then WSGI has been incrementally updated in small, non-breaking ways that allow the enormous ecosystem of WSGI applications and servers to continue to interoperate. The fact that a WSGI implementation from 2003 more-or-less functions with a WSGI implementation from 2015 is pretty astonishing.

However, the bigger problem is in the middle of your paragraph:

Your WSGI is claiming to be an interface to HTTP, but it is incomplete.

WSGI does not claim that, and has never claimed that. The relevant passage in the original PEP is this one (emphasis in the original):

However, because WSGI servers and applications do not communicate via HTTP, what RFC 2616 calls "hop-by-hop" headers do not apply to WSGI internal communications. WSGI applications must not generate any "hop-by-hop" headers, attempt to use HTTP features that would require them to generate such headers, or rely on the content of any incoming "hop-by-hop" headers in the environ dictionary. WSGI servers must handle any supported inbound "hop-by-hop" headers on their own, such as by decoding any inbound Transfer-Encoding, including chunked encoding if applicable.

This paragraph is dancing around the issue that you're encountering, which is that WSGI does not allow the application to utilise features of HTTP/1.1 that are hop-by-hop, except when it has the explicit cooperation of the server in question. In this instance, given that TE is a hop-by-hop header field, we in essence need to define a broader API that allows the sending of chunked trailers.

This is reinforced by the fact that, despite your assertion, a WSGI application cannot guarantee chunked responses. You assert that:

chunked encoding back to the client already happens depending on how the API is used

and I counter with the fact that that is simply untrue. Once again, I quote from the PEP (emphasis once again in the original):

if the server and client both support HTTP/1.1 "chunked encoding", then the server may use chunked encoding to send a chunk for each write() call or bytestring yielded by the iterable

Note this "may": it's important. There is no requirement to emit chunk-encoded bodies in the current WSGI draft: an entirely compliant WSGI server may elect to refuse to do it. In fact, once such compliant WSGI server is wsgiref, which bills itself as the WSGI reference implementation.

What this means is that it is simply not enough to define a protocol that asserts that for any streamed body without content-length we can emit trailers.

What this issue is asking for is to more formally break from some of the guarantees of WSGI 1.0, in order to allow the application to cooperate with the server to ensure certain framing properties of the response. That was essentially formally forbidden by WSGI 1.0, and deviating from that position is tricky. That trickiness, combined with the relative lack of use of trailers, makes this a low priority issue for most of us: there are simply bigger problems that need our attention first.

But let's address your specific criticism of our process:

If you don't want to deal with community issues involving that, then don't offer it to the community, or at least don't provide a feedback channel. There are millions of programmers. Let someone else take care of the problem.

I don't know what you think this body is, but we are not a dictatorship that rules with an iron fist. We have no power to enforce anything, we cannot bend programmers to our will, and we cannot prevent them from doing anything.

WSGI has a well-established convention for defining extension APIs. These are used heavily by WSGI servers to provide optional functionality (uWSGI does this a lot). Such an extension could absolutely provide the kind of logic we would need here, namely:

  1. If some entry is present in the environ dict (let's say a key called "trailers"), you may call "trailers" before you raise StopIteration but after the last chunk of the body with any trailers you'd like emitted by the server.
  2. A server must only add that callable if TE: trailers was present on the original response.

You could then use this extension from your own application code with cooperating servers. Defining this extension would allow you to prove, outside of the specification process, that this functionality is more valuable and widely desired than we (the members of the web-SIG paying attention to this issue) believe that it is, which provides excellent evidence that we should reconsider whether this is a low-priority issue.

Otherwise, this issue amounts to you stomping your feet and saying that because you need it this should be a high priority issue for me. I'm sorry, but that's not how it works. Right now, you are the first person I have ever met who has wanted access to trailers on responses, and I've been dealing with HTTP for quite a while.

I have no problem with adding support for emitting trailers on responses: they're a reasonable thing to want to support. But demanding that they should be a higher priority on WSGI without understanding how WSGI works or what exactly it is is deeply unhelpful.

@legitparty
Copy link

Thanks for the clarification. So basically WSGI is considered on top of a hop, past its edge, and those aspects of HTTP are delegated entirely to the web server. This means that WSGI cannot be used as middleware to implement hop-by-hop aspects of HTTP. That makes perfect sense, actually. In that case, I should fix the web server component itself, and recommend that this issue actually be closed as a won't fix for design reasons. This issue isn't about trailers at all, but about where WSGI is in the application stack.

@Lukasa
Copy link
Contributor

Lukasa commented Aug 29, 2016

👍, that understanding is on the nose.

The reason I'm disinclined to say that this is a WONTFIX is because I don't think it's unreasonable for WSGI applications to be notified of what kinds of lower-level functionality are available, and to be able to make use of it when they're present. But those things are less important in the short term than ensuring that WSGI is fully-functional in its mainline use-case, especially as they can be specced externally to the main WSGI spec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants