Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Content-Length: before or after gzip #46

Closed
martinthomson opened this issue Mar 1, 2013 · 16 comments
Closed

Content-Length: before or after gzip #46

martinthomson opened this issue Mar 1, 2013 · 16 comments
Labels

Comments

@martinthomson
Copy link
Collaborator

Content-Length is largely only needed as entity metadata in HTTP/2.0. It does provide a limited function in learning the complete size of a resource prior to receiving an entire message. (This is the behavior explicitly relied upon for POST, which is based on browser information only. For example, node.js always sends chunked encoding unless explicitly overridden.)

Since compression is applied by the framing layer, there's an ambiguity in the spec with respect to what value Content-Length is given. If the data frames are compressed at the framing layer, the pre-compression size is possibly, but not certainly, the size that is reported in Content-Length.

@jpinner
Copy link

jpinner commented Mar 1, 2013

IIRC we removed compression of the data frames at the framing layer in SPDY.

If we haven't done that in HTTP/2 we should.

@martinthomson
Copy link
Collaborator Author

Data frame compression is an analogue of Transfer-Encoding: gzip. This is a pretty important feature to retain.

It's optional, of course. No point in re-compressing compressed data, but html/css/js all need compression. It would complicate the media type stuff declaration for these types if the actual content was in a compressed format.

@jpinner
Copy link

jpinner commented Mar 3, 2013

Maybe I'm misunderstanding. I know the SPDY spec requires receivers to accept gzip encoded data. At one point there was a flag in the data frames themselves to indicate that the content was gzip compressed by the framing layer. I thought we removed this in favor of having the application compress the data and requiring accepting deflate or gzip data.

IMHO the proper thing to do here is to not have any compression at the framing layer, assert that all requests are made as if they sent Accept-Encoding: deflate, gzip, and that the Content-Length header be the aggregated length of the data frames.

@grmocg
Copy link
Contributor

grmocg commented Mar 4, 2013

Jeff-- you are correct.
We specified that a client MUST be able to handle an entity body which was compressed with gzip (because there were many clients where this was not the case).
In the very first version of SPDY we allowed for compression of the entity body at the protocol level, but after looking at the amount of CPU spent compressing things which were already compressed and the lack of gains, decided that it was easy to simply ensure that gzip could be used when the sender desired, and otherwise leave it to the application.

One of the implications of this is that content-length now describes exactly what it did before-- the length of the entity body (the sum of the payload of the data-frames in the stream should equal the content-length).

@martinthomson
Copy link
Collaborator Author

This doesn't change the fact that Transfer-Encoding: gzip has been replaced by a single bit. Thus, SPDY does compress at the protocol level. The fact that it is possible to compress outside of this doesn't change that. What you say is just an argument for having Content-Length refer to the post-compression representation (i.e., the pre-compression representation isn't available). When I don't have a 1 year-old fighting me for the keyboard, I'll provide the other argument.

@grmocg
Copy link
Contributor

grmocg commented Mar 4, 2013

No-- the application which provided it to the protocol stack already did the compression.
The protocol simply carries the information that it was compressed by the application.

Protocol compression (by which I mean that the protocol part of the stack does the compression, and not because of any explicit signal from the application-layer) has been disconnected/effectively deprecated since SPDY/2.

@martinthomson
Copy link
Collaborator Author

What is the Content-Type of a pre-compressed HTML document? Most of metadata applies to the pre-compression artifact. Through this, I infer that Content-Length should also.

This doesn't say anything about when or where compression actually takes place in actuality, just where it does logically.

@grmocg
Copy link
Contributor

grmocg commented Mar 4, 2013

Since the SPDY/HTTP/2 layer neither changes the content-type nor the content length (and since it doesn't transform the entity-body) since the application supplies both and the entity body, I don't think we need to say anything about either?

@martinthomson
Copy link
Collaborator Author

I believe that saying something about what these mean is of the utmost importance. These have a direct bearing on how data is interpreted. Sure, the application could set Content-Length of 3 for a 5Mb representation. We currently require that clients drop such requests. The same could be said of Content-Type if an "application/html" representation turned out to not be HTML because it was compressed at some point.

@grmocg
Copy link
Contributor

grmocg commented Mar 4, 2013

I'm seeing the HTTP/2 framing+semantic layer as providing mainly session management, which is perhaps why I'm confused by what you're asking for? Or maybe we're violently agreeing?

I know there will need to be other changes to hook up these fields to the previous definitions (http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-22#section-3.3.2
and http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-22#section-3.1.1.5 respectively), but, given that we're focused mostly on just changing how the bytes show up on the wire, and given that we don't do any compression, except of headers, I'm just confused?

@jpinner
Copy link

jpinner commented Mar 7, 2013

So looking at the d4397f7 version of the index.txt file -- I suggest:

  1. remove the COMPRESSED data flag (lines 869-870)
  2. replacing lines 901-906 with text else where (section 4.2.2?) stating that clients must accept gzip/deflate encoding
  3. remove lines 908-911
  4. remove section 5.5

thus removing all ambiguity about content-length (it equals the sum of the lengths of the data frames).

Happy to issue a pull request :)

@martinthomson
Copy link
Collaborator Author

How then does a client learn whether an entity is compressed (or not)? Transfer-Encoding is prohibited: http://http2.github.com/http2-spec/#rfc.section.4.2.1

@jpinner
Copy link

jpinner commented Mar 7, 2013

via the Content-Encoding header

So basically what SPDY has done is forced "chunked" Transfer-Encoding always and disallowed the rest.

@mnot
Copy link
Member

mnot commented Mar 18, 2013

The major use case for Content-Length in HTTP/2 is allowing an intermediary that's changing it to HTTP/1 to generate the correct header without buffering the message. I think that @jpinner 's proposal does that; agreed?

@martinthomson I think that you're asking how hop-by-hop compression (a la Transfer-Encoding in HTTP/1) happens in HTTP/2. In Orlando, we said we were getting rid of the compression flag, so it appears that this isn't possible in HTTP/2. If you (or anyone) is concerned about this, we should open a separate ticket.

@mcmanus
Copy link
Contributor

mcmanus commented Mar 18, 2013

@mnot re content-length - the other use case is pure http2.. CL enables transfer progress meters (especially on downloads) which are useful ui elements.. so keeping the status quo of it reflecting transfer size is right imo.

@grmocg
Copy link
Contributor

grmocg commented Mar 18, 2013

yes-- content-length's meaning should be unchanged, and still reflect the
entity-body size when optionally present.

On Mon, Mar 18, 2013 at 4:51 AM, mcmanus notifications@github.com wrote:

@mnot https://github.com/mnot re content-length - the other use case is
pure http2.. CL enables transfer progress meters (especially on downloads)
which are useful ui elements.. so keeping the status quo of it reflecting
transfer size is right imo.


Reply to this email directly or view it on GitHubhttps://github.com//issues/46#issuecomment-15050782
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants