Serialization for structured headers. #627

mikewest · 2018-05-23T11:01:20Z

I've started sketching out a feature that intends to deliver a structured header as part of an HTTP request, and I find myself doing a little more hand-waving than I'd like in step 6 of https://mikewest.github.io/sec-metadata/#abstract-opdef-set-the-sec-metadata-header-for-a-request.

The end of https://tools.ietf.org/html/draft-ietf-httpbis-header-structure-04#section-1 suggests that:

Those abstract types can be serialised into textual headers - such as those used in HTTP/1 and HTTP/2 - using the algorithms described in Section 3.

Section 3, however, seems to be the opposite: parsing a string into a Structured Header. I don't actually see a serialization algorithm in the document. Each type hints at how it might be serialized, but it would be nice to have an algorithm to point to.

mikewest · 2018-05-23T11:13:19Z

(For example, can we assume any ordering in the serialization of a dictionary? :) )

mnot · 2018-05-23T23:52:39Z

Makes sense, will do in the next round.

One hitch is that eventually, we want to be able to define alternative serialisations of the header field in new versions of HTTP, so we'll have to be careful in how we do this. Or just admit that there will be some residual hand-waving.

mikewest · 2018-05-24T06:53:35Z

Makes sense, will do in the next round.

Thanks, no rush. :)

One hitch is that eventually, we want to be able to define alternative serialisations of the header field in new versions of HTTP, so we'll have to be careful in how we do this. Or just admit that there will be some residual hand-waving.

Some level of hand-waving seems fine, though I'd prefer that it be constrained to this document, and not all the documents that wish to define structured headers. If we can end up with a single algorithm, no matter how complicated, that takes a structured header object and outputs a string, I'll be happy to use it!

mnot · 2018-05-24T23:07:11Z

Nod - but the hitch is "outputs a string" -- in that future world, it might be binary...

mikewest · 2018-05-25T11:41:30Z

How about "outputs a thing that might be a string, and might be binary, and might be trinary, or might be anything else I can hand to Fetch's header list set algorithm"? Fetch talks about it in terms of a https://infra.spec.whatwg.org/#byte-sequence. Would that work for you?

reschke · 2018-05-25T13:10:13Z

I love it: "A byte sequence is a sequence of bytes, represented as a space-separated sequence of bytes."

Maybe: "...represented as a space-separated sequence of byte representations"?

reschke · 2018-05-25T13:13:27Z

@mnot - actually, what's in a HTTP/1.1 message should also be better considered a byte sequence, not a string

@mikewest - I would prefer to have SH not to rely on FETCH in any way

mikewest · 2018-05-25T13:22:30Z

I would prefer to have SH not to rely on FETCH in any way

Pedantic nit: Byte sequence isn't defined in Fetch, but in Infra. :)

I would like SH to define a serialization algorithm in such a way that I can explain to web browsers what they ought to do with the result. It would be unfortunate if it was difficult to integrate SH and Fetch, as that makes my goals more difficult.

I think all I'm asking for is a clearly defined serialization algorithm that returns a result that Fetch can accept as a header value. I'm happy to leave details up to y'all and @annevk to work out who depends on whom and why amongst yourselves. :)

reschke · 2018-05-25T13:32:55Z

Well, that gets us back to the data model. I assume FETCH considers header field values as JavaScript strings (?), while in an HTTP/1.1 message it's really a sequence of bytes, usually restricted to values <= 127.

It's the edge case (non-ASCII) that makes this all interesting, but I believe it's a non-issue for SH.

annevk · 2018-05-25T13:40:54Z

@reschke no, byte sequences with restrictions: https://fetch.spec.whatwg.org/#concept-header. (The API does convert these back and forth from JavaScript strings, using IDL's ByteString primitive.)

reschke · 2018-05-25T13:59:57Z

OK.

Right now we have only one serialization, so it's hard to discuss future ones.

The one that we have uses US-ASCII, which can be trivially encoded in octet sequences, and shouldn't have any issues with FETCH. So maybe we just need to write down this more clearly?

annevk · 2018-05-25T14:11:32Z

For my own understanding, the problem is that you want to create a structured header using types, but then pass that into Fetch, with Fetch only taking byte sequences, and exact byte sequences being exposed through H/1 and H/2 and probably QUIC.

Ideally you keep the types around until you hit a point where you need to serialize. That would require HTTP offering some abstraction in front of H/1, H/2 and probably QUIC that takes headers where the values can be either byte sequences or types and then serializes them as appropriate for the eventual chosen transport.

Fetch could then change its "header" primitive so values would be either byte sequences or types and pass that on to the new HTTP abstraction. And also use the H/1 / H/2 serialization for its API, which isn't typed.

As an alternative, proposed by @mikewest I think, structured headers could define how to obtain a byte sequence from a type. We'd continue passing byte sequences around. Then a future H/N could eagerly parse those bytes to see if it can represent them as a type instead. You'd end up with a redundant serialize/parse, but all the interfaces don't have to be changed. Implementations could optimize the serialize/parse away. And this would also allow representing byte sequences as types that weren't types to begin with, which might be beneficial.

For #627

mnot · 2018-06-01T03:34:22Z

@mikewest see PR above; will that work for you (ignoring the alternative serialisations issue for now?)

mnot · 2018-06-01T03:52:50Z

My assumption has been that if a future H/n defines an alternative serialisation (or if it's done in an extension like a H2 SETTING), a separate API would have to be exposed for applications to call to set headers (even if that's a bump on the current API that adds information about the encoding being sent, plus a way for the application to detect that it's available).

Otherwise, there'd have to be either a needless encode/decode (if the application emitted H1 headers), or some nasty heuristics on the payload (if the application emitted the new format).

Same for parse; otherwise, the implementation will have to translate the new encoding to H1 for applications, which doesn't make sense if they just want the data structure and associated SH handling.

annevk · 2018-06-01T06:01:34Z

I wasn't talking about a JavaScript API, to be clear. I was talking about the low-level interface for the HTTP standard that Fetch in some (hand-wavy) way wraps. If you were too, I suppose a distinct interface would work, but it seems nicer if we could exchange a single header list that contains both byte sequence values and typed values.

annevk · 2018-06-01T06:03:04Z

(I don't see a link to a PR btw, just a commit on a branch that contains many other commits. Comments on those would probably easily get lost.)

mnot · 2018-06-01T06:12:16Z

Sorry, see #636

mnot · 2018-06-01T06:16:02Z

I suspect that implementations aren't going to want to hand around byte sequences, because that makes potential optimisations that they'll find attractive more expensive. If it's a byte sequence, that means they have to parse it for structure and figure out what to do with it. It'd be better if they hand around representations of the actual structures.

annevk · 2018-06-01T06:19:22Z

@mnot I think you still misunderstand me. The "byte sequence values" are only for the legacy headers that do not have a typed representation. The "typed values" are for the new headers.

mnot · 2018-06-01T06:20:57Z

Ah, indeed I do then; I shouldn't answer bug mail when I'm sick :-/

That seems reasonable to me.

mikewest added the header-structure label May 23, 2018

mnot added a commit that referenced this issue Jun 1, 2018

Rough in serialisation

266a476

For #627

mnot closed this as completed Jun 5, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Serialization for structured headers. #627

Serialization for structured headers. #627

mikewest commented May 23, 2018

mikewest commented May 23, 2018

mnot commented May 23, 2018

mikewest commented May 24, 2018

mnot commented May 24, 2018

mikewest commented May 25, 2018

reschke commented May 25, 2018 •

edited

reschke commented May 25, 2018

mikewest commented May 25, 2018

reschke commented May 25, 2018

annevk commented May 25, 2018

reschke commented May 25, 2018

annevk commented May 25, 2018 •

edited

mnot commented Jun 1, 2018

mnot commented Jun 1, 2018

annevk commented Jun 1, 2018 •

edited

annevk commented Jun 1, 2018

mnot commented Jun 1, 2018

mnot commented Jun 1, 2018

annevk commented Jun 1, 2018

mnot commented Jun 1, 2018

Serialization for structured headers. #627

Serialization for structured headers. #627

Comments

mikewest commented May 23, 2018

mikewest commented May 23, 2018

mnot commented May 23, 2018

mikewest commented May 24, 2018

mnot commented May 24, 2018

mikewest commented May 25, 2018

reschke commented May 25, 2018 • edited

reschke commented May 25, 2018

mikewest commented May 25, 2018

reschke commented May 25, 2018

annevk commented May 25, 2018

reschke commented May 25, 2018

annevk commented May 25, 2018 • edited

mnot commented Jun 1, 2018

mnot commented Jun 1, 2018

annevk commented Jun 1, 2018 • edited

annevk commented Jun 1, 2018

mnot commented Jun 1, 2018

mnot commented Jun 1, 2018

annevk commented Jun 1, 2018

mnot commented Jun 1, 2018

reschke commented May 25, 2018 •

edited

annevk commented May 25, 2018 •

edited

annevk commented Jun 1, 2018 •

edited