Need help with HeaderMap #300

annevk · 2014-05-30T14:35:18Z

HTTP headers are a mess. That's because semantics are confused with syntax due to legacy servers and user agents.

http://mxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpHeaderArray.h#146 (and elsewhere in that file and the .cpp file, this needs to be specified somewhere)

If we are going to make http://fetch.spec.whatwg.org/#headermap work we need to somehow deal with that or decide to ignore it at the API level (the fetch algorithm itself should probably cater for it somehow in its header representation). @mcmanus @mnot

mnot · 2014-05-31T08:05:05Z

Yes, they're definitely a mess.

To be clear, if I have

Foo: a, b
Bar: 134
Foo: c

does .get('foo') return "a, b" or just "a"?

I read the notes on HeaderMap as saying it's the former, correct?

If so, I think you can keep things relatively simple; .add() just adds a new header field (no matter how many there are currently).

annevk · 2014-06-01T13:22:00Z

get() would return "a", getAll() would return ["a", "b", "c"]. My idea was that there would be automatic flattening and combining. add() would append a new value to an existing entry or add a new header/value entry.

tabatkins · 2014-06-01T17:11:12Z

Are you planning to build in the exceptions, per the Moz bug, so that commas mean multiple entries in all headers except for those handful of weird ones? (And multiple entries always serialize with commas except for those exceptions?)

Otherwise, yes, MultiMap behavior is the best, go for it.

annevk · 2014-06-02T08:36:38Z

The idea was to not have exceptions in this API and just model after HTTP. It does mean this API cannot be used by the underlying implementation. I'm wondering if that's the correct tradeoff.

tabatkins · 2014-06-02T20:03:18Z

I'm not sure how worthwhile that is; the serialization isn't necessarily exposed to JS directly; and if it is, it's not really significant what it is. Being compatible with the web seems more important than being maximally simple here, since the choice doesn't affect the author-facing API.

tobie · 2014-06-02T21:53:23Z

Should the methods which return void return this instead for consistency with Map?

annevk · 2014-06-03T07:33:06Z

@tabatkins this is the author-facing API.

jakearchibald · 2014-06-03T16:36:53Z

@annevk

get() would return "a"

Does this mean request.headers.get('accept') would be image/webp rather than image/webp,*/*;q=0.8?

I assume comma is always used to seperate values in headers?

tabatkins · 2014-06-03T16:52:29Z

@annevk Right, the author-facing part of the API definitely shouldn't expose any irregularities. I was just discussing whether the exceptional cases should be handled appropriately when parsing headers into a HeaderMap and serializing a HeaderMap into headers.

@jakearchibald Yes, headers are defined to accept comma-separated lists of values.

mnot · 2014-06-03T18:19:36Z

To be clear, all that HTTP requires is that multiple instances of the same header are able to be folded into a single comma-separated, NOT that any header can be split on commas. See http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-26#section-3.2.2 for the precise language.

I.e., it's quite possible that get() as you define it will return a truncated string for some headers.

sicking · 2014-06-04T07:48:22Z

Could we do the simple thing and just use a normal ES6 Map rather than a new HeaderMap class?

The headers property of RequestInit could either accept a Map object or a JS object which we enumerate. If the value in either is an array, then we'd add multiple request headers.

The headers property of Response would always be a Map. If there were multiple response headers then we we return an Array of strings.

So if @mnot's initial comment is a set of response headers, response.headers.get("Foo") would return ["a, b", "c"].

Alternatively we could concatenate multiple response headers. That has the advantage that we'd consistently return strings. So then response.headers.get("Foo") would return "a, b, c".

annevk · 2014-06-04T08:08:52Z

No we cannot use a Map as we need to know about changes as they can affect the mode of the request.

E.g. you want event.request.headers.set("seen-by-service-worker", "yup") or some such. And we want to not allow dangerous headers there.

tabatkins · 2014-06-04T09:14:16Z

@sicking ES Maps are almost completely unusable for anything beyond the most trivial uses in DOM, for reasons that I've complained about in the past. :/

Regarding just using a Map-like interface, as noted by others, always returning an array complicates the 99% case for the purpose of the 1% case, while concatenating the headers will result in constantly broken code, as people will rarely ever test for multiple headers. Other languages like Python use approaches similar to what anne is doing here, where you can ask for the first header or for all headers. Also, the MultiMap interface this is based on is already in use in the URL object, for its query params, so it's not even anything new.

annevk · 2014-06-04T09:21:05Z

@tabatkins it is actually slightly different. In this API the names are unique. However, given @mnot's and @jakearchibald's comments maybe what we want is indeed an API similar to URLSearchParams and FormData. That is

Foo: a, b
Bar: 134
Foo: c

gets turned into three distinct entries. If you get("Foo") you get "a, b". If you getAll("Foo") you get ["a, b", "c"].

tabatkins · 2014-06-04T09:24:25Z

And would that serialize into "Foo: a, b, c\nBar: 134", or would it stay separate and do "Foo: a, b\nFoo: c\nBar: 134"?

annevk · 2014-06-04T09:26:42Z

It would stay separate and even preserve order. That's what URLSearchParams does today. The moment you start making modifications things will change around a bit of course.

jakearchibald · 2014-06-04T09:32:42Z

That sounds good to me.

If I .get the accept/vary header I'd expect the full header value of the first match.

annevk · 2014-06-04T10:22:07Z

What this basically means is that we take in the stream from HTTP, split on newlines to get header lines, and then we turn each header line into a header by splitting on the first ":" to get a name and a value. A setup like that seems fine for both Request and Response as far as I can tell.

(Note that getResponseHeader() from XMLHttpRequest does combine values and would return "a, b, c" for the example earlier. However, getAllResponseHeaders() does not.)

reschke · 2014-06-05T16:37:19Z

Sounds right to me.

Re: http://mxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpHeaderArray.h#146... it would be cool if we could do some research whether the three exceptions there really need to be exceptions (I do agree for Set-Cookie).

tabatkins · 2014-06-05T18:05:24Z

Yeah, given the stated semantics, this sounds fine to me.

(It's really weird that the spec requires headers to be splittable on commas, but you can't reverse that and treat all commas as splitters.)

sicking · 2014-06-06T09:52:29Z

Why do you want to get the first value rather than a comma-separated join of all headers with the specified name? The HTTP spec says that a comma separated join is equivalent.

Here's what I'm proposing:

For APIs that take a list of request headers, I.e. the headers property of RequestInit, allow the page to either pass a Map object or any other object.

If a Map object is passed, create a new Map, enumerate the passed in map and add each property to the newly created Map.

If any other object is passed, create a new Map, enumerate that object and add each property to the newly created Map.

In both cases any Arrays would still be kept as Arrays. Anything else is converted to a string.

Let the headers property on Request be a Map.

It would be a plain normal Map, which means that any values could be added to it.

At the time when a Request is submitted, if any bad values is in the headers Map, we either ignore them, or treat it as a network error. Either is fine with me. Any arrays result in multiple headers with the same name being added.

Let the headers property in Response be a Map. When we build a Response object and populate the headers Map we add headers in the order that they were received from the network (minus any that we couldn't expose for security reasons). Any duplicate headers are joined using ,.

This seems to me to give the properties that we want.

Duplicate headers in a Response always maintain their relative order.
All headers in a Response are plain string values.
You can send multiple headers with the same name, in an author defined order.

While also allowing us to reuse plain Map objects.

Anne points out that this means that we validate header values at time of submission, while we validate things like methods and CORS-mode early. I agree that's not perfect, but it also doesn't seem like a huge deal.

tabatkins · 2014-06-06T09:56:54Z

Why do you want to get the first value rather than a comma-separated join of all headers with the specified name? The HTTP spec says that a comma separated join is equivalent.

Because code written to assume only a single header is sent (which I expect to be basically 100% of code, because who tests sending multiple headers, come on) will break the moment multiple headers get sent. Don't make authors do parsing work that you can do instead; particularly, don't add extra parsing work for authors to do just because you want a simple key/value interface.

(Extra particularly, don't force authors to do parsing work that seems like it can be done with a simple .split(",") call, but can't really, because of complicated crappy semantics.)

annevk · 2014-06-06T10:07:02Z

My problems with the proposal from @sicking are:

All popular implementations of headers use multimap rather than map
It would not allow us to expose cookies if we decide to do so at some point (it subsets the overall design of HTTP)
Imported libraries will have to use isArray() rather than get a consistent API
Fail extra late for headers and early for everything else is inconsistent

reschke · 2014-06-06T13:04:30Z

Because code written to assume only a single header is sent (which I
expect to be basically 100% of code, because who tests sending multiple
headers, come on) will break the moment multiple headers get sent.

That actually sounds like an advantage. Fail early!

tabatkins · 2014-06-09T16:55:16Z

It won't fail early. It'll fail very late, in production, in probably random and difficult-to-reproduce ways, due to header editting/additions from random proxies and rewriters between your server and the user's computer.

Plus, if we forced authors to split, they'll do it in the most obvious, simple way - str.split(","). This'll work 90% of the time, and fail badly in a small number of not-too-rare cases, like Accept-Language. Remember, the HTTP spec doesn't require it to be easy to parse comma-separated header values apart, just that it be possible; it might require arbitrarily complex parsing to do, and knowledge of precisely which header you're dealing with.

But authors can't avoid it! They must defensively code in something to split the values apart, because we're jerks and we always merge them into a comma-separated string, even when the headers are always sent separately because the values are complex to tease apart.

Comma-merging has only one tiny benefit - it lets us, the implementors, use a Map rather than a MultiMap. Besides that, it's extremely author-hostile with literally zero author benefit.

annevk · 2014-06-13T11:12:35Z

Given support from @jakearchibald et al for my proposal and no real support for Map, I've further clarified that. See http://fetch.spec.whatwg.org/ for details.

annevk closed this as completed Jun 13, 2014

annevk mentioned this issue Jun 13, 2014

Response::headers getter/setter incorrect #268

Closed

KenjiBaheux added the no impact (blink) label Sep 5, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need help with HeaderMap #300

Need help with HeaderMap #300

annevk commented May 30, 2014

mnot commented May 31, 2014

annevk commented Jun 1, 2014

tabatkins commented Jun 1, 2014

annevk commented Jun 2, 2014

tabatkins commented Jun 2, 2014

tobie commented Jun 2, 2014

annevk commented Jun 3, 2014

jakearchibald commented Jun 3, 2014

tabatkins commented Jun 3, 2014

mnot commented Jun 3, 2014

sicking commented Jun 4, 2014

annevk commented Jun 4, 2014

tabatkins commented Jun 4, 2014

annevk commented Jun 4, 2014

tabatkins commented Jun 4, 2014

annevk commented Jun 4, 2014

jakearchibald commented Jun 4, 2014

annevk commented Jun 4, 2014

reschke commented Jun 5, 2014

tabatkins commented Jun 5, 2014

sicking commented Jun 6, 2014

tabatkins commented Jun 6, 2014

annevk commented Jun 6, 2014

reschke commented Jun 6, 2014

tabatkins commented Jun 9, 2014

annevk commented Jun 13, 2014

Need help with HeaderMap #300

Need help with HeaderMap #300

Comments

annevk commented May 30, 2014

mnot commented May 31, 2014

annevk commented Jun 1, 2014

tabatkins commented Jun 1, 2014

annevk commented Jun 2, 2014

tabatkins commented Jun 2, 2014

tobie commented Jun 2, 2014

annevk commented Jun 3, 2014

jakearchibald commented Jun 3, 2014

tabatkins commented Jun 3, 2014

mnot commented Jun 3, 2014

sicking commented Jun 4, 2014

annevk commented Jun 4, 2014

tabatkins commented Jun 4, 2014

annevk commented Jun 4, 2014

tabatkins commented Jun 4, 2014

annevk commented Jun 4, 2014

jakearchibald commented Jun 4, 2014

annevk commented Jun 4, 2014

reschke commented Jun 5, 2014

tabatkins commented Jun 5, 2014

sicking commented Jun 6, 2014

tabatkins commented Jun 6, 2014

annevk commented Jun 6, 2014

reschke commented Jun 6, 2014

tabatkins commented Jun 9, 2014

annevk commented Jun 13, 2014