Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiple headers of the "same name" allowed in http #3591

Closed
danielb2 opened this issue Oct 29, 2015 · 66 comments
Closed

multiple headers of the "same name" allowed in http #3591

danielb2 opened this issue Oct 29, 2015 · 66 comments
Labels
http Issues or PRs related to the http subsystem. stalled Issues and PRs that are stalled.

Comments

@danielb2
Copy link

If { 'cookie: 'foo=bar;', 'Cookie: 'foo=meh;' } is passed, you have no idea which one will be used.

If a request relies on, for example, cookie information, a request may work seemingly at random and can be rather difficult to debug.

@mscdex mscdex added the http Issues or PRs related to the http subsystem. label Oct 29, 2015
@bnoordhuis
Copy link
Member

It should be the last one. Headers are added in object insertion order, the last one wins out.

The behavior itself is most likely undocumented but I think we have tests for this (not easy to grep for though.)

@danielb2
Copy link
Author

Should it allow multiple by the same at all? Is it too costly to check for this?

@bricss
Copy link

bricss commented Oct 31, 2015

FYI -> http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2

Multiple message-header fields with the same field-name MAY be present in a message if and only if the entire field-value for that header field is defined as a comma-separated list [i.e., #(values)]. It MUST be possible to combine the multiple header fields into one "field-name: field-value" pair, without changing the semantics of the message, by appending each subsequent field-value to the first, each separated by a comma. The order in which header fields with the same field-name are received is therefore significant to the interpretation of the combined field value, and thus a proxy MUST NOT change the order of these field values when a message is forwarded.

@jasnell
Copy link
Member

jasnell commented Oct 31, 2015

/cc @nodejs/http

@dougwilson
Copy link
Member

Potentially there are a few solutions here:

  1. Only use the first key when there are duplicate keys once the keys have been converted to lower-case.

Basically, this solution would be to say that since HTTP headers are not case-sensitive, if { cookie: 'foo=bar;', Cookie: 'foo=meh;' } is given, act on it as if the user had really written { cookie: 'foo=bar;', cookie: 'foo=meh;' }, which in JavaScript would have been converted to { cookie: 'foo=meh;' }:

$ node -pe "({ 'cookie': 'foo=bar;', 'cookie': 'foo=meh;' })"
{ cookie: 'foo=meh;' }
  1. When given an object to set headers from (i.e. res.writeHead(status, obj)), if the obj contains duplicate keys once the keys have been converted to lower case, throw a TypeError.

This would signal to users that there is an issue in the given object, which was expected to only contains header-value pairs.

@danielb2
Copy link
Author

danielb2 commented Nov 2, 2015

Option two sounds more appealing since if you can comma separated list syntax to define multiple headers otherwise.

Another option to consider is allowing an optional array syntax for that:

{ cookie: ['foo=bar', 'foo=meh'] }

The problem I saw in production was where one library was using cookie and one external source used Cookie and the headers were merged and one was lost.

@dougwilson
Copy link
Member

Another option to consider is allowing an optional array syntax for that:

{ cookie: ['foo=bar', 'foo=meh'] }

I didn't mention it, because that is already supported; each element in the array will be a different header, making that response look like the following:

Cookie: foo=bar
Cookie: foo=meh

Assuming we are talking about res.writeHead. If you are referring to a different Node.js API, let me know, as I don't see what API is being used actually mentioned here.

@danielb2
Copy link
Author

danielb2 commented Nov 2, 2015

Perfect :)

@bricss
Copy link

bricss commented Nov 2, 2015

In that case I presume we can close an issue.

@danielb2
Copy link
Author

danielb2 commented Nov 2, 2015

Why would you make that presumption?

@bricss
Copy link

bricss commented Nov 2, 2015

FYI -> http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2

Each header field consists of a name followed by a colon (":") and the field value. Field names are case-insensitive.

@cressie176
Copy link

Having a similar problem with X-Robots-Tag. I need to support the following

X-Robots-Tag: googlebot: nofollow
X-Robots-Tag: otherbot: noindex, nofollow

As specified here.

@Trott
Copy link
Member

Trott commented Jan 10, 2016

@cressie176 According to the documentation:

Use an array of strings here if you need to send multiple headers with the same name.

So I think this would work for your case:

response.setHeader('X-Robots-Tag', ['googlebot: nofollow', 'otherbot: noindex, nofollow']);

@cressie176
Copy link

@Trott, thanks for the quick response. Unfortunately I've tried that - it generates a single header which I don't think conforms to googles spec.

x-robots-tag: 'googlebot: nofollow, otherbot: noindex, nofollow',

Furthermore if I add an additional directive to apply to all user agents, it gets concatenated with the last one in the list, i.e.

response.setHeader('X-Robots-Tag', ['googlebot: nofollow', 'otherbot: noindex, nofollow', 'noimageindex']);

Results in

x-robots-tag: 'googlebot: nofollow, otherbot: noindex, nofollow, noimageindex'

@Trott
Copy link
Member

Trott commented Jan 10, 2016

Ah, I see. Thanks for the clarification. (Wish I had more to add than just that.)

@cressie176
Copy link

lol!

@bricss
Copy link

bricss commented Jan 10, 2016

Try to use power of RegExp to split string to an array.

@cressie176
Copy link

@bricss Sorry - not sure I follow. Are you suggesting I use regex to split the header sent in the response?

In this situation my code is running on the server and I'm using node (via express) to send a response header. Node is concatenating header values with the same name into a single string, then transmitting the response.

If I was writing the client I could split the string, but in this case the client is googlebot.

@bricss
Copy link

bricss commented Jan 10, 2016

@cressie176 Ouh, sorry, now I see where is a problem.

@cressie176
Copy link

@bricss no problem. Intentions were good.

@stevenvachon
Copy link

stevenvachon commented Jan 23, 2016

X-Robots-Tag: nofollow
X-Robots-Tag: nocache
X-Robots-Tag: googlebot: noindex
X-Robots-Tag: unavailable_after: 1-Jan-3000 00:00:00 EST

should maybe result in something like:

{
  "x-robots-tag": "nofollow,nocache",
  "x-robots-tag: googlebot": "noindex",
  "x-robots-tag: unavailable_after": "1-Jan-3000 00:00:00 EST"
}

Also, how would we set such headers with setHeader() and writeHead() for Node to understand?

Edit: HTTP header field names do not allow the ":" character, so Google was never following the specification when they designed these headers. It should've been something like x-robots-tag|googlebot: noindex. Node.js aims to be spec-compliant, so this is a no-go.

@PeterDaveHello
Copy link
Member

Server Push feature of HTTP/2.0 also needs this feature, it will use multiple headers named Link
https://w3c.github.io/preload/#server-push-http-2

PeterDaveHello added a commit to cdnjs/new-website that referenced this issue May 18, 2016
Though we are using http/1.1 in express/nodejs, we are using CDN that
supports http/2.0, this header will be sent to CDN server.

This commit can not really benifit from that feature since nodejs
doesn't support multiple headers with the same header name yet, see:

 - nodejs/node#3591
 - expressjs/express#2845
PeterDaveHello added a commit to cdnjs/new-website that referenced this issue May 18, 2016
Though we are using http/1.1 in express/nodejs, we are using CDN that
supports http/2.0, this header will be sent to CDN server.

This commit can not really benifit from that feature since nodejs
doesn't support multiple headers with the same header name yet, see:

 - nodejs/node#3591
 - expressjs/express#2845
PeterDaveHello added a commit to cdnjs/new-website that referenced this issue May 18, 2016
Though we are using http/1.1 in express/nodejs, we are using CDN that
supports http/2.0, this header will be sent to CDN server.

This commit can not really benifit from that feature since nodejs
doesn't support multiple headers with the same header name yet, see:

 - nodejs/node#3591
 - expressjs/express#2845
PeterDaveHello added a commit to PeterDaveHello/node that referenced this issue May 18, 2016
Let http server support response multiple same-named header as RFC2612
defined, this feature is also used in http/2.0 "server push" feature.

Fix nodejs#3591
@stevenvachon
Copy link

@refack please reopen.

@ljharb
Copy link
Member

ljharb commented Mar 8, 2021

It seems like the solution here is to either allow multiple "same name" headers, or, fail loudly when the user tries to send them.

If anyone is interested in solving this, the two things that would likely be helpful are:

  1. detailing the exact desired solution in a comment here (if different from my summary, or if adding important missing details)
  2. sending a PR implementing one or all of those solutions.

@cressie176
Copy link

cressie176 commented Mar 8, 2021

It seems like the solution here is to either allow multiple "same name" headers, or, fail loudly when the user tries to send them.

Failing loudly would be a breaking change, and also isn't really a solution since sending multiple headers with the same name is supported by the HTTP specification and required by certain clients. Failing loudly in response to a valid action could also result in someone posting another issue, and we'd be back to square one. Therefore I suggest the only actual solution is to allow multiple headers with the same name.

@danielb2
Copy link
Author

danielb2 commented Mar 8, 2021

Therefore I suggest the only actual solution is to allow multiple headers with the same name.

I don't agree with continuing doing something bad simply because fixing it would break things.

Can we start with a warning "depreciated" message ? Deprecated: Multiple header names are being passed <stack trace>. This will be deprecated as of version X

@cressie176
Copy link

cressie176 commented Mar 8, 2021

@danielb2

Maybe I've misunderstood something, or not explained myself clearly enough.

As I understand things, what node does now is bad, because instead of honouring multiple headers with the same name (which is allowed in the HTTP specification and necessary in some circumstance) it squashes them.

Changing this to instead of honouring multiple headers with the same name, node reports an error, is still bad and arguably even worse.

@danielb2
Copy link
Author

danielb2 commented Mar 9, 2021

You mean it's allowed because it's not expressly prohibited? Under what circumstances would it be necessary? (Aside from a mistake)

@bricss
Copy link

bricss commented Mar 9, 2021

In current situation 💡, it would make sense to add Headers into Request and Response and gently deprecate ⛔ old methods, or even leave them as is for backward compatibility with few notes 📝

@cressie176
Copy link

cressie176 commented Mar 9, 2021

You mean it's allowed because it's not expressly prohibited?

@danielb2 The title of this issue is somewhat confusing. It talks about multiple headers of the "same name", but does not differentiate between requests and responses. From my understanding there are different problems with Node's implementation of both, however I am only talking about responses. Section 4.2 of the HTTP specification explicitly allows multiple headers with the same name to be sent, providing their values can be combined into a comma separated list.

Multiple message-header fields with the same field-name MAY be present in a message if and only if the entire field-value for that header field is defined as a comma-separated list [i.e., #(values)]. It MUST be possible to combine the multiple header fields into one "field-name: field-value" pair, without changing the semantics of the message, by appending each subsequent field-value to the first, each separated by a comma. The order in which header fields with the same field-name are received is therefore significant to the interpretation of the combined field value, and thus a proxy MUST NOT change the order of these field values when a message is forwarded.

Node does not currently support adding multiple headers to a response. The closest you can get is to call response.setHeader with an array, which will still result a single header. Therefore Node is not fully compatible with the HTTP specification.

Under what circumstances would it be necessary? (Aside from a mistake)

Google's X-Robots-Tag specification depends on being able to send multiple X-Robots-Tag headers on the response (in a violation of the HTTP specification since they cannot be combined). For example...

X-Robots-Tag: googlebot: noindex
X-Robots-Tag: bingbot: noindex, nofollow
X-Robots-Tag: noimageindex

The above headers:

  1. Instruct googlebot not to index the page, but allow it to follow links to other pages
  2. Instruct bingbot not to index the page or follow links to other pages
  3. Instruct all bots not to index images

Combining this into a single header would result in:

X-Robots-Tag: googlebot: noindex, bingbot: noindex, nofollow, noimageindex

The X-Robots-Tag specification is silent of whether this syntax would even work, but at best the noimageindex directive would only apply to bingbot rather than all robots, thus changing the semantic meaning of the X-Robots-Tag header.

Others have cited the following examples where the need to be able to send multiple headers with the same name as part of the HTTP response.

It is not unreasonable to suspect there may be more examples which have not yet been discovered or reported.

@cressie176
Copy link

cressie176 commented Mar 9, 2021

In current situation 💡, it would make sense to add Headers into Request and Response and gently deprecate ⛔ old methods, or even leave them as is for backward compatibility with few notes 📝

Unfortunately this still wouldn't work where multiple response headers with the same name are necessary since according to the append will combine values, rather than writing multiple response headers, which is necessary in some situations.

The difference between set() and append() is that if the specified header already exists and accepts multiple values, set() will overwrite the existing value with the new one, whereas append() will append the new value onto the end of the set of values.

@bricss
Copy link

bricss commented Mar 9, 2021

@cressie176 what about semicolon or newline delimiter '\n' as an option to set multiple values for the same header? 🤔

@cressie176
Copy link

@cressie176 what about semicolon or newline delimiter '\n' as an option to set multiple values for the same header? 🤔

Hi @bricss, I don't see anything in the HTTP specification about semicolons or new line delimiters having special meaning. I assume that they would be treated as part of the field value, rather than indicating multiple headers.

@PeterDaveHello
Copy link
Member

It seems like the solution here is to either allow multiple "same name" headers, or, fail loudly when the user tries to send them.

If anyone is interested in solving this, the two things that would likely be helpful are:

1. detailing the exact desired solution in a comment here (if different from my summary, or if adding important missing details)

2. sending a PR implementing one or all of those solutions.

@ljharb what do you think about this PR: #6865 😅 I know it's old

@rajeevnaikte
Copy link

rajeevnaikte commented Jan 21, 2022

Construct header using req.rawHeaders

// Server code
http.createServer(function (req, res) {
      const rqHeaders = req.rawHeaders.reduce((map, header, i, rawHeaders) =>
      {
        if (i % 2 === 0) return map;

        const headerName = rawHeaders[i-1];
        const headerValue = rawHeaders[i];
        const prevHeaderValue = map[headerName];
        if (prevHeaderValue)
        {
          if (Array.isArray(prevHeaderValue)) prevHeaderValue.push(headerValue);
          else map[headerName] = [prevHeaderValue, headerValue];
        }
        else map[headerName] = headerValue;
        return map;
      }, {} as any);

      console.log(rqHeaders); // This prints - { 'X-Robots-Tag': ['googlebot: noindex', 'bingbot: noindex, nofollow', 'noimageindex'] }
})
  .listen(0);


// Client code
const rq = http.request({ // options }, res =>
{
      rq.setHeader('X-Robots-Tag', ['googlebot: noindex', 'bingbot: noindex, nofollow', 'noimageindex']);

      rq.write('Hello World!');
      rq.end();
});

@ShogunPanda
Copy link
Contributor

ShogunPanda commented May 23, 2022

Relative PR has just landed in master. Closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
http Issues or PRs related to the http subsystem. stalled Issues and PRs that are stalled.
Projects
None yet
Development

Successfully merging a pull request may close this issue.