Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audit: single value header field error handling #193

Closed
10 of 25 tasks
mnot opened this issue Jan 20, 2019 · 23 comments
Closed
10 of 25 tasks

Audit: single value header field error handling #193

mnot opened this issue Jan 20, 2019 · 23 comments

Comments

@mnot
Copy link
Member

mnot commented Jan 20, 2019

As discussed in #74, we need to assure that all single-value headers we define have appropriate error handling for when more then one value is received. This could be a variety of things, including "it's not important for this header."

  • Authorization
  • Content-Language
  • Content-Length
  • Content-Location
  • Content-Range
  • Content-Type
  • Date
  • ETag
  • Expect
  • From
  • Host
  • If-Modified-Since
  • If-Range
  • If-Unmodified-Since
  • Last-Modified
  • Location
  • Max-Forwards
  • Range
  • Referer
  • Retry-After
  • Server
  • User-Agent
  • Proxy-Authorization
  • Age
  • Expires
@mnot
Copy link
Member Author

mnot commented Jan 22, 2019

A few initial thoughts:

  • We've already taken care of this for Content-Length; if there are issues therein it should be a separate bug.
  • It very obviously matters for Host, Content-Range and Content-Type. Location probably too.
  • It very obviously doesn't matter for User-Agent and Server. Referer and From are close followers.
  • Everything else is in between.

@annevk
Copy link
Contributor

annevk commented Jan 22, 2019

There's #59 on Content-Length. 😊 (I can confirm some clients at least have quite special handling for Location, but I haven't written exhaustive tests yet.)

What I'd like to see is that

Single-Value-Header: x
Single-Value-Header: y

and

Single-Value-Header: x, y

end up being treated in equivalent manner. Otherwise a naïvely-combining intermediary in front of some clients can cause drastic different results from clients that don't have such an intermediary in front of them.

@mnot
Copy link
Member Author

mnot commented Mar 14, 2019

Expect is an interesting case. In the past, it allowed multiple values, but now it only allows a single value, 100-continue. Error handing for it seems to be well-defined if wishy-washy (with a MAY reject).

@mnot
Copy link
Member Author

mnot commented Sep 4, 2019

Ticking off Expect due to #203

@mnot
Copy link
Member Author

mnot commented Sep 4, 2019

Starting to gather some data about Age here. It seems like most implementations honour the last value if there are more than one. Safari and Apache look like they consider the comma as making the Age invalid (and therefore not present).

@mnot
Copy link
Member Author

mnot commented Sep 4, 2019

Niiiiice... It appears that most implementations will pay attention to the last Age value if they appear on the same header line, but the first header line if there are two.

@royfielding
Copy link
Member

royfielding commented Sep 4, 2019

I am kind of disappointed that they don't treat the comma as a thousands indicator.

@annevk
Copy link
Contributor

annevk commented Sep 11, 2019

@mnot that's fairly consistent with other headers. The way parsers in browsers work is that they are per header-line. And it's considered risky to change those such that they effectively work on a combined value. With more tests and encouragement and maybe effectively making it impossible to have this distinction in future wire formats this can hopefully be overcome.

@mnot
Copy link
Member Author

mnot commented Oct 1, 2019

@annevk what if we specified that multiple Age headers (on different or the same line) caused the response to be uncacheable?

@mnot
Copy link
Member Author

mnot commented Oct 1, 2019

Ticking off Content-Location, Retry-After since AFAIK these aren't widely used.

@annevk
Copy link
Contributor

annevk commented Oct 1, 2019

@mnot sounds reasonable to me. Caching or not caching doesn't seem like a potential compatibility issue.

cc @MattMenke2 @ddragana @mayhemer

@MattMenke2
Copy link

MattMenke2 commented Oct 1, 2019

Chrome fails requests when we have multiple Content-Length, Content-Disposition, or Location headers, if they have different values (Content-Length, at least, resulted in too many problems when we disallowed multiple identical values). We only check for multiple distinct Content-Length values when the response is not using chunked encoding. Of those, looks like Content-Disposition hasn't been listed here yet.

I believe the main concern that led us to start blocking these was (somewhat) mitigating header injection attacks. Content-Length injection leads to response splitting, Location leads to redirect hijacking. The original motivating bug was actually on the FireFox tracker: https://bugzilla.mozilla.org/show_bug.cgi?id=655389

@MattMenke2
Copy link

MattMenke2 commented Oct 1, 2019

Also, those checks are only at the HTTP/1.x layer. I'm not sure if we currently have in place similar mitigations for H2 or H3 (The Content-Length one doesn't matter, but the other two do). I would guess we do not.

@mnot mnot self-assigned this Feb 2, 2020
@mnot
Copy link
Member Author

mnot commented Feb 3, 2020

in Host -

A server &MUST; respond with a <x:ref>400 (Bad Request)</x:ref> status code
to any HTTP/1.1 request message that lacks a Host header field and
to any request message that contains more than one Host header field
or a Host header field with an invalid field value.

Probably should be "... more than one Host header field value..."

@reschke
Copy link
Contributor

reschke commented Feb 3, 2020

field line?

@mnot
Copy link
Member Author

mnot commented Apr 28, 2020

We need a shorthand for a singleton header that has multiple values present, whether comma-separated or on separate field lines.

@mnot
Copy link
Member Author

mnot commented Apr 28, 2020

Ticked off Content-Type due to #39.

@mnot
Copy link
Member Author

mnot commented May 20, 2020

I'm starting to think that we should explicitly state that all field values are inherently list-based, because of the possibility of headers being combined. Field definitions should take this into account and define how to extract a single value from the list if they only want one.

@mnot
Copy link
Member Author

mnot commented May 20, 2020

See #369. If we go down this route, the next step would be to convert all existing singleton fields to the #rule and specify how to handle duplicates.

@royfielding @reschke thoughts?

@reschke
Copy link
Contributor

reschke commented May 20, 2020

Hm, that looks disruptive.

I'm +1 on stating how to handle multiple values, but I'm not convinced that we should change the ABNFs for that.

Are we discussing guidelines or normative text?

@mnot
Copy link
Member Author

mnot commented May 20, 2020

Guidelines, I think.

Remember that #203 already is taking us in this direction...

@royfielding
Copy link
Member

royfielding commented May 20, 2020

To be clear, the ABNF is signaling an error condition when a singleton field is received more than once. We still want that.

Guidelines to work around that error, after it is seen as an error, are fine.

@mnot
Copy link
Member Author

mnot commented Jun 9, 2020

OK, so I think we should adjust 4.4 Field Values to look like this:

A field's value can contain on or more members, separated by commas (","). Fields that allow multiple members are referred to as "list-based fields." Fields that only anticipate a single member are referred to as "singleton fields."

Because commas are used as a generic delimiter between members, they need to be treated with care if they are allowed as data within a member. This is true for both list-based and singleton fields, as a singleton field might be sent with multiple members erroneously, and a recipient needs to be able to detect this condition. Therefore, members that might contain a comma are typically enclosed in a quoted-string.

Does that work (noting that this doesn't touch the ABNF)? I'm not terribly happy with "singleton field", so if there are better suggestions for terminology, I'd be happy to replace it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

5 participants