Are headers always defined with ABNF? #74

mnot · 2018-06-04T12:02:23Z

https://tools.ietf.org/html/rfc7230#section-3.2.2 says:

A sender MUST NOT generate multiple header fields with the same field name in a message unless either the entire field value for that header field is defined as a comma-separated list [i.e., #(values)] or the header field is a well-known exception (as noted below).

This ties sender behaviour to how the header field is specified; do we require headers to be defined using ABNF?

If so, we should say so explicitly. If not, we should clarify this requirement.

Furthermore, Section 7 ties handling of specific syntax (e.g., empty lines) to the #-rule. If headers can be defined without ABNF, we should clarify whether that behaviour is required, or specified to #-specified headers.

See also httpwg/http-extensions#596

reschke · 2018-06-04T12:07:12Z

my 2 cents: I can see header fields not being defined by ABNF (but why?). But even in that case, the core protocol semantics should remain wrt list handling (such as the fact that empty elements get ignored) - anything else would be an incompatible change, no?

royfielding · 2018-06-05T21:42:57Z

I don't read that as requiring it be defined using ABNF. It is requiring that its field-value definition be equivalent to a comma-separated list that matches the grammar rule #(values) in ABNF. This is required because recipients will extract and combine fields without/before knowing their definition.

Perhaps we could just say that. shrug

mnot · 2018-06-29T04:23:54Z

AIUI Julian is saying that there's something more here; if your header can be combined, it has to be combined using exactly the rules defined in section 7 (e.g., regarding how whitespace is handled, multiple headers, etc.).

If I define a header field that allows comma separation, I think it's OK to do so without ABNF, and I think it can define its own handling for exactly how that combination should happen -- keeping in mind that upstream handlers might combine it in the specified fashion.

Proposal:

Replace the text above with:

A sender MUST NOT generate multiple header fields with the same field name in a message unless that field's definition allows this; e.g., that header field is defined as a comma-separated list [in ABNF, #(values)], or the header field is a well-known exception (as noted below).

It would also be great if we could move some of the logic for combining headers out of the #rule (section 11 currently) and into somewhere around 4.2. Field Values.

reschke · 2018-07-18T18:13:37Z

See slightly related issue #7

mnot · 2018-07-18T18:15:35Z

Discussed in Montreal; Roy suggested that we specify senders can only generate headers with multiple instances where the field's syntax allows recombining with commas.

Also make explicit that recipients can blindly combine multiple instances.

wtarreau · 2018-07-18T19:48:42Z

Also make explicit that recipients can blindly combine multiple instances
... provided that the field is not a well-known exception (i.e. set-cookie).

And conversely should we make it clearer that unless a field is defined as possibly containing commas in its value, a recipient may rightfully split it around commas (and strip LWS) and consider each part as an individual value for that field ? This basically means that a recipient could iterate over "values" between commas in fields it doesn't know (thus with no harm) and properly deal with those it knows (date, cookie, expires, etc which support commas in the value).

reschke · 2018-07-19T04:19:10Z

You can't split on comma without knowing the syntax of the list elements.

wtarreau · 2018-07-20T03:40:46Z

@reschke definitely, but what I mean is that if we stipulate that multiple header fields may be concatenated by default, then conversely it could be assumed that an unknown header field containing commas might be the result of a concatenation. At the very least, suggesting that passing commas in a new header field should be thought about twice because some recipients along the chain might not know how this field is formed seems like a good idea to me.

mnot · 2018-10-16T01:54:56Z

So I think the currently proposal (post-Montreal) is:

A sender MUST NOT generate multiple header fields with the same field name in a message unless that field's definition allows recombining them with commas (e.g., that header field is defined as a comma-separated list [in ABNF, #(values)]), or the header field is a well-known exception (as noted below).

mnot · 2018-10-16T01:57:11Z

Slight revision:

A sender MUST NOT generate multiple header fields with the same field name in a message unless that field's definition allows recombining them with commas (e.g., that header field is defined as a comma-separated list [in ABNF, #(values)]), or the header field is a well-known exception (as noted below).

This includes cases where a sender is adding a header field to an existing message (e.g., an intermediary appending a field to a forwarded message).

Fixes #74

royfielding · 2018-11-17T01:08:11Z

I had in mind something a little more general that would include fields which are defined like Vary:

   Aside from the well-known exception noted below,
   a sender MUST NOT generate multiple header fields with the same field
   name in a message, or append a header field when a field of the same name
   already exists in the message, unless that field's definition allows multiple
   field values to be recombined as a comma-separated list [i.e., at least one
   alternative of the field's definition allows a comma-separated list, such as
   an ABNF rule of #(values)].

OTOH, I personally feel that the existing requirement is a bit misguided. It doesn't seem to help in practice.

wtarreau · 2018-11-17T06:18:18Z

Reading the example mentioning the intermediary adding the field makes me think that the main problem we're having is caused by this combination of facts :

field syntax is ambiguous and most people don't know it (nothing new here)
users configure their intermediaries to add the fields they think they need, or at least those that appear to fix an issue they're facing
our rules indicate that a sender MUST NOT this or that
the configured component has no other option but blindly executing the configured rule, most often without even checking the existing message's contents nor whether or not it violates a rule
the offending message gets emitted with all requests or responses and recipients have to deal with the rule's violation otherwise they're designated as the incompatible one in the room

We can't use point 3 above on users to avoid point 2, it's for developers. It's not realistic either for developers to implement an exhaustive list of permitted/not permitted header fields in their product, nor to always look up every single added field and emit "500 server error" every time the rule matches.

So we're left with MUST NOT that affect hard-coded implementations (which is already important) but we don't help recipients recover when they face the situation, despite it being declared unlikely by the specification. For this reason I think that the effort should be mostly put on guidance to recover from such violations which in my opinion are extremely common and are the reason we're discussing this.

We already did this to recover from multiple "content-length" values, there definitely is some value in going further to generalize this to any invalid case.

annevk · 2018-11-19T08:35:50Z

I agree with @wtarreau I think. I've put quite a bit of effort into defining parsers on the combined value for Content-Length, Content-Type, et al: whatwg/fetch#814. As indicated in other issues I'd be happy to upstream what makes sense to HTTP.

mnot · 2018-11-29T00:18:46Z

Same. I've been thinking we need to have general advice that single-value header fields should always define such error handling, and we should do an audit of the ones defined by HTTP.

That said, I still think this text is helpful. I could see modifying it to make the target of the requirement clearer, or removing the 2119 language.

mnot added the semantics label Jun 4, 2018

mnot added the discuss label Jul 4, 2018

mnot removed the discuss label Jul 25, 2018

mnot self-assigned this Oct 10, 2018

mnot mentioned this issue Oct 11, 2018

Clarify semantics of OWS in header field values #53

Closed

mnot added a commit that referenced this issue Nov 13, 2018

clarify when header field combination is allowed

4434cda

Fixes #74

mnot mentioned this issue Nov 13, 2018

clarify when header field combination is allowed #172

Merged

mnot added the has-proposal label Nov 13, 2018

mnot closed this as completed in #172 Jan 8, 2019

mnot mentioned this issue Jan 20, 2019

Audit: single value header field error handling #193

Closed

25 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Are headers always defined with ABNF? #74

Are headers always defined with ABNF? #74

mnot commented Jun 4, 2018

reschke commented Jun 4, 2018 •

edited

Loading

royfielding commented Jun 5, 2018

mnot commented Jun 29, 2018

reschke commented Jul 18, 2018

mnot commented Jul 18, 2018

wtarreau commented Jul 18, 2018

reschke commented Jul 19, 2018

wtarreau commented Jul 20, 2018

mnot commented Oct 16, 2018 •

edited

Loading

mnot commented Oct 16, 2018

royfielding commented Nov 17, 2018

wtarreau commented Nov 17, 2018

annevk commented Nov 19, 2018 •

edited

Loading

mnot commented Nov 29, 2018

Are headers always defined with ABNF? #74

Are headers always defined with ABNF? #74

Comments

mnot commented Jun 4, 2018

reschke commented Jun 4, 2018 • edited Loading

royfielding commented Jun 5, 2018

mnot commented Jun 29, 2018

reschke commented Jul 18, 2018

mnot commented Jul 18, 2018

wtarreau commented Jul 18, 2018

reschke commented Jul 19, 2018

wtarreau commented Jul 20, 2018

mnot commented Oct 16, 2018 • edited Loading

mnot commented Oct 16, 2018

royfielding commented Nov 17, 2018

wtarreau commented Nov 17, 2018

annevk commented Nov 19, 2018 • edited Loading

mnot commented Nov 29, 2018

reschke commented Jun 4, 2018 •

edited

Loading

mnot commented Oct 16, 2018 •

edited

Loading

annevk commented Nov 19, 2018 •

edited

Loading