Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Header normalization rules #45

Closed
martinthomson opened this issue Jan 12, 2018 · 17 comments
Closed

Header normalization rules #45

martinthomson opened this issue Jan 12, 2018 · 17 comments

Comments

@martinthomson
Copy link
Contributor

From httpwg/http-extensions#282 (see that for more context).

Are the following header fields considered equal or equivalent or are they different?

Header: value; p=pval
Header: value; p="pval"
Header: value; p="Pval"
Header: value; P="Pval"

Some specializations of the header syntax permit both quoted-string and token for parameter values. If the string value after removal of quotes is the same as the token, is it the same value?

What about case folding for parameter names? Do we do that?

@mnot mnot added the semantics label Jun 29, 2018
@mnot
Copy link
Member

mnot commented Jun 29, 2018

That syntax is defined here.

That says:

The type, subtype, and parameter name tokens are case-insensitive. Parameter values might or might not be case-sensitive, depending on the semantics of the parameter name. The presence or absence of a parameter might be significant to the processing of a media-type, depending on its definition within the media type registry.

A parameter value that matches the token production can be transmitted either as a token or within a quoted-string. The quoted and unquoted values are equivalent. For example, the following examples are all equivalent, but the first is preferred for consistency:

  text/html;charset=utf-8
  text/html;charset=UTF-8
  Text/HTML;Charset="utf-8"
  text/html; charset="utf-8"

So, this pretty directly covers the question, I think. The only thing I see is that this is all defined as part of media types; it would be nice if it were pulled out into a separate section about generic syntax (perhaps alongside the #rule).

@reschke
Copy link
Contributor

reschke commented Jun 29, 2018

That assumes that the rules are the same for Content-Type and "similar" header fields. Can we claim that? I don't think so.

@mnot
Copy link
Member

mnot commented Jun 29, 2018

If another header references the ABNF for parameter, I think they're buying into that handling.

@reschke
Copy link
Contributor

reschke commented Jun 29, 2018

Ack.

So I think the only open non-editorial question is whether parameter names are case-sensitive. The ABNF doesn't say so, and I don't believe we can just start claiming it is, even if it's the case for Content-Type.

Isn't this stuff for SH?

@mnot
Copy link
Member

mnot commented Jun 29, 2018

It is already defined:

The type, subtype, and parameter name tokens are case-insensitive.

SH wants to define things very tightly yes, but there are already existing headers that point at this ABNF. It would be good to clarify its role for them.

@reschke
Copy link
Contributor

reschke commented Jun 29, 2018

Well. For Content-Type.

If you just reference the ABNF for parameter, you do not automatically inherit these additional constraints.

I'm all for making this re-usable,, but then we need to instruct people how to use it.

There are other questions that would come up, like repeating parameter names...

@mnot
Copy link
Member

mnot commented Jun 29, 2018

I'm going to disagree there; if someone references parameter and the text right above it says "parameter names are case-insensitive", that's part of the package.

Repeating parameter names is indeed a separate issue. Raise it?

@reschke
Copy link
Contributor

reschke commented Jun 29, 2018

Aren't we duplicating work that is supposed to happen in SH here?

@mnot
Copy link
Member

mnot commented Jun 29, 2018

Again, SH is for new headers, not for existing ones.

If you're arguing that it's not important to clarify this for existing headers, we should talk about that.

@reschke
Copy link
Contributor

reschke commented Jun 29, 2018

I'm just not convinced that there's something we can clarify, for some value of "clarify".

Can we discuss a concrete example?

@mnot
Copy link
Member

mnot commented Jul 1, 2018

I think the proposal on the table is moving the text referenced above into its own section, to recognise that it's syntax (and processing) that's used by more than just media types.

@mnot mnot added the discuss label Oct 10, 2018
@annevk
Copy link
Contributor

annevk commented Oct 15, 2018

Note that in practice browsers do not use this production to parse Content-Type. See also #39. So we probably need to exhaustively test headers using these productions to figure out what's what.

@royfielding
Copy link
Member

I agree that it should probably be its own section on parsing header fields based on the rather painful RFC822-inspired syntax.

@mcmanus
Copy link

mcmanus commented Nov 6, 2018

in bkk: support for clarifying document

@mnot mnot removed the discuss label Nov 12, 2018
mnot added a commit that referenced this issue Jan 8, 2019
Fixes #45.

This places it as a subsection of 1.2 Syntax Notation; the intent is to also move 11. ABNF List Extension: #rule and 4.3. Whitespace there, along with any other truly generic syntax.
@mnot
Copy link
Member

mnot commented Jan 20, 2019

Hmm, I somehow missed 4.2.3. Header Field Value Components, which seems to be a good place for this sort of thing. Is it more appropriate there, or should they move down to the new section?

It feels like there should be subsections, wherever these things end up.

@mnot mnot self-assigned this Feb 26, 2019
@mnot
Copy link
Member

mnot commented Feb 26, 2019

OK, moved into a subsection of 4.2.3.

@royfielding
Copy link
Member

I updated the PR to move the real examples back to media type and forward reference them from parameters. I also reduced some of the redundant text encircling them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

6 participants