Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Field-name syntax #30

Closed
mnot opened this issue Jun 28, 2017 · 15 comments · Fixed by #295
Closed

Field-name syntax #30

mnot opened this issue Jun 28, 2017 · 15 comments · Fixed by #295

Comments

@mnot
Copy link
Member

mnot commented Jun 28, 2017

Header field-names are defined as tokens. This is an extremely permissive syntax, including characters that will cause confusion and likely break some senders/recipients.

Most of the special characters allowed are not in the registry or seen "in the wild." Some research would be good to substantiate their use, but a starting point might be:

"-" / "_" / "." / "+" / DIGIT / ALPHA

There are a number of strategies we could take to the transition:

  1. Like OWS / BWS, mark some characters as "do not generate" but "should consume"

  2. Disallow registration of header fields with those characters, and discourage their use in unregistered headers

  3. If we have more confidence that they're not in use, just ignore headers containing those characters.

@royfielding
Copy link
Member

My opinion is that Apache httpd team would prefer to reduce the syntax to reduce the security issues.

@mnot
Copy link
Member Author

mnot commented Jul 18, 2018

Discussed in Montreal; interest in doing something, take to list.

@mnot mnot self-assigned this Oct 10, 2018
@mnot mnot added semantics and removed http-arch labels Oct 10, 2018
@annevk
Copy link
Contributor

annevk commented Oct 15, 2018

I'm not sure why this is being considered. All browsers support headers such as !#$%&'*+-.^_`|~0123456789abcdefghijklmnopqrstuvwxyz. Changing that seems likely to break applications.

annevk added a commit to web-platform-tests/wpt that referenced this issue Oct 15, 2018
annevk added a commit to web-platform-tests/wpt that referenced this issue Oct 15, 2018
@mnot mnot removed their assignment Oct 16, 2018
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Oct 17, 2018
…testonly

Automatic update from web-platform-testsXHR: test unusual header name syntax

For httpwg/http-core#30.

--

wpt-commits: b3e1de8dc92db9c4f400e622c9ff1582566d61fa
wpt-pr: 13517
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Oct 17, 2018
…testonly

Automatic update from web-platform-testsXHR: test unusual header name syntax

For httpwg/http-core#30.

--

wpt-commits: b3e1de8dc92db9c4f400e622c9ff1582566d61fa
wpt-pr: 13517
jamienicol pushed a commit to jamienicol/gecko that referenced this issue Oct 18, 2018
…testonly

Automatic update from web-platform-testsXHR: test unusual header name syntax

For httpwg/http-core#30.

--

wpt-commits: b3e1de8dc92db9c4f400e622c9ff1582566d61fa
wpt-pr: 13517
@royfielding
Copy link
Member

The restrictions exist to protect legacy server gateways (like CGI) that are far more restrictive than browsers need to be due to the terrible idea of passing names through env vars and command-lines.

@mcmanus
Copy link

mcmanus commented Nov 6, 2018

in bkk: no real consensus, concern about compatibility. reconsider on impact of a case by cases

@PiotrSikora
Copy link

For the record, NGINX drops request header fields with names containing characters other than letters, digits, hyphens and optionally underscores (if configured using underscores_in_headers directive). Unfortunately, virtually anything seems to be accepted and forwarded in the response headers.

@wtarreau
Copy link

wtarreau commented Nov 6, 2018

Since I'm seeing the conversation moved from the WG to GH, I'm just pasting what I sent there for completeness. Right now haproxy only accepts :
"-" / "_" / "." / "+" / DIGIT / ALPHA / "!" / "#" / "$" / "%" / "&" / "'" / "*" / "^" / "`" / "|" / "~"
i.e. everything matching a token. Quite honestly, seeing any character from this extra list in a field name would look extremely suspicious to me, and I'd rather get rid of them.

As Roy mentioned, the restriction is to avoid trouble with CGIs doing :
eval "hdr_name=$value"

Characters like backquote (`), dollar ($), or pipe(|) have long been abused to attack servers...

@wtarreau
Copy link

wtarreau commented Nov 6, 2018

Also I proposed that we could do something less extreme than blocking messages containing such header field names, we could recommend to simply drop these fields by default. This will have no impact if they're here by accident. Then we can let the agents decide if they want to let them pass or not. Thus we could make the difference between "forbidden characters" (the historical ones) and "unusual characters" (those excluded by the new, more restrictive list).

@mnot
Copy link
Member Author

mnot commented Nov 12, 2018

Discussed in Bangkok, but no consensus on an approach.

@mcmanus
Copy link

mcmanus commented Mar 25, 2019

ietf104: still seeking data for characters used in the wild. http archive mentioned as a possible source.

@mnot mnot added needs-data and removed discuss labels Apr 12, 2019
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this issue Oct 3, 2019
…testonly

Automatic update from web-platform-testsXHR: test unusual header name syntax

For httpwg/http-core#30.

--

wpt-commits: b3e1de8dc92db9c4f400e622c9ff1582566d61fa
wpt-pr: 13517

UltraBlame original commit: 889615275c32ec0a7fa79b4fd2daa3423165dcdf
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this issue Oct 3, 2019
…testonly

Automatic update from web-platform-testsXHR: test unusual header name syntax

For httpwg/http-core#30.

--

wpt-commits: b3e1de8dc92db9c4f400e622c9ff1582566d61fa
wpt-pr: 13517

UltraBlame original commit: 08d16a54e859b7de860fc345a4cdb9e0449f0cef
gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this issue Oct 3, 2019
…testonly

Automatic update from web-platform-testsXHR: test unusual header name syntax

For httpwg/http-core#30.

--

wpt-commits: b3e1de8dc92db9c4f400e622c9ff1582566d61fa
wpt-pr: 13517

UltraBlame original commit: 889615275c32ec0a7fa79b4fd2daa3423165dcdf
gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this issue Oct 3, 2019
…testonly

Automatic update from web-platform-testsXHR: test unusual header name syntax

For httpwg/http-core#30.

--

wpt-commits: b3e1de8dc92db9c4f400e622c9ff1582566d61fa
wpt-pr: 13517

UltraBlame original commit: 08d16a54e859b7de860fc345a4cdb9e0449f0cef
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this issue Oct 3, 2019
…testonly

Automatic update from web-platform-testsXHR: test unusual header name syntax

For httpwg/http-core#30.

--

wpt-commits: b3e1de8dc92db9c4f400e622c9ff1582566d61fa
wpt-pr: 13517

UltraBlame original commit: 889615275c32ec0a7fa79b4fd2daa3423165dcdf
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this issue Oct 3, 2019
…testonly

Automatic update from web-platform-testsXHR: test unusual header name syntax

For httpwg/http-core#30.

--

wpt-commits: b3e1de8dc92db9c4f400e622c9ff1582566d61fa
wpt-pr: 13517

UltraBlame original commit: 08d16a54e859b7de860fc345a4cdb9e0449f0cef
@reschke
Copy link
Contributor

reschke commented Jan 7, 2020

I'm sceptical about the "protect CGI servers" (and servers similar to those). We have 2020; shouldn't these all have their own protections by now?

@mnot
Copy link
Member Author

mnot commented Feb 2, 2020

Discussed in Basel; suggestion is to document "safe" characters in prose (for now), both below the ABNF and in the new header field recommendations.

@mnot mnot self-assigned this Feb 2, 2020
@mnot mnot removed the needs-data label Feb 3, 2020
mnot added a commit that referenced this issue Feb 3, 2020
@mnot mnot closed this as completed in #295 Feb 4, 2020
reschke added a commit that referenced this issue Feb 4, 2020
@PiotrSikora
Copy link

I'm sceptical about the "protect CGI servers" (and servers similar to those). We have 2020; shouldn't these all have their own protections by now?

Those CGI servers cannot protect themselves. The issue is that both hyphens and underscores are converted to underscores in CGI servers, so both Content-Length and Content_Length are converted to CONTENT_LENGTH at the protocol level, and CGI servers cannot tell which HTTP header did it originate from, since that information is lost.

@mnot
Copy link
Member Author

mnot commented Feb 4, 2020

Hey Piotr,

We chatted about that here and tend to agree -- will remove underscore and see how that goes down.

mnot added a commit that referenced this issue Feb 4, 2020
@royfielding
Copy link
Member

Well, to be clear, it is the HTTP server that invokes CGI that is creating the environment variables from the received header fields. That server is responsible for avoiding bad transformations, such as the underscore problem above. However, since we are just talking about the characters that are good practice, we can exclude underscores since they are not commonly used in field names.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

7 participants