Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non capitalized HTTP headers #524

Closed
ninoseki opened this issue Feb 2, 2019 · 9 comments · Fixed by #576
Closed

Non capitalized HTTP headers #524

ninoseki opened this issue Feb 2, 2019 · 9 comments · Fixed by #576

Comments

@ninoseki
Copy link

ninoseki commented Feb 2, 2019

Sorry for a dumb question.

I have to work with an API server which deals HTTP headers in case-sensitive manner.
(Yep I know that is a RFC violation)

Is there a good way to send non-capitalized HTTP headers?

@tarcieri
Copy link
Member

tarcieri commented Feb 2, 2019

I believe this is a dupe of #337, but I'll let @ixti confirm that.

This gem canonicalizes both request and response headers.

We've discussed disabling the canonicalization for requests, which I think is the most straightforward solution. We could also make it configurable on a request and response basis, e.g.:

HTTP.canonicalize_headers(false)
HTTP.canonicalize_headers(:request)
HTTP.canonicalize_headers(:response)
HTTP.canonicailze_headers(:all)

...or thereabouts.

@ixti
Copy link
Member

ixti commented Feb 3, 2019

Yeah it's a dupe. And I'm going to work on this soon. Finally have an idea in my head how to provide predictable API without intorducing public API changes. Will layout the idea as code soon so that discussion will be started :D But in few words, my idea is to change Headers class to keep header names as-is but using normalized names for headers lookup:

# imagine repsonse was sent with headers:
# foo_bar: 123
response.headers["Foo-Bar"] # => ["123"]
response.headers["foo_bar"] # => ["123"]
response.headers.keys # => ["foo_bar"]

When passing headers, that's pretty much the only public API change that will be needed:

  1. Fail if given header name is neither String nor Symbol
  2. Pre-normalize header name if it's given as Symbol

@tarcieri
Copy link
Member

tarcieri commented Feb 3, 2019

@ixti neat! SGTM 👍

@brasic
Copy link

brasic commented Mar 11, 2019

Hi! Just came across this issue while checking to see if anyone had run into the problem we are currently dealing with. An API we need to communicate with requires a particular header be specified with an underscore in the header name. It's otherwise case-insensitive. HTTP.rb makes it impossible to pass a header in an underscore because of this line, which transforms it to -:

normalized = name.split(/[\-_]/).each(&:capitalize!).join("-")

I would argue that the behavior of HTTP::Headers#normalize_header violates RFC 7230, which allows the _ character to be part of a header name. We can monkeypatch this for now on our side but are there any thoughts on making the canonicalization algorithm spec-compliant? I don't know of any case-insensitivity scheme that considers _ and - to be equivalent. I'd be happy to open a PR to remove _ from the split regex but that would likely require a major version bump (looking at the specs, lots of code depends on _ canonicalizing as -).

@ixti
Copy link
Member

ixti commented Mar 11, 2019

@brasic yes - I'm working on refactoring HTTP::Headers completely so that it will llow to pass any RFC compliant header (with or without normalization).

@Hendrione-Moka
Copy link

any update for this? I still got the issue.

@ixti
Copy link
Member

ixti commented Aug 27, 2019

I had no time to work on this yet.

@joshuaflanagan
Copy link
Contributor

Just ran into this again, dealing with an API that has case-sensitive headers. That may be "wrong" on their side, but it is what it is, and switching http libraries is much easier than getting an external API to change their behavior.
Do you have an in-process branch? I'd be happy to take a stab at fleshing it out. Would rather do that than switch libraries.

@tarcieri
Copy link
Member

@joshuaflanagan I think if you implemented the change @ixti suggested here it'd be accepted, and shouldn't be too difficult: #524 (comment)

joshuaflanagan added a commit to ShippingEasy/http that referenced this issue Dec 15, 2019
The original behavior was to normalize all header names so that they
were broken up into words, delimited by `-` or `_`, capitalize each word,
and then join the words together with a `_`.

This made it impossible to make a request with an underscore in the header
name, or with a different casing (ex: all caps).

However, the normalized name made it possible to access (or delete) headers,
without having to know the exact casing.

The new behavior is based on the following rules (as specified in httprb#524 (comment))

1) Fail if a header name is not specified as a String or Symbol
2) If the header name is specified as a Symbol, normalize it when writing it in a request.
If the header name is specified as a String, preserve it as-is when writing it in a request.
3) Allow lookup of any header using the normalized form of the name

I implemented this behavior by storing three elements for each header value:
1) normalized header name
2) header name as it will be written in a request
3) header value

Element 2 is the new addition. I considered just storing the header value
as it would be written, and only doing normalization during lookup, but
it seemed wasteful to potentially normalize the same value over and over
when searching through the list for various lookups. This way we only
normalize each name once, and can continue to use that value for lookups.
However, whenever asked for the contents (ex: via `each` or `keys`) we
return the new, non-normalized name.

Fixes: httprb#524
joshuaflanagan added a commit to ShippingEasy/http that referenced this issue Dec 15, 2019
The original behavior was to normalize all header names so that they
were broken up into words, delimited by `-` or `_`, capitalize each word,
and then join the words together with a `-`.

This made it impossible to make a request with an underscore in the header
name, or with a different casing (ex: all caps).

However, the normalized name made it possible to access (or delete) headers,
without having to know the exact casing.

The new behavior is based on the following rules (as specified in httprb#524 (comment))

1) Fail if a header name is not specified as a String or Symbol
2) If the header name is specified as a Symbol, normalize it when writing it in a request.
If the header name is specified as a String, preserve it as-is when writing it in a request.
3) Allow lookup of any header using the normalized form of the name

I implemented this behavior by storing three elements for each header value:
1) normalized header name
2) header name as it will be written in a request
3) header value

Element 2 is the new addition. I considered just storing the header value
as it would be written, and only doing normalization during lookup, but
it seemed wasteful to potentially normalize the same value over and over
when searching through the list for various lookups. This way we only
normalize each name once, and can continue to use that value for lookups.
However, whenever asked for the contents (ex: via `each` or `keys`) we
return the new, non-normalized name.

Fixes: httprb#524
joshuaflanagan added a commit to ShippingEasy/http that referenced this issue Dec 15, 2019
The original behavior was to normalize all header names so that they
were broken up into words, delimited by `-` or `_`, capitalize each word,
and then join the words together with a `-`.

This made it impossible to make a request with an underscore in the header
name, or with a different casing (ex: all caps).

However, the normalized name made it possible to access (or delete) headers,
without having to know the exact casing.

The new behavior is based on the following rules (as specified in httprb#524 (comment))

1) Fail if a header name is not specified as a String or Symbol
2) If the header name is specified as a Symbol, normalize it when writing it in a request.
If the header name is specified as a String, preserve it as-is when writing it in a request.
3) Allow lookup of any header using the normalized form of the name

I implemented this behavior by storing three elements for each header value:
1) normalized header name
2) header name as it will be written in a request
3) header value

Element 2 is the new addition. I considered just storing the header value
as it would be written, and only doing normalization during lookup, but
it seemed wasteful to potentially normalize the same value over and over
when searching through the list for various lookups. This way we only
normalize each name once, and can continue to use that value for lookups.
However, whenever asked for the contents (ex: via `each` or `keys`) we
return the new, non-normalized name.

Fixes: httprb#524
joshuaflanagan added a commit to ShippingEasy/http that referenced this issue Dec 16, 2019
The original behavior was to normalize all header names so that they
were broken up into words, delimited by `-` or `_`, capitalize each word,
and then join the words together with a `-`.

This made it impossible to make a request with an underscore in the header
name, or with a different casing (ex: all caps).

However, the normalized name made it possible to access (or delete) headers,
without having to know the exact casing.

The new behavior is based on the following rules (as specified in httprb#524 (comment))

1) Fail if a header name is not specified as a String or Symbol
2) If the header name is specified as a Symbol, normalize it when writing it in a request.
If the header name is specified as a String, preserve it as-is when writing it in a request.
3) Allow lookup of any header using the normalized form of the name

I implemented this behavior by storing three elements for each header value:
1) normalized header name
2) header name as it will be written in a request
3) header value

Element 2 is the new addition. I considered just storing the header value
as it would be written, and only doing normalization during lookup, but
it seemed wasteful to potentially normalize the same value over and over
when searching through the list for various lookups. This way we only
normalize each name once, and can continue to use that value for lookups.
However, whenever asked for the contents (ex: via `each` or `keys`) we
return the new, non-normalized name.

Fixes: httprb#524
joshuaflanagan added a commit to ShippingEasy/http that referenced this issue Dec 19, 2019
The original behavior was to normalize all header names so that they
were broken up into words, delimited by `-` or `_`, capitalize each word,
and then join the words together with a `-`.

This made it impossible to make a request with an underscore in the header
name, or with a different casing (ex: all caps).

However, the normalized name made it possible to access (or delete) headers,
without having to know the exact casing.

The new behavior is based on the following rules (as specified in httprb#524 (comment))

1) Fail if a header name is not specified as a String or Symbol
2) If the header name is specified as a Symbol, normalize it when writing it in a request.
If the header name is specified as a String, preserve it as-is when writing it in a request.
3) Allow lookup of any header using the normalized form of the name

I implemented this behavior by storing three elements for each header value:
1) normalized header name
2) header name as it will be written in a request
3) header value

Element 2 is the new addition. I considered just storing the header value
as it would be written, and only doing normalization during lookup, but
it seemed wasteful to potentially normalize the same value over and over
when searching through the list for various lookups. This way we only
normalize each name once, and can continue to use that value for lookups.
However, whenever asked for the contents (ex: via `each` or `keys`) we
return the new, non-normalized name.

Fixes: httprb#524
@ixti ixti closed this as completed in #576 Dec 19, 2019
ixti pushed a commit that referenced this issue Dec 19, 2019
The original behavior was to normalize all header names so that they
were broken up into words, delimited by `-` or `_`, capitalize each word,
and then join the words together with a `-`.

This made it impossible to make a request with an underscore in the header
name, or with a different casing (ex: all caps).

However, the normalized name made it possible to access (or delete) headers,
without having to know the exact casing.

The new behavior is based on the following rules (as specified in #524 (comment))

1) Fail if a header name is not specified as a String or Symbol
2) If the header name is specified as a Symbol, normalize it when writing it in a request.
If the header name is specified as a String, preserve it as-is when writing it in a request.
3) Allow lookup of any header using the normalized form of the name

I implemented this behavior by storing three elements for each header value:
1) normalized header name
2) header name as it will be written in a request
3) header value

Element 2 is the new addition. I considered just storing the header value
as it would be written, and only doing normalization during lookup, but
it seemed wasteful to potentially normalize the same value over and over
when searching through the list for various lookups. This way we only
normalize each name once, and can continue to use that value for lookups.
However, whenever asked for the contents (ex: via `each` or `keys`) we
return the new, non-normalized name.

Fixes: #524
@tarcieri tarcieri mentioned this issue May 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants