Preserve header casing as specified #576

joshuaflanagan · 2019-12-15T01:12:39Z

The original behavior was to normalize all header names so that they
were broken up into words, delimited by - or _, capitalize each word,
and then join the words together with a -.

This made it impossible to make a request with an underscore (_) in the header
name, or with a different casing (ex: all caps).

However, the normalized name made it possible to access (or delete) headers,
without having to know the exact casing.

The new behavior is based on the following rules (as specified in #524 (comment))

Fail if a header name is not specified as a String or Symbol
If the header name is specified as a Symbol, normalize it when writing it in a request.
If the header name is specified as a String, preserve it as-is when writing it in a request.
Allow lookup of any header using the normalized form of the name

I implemented this behavior by storing three elements for each header value:

normalized header name
header name as it will be written in a request
header value

Element 2 is the new addition. I considered just storing the header value
as it would be written, and only doing normalization during lookup, but
it seemed wasteful to potentially normalize the same value over and over
when searching through the list for various lookups. This way we only
normalize each name once, and can continue to use that value for lookups.
However, whenever asked for the contents (ex: via each or keys) we
return the new, non-normalized name.

Fixes: #524

tarcieri

This approach looks good to me. Curious what @ixti thinks.

ixti

Please change #add api to minimize breaking changes. We can also make it fully backward compatible by adding normalize keyword with true as default.

ixti · 2019-12-16T04:20:24Z

lib/http/headers.rb

@@ -49,16 +53,30 @@ def delete(name)
    # @param [Array<#to_s>, #to_s] value header value(s) to be appended
    # @return [void]
    def add(name, value)


Would be nice to introduce optional keyword:

def add(name, value, normalize: nil)

Naturally, when normalize is false, no normalization should be done for the wire. When true - force normalization, and when nil - type dependent normalization.

ixti · 2019-12-16T04:23:04Z

lib/http/headers.rb

+                  when Symbol
+                    lookup_name
+                  else
+                    raise HTTP::HeaderError, "HTTP header must be a String or Symbol: #{name.inspect}"


I think it should raise only if key is no ot responding to to_s and not a Symbol. Something like:

if name.is_a? Symbol # ... elsif name.respond_to? :to_s # ... else raise end

I don't see how that is helpful. Almost everything in ruby responds to to_s (unless it derives from BasicObject to explicitly opt out of default behavior).

If you want to allow more than just String (not sure what the use case is), we could support anything that responds to to_str, which is used to support implicit conversion to strings.

Ok. I agree. Then we need to update api doc to reflect this change.

Do you want to want to keep my current strict implementation (only allow String or Symbol), or do you want to use the more relaxed "anything that supports an implicit conversion to string (ie. responds to #to_str)"?

I think being strict makes sense here, especially since we have different behavior based on the data type.

I'm fine with your way. Strict Symbol or String.

ixti · 2019-12-16T04:28:23Z

Huge thanks for the PR. I'm not against breaking api, but would like add signature to allow predictable behaviour when normalize dwarf passed

joshuaflanagan · 2019-12-16T14:18:55Z

Thanks for the feedback. I don’t completely understand what you want to enable. Can you write me a test that would fail with the existing code, that you want to pass?

I think the most “predictable” behavior is to not do any manipulation of the strings provided by the user. I do understand that is a breaking change if someone is currently specifying a “content_type” and expecting it to go out on the wire as “Content-Type”. But for the vast number of new users and new code being written, I think changing the name out from under them is more surprising.

joshuaflanagan · 2019-12-16T14:21:52Z

I understand the normalize keyword, but not exactly why it is helpful. In one place you suggest the default is nil (current type based switching), but then another place you suggest it should be true. That suggests the default behavior would be to continue changing a specified name into something else on the wire. That seems like more surprises for future users.

joshuaflanagan

I want to help clarify my reluctance toward the normalize keyword:

If the default is going to be nil, then the only place a user would set it explicitly to true is when they want to maintain the backwards compatible behavior. But in those cases, since they are touching the callsites anyway, wouldn't it make sense for them to change the name string they are specifying so it is in the normalized form they want?

If the default is going to be true, you are continuing the existing behavior of ignoring the casing (and more importantly, _ over -) of the name as specified by the user. You would be forcing everyone writing new code to add an additional keyword just to say "don't mess with my strings! You see that header name I gave you? I really meant it - keep it that way". That feels odd to ask the user to do extra work, to stop the library from getting in their way.

I understand the pain of introducing a breaking change. One option, that I don't love, would be to allow a global configuration option that the user could opt-in to preserve the existing auto-convert behavior.

HTTP.always_normalize_header_names = true` # default false

I don't like the global state aspect of that solution. Or the fact that it perpetuates the behavior forever. It might be better if it reported a noisy deprecation warning (to stderr) whenever it does a conversion of a String. That would call attention to the callsites, so that they can be fixed. With the idea that the next major version would remove the configuration option and the old behavior.

ixti · 2019-12-16T15:00:02Z

@joshuaflanagan honestly I'm not sure what I want. :D I want the API to be as predictable as possible. Please give me a bit of time to think. Would like to play around with your PR first.

joshuaflanagan · 2019-12-19T14:55:14Z

Any further thoughts on this? Is the primary concern that it includes a breaking change?

I think the only breaking change is to existing code that adds a request header with a name, specified as a String, that includes an underscore. With the new code, that will go out on the wire as an underscore (_), whereas previous code would have (incorrectly) converted it to a dash (-).

Technically, it could also be considered a breaking change that the casing is no longer changed from what the user specified. Header names are not supposed to be case-sensitive, so servers shouldn't care that header names are no longer being converted to "capitalized" case. However, we know that not all servers are RFC compliant and some do care about case (which is, in fact, the motivation behind this PR). So it is possible that this change in casing will break some existing code, if the server someone is communicating with expects a specific casing and the user is not currently passing the correct casing.

I do appreciate trying to avoid breaking changes, but in this case, I think any changes forced on the user will be toward making their code more correct. If this is released as part of a major version upgrade, these types of breaking changes can be expected.

ixti · 2019-12-19T16:01:20Z

After giving it more thoughts I guess I'm fine to merge it. But please, fix the API doc of Headers#add.

The original behavior was to normalize all header names so that they were broken up into words, delimited by `-` or `_`, capitalize each word, and then join the words together with a `-`. This made it impossible to make a request with an underscore in the header name, or with a different casing (ex: all caps). However, the normalized name made it possible to access (or delete) headers, without having to know the exact casing. The new behavior is based on the following rules (as specified in httprb#524 (comment)) 1) Fail if a header name is not specified as a String or Symbol 2) If the header name is specified as a Symbol, normalize it when writing it in a request. If the header name is specified as a String, preserve it as-is when writing it in a request. 3) Allow lookup of any header using the normalized form of the name I implemented this behavior by storing three elements for each header value: 1) normalized header name 2) header name as it will be written in a request 3) header value Element 2 is the new addition. I considered just storing the header value as it would be written, and only doing normalization during lookup, but it seemed wasteful to potentially normalize the same value over and over when searching through the list for various lookups. This way we only normalize each name once, and can continue to use that value for lookups. However, whenever asked for the contents (ex: via `each` or `keys`) we return the new, non-normalized name. Fixes: httprb#524

joshuaflanagan · 2019-12-19T18:50:14Z

I've updated the API doc to reflect the current strict implementation that only accepts a string or symbol as the header name.

ixti

I think it makes things better than they were, thus OK to merge it.

uberllama · 2021-04-13T16:39:45Z

Hey folks, we ran into this today with a partner using non-standard api headers. I noticed it still hasn't been released. Is there an ETA?

tarcieri · 2021-04-14T14:46:22Z

See the release tracking milestone here: https://github.com/httprb/http/milestone/11

There are some showstopper bugs which would be good to resolve before cutting another release. We would appreciate help from anyone interested in getting it over the line.

Linuus · 2021-11-10T14:20:15Z

For anyone reading this later on. This is a breaking change. See: #700

joshuaflanagan force-pushed the unconverted_headers branch 2 times, most recently from 697e29b to 73bf831 Compare December 15, 2019 01:18

tarcieri requested review from ixti and tarcieri December 15, 2019 16:44

tarcieri approved these changes Dec 15, 2019

View reviewed changes

ixti requested changes Dec 16, 2019

View reviewed changes

joshuaflanagan commented Dec 16, 2019

View reviewed changes

joshuaflanagan force-pushed the unconverted_headers branch from 73bf831 to 7b7c921 Compare December 16, 2019 14:59

joshuaflanagan force-pushed the unconverted_headers branch from 7b7c921 to e92cd44 Compare December 19, 2019 18:49

ixti approved these changes Dec 19, 2019

View reviewed changes

ixti merged commit 8a236fe into httprb:master Dec 19, 2019

joshuaflanagan mentioned this pull request Dec 20, 2019

header parameter name changed from underscore ('_') to hyphen ('-') postmanlabs/httpbin#435

Open

joshuaflanagan mentioned this pull request Feb 12, 2020

Headers with leading underscore being rewritten to dashes #337

Closed

unikitty37 mentioned this pull request Mar 12, 2020

Logger outputs canonicalised request headers, rather than logging what is sent #600

Open

tarcieri mentioned this pull request Jan 18, 2021

Header key with underscore being converted to a dash #641

Closed

tarcieri mentioned this pull request May 13, 2021

v5.0.0 #660

Merged

flosacca mentioned this pull request Sep 29, 2021

Cookies extraction is not working properly in 5.x #693

Closed

Linuus mentioned this pull request Nov 10, 2021

Breaking change not marked as breaking in changelog #700

Closed

zarqman mentioned this pull request Feb 14, 2022

Normalize response headers for http.rb and wget backends janko/down#68

Merged

cben mentioned this pull request Oct 31, 2022

Update HTTP gem requirement to allow version 5 ManageIQ/kubeclient#571

Merged

jeffgran-dox mentioned this pull request Dec 14, 2022

Allow http dependency v5+ doximity/oauth2c#15

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preserve header casing as specified #576

Preserve header casing as specified #576

joshuaflanagan commented Dec 15, 2019

tarcieri left a comment

ixti left a comment

ixti Dec 16, 2019

ixti Dec 16, 2019

joshuaflanagan Dec 16, 2019

ixti Dec 16, 2019

joshuaflanagan Dec 19, 2019

ixti Dec 19, 2019

ixti commented Dec 16, 2019

joshuaflanagan commented Dec 16, 2019

joshuaflanagan commented Dec 16, 2019 •

edited

Loading

joshuaflanagan left a comment

ixti commented Dec 16, 2019

joshuaflanagan commented Dec 19, 2019 •

edited

Loading

ixti commented Dec 19, 2019

joshuaflanagan commented Dec 19, 2019

ixti left a comment

uberllama commented Apr 13, 2021

tarcieri commented Apr 14, 2021

Linuus commented Nov 10, 2021

Preserve header casing as specified #576

Preserve header casing as specified #576

Conversation

joshuaflanagan commented Dec 15, 2019

tarcieri left a comment

Choose a reason for hiding this comment

ixti left a comment

Choose a reason for hiding this comment

ixti Dec 16, 2019

Choose a reason for hiding this comment

ixti Dec 16, 2019

Choose a reason for hiding this comment

joshuaflanagan Dec 16, 2019

Choose a reason for hiding this comment

ixti Dec 16, 2019

Choose a reason for hiding this comment

joshuaflanagan Dec 19, 2019

Choose a reason for hiding this comment

ixti Dec 19, 2019

Choose a reason for hiding this comment

ixti commented Dec 16, 2019

joshuaflanagan commented Dec 16, 2019

joshuaflanagan commented Dec 16, 2019 • edited Loading

joshuaflanagan left a comment

Choose a reason for hiding this comment

ixti commented Dec 16, 2019

joshuaflanagan commented Dec 19, 2019 • edited Loading

ixti commented Dec 19, 2019

joshuaflanagan commented Dec 19, 2019

ixti left a comment

Choose a reason for hiding this comment

uberllama commented Apr 13, 2021

tarcieri commented Apr 14, 2021

Linuus commented Nov 10, 2021

joshuaflanagan commented Dec 16, 2019 •

edited

Loading

joshuaflanagan commented Dec 19, 2019 •

edited

Loading