-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preserve header casing as specified #576
Conversation
697e29b
to
73bf831
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach looks good to me. Curious what @ixti thinks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change #add api to minimize breaking changes. We can also make it fully backward compatible by adding normalize keyword with true as default.
@@ -49,16 +53,30 @@ def delete(name) | |||
# @param [Array<#to_s>, #to_s] value header value(s) to be appended | |||
# @return [void] | |||
def add(name, value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to introduce optional keyword:
def add(name, value, normalize: nil)
Naturally, when normalize is false, no normalization should be done for the wire. When true - force normalization, and when nil - type dependent normalization.
when Symbol | ||
lookup_name | ||
else | ||
raise HTTP::HeaderError, "HTTP header must be a String or Symbol: #{name.inspect}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should raise only if key is no ot responding to to_s and not a Symbol. Something like:
if name.is_a? Symbol
# ...
elsif name.respond_to? :to_s
# ...
else
raise
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how that is helpful. Almost everything in ruby responds to to_s
(unless it derives from BasicObject
to explicitly opt out of default behavior).
If you want to allow more than just String
(not sure what the use case is), we could support anything that responds to to_str
, which is used to support implicit conversion to strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I agree. Then we need to update api doc to reflect this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to want to keep my current strict implementation (only allow String or Symbol), or do you want to use the more relaxed "anything that supports an implicit conversion to string (ie. responds to #to_str
)"?
I think being strict makes sense here, especially since we have different behavior based on the data type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with your way. Strict Symbol or String.
Huge thanks for the PR. I'm not against breaking api, but would like add signature to allow predictable behaviour when normalize dwarf passed |
Thanks for the feedback. I don’t completely understand what you want to enable. Can you write me a test that would fail with the existing code, that you want to pass? I think the most “predictable” behavior is to not do any manipulation of the strings provided by the user. I do understand that is a breaking change if someone is currently specifying a “content_type” and expecting it to go out on the wire as “Content-Type”. But for the vast number of new users and new code being written, I think changing the name out from under them is more surprising. |
I understand the normalize keyword, but not exactly why it is helpful. In one place you suggest the default is nil (current type based switching), but then another place you suggest it should be true. That suggests the default behavior would be to continue changing a specified name into something else on the wire. That seems like more surprises for future users. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to help clarify my reluctance toward the normalize
keyword:
If the default is going to be nil
, then the only place a user would set it explicitly to true
is when they want to maintain the backwards compatible behavior. But in those cases, since they are touching the callsites anyway, wouldn't it make sense for them to change the name string they are specifying so it is in the normalized form they want?
If the default is going to be true
, you are continuing the existing behavior of ignoring the casing (and more importantly, _
over -
) of the name as specified by the user. You would be forcing everyone writing new code to add an additional keyword just to say "don't mess with my strings! You see that header name I gave you? I really meant it - keep it that way". That feels odd to ask the user to do extra work, to stop the library from getting in their way.
I understand the pain of introducing a breaking change. One option, that I don't love, would be to allow a global configuration option that the user could opt-in to preserve the existing auto-convert behavior.
HTTP.always_normalize_header_names = true` # default false
I don't like the global state aspect of that solution. Or the fact that it perpetuates the behavior forever. It might be better if it reported a noisy deprecation warning (to stderr) whenever it does a conversion of a String. That would call attention to the callsites, so that they can be fixed. With the idea that the next major version would remove the configuration option and the old behavior.
73bf831
to
7b7c921
Compare
@joshuaflanagan honestly I'm not sure what I want. :D I want the API to be as predictable as possible. Please give me a bit of time to think. Would like to play around with your PR first. |
Any further thoughts on this? Is the primary concern that it includes a breaking change? I think the only breaking change is to existing code that adds a request header with a name, specified as a String, that includes an underscore. With the new code, that will go out on the wire as an underscore ( Technically, it could also be considered a breaking change that the casing is no longer changed from what the user specified. Header names are not supposed to be case-sensitive, so servers shouldn't care that header names are no longer being converted to "capitalized" case. However, we know that not all servers are RFC compliant and some do care about case (which is, in fact, the motivation behind this PR). So it is possible that this change in casing will break some existing code, if the server someone is communicating with expects a specific casing and the user is not currently passing the correct casing. I do appreciate trying to avoid breaking changes, but in this case, I think any changes forced on the user will be toward making their code more correct. If this is released as part of a major version upgrade, these types of breaking changes can be expected. |
After giving it more thoughts I guess I'm fine to merge it. But please, fix the API doc of |
The original behavior was to normalize all header names so that they were broken up into words, delimited by `-` or `_`, capitalize each word, and then join the words together with a `-`. This made it impossible to make a request with an underscore in the header name, or with a different casing (ex: all caps). However, the normalized name made it possible to access (or delete) headers, without having to know the exact casing. The new behavior is based on the following rules (as specified in httprb#524 (comment)) 1) Fail if a header name is not specified as a String or Symbol 2) If the header name is specified as a Symbol, normalize it when writing it in a request. If the header name is specified as a String, preserve it as-is when writing it in a request. 3) Allow lookup of any header using the normalized form of the name I implemented this behavior by storing three elements for each header value: 1) normalized header name 2) header name as it will be written in a request 3) header value Element 2 is the new addition. I considered just storing the header value as it would be written, and only doing normalization during lookup, but it seemed wasteful to potentially normalize the same value over and over when searching through the list for various lookups. This way we only normalize each name once, and can continue to use that value for lookups. However, whenever asked for the contents (ex: via `each` or `keys`) we return the new, non-normalized name. Fixes: httprb#524
7b7c921
to
e92cd44
Compare
I've updated the API doc to reflect the current strict implementation that only accepts a string or symbol as the header name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes things better than they were, thus OK to merge it.
Hey folks, we ran into this today with a partner using non-standard api headers. I noticed it still hasn't been released. Is there an ETA? |
See the release tracking milestone here: https://github.com/httprb/http/milestone/11 There are some showstopper bugs which would be good to resolve before cutting another release. We would appreciate help from anyone interested in getting it over the line. |
For anyone reading this later on. This is a breaking change. See: #700 |
The original behavior was to normalize all header names so that they
were broken up into words, delimited by
-
or_
, capitalize each word,and then join the words together with a
-
.This made it impossible to make a request with an underscore (
_
) in the headername, or with a different casing (ex: all caps).
However, the normalized name made it possible to access (or delete) headers,
without having to know the exact casing.
The new behavior is based on the following rules (as specified in #524 (comment))
If the header name is specified as a String, preserve it as-is when writing it in a request.
I implemented this behavior by storing three elements for each header value:
Element 2 is the new addition. I considered just storing the header value
as it would be written, and only doing normalization during lookup, but
it seemed wasteful to potentially normalize the same value over and over
when searching through the list for various lookups. This way we only
normalize each name once, and can continue to use that value for lookups.
However, whenever asked for the contents (ex: via
each
orkeys
) wereturn the new, non-normalized name.
Fixes: #524