Header normalization (yeah, again) #145

Closed
meh opened this Issue Sep 28, 2011 · 7 comments

Projects

None yet

2 participants

meh commented Sep 28, 2011

I know this has already been discussed, but I want to talk about this again because it's causing me some useless trouble.

Why is the normalization done like this? The standard says that header names are case insensitive, but changing - to _ tends to break kind of everything. I sincerely think headers should be left untouched and make the #[] just use #downcase on the key, and really avoid NAME_OF_HEADER, beause it's non standard and in my proxy it breaks everything, and it's really useless to leave me renormalizing again every header at every request for no real good reason.

If I'm seeing this in the wrong way, please explain.

Quoting the HTTP RFC

Each header field consists of a name followed by a colon (":") and the field value. Applications ought to follow "common form", where one is known or indicated, when generating HTTP constructs, since there might exist some implementations that fail to accept anything

Making it accessible with NAME_OF_HEADER is ok, but changing the names to it is really non standard in my opinion.

Thanks for the time.

Owner

The conversion is to follow the Rack specification: http://rack.rubyforge.org/doc/SPEC.html

As of very recently, this should also be transparent: 72eeebc

You can use both variants, and they'll "just work". To get that behavior, build from master.

@igrigorik igrigorik closed this Sep 28, 2011
meh commented Sep 28, 2011

The thing is that I'm not accessing them, I'm sending the headers back and I get borked names, it would be awesome if there could be a "raw headers" Hash, would really improve my proxy.

Owner

Ah, you can grab the raw headers by providing an on_headers callback to the connection.. that'll give you the raw hash of headers as soon as they arrive on the wire.

meh commented Sep 28, 2011

Sorry for the noobness, but how do I provide a on_headers callback?

An example for this in the doc would be useful (and if there's already one, didn't find it, mea culpa).

Owner
req = HttpRequest.new("http://google.com").get
req.headers { |hash| ... }
meh commented Sep 29, 2011

That hash has normalized headers here, using the last version from git.

{"DATE"=>"Thu, 29 Sep 2011 04:17:50 GMT", "SERVER"=>"Apache", "X_POWERED_BY"=>"PHP/5.2.6-1+lenny13", "CONTENT_LANGUAGE"=>"en", "VARY"=>"Accept-Encoding,Cookie", "EXPIRES"=>"Thu, 01 Jan 1970 00:00:00 GMT", "CACHE_CONTROL"=>"private, must-revalidate, max-age=0", "LAST_MODIFIED"=>"Thu, 29 Sep 2011 04:11:15 GMT", "CONNECTION"=>"close", "TRANSFER_ENCODING"=>"chunked", "CONTENT_TYPE"=>"text/html; charset=UTF-8"}
Owner

Hmm, that's clever. I guess we do that in Goliath, but not here. Hmm. We could expose response_header.raw, although that's not the cleanest approach either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment