Skip to content
This repository
Browse code

Make multipart part headers binary lowercase

Here we do not remove decode_packet yet, we just lowercase the
header name and transform it into a binary if needed, to fix
the consistency issue.
  • Loading branch information...
commit e27fd5fcb9786bed45976a3e4322bc89d3ffc025 1 parent 3402166
Loïc Hoguin authored September 21, 2012
14  src/cowboy_multipart.erl
@@ -23,7 +23,7 @@
23 23
 -type more(T) :: T | {more, parser(T)}.
24 24
 -type part_result() :: headers() | eof.
25 25
 -type headers() :: {headers, http_headers(), body_cont()}.
--type http_headers() :: [{atom() | binary(), binary()}].
+-type http_headers() :: [{binary(), binary()}].
27 27
 -type body_cont() :: cont(more(body_result())).
28 28
 -type cont(T) :: fun(() -> T).
29 29
 -type body_result() :: {body, binary(), body_cont()} | end_of_part().
@@ -135,7 +135,11 @@ parse_headers(Bin, Pattern) ->
135 135
 parse_headers(Bin, Pattern, Acc) ->
136 136
 	case erlang:decode_packet(httph_bin, Bin, []) of
137 137
 		{ok, {http_header, _, Name, _, Value}, Rest} ->
-			parse_headers(Rest, Pattern, [{Name, Value} | Acc]);
+			Name2 = case is_atom(Name) of
+				true -> cowboy_bstr:to_lower(atom_to_binary(Name, latin1));
+				false -> cowboy_bstr:to_lower(Name)
+			end,
+			parse_headers(Rest, Pattern, [{Name2, Value} | Acc]);
139 143
 		{ok, http_eoh, Rest} ->
140 144
 			Headers = lists:reverse(Acc),
141 145
 			{headers, Headers, fun () -> parse_body(Rest, Pattern) end};
@@ -205,7 +209,7 @@ multipart_test_() ->
205 209
 		{<<"preamble\r\n--boundary--">>, []},
206 210
 		{<<"--boundary--\r\nepilogue">>, []},
207 211
-			[{[{<<"A">>, <<"b">>}, {<<"C">>, <<"d">>}], <<>>}]},
+			[{[{<<"a">>, <<"b">>}, {<<"c">>, <<"d">>}], <<>>}]},
209 213
210 214
211 215
@@ -213,8 +217,8 @@ multipart_test_() ->
213 217
214 218
215 219
-				{[{<<"X-Name">>, <<"answer">>}], <<"42">>},
-				{[{'Server', <<"Cowboy">>}], <<"It rocks!\r\n">>}
+				{[{<<"x-name">>, <<"answer">>}], <<"42">>},
+				{[{<<"server">>, <<"Cowboy">>}], <<"It rocks!\r\n">>}
218 222
219 223
220 224
4  test/http_SUITE.erl
@@ -552,8 +552,8 @@ multipart(Config) ->
552 552
 	{ok, RespBody, _} = cowboy_client:response_body(Client3),
553 553
 	Parts = binary_to_term(RespBody),
554 554
 	Parts = [
-		{[{<<"X-Name">>, <<"answer">>}], <<"42">>},
-		{[{'Server', <<"Cowboy">>}], <<"It rocks!\r\n">>}
+		{[{<<"x-name">>, <<"answer">>}], <<"42">>},
+		{[{<<"server">>, <<"Cowboy">>}], <<"It rocks!\r\n">>}
557 557
558 558
559 559
 nc_reqs(Config, Input) ->

19 notes on commit e27fd5f

Ivan Blinkov

Loïc, could you please maintain somewhere a list of public API changes? It will help greatly to adapt applications using cowboy to it's newer versions.

For example after this commit usage of atom HTTP header names seem to be deprecated and all requests using them were failing with weird message, which originated somewhere in gen_tcp:

Bad value on output port 'tcp_inet'

Took a while to figure out what exactly was wrong and which cowboy commit caused that.

Loïc Hoguin

We do have a changelog file.

Ivan Blinkov

Of course, but it's more about releases and features, not API changes.

I've meant some place to look for stuff like this commit change, change of onresponse hook arity from 3 to 4, rename of cowboy_http_req to cowboy_req and stuff like that.

Currently there are only two ways to upgrade cowboy:

  • Update it, look how it fails and try to find the reason
  • Or to browse through each commit and check whether it changed something important or not
Loïc Hoguin

I didn't say the git log, I said the changelog file, found in the base directory of the project:

Ivan Blinkov

I was talking about it either. There's practically nothing about API changes.
Well, ok, not nothing, but not everything.

Loïc Hoguin

It has everything up to the last changelog file update (this included). You're not supposed to upgrade Cowboy to master though, the last release is 0.6.1 before the API changes. Upgrading to an in-development version features a number of risks and pitfalls.

Ivan Blinkov

Yep, actually the fact that that the last stable version was three month ago makes me stick to master. There are some issues with that, but nothing serious yet.

"Next" section of changelog indeed mentions most API changes, but besides being less cluttered it could have been a bit more user-friendly. For example for this commit: "Header names are now always lowercase binary string." probably could have been like "Atom HTTP header names are now deprecated, use lowercase binary strings instead".

By the way when I changed my application code to work after this commit, I actually had to use (non-standard) HTTP headers in their original form, not lowercase, because cowboy just outputs them exactly as they are specified.

Loïc Hoguin

Lowercase as output is by convention, Cowboy won't try to lowercase them. If the developer agrees with Cowboy that all header names should be lowercase, then that allows for more efficient processing of responses (instead of Cowboy having to try to lowercase everything).

Note that header names are case insensitive, so if you have an application that relies on them being of a specific case, that application is doing it wrong.

Ivan Blinkov

I always thought that convention for HTTP headers always was Camel-Case with hyphens, not lower-case... At least that's how all major mainstream HTTP-servers do it.

What was your reasoning to make them lowercase in the first place?

%% Probably this commit comments is not the right place to discuss this...

Loïc Hoguin

By convention in Cowboy. Lowercase is just easier to get right (worst case: to_lower, instead of some weird function we don't need anymore). Why worry on the case if you can not worry about it?

Ivan Blinkov

As for me mimicking the mainstream HTTP-servers alone is a good enough reason to use camel case. Maybe I don't get something, but they are usually just hard-coded and it's as easy to make them camel case as to make them lower case. It's all about the convention.

Tom Burdick

actually the lower case connection: keep-alive may very well be what broke apache bench?

Loïc Hoguin

But the mainstream HTTP servers can't even agree. Let's look at a popular website:

% curl -i
HTTP/1.1 301 Moved Permanently
Content-Type: text/html; charset=UTF-8
Date: Fri, 26 Oct 2012 22:00:26 GMT
Expires: Sun, 25 Nov 2012 22:00:26 GMT
Cache-Control: public, max-age=2592000
Server: gws
Content-Length: 219
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN

X-XSS-Protection would have been X-Xss-Protection in previous Cowboy (based on the OTP parsing of headers).

% curl -i
HTTP/1.1 200 OK
Server: nginx/1.3.6
Date: Fri, 26 Oct 2012 22:01:46 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 8561
Last-Modified: Thu, 04 Oct 2012 18:25:08 GMT
Connection: keep-alive
Keep-Alive: timeout=15
ETag: "506dd484-2171"
Accept-Ranges: bytes

ETag would have been Etag in previous Cowboy.

This page is full of example of nobody following any convention:

Servers never let you send headers directly, you do it through configuration, or a programming language, and then the server parses it, maps values to the hardcoded names it has and sends that. But the case doesn't matter.

Cowboy wants to avoid doing the parsing to be more efficient, so it simply asks you developer to use a simple rule where you can't do any mistake for your header names.

Loïc Hoguin

@bfrog: Wouldn't surprise me, ab is a horrible tool full of issues.

Ivan Blinkov

Loïc, examples you used are still camel case, these are just abbreviations.

I totally agree that it was a good idea to get rid of that header-parsing function you had. But lowercase for common headers looks like going against trend/mainstream/whatever.

I have no idea how different browsers and HTTP client libraries handle lowercase headers, probably most do it just fine ignoring the case. But some could have something hard-coded (also to be more efficient) and chances are that it's camel case because major HTTP servers use it.

Loïc Hoguin

Yeah, it's not camel-case, so I can't parse it anyway. Common headers sure, but what about unknown headers? I'm removing inconsistencies here, not adding new ones in their place.

It's better to make developers remember it's in lowercase than make them know how all header names should be written by heart.

If some client libraries are broken, you can still use camel case if you really need it, simply write an onresponse hook converting the header to a case that works. But mainstream clients don't need it, so there's really no need to worry about it.

Ivan Blinkov

Why can't you just don't parse unknown headers and leave them as they are?

I believe majority of developers who ever worked with HTTP already know how they are usually written by heart (at least I do).

Overriding all headers in onresponse hook is an option, but not a great one since it's a lot of excessive work that can be avoided.

Loïc Hoguin

My link to the PHP documentation should have been enough to show that no, developers do not know how they should be formatted (nor should they care).

If you think adding the function to do that directly into Cowboy (but not enabled, just to have it if needed) is a good idea, please open a ticket and link to this discussion. I can add it to cowboy_http, and then it's just an option away pretty much.

Ivan Blinkov

Link to new issue: #300

Please sign in to comment.
Something went wrong with that request. Please try again.