This repository has been archived by the owner. It is now read-only.

node does not sanitize http header field values #2602

Closed
alpercelik opened this Issue Jan 24, 2012 · 13 comments

Comments

Projects
None yet
7 participants
@alpercelik

Noticed first with express and submitted a pull request to express framework but after our discussions about the issue with @visionmedia, it seemed more logical to have it in node.

You may find the pull request for express which explains details here
visionmedia/express#974

I did sanitizing in http.js where header field values are stored.
https://github.com/alpercelik/node/commit/667aae596cded9336f50574386683ec39ada43f2

@alpercelik

This comment has been minimized.

Show comment Hide comment
@alpercelik

alpercelik Jan 24, 2012

@koichik

This comment has been minimized.

Show comment Hide comment
@koichik

koichik Jan 24, 2012

Actually, CRLF is valid characters of header field value.
RFC-2616 - section 2.2 says:

HTTP/1.1 header field values can be folded onto multiple lines if the
continuation line begins with a space or horizontal tab.

So, I think that it should throw an error if CRLF in the header value is not a part of LWS.

   LWS            = [CRLF] 1*( SP | HT )

koichik commented Jan 24, 2012

Actually, CRLF is valid characters of header field value.
RFC-2616 - section 2.2 says:

HTTP/1.1 header field values can be folded onto multiple lines if the
continuation line begins with a space or horizontal tab.

So, I think that it should throw an error if CRLF in the header value is not a part of LWS.

   LWS            = [CRLF] 1*( SP | HT )
@bnoordhuis

This comment has been minimized.

Show comment Hide comment
@bnoordhuis

bnoordhuis Jan 24, 2012

Member

Thanks, but I don't think this is a good change.

@koichik already pointed out that the patch rejects valid header values. Even if the patch worked as advertised, it could still have unforeseen consequences. It's quite likely that some applications rely on the current behaviour of .setHeader().

Moreover, it's a change where everybody pays the price (expensive per-header regex check) for a feature that only some need. It belongs at the library or application level, not in the core http library.

Member

bnoordhuis commented Jan 24, 2012

Thanks, but I don't think this is a good change.

@koichik already pointed out that the patch rejects valid header values. Even if the patch worked as advertised, it could still have unforeseen consequences. It's quite likely that some applications rely on the current behaviour of .setHeader().

Moreover, it's a change where everybody pays the price (expensive per-header regex check) for a feature that only some need. It belongs at the library or application level, not in the core http library.

@bnoordhuis bnoordhuis closed this Jan 24, 2012

@alpercelik

This comment has been minimized.

Show comment Hide comment
@alpercelik

alpercelik Jan 24, 2012

You are right, my change breaks valid header values. I overlooked validity of CRLF as @koichik mentioned.
I used regex, that was a quick change and your comment is valid about performance.

The issue I was trying to fix was related to http response splitting attack (http://en.wikipedia.org/wiki/HTTP_response_splitting). I was checking if this is possible in express and then noticed that it is, via url parameter to redirect method, which sets Location header with value url parameter passed in (if it is not mapped, a nice feature in feature, but this is also possible even url is generated indirectly).

example : https://gist.github.com/1654555#file_app.js
response : https://gist.github.com/1654555#file_server+response

When a header is set through setHeader function it is possible to add extra header fields/values where value has CRLF (not part of LWS).

I agree that applications that are using current behaviour may break if changed, but I think current applications should be using this function as it is documented. As per documentation :

Sets a single header value for implicit headers. If this header already exists in the to-be-sent headers, its value will be replaced. Use an array of strings here if you need to send multiple headers with the same name.

So if an application is calling this setHeader and intentionally adding more than one header field by CRLFs that is not part of LWS, then application is doing something wrong and better to be fixed. If they already use properly, they should be fine.

Also as per documentation setHeader function should not allow adding new headers or modifying other header fields unintentionally. At this point the solution that @koichik has proposed fits RFC and behaviour as per documentation.

this example will set Location header as well:

res.setHeader('Set-Cookie', 'foo=bar;\r\nLocation:http://www.mysite.com');

About http response splitting attack there are some good resources, this page has nice content with examples and references:
http://cwe.mitre.org/data/definitions/113.html

And in this blog different proof of concepts are presented for reddit.com (also mentions null byte in concept 3):
https://nealpoole.com/blog/2011/01/http-response-splitting-on-reddit-com/

If this check will not be done in core http library then I think it is better to mention in documentation, because it is easy to overlook in library or application level.

At least for "Location" header, I think it is better to fix, I don't see what would be legitimate reason to have CRLF in header value if I am not overlooking.

for CRLF's in redirect url

I feel that such controls fits better in core http library and it also applies to everyone's need. At the end security is also everyones need.

You are right, my change breaks valid header values. I overlooked validity of CRLF as @koichik mentioned.
I used regex, that was a quick change and your comment is valid about performance.

The issue I was trying to fix was related to http response splitting attack (http://en.wikipedia.org/wiki/HTTP_response_splitting). I was checking if this is possible in express and then noticed that it is, via url parameter to redirect method, which sets Location header with value url parameter passed in (if it is not mapped, a nice feature in feature, but this is also possible even url is generated indirectly).

example : https://gist.github.com/1654555#file_app.js
response : https://gist.github.com/1654555#file_server+response

When a header is set through setHeader function it is possible to add extra header fields/values where value has CRLF (not part of LWS).

I agree that applications that are using current behaviour may break if changed, but I think current applications should be using this function as it is documented. As per documentation :

Sets a single header value for implicit headers. If this header already exists in the to-be-sent headers, its value will be replaced. Use an array of strings here if you need to send multiple headers with the same name.

So if an application is calling this setHeader and intentionally adding more than one header field by CRLFs that is not part of LWS, then application is doing something wrong and better to be fixed. If they already use properly, they should be fine.

Also as per documentation setHeader function should not allow adding new headers or modifying other header fields unintentionally. At this point the solution that @koichik has proposed fits RFC and behaviour as per documentation.

this example will set Location header as well:

res.setHeader('Set-Cookie', 'foo=bar;\r\nLocation:http://www.mysite.com');

About http response splitting attack there are some good resources, this page has nice content with examples and references:
http://cwe.mitre.org/data/definitions/113.html

And in this blog different proof of concepts are presented for reddit.com (also mentions null byte in concept 3):
https://nealpoole.com/blog/2011/01/http-response-splitting-on-reddit-com/

If this check will not be done in core http library then I think it is better to mention in documentation, because it is easy to overlook in library or application level.

At least for "Location" header, I think it is better to fix, I don't see what would be legitimate reason to have CRLF in header value if I am not overlooking.

for CRLF's in redirect url

I feel that such controls fits better in core http library and it also applies to everyone's need. At the end security is also everyones need.

@tj

This comment has been minimized.

Show comment Hide comment
@tj

tj Jan 25, 2012

it's still an attack vector even if it's not for a redirect so it didn't make sense to fix this in express, but if core doesn't want something to protect against it I probably wouldn't go for it either since so many people still use setHeader() or writeHead(). Best to just filter the input

tj commented Jan 25, 2012

it's still an attack vector even if it's not for a redirect so it didn't make sense to fix this in express, but if core doesn't want something to protect against it I probably wouldn't go for it either since so many people still use setHeader() or writeHead(). Best to just filter the input

@isaacs

This comment has been minimized.

Show comment Hide comment
@isaacs

isaacs Jan 25, 2012

The setHeader(key, value) API should not let you accidentally set more than one header. \r\n is a valid character for the value, but only if it is followed by whitespace.

// valid
response.setHeader('set-cookie', 'foo=bar;\r\n     Expires=Wed, 09 Jun 2021 10:18:14 GMT')

// invalid
response.setHeader('set-cookie', 'foo=bar\r\nLocation: http://evil.com')

So, value.replace(/\r\n.*$/, '') is not appropriate. But value.replace(/\r\n\S.*$/, '') probably is appropriate, though.

isaacs commented Jan 25, 2012

The setHeader(key, value) API should not let you accidentally set more than one header. \r\n is a valid character for the value, but only if it is followed by whitespace.

// valid
response.setHeader('set-cookie', 'foo=bar;\r\n     Expires=Wed, 09 Jun 2021 10:18:14 GMT')

// invalid
response.setHeader('set-cookie', 'foo=bar\r\nLocation: http://evil.com')

So, value.replace(/\r\n.*$/, '') is not appropriate. But value.replace(/\r\n\S.*$/, '') probably is appropriate, though.

@isaacs isaacs reopened this Jan 25, 2012

koichik added a commit to koichik/node that referenced this issue Jan 26, 2012

@koichik

This comment has been minimized.

Show comment Hide comment
@koichik

koichik Jan 26, 2012

How about koichik/node@2bd9f87? As wrote to my first comment, I would like to throw an error rather than replacing silently.

The following example calls setHeader() one million with and without assertion.

var http = require('http');

var N = 1E6;
var name = 'Content-Type';
var value = 'text/plain; charset=UTF-8';

var server = http.createServer(function(req, res) {
  console.time('assert');
  for (var i = 0; i < N; ++i) {
    res.setHeader(name, value + i);
  };
  console.timeEnd('assert');

  console.time('noAssert');
  for (var i = 0; i < N; ++i) {
    res.setHeader(name, value + i, true);
  };
  console.timeEnd('noAssert');

  res.writeHead(200);
  res.end();
}).listen(3000, function() {
  http.get({port: 3000}, function(res) {
    res.on('end', function() {
      server.close();
    });
  });
});

result:

assert: 870ms
noAssert: 468ms

The use of the RegExp is slow for about 400 milliseconds per one million times. I think that it is reasonable enough.

koichik commented Jan 26, 2012

How about koichik/node@2bd9f87? As wrote to my first comment, I would like to throw an error rather than replacing silently.

The following example calls setHeader() one million with and without assertion.

var http = require('http');

var N = 1E6;
var name = 'Content-Type';
var value = 'text/plain; charset=UTF-8';

var server = http.createServer(function(req, res) {
  console.time('assert');
  for (var i = 0; i < N; ++i) {
    res.setHeader(name, value + i);
  };
  console.timeEnd('assert');

  console.time('noAssert');
  for (var i = 0; i < N; ++i) {
    res.setHeader(name, value + i, true);
  };
  console.timeEnd('noAssert');

  res.writeHead(200);
  res.end();
}).listen(3000, function() {
  http.get({port: 3000}, function(res) {
    res.on('end', function() {
      server.close();
    });
  });
});

result:

assert: 870ms
noAssert: 468ms

The use of the RegExp is slow for about 400 milliseconds per one million times. I think that it is reasonable enough.

@tj

This comment has been minimized.

Show comment Hide comment
@tj

tj Jan 26, 2012

yeah on jsperf that regexp is doing 5,265,491 ops/s :p. -1 for throwing though, adding a try/catch to every header manip would get pretty annoying

tj commented Jan 26, 2012

yeah on jsperf that regexp is doing 5,265,491 ops/s :p. -1 for throwing though, adding a try/catch to every header manip would get pretty annoying

@koichik

This comment has been minimized.

Show comment Hide comment
@koichik

koichik Jan 27, 2012

@visionmedia - Because header manipulations throw an error from before, you should have already added a try/catch :)

I think that it is a bug of an application that an error is thrown, it must check its input. Throwing an error makes a user notice it. Probably, handling an error will become easy when Domains is merged.

koichik commented Jan 27, 2012

@visionmedia - Because header manipulations throw an error from before, you should have already added a try/catch :)

I think that it is a bug of an application that an error is thrown, it must check its input. Throwing an error makes a user notice it. Probably, handling an error will become easy when Domains is merged.

@alpercelik

This comment has been minimized.

Show comment Hide comment
@alpercelik

alpercelik Jan 27, 2012

I agree about throwing error is better because; setHeader cannot successfully do what it is designed to do. And in this case it is caused by invalid input parameters.

I agree about throwing error is better because; setHeader cannot successfully do what it is designed to do. And in this case it is caused by invalid input parameters.

@tj

This comment has been minimized.

Show comment Hide comment
@tj

tj Jan 27, 2012

@koichik ah well then yeah we shouldn't throw, I see that it throws as far as the header already being sent, that's fine because that's a user error, you dont need to try/catch all over for those, but IMO this should not throw. Sure I might want to know with a console.warn when someone is attempting to split the header but I also dont want to litter my apps with error checking for every single setHeader()

tj commented Jan 27, 2012

@koichik ah well then yeah we shouldn't throw, I see that it throws as far as the header already being sent, that's fine because that's a user error, you dont need to try/catch all over for those, but IMO this should not throw. Sure I might want to know with a console.warn when someone is attempting to split the header but I also dont want to litter my apps with error checking for every single setHeader()

@indutny

This comment has been minimized.

Show comment Hide comment
@indutny

indutny Mar 18, 2014

Member

Is it still relevant?

Member

indutny commented Mar 18, 2014

Is it still relevant?

@jasnell jasnell self-assigned this May 15, 2015

@jasnell jasnell referenced this issue in nodejs/node-convergence-archive May 19, 2015

Closed

http: sanitize http headers and method names #13

@jasnell

This comment has been minimized.

Show comment Hide comment
@jasnell

jasnell Aug 26, 2015

Owner

New PR opened on nodejs/node. Will continue to track the conversation there. nodejs/node#2526

Owner

jasnell commented Aug 26, 2015

New PR opened on nodejs/node. Will continue to track the conversation there. nodejs/node#2526

@jasnell jasnell closed this Aug 26, 2015

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.