Do not accept PUN/GEM methods as PUT/GET. #156

chrisdickinson · 2013-08-20T05:20:25Z

Encountering them returns an error, HPE_INVALID_METHOD
Tests have been added.

There's not a clear distinction between what erroneous methods should trigger HPE_INVALID_METHOD vs. HPE_UNKNOWN (there are extant tests for HPE_UNKNOWN against methods like "C*****"), advice would be appreciated. Tested with make test on OSX 10.7.5.

This fixes nodejs/node-v0.x-archive#6078.

* Encountering them returns an error, `HPE_INVALID_METHOD` * Tests have been added.

flexd · 2013-08-20T05:51:25Z

@chrisdickinson This does not at all solve the problem described at nodejs/node-v0.x-archive#6078

Node is doing HTTP parsing completely the wrong the way, which I guess is for performance but it's silly stupid and should be fixed. And not by adding more if-logic and goto statements.

chrisdickinson · 2013-08-20T06:10:45Z

@flexd Ah, good point -- this does not add LINK and UNLINK; rather it addresses the immediate problem of accepting unacceptable HTTP methods.

To your second statement, given certain constraints, yes, it could be argued that http_parser is wrong. It would be worthwhile to drop in another existing HTTP parser that "does it right" and do an apples-to-apples exhaustive comparison.

However, one bug -- or even one gnarly part -- does not a bad library make. From 0, it was easy to jump into http-parser and make the required fix. It presents a zero-allocation, zero-syscall, conditionally compliant handwritten streaming HTTP parser (NB: extension verbs are optional, as is the 501 response -- which is the purview of node itself, in this case).

I'll see about adding LINK as a separate PR.

bnoordhuis · 2013-08-20T06:59:36Z

@chrisdickinson LGTM, thanks. Can you sign the CLA? It's a different one from the node.js CLA.

@flexd Constructive criticism is welcome but if 'it's silly stupid and should be fixed' is all you have to offer, then please stop now.

I'll see about adding LINK as a separate PR.

LINK and UNLINk are still drafts. I won't be accepting pull requests for that.

flexd · 2013-08-20T07:27:00Z

@bnoordhuis My point was more that I do not think adding more complexity to this is the best way of solving the problem.
This seems to me at least to be a less than optimal approach, especially since it in the first place allowed this kind of thing to be possible. Adding if-else to be able to catch someone doing a PUN or whatever other unacceptable/non-existing HTTP method will just lead to more bits that could go wrong and more places where there could be logic error.

As someone mentioned in the HN discussion (https://news.ycombinator.com/item?id=6240437), the whole thing could be replaced with a Ragel parser.

What exactly is the overhead of checking just one character VS the whole verb? Has it really come to strncmp being the bottleneck here?

Edit: English.

bnoordhuis · 2013-08-20T07:39:06Z

My plan is to eventually phase out http_parser in favor of a pure JS parser (if only because then I can hand off maintenance to others) but that depends on either me finding the time for it (unlikely in the short/medium term) or someone stepping up.

As someone mentioned in the HN discussion (https://news.ycombinator.com/item?id=6240437), the whole thing could be replaced with a Ragel parser.

I wrote a JS backend for Ragel a while ago and did some experiments but the results were inconclusive. (That is, the JS parser did pretty bad but then I didn't spend days on it.)

What exactly is the overhead of checking just one character VS the whole verb? Has it really come to strncmp being the bottleneck here?

http_parser is a state machine. It has to process characters one at a time because the input may be split over multiple messages.

jjclark1982 · 2013-08-20T07:39:36Z

It would be nice to treat all unrecognized method names the same, instead of simply checking the first three characters to rule out PUN and GEM but still misassign PURSE and UNSURPRISED.

Edit: never mind, I see that those cases should be caught by

        } else if (ch == matcher[parser->index]) {
        ...
        } else {
          SET_ERRNO(HPE_INVALID_METHOD);
          goto error;
        }

bnoordhuis · 2013-08-20T09:40:00Z

As someone mentioned in the HN discussion (https://news.ycombinator.com/item?id=6240437), the whole thing could be replaced with a Ragel parser.

Ah, I've read the HN thread now (and feeling dumber for it). The Internet peanut gallery is out in full force and as clueless as ever, it seems...

To make a long story short: very old versions of http_parser were based on a Ragel grammar but the handwritten version outperformed it by a significant margin, as in over 10% fewer branches taken.

Ragel is great but it can't take shortcuts like a human programmer can and once you start jumping around in your grammar, things become complicated fast (and impossible to debug.)

jamesamcl · 2013-08-20T14:31:46Z

My plan is to eventually phase out http_parser in favor of a pure JS parser (if only because then I can hand off maintenance to others)

Does Joyent intend on maintaining http-parser independently of Node? Or are you basically saying this is going to be abandoned?

isaacs · 2013-08-20T14:56:50Z

Does Joyent intend on maintaining http-parser independently of Node? Or are you basically saying this is going to be abandoned?

What do you mean when you say "abandoned"? As it is, no one works on this full-time, and it gets only a few commits a year.

If Node isn't using it, then we'll probably just call this project "done" (as it effectively already is), and stop worrying about it. It's a good piece of software, and some other projects find it useful. We won't stop taking pull requests that improve it, if that's what you mean.

chrisdickinson · 2013-08-20T15:21:11Z

@bnoordhuis Signed! Thanks!

laino · 2013-08-20T20:55:32Z

Can someone explain why node is trying to internally compare the HTTP Methods against known values at all?
I mean, why not just extract the value and hand it to the user?

99% of all HTTP servers in node do compare the HTTP method to some other value in pure JS anyways. Something along the lines of:

if(req.method === 'GET'){
    // Handle GET
}

So trying to squeeze a few nanoseconds by checking it in the http parser is not going to make much of a difference. It will be even less noticeable as node is already extracting dozens of strings from the request anyways (the url, 2 strings for each header...).

isaacs · 2013-08-21T03:41:44Z

Can someone explain why node is trying to internally compare the HTTP Methods against known values at all?

Because it's faster, and trivial to make secure.

I mean, why not just extract the value and hand it to the user?

Because it'd have to add logic to limit the length of the method, as well as take many more code paths in the parser itself.

So trying to squeeze a few nanoseconds by checking it in the http parser is not going to make much of a difference.

Have you benchmarked the difference?

bnoordhuis · 2013-08-21T13:10:27Z

Thanks Chris, landed in c6ee6ad. I added some more request method checks in 547553b. Thanks again!

Do not accept PUN/GEM methods as PUT/GET.

88d6c01

* Encountering them returns an error, `HPE_INVALID_METHOD` * Tests have been added.

bnoordhuis mentioned this pull request Aug 20, 2013

LINK and UNLINK HTTP Methods are not-supported nodejs/node-v0.x-archive#6078

Closed

bnoordhuis closed this Aug 21, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do not accept PUN/GEM methods as PUT/GET. #156

Do not accept PUN/GEM methods as PUT/GET. #156

chrisdickinson commented Aug 20, 2013

flexd commented Aug 20, 2013

chrisdickinson commented Aug 20, 2013

bnoordhuis commented Aug 20, 2013

flexd commented Aug 20, 2013

bnoordhuis commented Aug 20, 2013

jjclark1982 commented Aug 20, 2013

bnoordhuis commented Aug 20, 2013

jamesamcl commented Aug 20, 2013

isaacs commented Aug 20, 2013

chrisdickinson commented Aug 20, 2013

laino commented Aug 20, 2013

isaacs commented Aug 21, 2013

bnoordhuis commented Aug 21, 2013

Do not accept PUN/GEM methods as PUT/GET. #156

Do not accept PUN/GEM methods as PUT/GET. #156

Conversation

chrisdickinson commented Aug 20, 2013

flexd commented Aug 20, 2013

chrisdickinson commented Aug 20, 2013

bnoordhuis commented Aug 20, 2013

flexd commented Aug 20, 2013

bnoordhuis commented Aug 20, 2013

jjclark1982 commented Aug 20, 2013

bnoordhuis commented Aug 20, 2013

jamesamcl commented Aug 20, 2013

isaacs commented Aug 20, 2013

chrisdickinson commented Aug 20, 2013

laino commented Aug 20, 2013

isaacs commented Aug 21, 2013

bnoordhuis commented Aug 21, 2013