Parse error on proxy request with auth URL #116

Closed
bnoordhuis opened this Issue Jun 24, 2012 · 18 comments

Comments

Projects
None yet
3 participants
Owner

bnoordhuis commented Jun 24, 2012

Works -> GET http://example.com/ HTTP/1.1
Fails -> GET http://user:pass@example.com/ HTTP/1.1

See joyent/node#3527.

Owner

bnoordhuis commented Jun 24, 2012

Added two tests in 4ecc216.

Owner

bnoordhuis commented Jun 24, 2012

This is kind of a pain to implement.

Axiom: the parser cannot backtrack or look ahead.

When the parser has seen http://, it can't know if what follows is the userinfo or the host:port section. That needs to be addressed somehow because the userinfo section allows characters that are illegal in host names.

From RFC 2396:

3.2.2. Server-based Naming Authority

   URL schemes that involve the direct use of an IP-based protocol to a
   specified server on the Internet use a common syntax for the server
   component of the URI's scheme-specific data:

      <userinfo>@<host>:<port>

   where <userinfo> may consist of a user name and, optionally, scheme-
   specific information about how to gain authorization to access the
   server.  The parts "<userinfo>@" and ":<port>" may be omitted.

      server        = [ [ userinfo "@" ] hostport ]

   The user information, if present, is followed by a commercial at-sign
   "@".

      userinfo      = *( unreserved | escaped |
                         ";" | ":" | "&" | "=" | "+" | "$" | "," )

   Some URL schemes use the format "user:password" in the userinfo
   field. This practice is NOT RECOMMENDED, because the passing of
   authentication information in clear text (such as URI) has proven to
   be a security risk in almost every case where it has been used.

   The host is a domain name of a network host, or its IPv4 address as a
   set of four decimal digit groups separated by ".".  Literal IPv6
   addresses are not supported.

      hostport      = host [ ":" port ]
      host          = hostname | IPv4address
      hostname      = *( domainlabel "." ) toplabel [ "." ]
      domainlabel   = alphanum | alphanum *( alphanum | "-" ) alphanum
      toplabel      = alpha | alpha *( alphanum | "-" ) alphanum
      IPv4address   = 1*digit "." 1*digit "." 1*digit "." 1*digit
      port          = *digit
Contributor

bpaquet commented Jun 25, 2012

Hi,

I can submit a patch which

  • relax constraint on host and hostport parsing
  • check allowed chars when finding delimiters for user:password or host:port
  • add two values in http_parser_url_fields : BASIC_AUTH_USERNAME and BASIC_AUTH_PASSWORD.

I can try this if above design is ok for you.

Owner

bnoordhuis commented Jun 25, 2012

check allowed chars when finding delimiters for user:password or host:port

The problem is that that is not an option. You're not guaranteed to get a full request line, the input to http_parser_execute() may be spread over several packets. That's what I meant when I said that the parser can't backtrack or look ahead.

As an example, say the first packet contains GET http://foo:. What is foo? A username or a domain?

Contributor

bpaquet commented Jun 26, 2012

Ok ...

But I do not see a solution : you can not guess if foo is user info or host
name without seeking an @. Right ?

I'm ready to help you to implement the patch if needed.

Regards

Le mardi 26 juin 2012, Ben Noordhuis a écrit :

check allowed chars when finding delimiters for user:password or
host:port

The problem is that that is not an option. You're not guaranteed to get a
full request line, the input to http_parser_execute() may be spread over
several packets. That's what I meant when I said that the parser can't
backtrack or look ahead.

As an example, say the first packet contains GET http://foo:. What is
foo? A username or a domain?


Reply to this email directly or view it on GitHub:
joyent#116 (comment)

Owner

bnoordhuis commented Jun 26, 2012

But I do not see a solution : you can not guess if foo is user info or host name without seeking an @. Right ?

Correct, that's the dilemma. :-)

I have to think hard about how to best approach this. Maybe there's no good alternative but to break the interface - but that's something I'd rather not do, of course.

Contributor

bpaquet commented Jul 3, 2012

Ping :)

Sorry, it's really annoying for me.

Owner

bnoordhuis commented Jul 4, 2012

I haven't really been able to come up with a good solution yet. Tell you what, start with something that works for you and if it's good, I'll pull it in.

Contributor

bpaquet commented Jul 6, 2012

http-parser is a fork of the Nginx one (I do not see that before).

The nginx http parser does not support a GET http://toto:titi@localhost/ HTTP/1.1 request. Nginx issue a 400 response. It's not a problem for nginx because nginx is not used as an outgoing proxy.
But NodeJS can be used as an outgoing proxy (it's what I want to do :)).

I will try a fix, but I'm afraid it can only be crappy ...

Contributor

bpaquet commented Jul 8, 2012

Pull request done. I will be happy if someone can do a code review on my code.

@pgriess pgriess added a commit that referenced this issue Jul 25, 2012

@pgriess pgriess Merge pull request #118 from bpaquet/master
#116 : refactor to allow url with basic auth a:b@toto.com
fb3eeb7
Contributor

pgriess commented Jul 25, 2012

Pull request merged; closing.

pgriess closed this Jul 25, 2012

Contributor

bpaquet commented Jul 25, 2012

Thanks you !

I have two question : @bnoordhuis or @pgriess

  • Have I something to do to have this code merged into nodejs ? I can
    create a pull request if needed.
  • Can I have my name in contributors, or my contribution is too small ?

Regards,

On Wed, Jul 25, 2012 at 3:04 AM, Peter Griess <
reply@reply.github.com

wrote:

Pull request merged; closing.


Reply to this email directly or view it on GitHub:
joyent#116 (comment)

Contributor

pgriess commented Jul 25, 2012

Oh, crap you need to sign the CLA at

http://spreadsheets2.google.com/viewform?hl=en&formkey=dDJXOGUwbzlYaWM4cHN1MERwQS1CSnc6MQ

Once you do that, we'll include you in the AUTHORS file.

As for getting this merged into NodeJS, I'm not sure what the procedure is for that these days. @ry or @isaacs what's the deal with that?

Owner

bnoordhuis commented Jul 25, 2012

I'll land it in node master sometime soon.

Contributor

bpaquet commented Jul 26, 2012

Done for the CLA.

On Thu, Jul 26, 2012 at 12:39 AM, Peter Griess <
reply@reply.github.com

wrote:

Oh, crap you need to sign the CLA at

http://spreadsheets2.google.com/viewform?hl=en&formkey=dDJXOGUwbzlYaWM4cHN1MERwQS1CSnc6MQ

Once you do that, we'll include you in the AUTHORS file.

As for getting this merged into NodeJS, I'm not sure what the procedure is
for that these days. @ry or @isaacs what's the deal with that?


Reply to this email directly or view it on GitHub:
joyent#116 (comment)

Contributor

bpaquet commented Aug 20, 2012

Hi,

Any news for the merge into nodejs ?

And for my name in AUTHORS ?
Bertrand Paquet bertrand.paquet@gmail.com

Regards,

Bertrand

Owner

bnoordhuis commented Aug 29, 2012

@bpaquet Added you to AUTHORS in 4e1a6ab and upgraded the parser in nodejs/node-v0.x-archive@4784ea1.

Contributor

bpaquet commented Aug 30, 2012

Cool :)

Thx.

On Thu, Aug 30, 2012 at 12:18 AM, Ben Noordhuis notifications@github.comwrote:

@bpaquet https://github.com/bpaquet Added you to the AUTHORS in ad3b631ad3b631 upgraded the parser in
nodejs/node-v0.x-archive@4784ea1 nodejs/node-v0.x-archive@4784ea1.


Reply to this email directly or view it on GitHubhttps://github.com/joyent/http-parser/issues/116#issuecomment-8143206.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment