Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

url.parse() fails if auth contain a colon #25353

Closed
ben-page opened this issue May 19, 2015 · 11 comments
Closed

url.parse() fails if auth contain a colon #25353

ben-page opened this issue May 19, 2015 · 11 comments

Comments

@ben-page
Copy link

If the user or the password component of auth contains colons, url.parse() fails.

var url = require('url');

var user = encodeURIComponent('us:er');
var password = encodeURIComponent('pass:word');
var uri = 'http://' + user + ':' + password + '@localhost/';
console.log(uri); // http://us%3Aer:pass%3Aword@localhost/

var parsed = url.parse(uri);
console.log(parsed.auth); // us:er:pass:word

The output of parse() is not understood by format().

console.log(url.format(parsed)); // http://us:er%3Apass%3Aword@localhost/

I think auth needs to be split into two fields (username and password).

@ben-page
Copy link
Author

Browsers handled it this way:

var parser = document.createElement('a');
parser.href = "http://us%3Aer:pass%3Aword@example.com:3000/path%3Ename/?search=some%20phrase#hash";

console.log(parser.hash); // #hash
console.log(parser.host); // example.com:3000
console.log(parser.hostname); // example.com
console.log(parser.href); // http://us%3Aer:pass%3Aword@example.com:3000/path%3Ename/?search=some%20phrase#hash
console.log(parser.origin); // http://example.com:3000
console.log(parser.password); // pass%3Aword
console.log(parser.pathname); // /path%3Ename/
console.log(parser.port); // 3000
console.log(parser.protocol); // http:
console.log(parser.search); // ?search=some%20phrase
console.log(parser.username); // us%3Aer

I think is much better than the way url.parse() currently work. Though, I don't see any harm in decoding the components after the URL has been parsed. It might be nice to have a argument to disable automatic URL decoding.

@dnakamura
Copy link

Just to be clear, this is not so much url.parse() failing (its still technically doing what its supposed to), but rather a limitation of the api (it's impossible to separate the username/password from the auth field).
PS. that's just a colon, not a semicolon

@jasnell
Copy link
Member

jasnell commented May 20, 2015

Hmm.. there appears to be a couple of failures here. Given the test case, if you look at the parsed object, you'll see:

> parsed
{ protocol: 'http:',
  slashes: true,
  auth: 'us:er:pass:word',
  host: 'localhost',
  port: null,
  hostname: 'localhost',
  hash: null,
  search: null,
  query: null,
  pathname: '/',
  path: '/',
  href: 'http://us:er%3Apass%3Aword@localhost/' }
> 

Notice that auth loses the pct-encoding making it impossible to properly parse out the actual username and password given in the input. But also notice that the href changes the input entirely. The first colon is unescaped while the original unescaped colon is modified. That's definitely a bug.

That said, https://www.ietf.org/rfc/rfc3986.txt is very clear about user:password being deprecated.

Section 3.2.1:

   Use of the format "user:password" in the userinfo field is
   deprecated.  Applications should not render as clear text any data
   after the first colon (":") character found within a userinfo
   subcomponent unless the data after the colon is the empty string
   (indicating no password).  Applications may choose to ignore or
   reject such data when it is received as part of a reference and
   should reject the storage of such data in unencrypted form.  The
   passing of authentication information in clear text has proven to be
   a security risk in almost every case where it has been used.

So while it definitely looks like node is doing the wrong thing, I'm inclined to give this a lower priority (which doesn't mean it won't be fixed, it would just need to be after we get a few other higher priority items done first).

@joyent/node-coreteam

@ben-page ben-page changed the title url.parse() fails if auth contains semicolons url.parse() fails if auth contains colons May 20, 2015
@ben-page
Copy link
Author

Some protocols ignore this advice. For example, the userinfo field is the only was to provide credentials in AMQP.

RFC3986 states that "A password appearing within the userinfo component is deprecated and should be considered an error" (section 7.5). While this is sound advice in the context of user-facing applications (e.g. web browsers) and for URIs that might be stored and displayed insecurely, it is not necessarily valid for AMQP applications. Many AMQP applications are server-based, and make AMQP connections on behalf of the application as a whole rather than for specific users. So the username and password identify the application rather than a human user, and are likely to be included with connection parameters appearing in a secure store of configuration settings. User-facing AMQP applications, which make AMQP connections on behalf of specific users, are also possible. In such cases the username and password may be provided by the user to identify themselves. But such AMQP applications are the exception rather than the rule. Thus authors of AMQP applications implementing this specification should not consider themselves bound by section 7.5 of RFC3986. Please also see section 5 ("Security Considerations") below.
https://www.rabbitmq.com/uri-spec.html

@jasnell
Copy link
Member

jasnell commented May 20, 2015

Understood. It definitely needs to be fixed.
On May 20, 2015 11:08 AM, "Ben Page" notifications@github.com wrote:

Some protocols ignore this advice. For example, the userinfo field is the
only was to provide credentials in AMQP.

RFC3986 states that "A password appearing within the userinfo component is
deprecated and should be considered an error" (section 7.5). While this is
sound advice in the context of user-facing applications (e.g. web browsers)
and for URIs that might be stored and displayed insecurely, it is not
necessarily valid for AMQP applications. Many AMQP applications are
server-based, and make AMQP connections on behalf of the application as a
whole rather than for specific users. So the username and password identify
the application rather than a human user, and are likely to be included
with connection parameters appearing in a secure store of configuration
settings. User-facing AMQP applications, which make AMQP connections on
behalf of specific users, are also possible. In such cases the username and
password may be provided by the user to identify themselves. But such AMQP
applications are the exception rather than the rule. Thus authors of AMQP
applications implementing this specifica tion should not consider
themselves bound by section 7.5 of RFC3986. Please also see section 5
("Security Considerations") below.
https://www.rabbitmq.com/uri-spec.html


Reply to this email directly or view it on GitHub
#25353 (comment).

@ben-page ben-page changed the title url.parse() fails if auth contains colons url.parse() fails if auth contain a colons May 20, 2015
@ben-page ben-page changed the title url.parse() fails if auth contain a colons url.parse() fails if auth contain a colon May 20, 2015
@ben-page
Copy link
Author

I created a pull request to address this issue. How long do they normally take to be reviewed?

@jasnell
Copy link
Member

jasnell commented May 26, 2015

I'll review either today or tomorrow.
On May 26, 2015 1:20 PM, "Ben Page" notifications@github.com wrote:

I created a pull request to address this issue. How long does it normally
for them to reviewed?


Reply to this email directly or view it on GitHub
#25353 (comment).

@jasnell
Copy link
Member

jasnell commented May 26, 2015

@ben-page if you haven't done so already, can you also open an issue either at http://github.com/nodejs/node or http://github.com/nodejs/io.js that points back to this issue and the PR. That'll be the best way to ensure that everyone who needs to look at this will spot it. Given that this is an API change in a bit of code that's already being looked at, we'll want to make sure it gets broader review before landing.

@jasnell
Copy link
Member

jasnell commented Jun 5, 2015

nodejs/node#49

@jasnell jasnell self-assigned this Jun 5, 2015
@bobzoller
Copy link

I published urlurl based on @ben-page's PR #25359 so I can use this today.

@ChALkeR
Copy link
Member

ChALkeR commented Apr 8, 2016

This is reported to nodejs/node issues as nodejs/node#1802.
No need to keep a duplicate open here, the discussion should move there.

@ChALkeR ChALkeR closed this as completed Apr 8, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants