Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect RFC 3986 validation? #1640

Closed
saucecode opened this issue Jun 28, 2019 · 4 comments
Closed

Incorrect RFC 3986 validation? #1640

saucecode opened this issue Jun 28, 2019 · 4 comments

Comments

@saucecode
Copy link

As of release 1.25, I believe certain RFC 3986 compliant URIs have been wrongly marked as invalid, leading to urllib3.exceptions.LocationParseError: Failed to parse exceptions.

I'm thinking that this is caused by the appearance of an @ (at) symbol in the userinfo portion of the URI, which previously went ignored.

To reproduce:

>>> import urllib3 # 1.25.3
>>> urllib3.util.parse_url('socks5://saucecode@example.com:myPassword@proxy.example.com:1080')

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/julian/.local/lib/python3.6/site-packages/urllib3/util/url.py", line 234, in parse_url
    raise LocationParseError(url)
urllib3.exceptions.LocationParseError: Failed to parse: socks5://saucecode@example.com:myPassword@proxy.example.com:1080

Unfortunately I'm not all that good at reading RFC documents, so I'm not sure if the @ (at) symbol appearing is in violation of the standard or not. To gather more evidence, I tried using uritools - another Python package advertising RFC 3986 compliance - to see what it thinks of that URI. It agrees that it is compliant.

>>> import uritools
>>> uritools.isuri('socks5://saucecode@example.com:myPassword@proxy.example.com:1080')
True

I'll leave it to better minds to judge if this is the intended behavior.


Some loosely related background:

My vpn provider provides SOCKS5 proxies to their customers which I use with Python requests in my own scripts, as an easy way to proxy HTTP traffic. Python requests requires that socks5 proxies are specified as socks5://username:password@vpn.example.com:1080, and in this case, the username was an email address.

Since upgrading urllib3, my scripts have been throwing LocationParseError when one of these proxy URIs is specified.

@sethmlarson
Copy link
Member

So according to the RFC the @ character isn't allowed within the userinfo section.

From RFC 3986 Appendix A:

userinfo      = *( unreserved / pct-encoded / sub-delims / ":" )
pct-encoded   = "%" HEXDIG HEXDIG
unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
sub-delims    = "!" / "$" / "&" / "'" / "(" / ")"
              / "*" / "+" / "," / ";" / "="

So that URI is technically not RFC 3986 compliant. However I wonder if we can loosen that restriction as we simply strip off userinfo and @ is similarly not allowed within the host section.

@ZedNaught
Copy link

I also ran into this issue trying to use a proxy that requires email:password authentication with the requests package. I've had luck URL escaping the @ in my email address to %40, though to be honest I haven't pinpointed exactly where it gets unescaped haha.

As an aside, I'll also note that once I got past this issue, I ran into PySocks incompatibility with IPv6, which I imagine will be an issue for you as well – I ended up just using http for my proxy. Interestingly, I wasn't having this issue with an older environment (currently on Python 3.7.3, urllib3 1.25.3) which I assume is related to a recent change in how Python or urllib3 handle IPv6.

@sethmlarson
Copy link
Member

@ZedNaught It get unescaped on the proxy side, they probably handle changing percent-encoded bytes back into ASCII before validating your username+password.

I'd love to hear more about the IPv6+PySocks incompatibility, could you open a separate issue and provide more detail about the environment it is working with and the one it isn't working with?

@sethmlarson
Copy link
Member

Closed via #1647

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants