New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
urllib IPv6 parsing fails with special characters in passwords #77523
Comments
The documentation specifies to follow RFC 2396 (https://tools.ietf.org/html/rfc2396.html) but fails to parse a user:password@host url in urllib.parse.urlsplit (https://docs.python.org/3/library/urllib.parse.html#urllib.parse.urlsplit) when the password contains an '[' character. |
I presume this is about parsing a URL like >>> urlsplit("//user:[@host")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/proj/python/cpython/Lib/urllib/parse.py", line 431, in urlsplit
raise ValueError("Invalid IPv6 URL")
ValueError: Invalid IPv6 URL Ideally the square bracket should be escaped as %5B. Related reports about parsing unescaped delimiters in a URL password are bpo-18140 (fragment #, query ?) and bpo-23328 (slash /). |
RFC 2396 explicitly excludes the use of [ and ] in URLs. RFC 2732 <https://www.ietf.org/rfc/rfc2732.txt\> defines the syntax for IPv6 URLs, and allows [ and ] ONLY in the host part. So I'd say that the behaviour is arguably correct (if somewhat unfortunate) |
I would like to add to this bug - the password field on the URL cannot contain a pound sign or question mark or the parser incorrectly parses the URL, as this gist demonstrates - https://gist.github.com/metaperl/fc6f43bf6b9a9f874b8f27e29695e68c |
Also note, if SQLAlchemy gives any guidance, then note that SA unquotes both the username and password of the URL: https://github.com/sqlalchemy/sqlalchemy/blob/master/lib/sqlalchemy/engine/url.py#L274 |
Regarding "RFC 2396 explicitly excludes the use of [ and ] in URLs. RFC 2732 <https://www.ietf.org/rfc/rfc2732.txt\> defines the syntax for IPv6 URLs, and allows [ and ] ONLY in the host part. So I'd say that the behaviour is arguably correct (if somewhat unfortunate)" I would say that a square bracket CAN be used in the password, but that it should be urlencoded and that this library should perform a urldecode for both username and password, just as SQLAlchemy does. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: