Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trim excess leading path separators #6644

Merged
merged 1 commit into from
Feb 23, 2024
Merged

Conversation

sigmavirus24
Copy link
Contributor

A URL with excess leading / (path-separator)s would cause urllib3 to attempt to reparse the request-uri as a full URI with a host and port. This bypasses that logic in ConnectionPool.urlopen by replacing these leading /s with just a single /.

Closes #6643

tests/test_adapters.py Outdated Show resolved Hide resolved
tests/test_adapters.py Outdated Show resolved Hide resolved
tests/test_adapters.py Outdated Show resolved Hide resolved
tests/test_adapters.py Outdated Show resolved Hide resolved
@@ -389,7 +390,7 @@ def request_url(self, request, proxies):
proxy_scheme = urlparse(proxy).scheme.lower()
using_socks_proxy = proxy_scheme.startswith("socks")

url = request.path_url
url = re.sub("^/+", "/", request.path_url)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned on urllib3/urllib3#3352 this could also be

url = f"/{request.path_url.lstrip('/')}"

I could benchmark these but I don't particularly care what the implementation is. I just threw this together to show that it can be fixed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the f-string (Python 3.9-3.12 tested) is ~4x faster but we're talking on the scale of nanoseconds so it's basically moot. I'd vote the f-string for readability, but don't have a strong opinion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm also happy to shove this into a branch too like

if path.startswith('//'):

To make it clearer that we only care about the separator being repeated. What I want is clarity in the reader as to why we're doing this. My old school brain things the regexp is clearer and the f-string looks sus but that's just my opinion and I'm not holding it closely

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, branching seems fine to me too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did both (f string and branch)

@nateprewitt nateprewitt added this to the 2.32.0 milestone Feb 22, 2024
A URL with excess leading / (path-separator)s would cause urllib3 to
attempt to reparse the request-uri as a full URI with a host and port.
This bypasses that logic in ConnectionPool.urlopen by replacing these
leading /s with just a single /.

Closes psf#6643
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Leading slash in uri followed by column fails
3 participants