-
-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prioritize lowercase proxy variables in urllib.request #70991
Comments
During programming a function, that replaces a wget call, I noticed, that something is wrong with urllibs proxy handling. I usually use the scheme "http_proxy= wget -N -nd URL" when I need to bypass the proxy. Hence I was pretty confused, that this doesn't work with python(3). Creating an empty ProxyHandler isn't the real Mc Coy either. Diving into the issue, I found getproxies_environment, but couldn't make much sense out of its behavior, up until I noticed, that Consequence: python3 needs the scheme "http_proxy= HTTP_PROXY= python3 ..." Since I, like everyone else, prefer gentle tones over the loud, and want to spare this surprise for others in the future, I propose the attached patch. Process environment variables in two passes, first uppercase, then lowercase, allowing an empty lowercase value to overrule any uppercase value. Please consider applying this. |
I think this should be applied as a bug fix to 2.7 and 3.5 as well. What do you think? Lowercase is the normal way to use these variables. I left some comments on the code review. A similar bug seems to exist for the “no_proxy” variable. Also, it would be nice to add a test case for this. |
Hi Hans-Peter, I agree with Martin's comments and suggestion. If I understand the suggestion correctly, the only change will be a documentation change. Isn't it? Because getproxies_environment() in it's current form already fetches the lower_case proxy which you want to highlight that is a winner in case there are two environment variables with different cases. Thanks! |
It’s not just documentation, it is a real bug. If you run http_proxy="" HTTP_PROXY=http://bad-proxy python . . . the empty value of lowercase “http_proxy” should have priority over the other one. At the moment it depends on the order of os.environ.items(), which I understand is random. |
Hi Martin, hi Senthil, thanks for the valuable comments. Will incorporate your suggestions later today. Yes, Martin, it's a bug, and should be fixed for 2.7 and 3.5 as well, but I was unsure, if I get some feedback at all... Hence, this is a very nice experience for me. I'm out for jogging now, |
Hi Martin, hi Senthil, please find a new patch attached, that incorporates your suggestions.
Yes, mixed case situations are not handled in proxy_bypass_environment, BTW, while looking at the code, I noticed, that most docstrings of the callers of proxy_bypass_environment are wrong: they say, that the proxies dict is returned, but they return the value of proxy_bypass_environment(), not get_proxies(). A follow up patch could do this in order to clean up this mess: What do you think about the attached patch and the last paragraph? |
The second patch looks reasonable (I left one minor grammar comment). About getproxies_environment(), I meant that if you set no_proxy="", it may be ignored if NO_PROXY=. . . is also set. E.g. if you already have these set: http_proxy=http://proxy
no_proxy=example.net
NO_PROXY=example.net you should be able to override no_proxy="" to cancel bypassing for example.net. I agree the proxy_bypass() docstring looks wrong. If you want to write a patch to fix that, or to make the code more efficient or easier to understand, that would be good. |
Here we go: v3 fixes following issues:
|
Here's the finalized version of this patch, including unit tests. |
v5: don't require the proxies argument in proxy_bypass_environment() |
I found two bugs; see the comments. In Python 2, it looks like the proxy_bypass_etc() functions are defined in urllib and imported into urllib2, so it makes sense to include the tests in test_urllib rather than test_urllib2. Technically I think proxy_bypass_environment() is meant to be an internal function, but it is safer to keep the changes to in minimal for bug fixes. So I think the optional proxies parameter should be okay. |
|
The tests are in test_urllib. test_urllib2 is testing proxy behaviour on a higher level, so I think, they're in the correct module, aren't they? |
Yes that was my rambling way of saying that I had checked to see if they were in the right place, and reporting that it was all okay :) New patch seems okay to me. |
v7:
|
V7 looks good to me |
New changeset 49b975122022 by Senthil Kumaran in branch '3.5': New changeset 316593f5bf73 by Senthil Kumaran in branch 'default': |
New changeset c502deb19cb0 by Senthil Kumaran in branch '2.7': |
This is fixed in all versions of Python. Thank you for your contribution, Hans-Peter Jansen. |
In a couple of systems, I have to stick with 3.4. Is there a chance to have this patch in 3.4 as well, if a new release 3.4 is made? |
Unfortunately no. 3.4 is only in security fixes mode and this doesn't qualify as a security fix. Usually the bug fixes and feature additions are incentives for developers to upgrade their python codebase. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: