You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When accessing a URL using requests, the redirect causes proxy error because the initial web page DOES require the proxy, but the page it redirects to DOES NOT require the proxy. The program succeeds in using the proxy for the first web page, but when redirecting to the next web page it does not recognize the
Expected Result
After my program executes 'requests.get()', the program should reach the initial website through the proxy, and then redirect to the new website without going through the proxy, and then print the HTTP status code. In short, I expect the program to run and print the HTTP status code (ideally 200).
Actual Result
Instead I get a proxy error when the program tries to access the website it redirects to and the HTTP status code line isn't executed.
I traced this down using python debugger and my findings were as follows...
The proxy and no proxy are set in my environment and these are first set in the program as a dictionary in 'merge_environment_settings', which calls 'get_environ_proxies' which gets the proxy settings from the environment (which is needed at this point so is correct).
NOTE: If the proxy wasn't needed at this point, it would have produced an empty dictionary for 'proxies' as long as the URL domain matched the 'no_proxy' (thanks to the 'should_bypass_proxies' function).
All is fine at this point as the first URL requires the proxy, BUT...
Further down the line when the redirect occurs, within the 'resolve_redirects' function, 'rebuild_proxies' is called. This functions creates a 'new_proxies' variable which copies the original environment proxy settings.
Further along in this function, the 'bypass_proxy' boolean variable is set from the 'should_bypass_proxies' function, this is correctly set to 'True' as the 'no_proxy' matches the new URL (for the redirect website).
At this point, I expect the 'new_proxies' variable to be altered as the bypass proxy has been triggered, but it remains unchanged and the 'rebuild_proxies' function just returns the original proxy settings.
Another note: within 'rebuild_proxies', the below statement is skipped due to 'not bypass_proxy' but in the 'get_environ_proxies' function mentioned earlier is the only place I discovered to actually return an empty dictionary for 'proxies' when it detects a no proxy. So by skipping the below code, it is skipping the only function I discovered to actually remove the proxy settings.
I actually added the following bit of code to the end of 'rebuild_proxies' and it seemed to fix my program, but i think there is an underlying issue worth fixing.
This command is only available on Requests v2.16.4 and greater. Otherwise,
please provide some basic information about your system (Python version,
operating system, &c).
Thanks for reading and hope it can be fixed!
The text was updated successfully, but these errors were encountered:
Summary.
When accessing a URL using requests, the redirect causes proxy error because the initial web page DOES require the proxy, but the page it redirects to DOES NOT require the proxy. The program succeeds in using the proxy for the first web page, but when redirecting to the next web page it does not recognize the
Expected Result
After my program executes 'requests.get()', the program should reach the initial website through the proxy, and then redirect to the new website without going through the proxy, and then print the HTTP status code. In short, I expect the program to run and print the HTTP status code (ideally 200).
Actual Result
Instead I get a proxy error when the program tries to access the website it redirects to and the HTTP status code line isn't executed.
I traced this down using python debugger and my findings were as follows...
The proxy and no proxy are set in my environment and these are first set in the program as a dictionary in 'merge_environment_settings', which calls 'get_environ_proxies' which gets the proxy settings from the environment (which is needed at this point so is correct).
NOTE: If the proxy wasn't needed at this point, it would have produced an empty dictionary for 'proxies' as long as the URL domain matched the 'no_proxy' (thanks to the 'should_bypass_proxies' function).
All is fine at this point as the first URL requires the proxy, BUT...
Further down the line when the redirect occurs, within the 'resolve_redirects' function, 'rebuild_proxies' is called. This functions creates a 'new_proxies' variable which copies the original environment proxy settings.
Further along in this function, the 'bypass_proxy' boolean variable is set from the 'should_bypass_proxies' function, this is correctly set to 'True' as the 'no_proxy' matches the new URL (for the redirect website).
At this point, I expect the 'new_proxies' variable to be altered as the bypass proxy has been triggered, but it remains unchanged and the 'rebuild_proxies' function just returns the original proxy settings.
Another note: within 'rebuild_proxies', the below statement is skipped due to 'not bypass_proxy' but in the 'get_environ_proxies' function mentioned earlier is the only place I discovered to actually return an empty dictionary for 'proxies' when it detects a no proxy. So by skipping the below code, it is skipping the only function I discovered to actually remove the proxy settings.
I actually added the following bit of code to the end of 'rebuild_proxies' and it seemed to fix my program, but i think there is an underlying issue worth fixing.
Reproduction Steps
System Information
This command is only available on Requests v2.16.4 and greater. Otherwise,
please provide some basic information about your system (Python version,
operating system, &c).
Thanks for reading and hope it can be fixed!
The text was updated successfully, but these errors were encountered: