-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
Redirect is not working correctly in urllib2 #58340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
When only the query string is sent by the server as the redirect url, urllib2 redirects to incorrect address. Error is occuring on the page http://kniznica.uniza.sk/opac. Server sends only the query string part of the uri in the Location header (ie. ?fs=04D07295D4434730A51C95A9F1727373&fn=main). Path is then incorrectly stripped from the original url, and urllib2 redirects to http://kniznica.uniza.sk/?fs=04D07295D4434730A51C95A9F1727373&fn=main. The error was introduced in the fix of the issue bpo-2464. I think, the attached patch is fixing the error (it is working for me). |
I forgot to mention that the correct url in the example would be http://kniznica.uniza.sk/opac?fs=04D07295D4434730A51C95A9F1727373&fn=main. |
→ curl -sI http://kniznica.uniza.sk/opac HTTP/1.1 302 Moved Temporarily → python3.3 Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 01:25:11)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib.parse
>>> urllib.parse.urlparse("http://kniznica.uniza.sk/opac")
ParseResult(scheme='http', netloc='kniznica.uniza.sk', path='/opac', params='', query='', fragment='')
>>> urllib.parse.urlparse("?fs=C79F09C9F1304E7AA4FF7C211BEA2B9B&fn=main")
ParseResult(scheme='', netloc='', path='', params='', query='fs=C79F09C9F1304E7AA4FF7C211BEA2B9B&fn=main', fragment='') Redirection is defined at |
The proposed patch looks good to me. A test case would be nice though. Also I wonder why the “malformed URL” logic needs to be in urllib.request. Surely it either belongs in urljoin(), or in the underlying http.client. That needs more thought, but either way the current patch is a definite improvement. |
urllib2_redirect_fix.2.patch adds a test. I was tempted to remove the whole block of code setting the path to “/”, but there is one minor disadvantage: if a redirect points to a so-called “malformed” URL without any path component, like “http://example.net” or “http://example.net?query”, geturl() would return this URL verbatim. |
I will try to commit this soon |
New changeset 52a7f580580c by Martin Panter in branch '3.5': New changeset 789a3f87bde1 by Martin Panter in branch '2.7': New changeset 841a9a3f3cf6 by Martin Panter in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: