New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
urllib.quote horribly mishandles unicode as second parameter #68073
Comments
All hell breaks loose when unicode is passed as the second argument to urllib.quote in Python 2: >>> import urllib
>>> urllib.quote('\xce\x91', u'')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/urllib.py", line 1292, in quote
if not s.rstrip(safe):
UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position 0: ordinal not in range(128) This on its own wouldn't be that bad - just another Python 2 unicode wonkiness. However, coupled with caching done by the quote function (quoters are cached based on the second parameter, and u'' == ''), it means that a random preceding call to quote from an entirely different place in the application can break your code: $ python2
Python 2.7.9 (default, Dec 11 2014, 04:42:00)
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib
>>> urllib.quote('\xce\x91', '')
'%CE%91'
>>>
$ python2
Python 2.7.9 (default, Dec 11 2014, 04:42:00)
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib
>>> urllib.quote('a', u'')
'a'
>>> urllib.quote('\xce\x91', '')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/urllib.py", line 1292, in quote
if not s.rstrip(safe):
UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position 0: ordinal not in range(128) Good luck debugging that. So, one of two things needs to happen:
|
The typerror isn't going to happen for backward compatibility reasons. A fix isn't likely to happen because python2 doesn't really support unicode in urllib, to my understanding (if I'm wrong about that the answser changes). I'm not sure whether casting to string would have backward compatibility issues or not (I suspect it would; somneone would have to investigate that question as a first step). |
Couldn't this be fixed in a backwards compatible way by clearing the cache when this type of error occurs? We can do this by wrapping the offending line with a try/except, then checking to see if the cache is corrupted. If it is, then we clear the cache and try again. try: |
Python 2 is EOL, so I think this issue should be closed. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: