You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
======================================================================
ERROR: test_unicode_input (__main__.TestPypandoc)
----------------------------------------------------------------------
Traceback (most recent call last):
File "./tests.py", line 279, in test_unicode_input
written = pypandoc.convert(u'<h1>\xfc\xe4\xf6\xee\xf4\xfb</h1>', 'md', format='html')
File "[...]/pypandoc/pypandoc/__init__.py", line 58, in convert
path = _identify_path(source)
File "[...]/pypandoc/pypandoc/__init__.py", line 159, in _identify_path
result = urlparse(source)
File "/usr/lib/python3.5/urllib/parse.py", line 295, in urlparse
url, scheme, _coerce_result = _coerce_args(url, scheme)
File "/usr/lib/python3.5/urllib/parse.py", line 115, in _coerce_args
return _decode_args(args) + (_encode_result,)
File "/usr/lib/python3.5/urllib/parse.py", line 99, in _decode_args
return tuple(x.decode(encoding, errors) if x else '' for x in args)
File "/usr/lib/python3.5/urllib/parse.py", line 99, in <genexpr>
return tuple(x.decode(encoding, errors) if x else '' for x in args)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 4: ordinal not in range(128)
Apparently the failure is in _identify_path(source) where source gets encoded to utf-8 while looking for a local file, but urlparse tries to decode it.
The attached patch fixes the issue by not saving the encoded data into source but only encoding it temporarly, and does not break any other test (I've checked both with the C locale and an utf-8 one)
to reproduce:
results in:
Apparently the failure is in
_identify_path(source)
where source gets encoded to utf-8 while looking for a local file, but urlparse tries to decode it.The attached patch fixes the issue by not saving the encoded data into
source
but only encoding it temporarly, and does not break any other test (I've checked both with the C locale and an utf-8 one)0001-Fix-parsing-of-unicode-paths-on-non-unicode-locales.txt
(please let me know if you prefer it as a PR.)
The text was updated successfully, but these errors were encountered: