Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible UnicodeError caused by missing encoding="utf-8" #1908

Closed
methane opened this issue Feb 13, 2021 · 1 comment · Fixed by #1940
Closed

Possible UnicodeError caused by missing encoding="utf-8" #1908

methane opened this issue Feb 13, 2021 · 1 comment · Fixed by #1940
Labels
bug:normal affects many people or has quite an impact

Comments

@methane
Copy link

methane commented Feb 13, 2021

def get_py_project_toml(path):
with open(str(path)) as file_handler:
config_data = toml.load(file_handler)

toml must be encoded by UTF-8. Please add encoding="utf-8" here.
It may cause UnicodeDecodeError when pyproject.toml contains non-ASCII character and locale encoding is not UTF-8 (e.g. Windows).

with TemporaryFile("wt") as file:
with redirect_stdout(file):
with redirect_stderr(file):
yield

On Windows, stdout is UTF-8 encoded when it is console (e.g. _WinConsoleIO), but the default encoding of TemporaryFile() is legacy encoding. So UTF-8 is better encoding for this purpose.

This is not a real bug because this function is used only here:

with suppress_output():

@methane methane added the bug:normal affects many people or has quite an impact label Feb 13, 2021
@domdfcoding
Copy link

I have encountered this issue "in the wild" with a pyproject.toml file containing UTF-8 characters.

The traceback is:

Traceback (most recent call last):
  File "C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\site-packages\tox\__main__.py", line 4, in <module>
    tox.cmdline()
  File "C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\site-packages\tox\session\__init__.py", line 44, in cmdline
    main(args)
  File "C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\site-packages\tox\session\__init__.py", line 65, in main
    config = load_config(args)
  File "C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\site-packages\tox\session\__init__.py", line 81, in load_config
    config = parseconfig(args)
  File "C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\site-packages\tox\config\__init__.py", line 274, in parseconfig
    toml_content = get_py_project_toml(config_file)
  File "C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\site-packages\tox\config\__init__.py", line 306, in get_py_project_toml
    config_data = toml.load(file_handler)
  File "C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\site-packages\toml\decoder.py", line 156, in load
    return loads(f.read(), _dict, decoder)
  File "C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 219: character maps to <undefined>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug:normal affects many people or has quite an impact
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants