Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: Specify the character set as utf-8 when opening the file in the file config.py, or use character set detection #3022

Closed
yinhedot opened this issue Mar 26, 2021 · 4 comments · Fixed by #3592
Assignees
Labels
type:bug Something isn't working

Comments

@yinhedot
Copy link

Summary

Suggestion: Specify the character set as utf-8 when opening the file in the file config.py, or use character set detection.

Steps to reproduce

  1. utf-8 char in config.toml:
[runner]
# 禁止掉 magic commands
magicEnabled = false
  1. streamlit run xxx.py

  2. get error

  File "c:\python38_64\lib\site-packages\streamlit\hashing.py", line 40, in <module>
    from streamlit.folder_black_list import FolderBlackList
  File "c:\python38_64\lib\site-packages\streamlit\folder_black_list.py", line 39, in <module>
    if config.get_option("global.developmentMode"):
  File "c:\python38_64\lib\site-packages\streamlit\config.py", line 104, in get_option
    config_options = get_config_options()
  File "c:\python38_64\lib\site-packages\streamlit\config.py", line 1055, in get_config_options
    file_contents = input.read()
UnicodeDecodeError: 'gbk' codec can't decode byte 0xad in position 6: illegal multibyte sequence

caused by config.py ,code snippet:

         with open(filename, "r") as input:
                file_contents = input.read()

Treatment method:

with open(filename, "r",encoding='utf-8') as input:
                file_contents = input.read()
@yinhedot yinhedot added type:bug Something isn't working status:needs-triage Has not been triaged by the Streamlit team labels Mar 26, 2021
@jroes
Copy link
Contributor

jroes commented Mar 26, 2021

This reminds me of #2615. Might be worth us auditing git grep open\( across the board for this kind of issue.

@yinhedot
Copy link
Author

Good code practice, just like Golang source file must be encoded in UTF-8.

@jroes
Copy link
Contributor

jroes commented Mar 26, 2021

I did a little more research and found some interesting things:

@kmcgrady
Copy link
Collaborator

Thanks for the report @yinhedot ! I think @jroes is right! We should just default to UTF-8.

@kmcgrady kmcgrady removed the status:needs-triage Has not been triaged by the Streamlit team label Mar 26, 2021
@kajarenc kajarenc self-assigned this May 17, 2021
@kmcgrady kmcgrady assigned kmcgrady and vdonato and unassigned kajarenc and kmcgrady Jul 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants