Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set the locale to C.UTF-8 for Python 3 #13

Closed
rspeer opened this issue Sep 25, 2014 · 3 comments · Fixed by #14
Closed

Set the locale to C.UTF-8 for Python 3 #13

rspeer opened this issue Sep 25, 2014 · 3 comments · Fixed by #14

Comments

@rspeer
Copy link

rspeer commented Sep 25, 2014

Python 3 uses Unicode everywhere, which makes it inconvenient to use in an environment that claims not to support Unicode, such as these Docker images. The system I/O encoding ends up being "ascii", which means (for example) that non-ASCII strings simply can't be used with the print() function.

Usually, on Debian, the system's default locale would have set the encoding to UTF-8, but locales aren't installed.

The issue is easily fixed by adding one line to a Dockerfile, but I think it should be there in the Python 3 image itself so it doesn't catch people by surprise.

The line to add is:

ENV LANG C.UTF-8
@yosifkit
Copy link
Member

See: #5 (comment)

@rspeer
Copy link
Author

rspeer commented Sep 25, 2014

That doesn't convince me that it's reasonable to run Python 3 in an ASCII locale.

Sometimes scripts print things. The print function uses actual Unicode in py3. It is one of the layers of the Unicode sandwich. But it relies on the locale to be reasonable.

I see no downside to setting a locale as simple as C.UTF-8.

@rspeer
Copy link
Author

rspeer commented Sep 26, 2014

Here's a minimal example to show you that the Unicode problem is not in my code.

FROM python:3.4
RUN python -c "print(chr(0x5555))"

Currently, this gives:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character '\u5555' in position 0: ordinal not in range(128)

But on a correctly configured system, it gives:

Here is a thread from the Python mailing list about this issue, where many of the Python developers describe this as a bug in the OS distribution: http://bugs.python.org/issue19846

This state of things is kind of unfortunate, because other systems run Python 3 in an insufficient locale as well, and it makes Python 3 look broken. But it's probably not going to change anytime soon, as the developers are unwilling to override what the system locale says when the system locale is wrong. That's why I'm reporting it as a bug here. Docker is the "OS distribution" here, and it's the only place the issue can be fixed.

Nick Coghlan: "At the moment, setting "LANG=C" on a Linux system fundamentally breaks Python 3, and that's not OK."

STINNER Victor: "The solution is to fix the locale, not to fix Python. For example, don't set LANG to C."

hswong3i added a commit to alvistack/docker-bamboo that referenced this issue Sep 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants