-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set the locale to C.UTF-8 for Python 3 #13
Comments
See: #5 (comment) |
That doesn't convince me that it's reasonable to run Python 3 in an ASCII locale. Sometimes scripts print things. The print function uses actual Unicode in py3. It is one of the layers of the Unicode sandwich. But it relies on the locale to be reasonable. I see no downside to setting a locale as simple as C.UTF-8. |
Here's a minimal example to show you that the Unicode problem is not in my code.
Currently, this gives:
But on a correctly configured system, it gives:
Here is a thread from the Python mailing list about this issue, where many of the Python developers describe this as a bug in the OS distribution: http://bugs.python.org/issue19846 This state of things is kind of unfortunate, because other systems run Python 3 in an insufficient locale as well, and it makes Python 3 look broken. But it's probably not going to change anytime soon, as the developers are unwilling to override what the system locale says when the system locale is wrong. That's why I'm reporting it as a bug here. Docker is the "OS distribution" here, and it's the only place the issue can be fixed. Nick Coghlan: "At the moment, setting "LANG=C" on a Linux system fundamentally breaks Python 3, and that's not OK." STINNER Victor: "The solution is to fix the locale, not to fix Python. For example, don't set LANG to C." |
Python 3 uses Unicode everywhere, which makes it inconvenient to use in an environment that claims not to support Unicode, such as these Docker images. The system I/O encoding ends up being "ascii", which means (for example) that non-ASCII strings simply can't be used with the
print()
function.Usually, on Debian, the system's default locale would have set the encoding to UTF-8, but locales aren't installed.
The issue is easily fixed by adding one line to a Dockerfile, but I think it should be there in the Python 3 image itself so it doesn't catch people by surprise.
The line to add is:
The text was updated successfully, but these errors were encountered: