diff --git a/pep-0686.rst b/pep-0686.rst index 0978770670f..b22c9fc28a5 100644 --- a/pep-0686.rst +++ b/pep-0686.rst @@ -30,7 +30,7 @@ UTF-8 becomes de-facto standard text encoding. default. * Most websites and text data on the internet uses UTF-8. * And many other popular programming languages including node.js, Go, Rust, - Ruby, and Java uses UTF-8 by default. + and Java uses UTF-8 by default. Changing the default encoding to UTF-8 makes Python easier to interoperate with them. @@ -44,34 +44,25 @@ source files). Inconsistent default encoding caused many bugs. Specification ============= -Changes to UTF-8 mode ---------------------- - -Currently, UTF-8 mode affects to ``locale.getpreferredencoding()``. - -This PEP proposes to remove this override. UTF-8 mode will not affect to -``locale`` module. - -After this change, UTF-8 mode affects to: - -* stdin, stdout, stderr - - * User can override it with ``PYTHONIOENCODING``. +Enable UTF-8 mode by default +---------------------------- -* filesystem encoding +Python enables UTF-8 mode by default. -* ``TextIOWrapper`` and APIs using it including ``open()``, - ``Path.read_text()``, ``subprocess.Popen(cmd, text=True)``, etc... +User can still disable UTF-8 mode by setting ``PYTHONUTF8=0`` or ``-X utf8=0``. -This change will be introduced in Python 3.11 if possible. +``locale.get_encoding()`` +------------------------- -Enable UTF-8 mode by default ----------------------------- +Add ``locale.get_encoding()``. It is same to +``locale.getpreferredencoding(False)`` except it don't follow UTF-8 mode. -Python enables UTF-8 mode by default. +This API will be used by ``io.TextIOWrapper`` to support ``encoding="locale"`` +option. -User can still disable UTF-8 mode by setting ``PYTHONUTF8=0`` or ``-X utf8=0``. +This change will be released in Python 3.11 so that users can prepare before +UTF-8 mode is enabled by default. Backward Compatibility @@ -86,10 +77,14 @@ should be announced very loudly. To resolve this backward incompatibility, users can do: -* Disable UTF-8 mode +* Disable UTF-8 mode. * Use ``EncodingWarning`` to find where the default encoding is used and use - ``encoding="locale"`` option to keep using locale encoding + ``encoding="locale"`` option if locale encoding should be used (as defined in :pep:`597`). +* Find every occurrence of ``locale.getpreferredencoding(False)`` in the + application, and replace it with ``locale.get_locale_encoding()`` if + locale encoding should be used. +* Test the application with UTF-8 mode. Preceding examples @@ -125,11 +120,10 @@ How to teach this ================= For new users, this change reduces things that need to teach. +Users don't need to learn about text encoding in their first year. +They need to learn it when they need to use non-UTF-8 text files. -Users can delay learning about text encoding until they need to handle -non-UTF-8 text files. - -For existing users, see `Backward compatibility`_ section. +For existing users, see the `Backward compatibility`_ section. References