-
-
Notifications
You must be signed in to change notification settings - Fork 30.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python 3.6 cannot reopen .pyc file with non-ASCII path #76562
Comments
have a problem that python3.6 can not reopen .pyc file with Chinese path, and python3.5 can reopen the same pyc file. As shown in the picture |
run_file encodes the file path via PyUnicode_EncodeFSDefault, which encodes as UTF-8 in Windows, starting with 3.6. PyRun_SimpleFileExFlags subsequently tries to open this encoded path via _Py_fopen, which calls fopen. The CRT expects an ANSI encoded path, so only the common ASCII subset will work. Non-ASCII paths will fail. This could be addressed in _Py_fopen by decoding the path and calling _wfopen instead of fopen. Executing a .pyc also fails in 3.5 if the wide-character path can't be encoded as ANSI, but the 3.5 branch only accepts security fixes. |
Thanks a lot. What should I do to reopen .pyc file with non-ASCII path use 2017-12-20 15:35 GMT+08:00 Eryk Sun <report@bugs.python.org>:
|
Workarounds: (1) force 3.6 to use the legacy ANSI filesystem encoding by setting the environment variable PYTHONLEGACYWINDOWSFSENCODING. (2) Use 8.3 DOS names, if creating them is enabled on the volume. You can check their value in CMD via |
run_file() gets a wchar_t* string which comes from wmain() argv. run_file() encodes the wchar_t* using PyUnicode_EncodeFSDefault(). Later, PyRun_SimpleFileExFlags() calls indirectly fopen() with the encoded filename.
I agree that it's the correct fix. I would make _Py_fopen() more compatible with the PEP-529. |
Typo: It* would |
Hum. In fact, this problem can be fixed differently: modify PyRun_xxx() functions to pass the filename as an Unicode string. Maybe pass it as a wchar_t* string or even a Python str object. |
Thanks, Eryk, for catching the dup, I missed it somehow. @ZackerySpytz: do you plan to proceed with your PR? If not, I can pick it up -- this issue broke the software I develop after upgrade to 3.8. I filed bpo-42569 to hopefully clarify the status of _Py_fopen() which became murky to me. |
I can reproduce the issue on Python 3.10 with a script called 北京市.py which contains: print("hello"). c:\> python 北京市.py c:\>python __pycache__\北京市.cpython-310.pyc And with my PR 23642 fix, it works as expected: C:\>python __pycache__\北京市.cpython-310.pyc |
bpo-42568 is marked as a duplicate of this issue. |
Thanks for the patch, Victor, it looks good. Just so it doesn't get lost: the problem with the contract of PyErr_ProgramText() which I mentioned in my dup 42568 is still there. |
It seems like PyErr_ProgramText() is no longer used in Python. PyErr_ProgramTextObject() is used and it pass the filename as Python object to _Py_fopen_obj(). |
Isn't it a part of the public API? I can't find it in the docs, but it seems to be declared in the public header. |
The Python C API has a strange history... |
It's now fixed in 3.8, 3.9 and master branches. Thanks for the bug report tianjg. |
Thanks for the fix and backports! |
My PR 23778 fix the encoding/error handler when writing the filename into stderr, when the file does not exist: $ LANG= PYTHONCOERCECLOCALE=0 ./python -X utf8=0 héllo.py
./python: can't open file '/home/vstinner/python/master/h\udcc3\udca9llo.py': [Errno 2] No such file or directory |
boost-python was using the removed private _Py_fopen() function, I proposed boostorg/python#344 to replace _Py_fopen() with fopen(). |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: