New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple test failures with OSError: [Errno 84] Invalid or incomplete multibyte or wide character on ZFS with utf8only=on #81765
Comments
I'm running Ubuntu 19.04 on a ZFS mirrored pool, where my home partition is configured with 'utf8only=on' attribute. I've cloned cpython and after running the tests, as described in devguide.python.org, I have 11 test failures: == Tests result: FAILURE == 389 tests OK. 11 tests failed: I've been looking for similar or matching reported issues, but could not find one. I'm on the EuroPython 2019 CPython sprint and we'll be looking into this with the help of some of the core devs. |
Here's some additional information I found for that specific attribute: From the documentation at utf8only |
I think Dimiter was able to fix most of the failures, except test_unicode_file_functions. import os
import unicodedata
upsilon_diaeresis_and_hook = "ϔ"
for form in ["NFC", "NFD", "NFKC", "NFKD"]:
unicode_filename = unicodedata.normalize(form, upsilon_diaeresis_and_hook)
with open(unicode_filename, "w") as f: f.write(form)
print("N:", ascii(unicode_filename))
print([ascii(filename) for filename in os.listdir('.')]) On ext4 this creates 4 different files: ['\u03d4', '\u03d2\u0308', '\u03ab', '\u03a5\u0308'] The test is already skipped on darwin (Lib/test/test_unicode_file_functions.py:120), and should be skipped for ZFS too (might depend on the exact flags used), however we weren't able to find a portable way to determine the filesystem and flags. An alternative is to try creating the 4 files and skip the test if only 2 gets created and if all the names can be used to open these two files, however this might mask other failures. Unless someone can come up with a better way to do this, I think this is the only option. In addition, different filesystems that don't exhibit this behavior can be used on Mac, so the test shouldn't be skipped in those cases. |
""" The test is already skipped on darwin (Lib/test/test_unicode_file_functions.py:120), and should be skipped for ZFS too (might depend on the exact flags used), however we weren't able to find a portable way to determine the filesystem and flags. I suggest to create a temporary directory, create the 4 files and see how many files you can using os.listdir(). If you get 4, the FS doesn't normalize anything. If you get less, it's likely that the FS normalizes names. |
Confirmed. Repro: Do an ubuntu 20.04 install and choose "experimental zfs" support during install - https://ubuntu.com/blog/zfs-focus-on-ubuntu-20-04-lts-whats-new). On such a zfs filesystem, the following tests from a ./python -m test.regrtest run fail in 3.10: 11 tests failed: Move over to a tmpfs and all but test_httpservers now pass. test_httpservers tries to create such a path on /tmp ====================================================================== Traceback (most recent call last):
File "/home/greg/test/cpython/Lib/test/test_httpservers.py", line 400, in test_undecodable_filename
with open(os.path.join(self.tempdir, filename), 'wb') as f:
OSError: [Errno 84] Invalid or incomplete multibyte or wide character: '/tmp/tmpnt9ch98x/@test_124227_tmp\udce7w\udcf0.txt' I expect any filesystem mounted to reject non-UTF8 pathnames to cause similar failures. Our test suite needs to detect this environment and skip these tests there. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: