Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gracefully handle unicode surrogate bogosity in pmap.py #1322

Closed

Conversation

Kelledin
Copy link

@Kelledin Kelledin commented Aug 11, 2018

Lately, while running 5.4.6 unit-tests on Python 3.6, I discovered the test_pmap case failing due to scripts/pmap.py barfing on some fugly Unicode sequences. Apparently, ext4 on Linux can have some oddball Unicode embedded in filenames that Python 3.6.3 rejects with complaints about surrogate characters. Specifically, the offending paths appear to be transient testfn tmpfiles that happen to be mapped within the interpreter when test_pmap.py acts on it, with pathnames like 'build/$testfnf\udcc0\udc80dxyit7fq.so'.

I'm not sure which other testcase creates these files, but if I run the affected test_pmap case by itself, it will happily succeed. It only seems to fail when I run the test suite in its entirety. For background, the testbed is an arm64 system, running a custom Linux distro with a mostly-vanilla 4.16.15 kernel.

For now the most robust way I can see to handle this is catch the exception and fall back on running the offending pathname through repr(). It appears to produce sane pmap.py output and pass tests on Python 2.7.15 and 3.6.3.

@giampaolo
Copy link
Owner

That is probably due to default shell encoding which is != UTF8 (I guess you can figure it out with echo $LANG or something). Anyway, I don't think it's worth the trouble. pmap.py is just an example script after all...

@giampaolo giampaolo closed this Aug 11, 2018
@Kelledin
Copy link
Author

FYI this was with LANG and LC_ALL set to "en_US.UTF-8".

(Yes, I know this is closed, I figured I'd just add that bit for the record.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants