Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

decode the execution path string based file system encoding #4343

Merged
merged 6 commits into from Apr 19, 2015
Merged

decode the execution path string based file system encoding #4343

merged 6 commits into from Apr 19, 2015

Conversation

sohero
Copy link
Contributor

@sohero sohero commented Apr 16, 2015

When my py file importing matplotlib run in a folder where the path contains gbk encoding character, matplotlib will raise a unicode decode error.The changes in line 666/673/679 will fix the problem.

When my py file importing matplotlib run in a folder where the path contains gbk encoding character, matplotlib will raise a unicode decode error.The changes in line 666/673/679 will fix the problem.
fix support for python version 3+ and pep8
pep8 need some spaces around operator
@tacaswell tacaswell added this to the next point release milestone Apr 16, 2015
@tacaswell
Copy link
Member

What OS are you on? Can you turn that pattern into a helper function?

attn @mdboom

Checking for the decode attribute and assuming that it is encoded and making assumptions about what that encoding is seems sketchy to me, is there a better way to do this?

@mdboom
Copy link
Member

mdboom commented Apr 16, 2015

The issue here is that we take the a path we get from Python and get a byte string back. Using that byte string to open the file again would be totally fine and correct. The problem is that we tack other things onto that path using a unicode string -- os.sep.join([__file__, u'mpl-data']) -- which forces the path to be decoded from a byte string into unicode using the default encoding, which, on @sohero's system, I'll assume is different from the file system encoding. Python can't really do better since it doesn't know the path is a filesystem path and not some random string. (FWIW, os.path.join has the same issue, though I'd assume it could do better). You could, alternatively, encode the bit we are adding into the file system encoding before joining, and keep everything as byte strings, but that's not really an improvement.

So, I think what this PR is doing is fine in general. However, I agree there should be a helper function for this, maybe decode_filesystem_path(). And I don't think we should do hasattr(obj, 'decode'), though. Maybe isinstance(obj, bytes) instead.

@sohero
Copy link
Contributor Author

sohero commented Apr 17, 2015

My working OS is windows 7 Chinese version, the default file system encoding is gbk. Python 2.X can not detect the encoding correctly, and Matplotlib works well when the path only contains english chars, while not working when the path has any non-ascii chars like 中文. Using a helper function isinstance(obj, bytes) make the codes more readable and reliable. Thanks for the reminder.

Change to using a helper function and isinstance(path, bytes) to decode __file__ based on file system encoding.
Oops, some white spaces.
oh, pep8
tacaswell added a commit that referenced this pull request Apr 19, 2015
FIX : decode the execution path string based file system encoding

Prevents issues on systems with mixed encodings.
@tacaswell tacaswell merged commit 92c50d1 into matplotlib:master Apr 19, 2015
@tacaswell
Copy link
Member

@sohero Thank you! I believe this is your first mpl contribution, congratulations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants