Skip to content
This repository has been archived by the owner on May 31, 2020. It is now read-only.

Work around Unicode encoding error on Windows #703

Closed
wants to merge 1 commit into from

Conversation

whydoubt
Copy link
Contributor

For Python <3.6 on Windows, encoding strings for output to the Windows
command prompt may result in a UnicodeEncodingError. VOC does not take
encoding into consideration, and so the output differs.

Until such time as proper encoding support is implemented in VOC, use an
environment variable that Python provides for overriding the IO
encoding. By setting this to UTF-8, the output may appear garbled, but
the error is avoided, and it matches the run-as-Java output. For
consistency, pass the environment variable to Java as well.

Addresses #610 and #237.

For Python <3.6 on Windows, encoding strings for output to the Windows
command prompt may result in a UnicodeEncodingError.  VOC does not take
encoding into consideration, and so the output differs.

Until such time as proper encoding support is implemented in VOC, use an
environment variable that Python provides for overriding the IO
encoding.  By setting this to UTF-8, the output may appear garbled, but
the error is avoided, and it matches the run-as-Java output.  For
consistency, pass the environment variable to Java as well.

Addresses beeware#610 and beeware#237.
@freakboy3742
Copy link
Member

Thanks for looking into this.

I'm intrigued what you'd consider "proper" encoding support in this context. I can't deny that Windows is definitely having difficulties here; I imagine there would be many Linux configurations that have similar problems due to odd codepage configurations.

However, in the Linux space at least, my understanding is that this is something that is considered an error of usage - somewhere between "you're doing it wrong" and "you're doing it in a way that makes it impossible to know what is right". What is the right approach here?

@whydoubt
Copy link
Contributor Author

For one instance, to really match what Cpython does, sys.stdout.write() should be encoding from string to stream using sys.stdout.encoding. Typically for Linux and for Windows-with-Python>=3.6 that will be UTF-8, but it still should not be assumed. WIth voc, sys.stdout.encoding does not even exist. I believe this is at least part of the cause of #395, and if that's the case, that one can only be solved with some semblance of encoding support.

@eliasdorneles
Copy link
Collaborator

If I understood correctly, the issue is that VOC output is always UTF-8, and the CPython output depends on the environment.
So, this PR sets the PYTHONIOENCODING environment variable to tell CPython to use UTF-8 as well, regardless of the environment.
It sounds good to me.
We could make it more explicit that this is a workaround for Windows, by setting it variable only for windows, checking sys.platform.

@freakboy3742 any objections about merging this?

@freakboy3742
Copy link
Member

I don't have any particular objection to merging, other than wanting to have a slightly better understanding of what the "real" fix is. If it's as simple as defining sys.platform.encoding, and using that in system.out.write() calls... then why not just do that?

@whydoubt
Copy link
Contributor Author

There is more to it than the portion I mentioned in my previous comment. I had thought it would require creating translation tables for each encoding we wish to support. However, I may give a shot at using Java's encoding support to make it work.

Copy link

@prateek3255 prateek3255 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am on a windows machine and the test_title and test_case_changes in the test_str.py file were failing earlier, but after making these changes they are working fine.

@phildini
Copy link
Member

Hi there! It looks like this PR might be dead, so we're closing it for now. Feel free to re-open it if you'd like to continue, or think about directing your efforts to https://github.com/beeware/briefcase or https://github.com/beeware/toga. Both of these have more active development right now. 😄

@phildini phildini closed this Apr 25, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants