Issues when trying to run tests on Windows #237
Comments
I've been (slowly) chasing this up this evening, and making some progress through judicious commenting-out of exception handlers :) This is more of an infodump than anything, but might be helpful. Commenting out the first exception handler in assertCodeExecution (lines 345/346), reveals that it's
on line 505. Printing out the return string from
which I'm pretty sure is not right. I've tried a few simple things to try and work out what's going on. Encoding the java output to
and various other combinations of encoding/decoding that string in a test script (without explicitly setting I've done some light googling for "windows java stdout encoding" and "windows check powershell encoding", which turns up some potentially helpful info: Default character encoding for java console output Based on these, I've tried a few things:
Anyway, not really sure what I'm doing here, but I'll keep trying things out until I figure out what's going on, or someone more knowledgable can jump in :) |
Update: it seems like I can recreate that specific error, but by writing a file out as UTF-16, and then back as UTF-8 (see attached script, test_encoding.py.txt). Perhaps the Windows Java does this by default on Windows? In any case, it's progress, and late here, so I'll pick this up later if it's still open. |
You might be on to something with the UTF-16/8 thing. Internally, Java's string format uses an odd format called MUTF-8. The key feature of MUTF-8 is an odd way of encoding nulls. I'm not sure why this would be manifesting on console output, and only on Windows - but it's worth some investigation. |
For Python <3.6 on Windows, encoding strings for output to the Windows command prompt may result in a UnicodeEncodingError. VOC does not take encoding into consideration, and so the output differs. Until such time as proper encoding support is implemented in VOC, use an environment variable that Python provides for overriding the IO encoding. By setting this to UTF-8, the output may appear garbled, but the error is avoided, and it matches the run-as-Java output. For consistency, pass the environment variable to Java as well. Addresses beeware#610 and beeware#237.
For Python <3.6 on Windows, encoding strings for output to the Windows command prompt may result in a UnicodeEncodingError. VOC does not take encoding into consideration, and so the output differs. Until such time as proper encoding support is implemented in VOC, use an environment variable that Python provides for overriding the IO encoding. By setting this to UTF-8, the output may appear garbled, but the error is avoided, and it matches the run-as-Java output. For consistency, pass the environment variable to Java as well. Addresses beeware#610 and beeware#237.
(This is on Windows 8.1; I imagine similar issues appear on other Windows systems)
I've had a try at running the tests under Windows, the main failure that I run into seems to be related to the default encoding under Windows (cp1252 rather than utf-8):
E AssertionError: 'utf-8' codec can't decode byte 0xff in position 10: invalid start byte
This seems to be an issue on both the Java and Python versions, see attached voc_output.txt, which is a dump of the main_code for both
runAsPython()
andrunAsJava()
Another, possibly related issue is that Windows has a "charmap" encoding in it's terminals (cmd and powershell), which doesn't handle some unicode characters. I've also attached
unicode_test.py.txt which blows up with a similar error.
In the process of debugging, I've also found a couple of places where utf-8 encoding seems to have been missed - pull request here: #236
The text was updated successfully, but these errors were encountered: