-
Notifications
You must be signed in to change notification settings - Fork 438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serial Console doesn't handle unicode characters properly #797
Comments
Thanks for the report @olivier-boesch. Quick notes:
|
As this was one of the older issues for this problem and there was already a few internal and external github issues linking here I've decided to unify all duplicates we have into this one. I've also updated the issue title to be more generic, as this affects all serial terminals (CircuitPython, micro:bit and MicroPython). |
We currently have two PRs looking into this: Both need to be expanded to deal with incomplete multi-byte unicode characters at the beginning and end of the data array. I've looked into this, but didn't quite finished it up, I'll try to push my current status before the end of this week (can't today as it's my birthday 🥳 ). I've also run a benchmark on a couple of different implementations to see what would be faster. Mu currently struggles to process serial data coming without interruption at 115200 baud, so I didn't want to make this worse. Good news is a version based on @k0d's implementation is actually faster than the original implementation before any fix, as it decodes the entire data array at the beginning (otherwise it does decoding on smaller chunks when trying to regex VT100 commands). @dybber We will also need to agree on a merge order with #1026 as it is touching the same area and some aspects of keeping bytes from previous iterations will have to be combined. |
As mentioned elsewhere, I believe it's better to start from what I've done in #1026, and add unicode support from there, than to try and merge the two branches later on. I think it will become a mess to figure out how to do such a merge. In that branch I already added support for receiving partial input, and leaving unprocessed input for next call to |
On the otherhand, it wasn't terrible difficult actually. I just made a commit that seems to fix it to my replcursor_movement branch, check dybber@810aa71 |
Yeah, my main concern was performance, as this is already a very busy loop that hangs the entire UI if the incoming serial data is large enough. For example, on an i7 MacBook this will max out my CPU and hang the Mu UI until I unplug the micro:bit or I kill the process, so I suspect lower spec computers will struggle with less: from microbit import *
while True:
print(help())
sleep(20)
|
Okay, I've added the benchmark source code in this gist (it might look like a lot of code, but each "option" file is the original There are 5 implementations, but really it's only comparing 3 different methods:
The results for those are (in seconds):
Option 1 was clearly going to be the worst implementation, which why I started looking into Option 2 (looking at the UTF-8 bits to figure out if we have incomplete characters at the beginning or end of the byte array), so unsurprisingly Option 2 does much better. @dybber I'm very glad you found The other two measurements on the benchmark were based on the
@dybber I've created PR dybber#1 in your fork to include these changes in #1026. |
Great, thanks, I hope we can soon get it merged into Mu. |
My branch fixing this is now merged, and I will close this issue |
If you are reporting a bug, we would like to know:
display text send by circuitpython script on serial console in adafruit mode.
open serial console only
when i write '°' or '\u00b0' in a print statement in my script, i expect '°'
I see this if I connect with putty.
I see '°' in the serial console
unicode characters are not handled properly
other aspects of the context in which Mu was running.
installed via installer(64bits version), windows 10, mu version 1.0.2
However that's a good software. Thanks.
The text was updated successfully, but these errors were encountered: