-
-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strange output with non-ascii string #201
Comments
I can't reproduce that on my Mac. 🤷♂️ What happens if you try setting an encoding with a line like this at the top of your file? (If you are not using utf8, then replace that with your actual encoding): # -*- coding: utf8 -*- |
Also, what is your codepage set to? |
@CleanCut Setting encoding comment has no effect. @MinchinWeb default 932(shift-jis), but same result on 65001(utf-8). In my opinion, this is caused by Lines 145 to 148 in ce9383b
To reproduce, run this code. from unidecode import unidecode
print("日本語") # -> 'Ri Ben Yu' |
I think you figured out what's going on, and I can fill in the why: So |
I agree. This is expected, but not necessarily ideal. If we added an option to disable this behavior, would anyone use it? @tkamenoko @MinchinWeb |
Yes, I think so. Current behavior is not documented and may lead to confusion. The same output as builtin |
Fix is in 2.14.0, just released. |
Platform: Windows 10 64bit
Python: 3.7
Shell: Powershell 6.1.1
Test code:
Output by
Green
:Output by builtin
unittest
:Test finished as expected, but that is wrong output(
日本語
->Ri Ben Yu
).The text was updated successfully, but these errors were encountered: