-
-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UnicodeEncodeError: 'shift_jis' codec can't encode character '\u90d5' in position 3: illegal multibyte sequence #14
Comments
I faced the same issue and need change code page from "shift_jis" to "cp65001". Seem need update interface of QRDecode that can provide alternative code page for encoding |
Hi! I have been trying to debug with this QR you sent, but it worked on my environment :( I'm testing on Python 3.9, and when decoding it gets me 'qr код' as when I test it with TeaCapps QR Reader App. If I try to decode it with "cp65001" as proposed by @nguyen-viet-hung, it gets me 'qr ミコミセミエ'. Which is your expected output? But in any case it doesn't produce me the UnicodeEncodeError. Could you get me more details about your environment? If you can access to the source, does the solution proposed by @nguyen-viet-hung solves your problem (qreader.py, line 70)? Thanks! Any way, I'll include the charset as an input parameter, to allow more flexibility. |
Hi, My environment is Python 3.10, qreader 2.12 on Windows. I face same issue with @Arutemu64 when my QR contains Vietnamese. I tried with @Arutemu64 QR code with 'shift_jis' code page it gave also error: "UnicodeEncodeError: 'shift_jis' codec can't encode character '\u1ec5' in position 27: illegal multibyte sequence". So I think it depends on the system running. |
I have been testing @Arutemu64 's QR under python 3.10, also on Windows and gives me the same result ('qr код') than before. Could it be related with some internal locale configuration? Could you share a Vietnamese QR that you know that produces the "UnicodeEncodeError: 'shift_jis' codec can't encode character..." on your side? |
Hi, As I tested with your QR, it gave the result as you mentions. and for my case, can use only .decode('utf-8') and it gave correct result. After checking for the while, I found the mention of code page here. cp65001 is utf-8, then for some languages (Japanese, Chinese, Arabic ...), they need re-encode and decode as you do in the code. Before we can find an universal solution, should make encoding code-page as parameter as you do. |
Sure, I have included that parameter by the moment. You can upgrade it with And instantiate the QReader object as By default it re-encodes to shift-jis, just to avoid messing anything with current implementations, but you can set reencode_to=None and it will just do the one-step 'utf-8' decoding. Anyway, if it triggers a UnicodeEncodeError, or an UnicodeDecodeError, it will fallback to Thanks a lot for your help @nguyen-viet-hung . @Arutemu64 , your problem should be solved with this update. Thanks! |
Hi @Arutemu64 You can use the update that @Eric-Canas has published or try installing extra package
|
Trying out QReader in my project, I discovered it fails to decode some QR codes and I'm not exactly sure why.
Here is an example of such QR code
It scans just fine with TeaCapps QR Reader app (the most popular one) on my Android phone.
The text was updated successfully, but these errors were encountered: