Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while decoding QR codes with UTF-8 characters #95

Open
DrMint opened this issue Apr 7, 2021 · 2 comments
Open

Error while decoding QR codes with UTF-8 characters #95

DrMint opened this issue Apr 7, 2021 · 2 comments

Comments

@DrMint
Copy link

DrMint commented Apr 7, 2021

Hi!

I'm currently working on a project that uses QR codes. To make sure the generated QR codes are correct, I thought of using your module to decode them and compare the expected content with the decoded content.

Everything was working fine until I tried using Unicode characters. The decoded content no longer match the expected content.
Here is a simple program that showcases this problem:

import pyqrcode
import pyzbar.pyzbar
from PIL import Image

def encodeDecode(content):
    # Generate a QR code image from the content
    url = pyqrcode.create(content, encoding='utf-8')
    url.png('qrcode.png', scale=8)

    # Decode the QR code and retrieve the content
    decodedContent = pyzbar.pyzbar.decode(Image.open('qrcode.png'))[0].data

    # Compare with the original content
    if (decodedContent.decode('utf-8') == content):
        print("TEST OK with", content)
    else:
        print("TEST FAILED with", content)
    
encodeDecode('\u0100')               # TEST OK
encodeDecode('\u0101')               # TEST OK
encodeDecode('\u2133')               # TEST FAIL
encodeDecode('\u0100\u2133')    # TEST OK
encodeDecode('\u0101\u2133')    # TEST FAIL

And here is the result when executed:

TEST OK with Ā
TEST OK with ā
TEST FAILED with ℳ
TEST OK with Āℳ
TEST FAILED with āℳ

We can see that the character ℳ is not well decoded, but then when appended with Ā, it's working correctly. Even stranger, appended with ā (a character from the same Unicode group), again, the content is incorrectly decoded.

The problem doesn't originate from pyqrcode as all generated QR codes can be decoded correctly using other decoding solution like ZXing for Java, phones, or websites.

@guyskk
Copy link

guyskk commented Jun 8, 2022

Seems the qrcode value is decoded as ISO-8859-1 in some where, and can convert it back to utf-8.

>>> d=b'\xc3\xa4\xc2\xbd\xc2\xa0\xc3\xa5\xc2\xa5\xc2\xbd'
>>> d.decode('utf-8').encode('ISO-8859-1').decode('utf-8')
'你好'

@PythonJustForFun
Copy link

PythonJustForFun commented Dec 17, 2022

A very very ugly and simply "workaround" in Python 3, use the cmdlinetool "zbarimg" Version 0.23.90:

cv2.imwrite('/tmp/image.png',image)
os.system('zbarimg -q --raw --nodbus -Sqr.binary /tmp/image.png >/tmp/result.txt') # CMDLine zbarimg decode 
barcodefile = open("/tmp/result.txt", "rb")
barcodeData = barcodefile.read()
barcodeData = barcodeData.decode('UTF-8')
barcodefile.close()

Greets

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants