Skip to content

Python library for multi-byte character string decoder.

License

Notifications You must be signed in to change notification settings

thombashi/mbstrdecoder

Repository files navigation

mbstrdecoder is a Python library for multi-byte character string decoder.

PyPI package version Supported Python implementations CI status of Linux/macOS/Windows Test coverage CodeQL
pip install mbstrdecoder
sudo add-apt-repository ppa:thombashi/ppa
sudo apt update
sudo apt install python3-mbstrdecoder
Sample Code:
from mbstrdecoder import MultiByteStrDecoder

encoded_multibyte_text = "マルチバイト文字".encode("utf-8")
decoder = MultiByteStrDecoder(encoded_multibyte_text)

print("encoded bytes: {}".format(encoded_multibyte_text))
print("unicode: {}".format(decoder.unicode_str))
print("codec: {}".format(decoder.codec))
Output:
encoded bytes: b'\xe3\x83\x9e\xe3\x83\xab\xe3\x83\x81\xe3\x83\x90\xe3\x82\xa4\xe3\x83\x88\xe6\x96\x87\xe5\xad\x97'
unicode: マルチバイト文字
codec: utf_8