A fast converter between Japanese hankaku and zenkaku characters
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.gitignore
.travis.yml
LICENSE
MANIFEST.in
README.rst
cythonize.sh
mojimoji.cpp
mojimoji.pyx
requirements.txt bump version Apr 19, 2018
setup.py
test_mojimoji.py fix incorrect characters in the mapping #7 Apr 19, 2018

README.rst

mojimoji

https://badge.fury.io/py/mojimoji.png https://travis-ci.org/studio-ousia/mojimoji.png?branch=master

A fast converter between Japanese hankaku and zenkaku characters.

Installation

$ pip install mojimoji

Examples

Zenkaku to Hankaku

>>> import mojimoji
>>> print mojimoji.zen_to_han(u'アイウabc012')
アイウabc012
>>> print mojimoji.zen_to_han(u'アイウabc012', kana=False)
アイウabc012
>>> print mojimoji.zen_to_han(u'アイウabc012', digit=False)
アイウabc012
>>> print mojimoji.zen_to_han(u'アイウabc012', ascii=False)
アイウabc012

Hankaku to Zenkaku

>>> import mojimoji
>>> print mojimoji.han_to_zen(u'アイウabc012')
アイウabc012
>>> print mojimoji.han_to_zen(u'アイウabc012', kana=False)
アイウabc012
>>> print mojimoji.han_to_zen(u'アイウabc012', digit=False)
アイウabc012
>>> print mojimoji.han_to_zen(u'アイウabc012', ascii=False)
アイウabc012

Benchmarks

Library versions

Results

In [19]: s = u'ABCDEFG012345' * 10

In [20]: %time for n in range(1000000): mojimoji.zen_to_han(s)
CPU times: user 2.86 s, sys: 0.10 s, total: 2.97 s
Wall time: 2.88 s

In [21]: %time for n in range(1000000): unicodedata.normalize('NFKC', s)
CPU times: user 5.43 s, sys: 0.12 s, total: 5.55 s
Wall time: 5.44 s

In [22]: %time for n in range(1000000): zenhan.z2h(s)
CPU times: user 69.18 s, sys: 0.11 s, total: 69.29 s
Wall time: 69.48 s