Permalink
Find file Copy path
e2bd604 Apr 20, 2018
1 contributor

Users who have contributed to this file

81 lines (57 sloc) 2.07 KB

mojimoji

https://badge.fury.io/py/mojimoji.png https://travis-ci.org/studio-ousia/mojimoji.png?branch=master

A fast converter between Japanese hankaku and zenkaku characters.

Installation

$ pip install mojimoji

Examples

Zenkaku to Hankaku

>>> import mojimoji
>>> print mojimoji.zen_to_han(u'アイウabc012')
アイウabc012
>>> print mojimoji.zen_to_han(u'アイウabc012', kana=False)
アイウabc012
>>> print mojimoji.zen_to_han(u'アイウabc012', digit=False)
アイウabc012
>>> print mojimoji.zen_to_han(u'アイウabc012', ascii=False)
アイウabc012

Hankaku to Zenkaku

>>> import mojimoji
>>> print mojimoji.han_to_zen(u'アイウabc012')
アイウabc012
>>> print mojimoji.han_to_zen(u'アイウabc012', kana=False)
アイウabc012
>>> print mojimoji.han_to_zen(u'アイウabc012', digit=False)
アイウabc012
>>> print mojimoji.han_to_zen(u'アイウabc012', ascii=False)
アイウabc012

Benchmarks

Library versions

Results

In [19]: s = u'ABCDEFG012345' * 10

In [20]: %time for n in range(1000000): mojimoji.zen_to_han(s)
CPU times: user 2.86 s, sys: 0.10 s, total: 2.97 s
Wall time: 2.88 s

In [21]: %time for n in range(1000000): unicodedata.normalize('NFKC', s)
CPU times: user 5.43 s, sys: 0.12 s, total: 5.55 s
Wall time: 5.44 s

In [22]: %time for n in range(1000000): zenhan.z2h(s)
CPU times: user 69.18 s, sys: 0.11 s, total: 69.29 s
Wall time: 69.48 s