Skip to content

provide separate data for Traditional Chinese #8

Closed
wengxt opened this Issue Jan 4, 2012 · 4 comments

2 participants

@wengxt
wengxt commented Jan 4, 2012

libpinyin should provide tradition chinese data, with theirs own language model and character. Using opencc is not a perfect solution, and opencc cannot provides different characters which are both available in Traditional Chinese, such as 台 and 臺.

@epico epico was assigned Jan 4, 2012
@epico
libpinyin member
epico commented Jan 4, 2012

Sorry, as the name stated, libpinyin focus on pinyin handling.
The chewing parser is required by ibus-pinyin integration.
And I believe that the best software is produced by its users, maybe libchewing will deal with this better. :)

@wengxt
wengxt commented Jan 4, 2012

Actually I don't think it's too hard. We can do it in a simply way at first, and see if there will be some people make it better.

The first step maybe simply create a separate char.table (with only traditional character, but a complete set, for example, "台" and "臺" are both inside the table, we can use simp frequency data at first.) and use opencc to convert all the word, thus get a usable traditional data. Currently state that "台" cannot be typed with libpinyin when converted by opencc, this is not acceptable for zh_TW.

And I would say, libchewing is simply a crap, just a 10 years ago input method library.

@epico
libpinyin member
epico commented Feb 23, 2012

Just pushed ucs4 character support to github.com/epico/libpinyin.
Please try it. :)

@epico
libpinyin member
epico commented Sep 30, 2014

As libzhuyin 0.9.99.20140929 is released, close issue.

@epico epico closed this Sep 30, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.