NAME

genpyt - generate the PINYIN lexicon

SYNOPSIS

genpyt lexicon-file result-file log-file slm-file

DESCRIPTION

genpyt is used to generate the PINYIN lexicon. It only works on zh_CN.UTF-8 locale.

ARGUMENTS

lexicon-file

Specify a dictionary file. It should be a line-based text file in utf-8 encoding . Each line looks like:

CCC  id  [pinyin'pinyin'pinyin]*

A default dictionary file can be found at /usr/share/sunpinyin/dict.utf8.

result-file

The output binary PINYIN lexicon file. This lexicon contains a trie presenting the key tree of PINYIN. And all of the candiate words are sorted using the unigram in slm-file. This file can be used with sunpinyin input method engines.

log-file

Specify the file to where the log goes. The log-file can be seen as the human-readble presentation of the binary output file.

slm-file

The language model from which the unigram information are retrieved. Typically, the slm-file is generated by slmthread.

AUTHOR

Originally written by Phill.Zhang <phill.zhang@sun.com>. Currently maintained by Kov.Chai <tchaikov@gmail.com>.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

genpyt.pod

genpyt.pod

NAME

SYNOPSIS

DESCRIPTION

ARGUMENTS

AUTHOR

SEE ALSO

Files

genpyt.pod

Latest commit

History

genpyt.pod

File metadata and controls

NAME

SYNOPSIS

DESCRIPTION

ARGUMENTS

AUTHOR

SEE ALSO