Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
単語分割器Micterの分割部高速化試作
Common Lisp C C++ Ruby
Branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
trie
COPYING
README
common-utils.lisp
mimic-split.cc
mimic.lisp
mkcorpus.rb
mkmimic.cc

README

【概要】
・単語分割器Micterの分割部の速度改善を試したもの
 ・Mincterに関しては右のURLを参照# http://d.hatena.ne.jp/tkng/20100625/1277428044


【使い方】
==============================
[common lisp]

・モデル作成
--
$ cd mimic
$ sbcl
> (load "mimic")
> (mimic:train-file "/path/to/learn-data")  ; learn-dataの形式はMicterと同様
--

・分割(common lisp)
--
> (defvar *svm* (mimic:train-file "/path/to/learn-data"))
> (mimic:split *svm* "text")
--

・C++用にモデルを保存
--
> (mimic:dump-model *svm* "model-file")


==============================
[C++]
・DoubleArrayインデックス作成
$ g++ -O2 -o mkmimic mkmimic.cc
$ ./mkmimic index model-file

・分割(C++)
$ g++ -O2 -o mimic-split mimic-split.cc
$ ./mimic-split index text-file
Something went wrong with that request. Please try again.