A port of pytst to Ruby. It is a implementation of Ternary Search Trie (TST) and scanning by the Aho-Corasick algorithm.
Fetching latest commit…
Cannot retrieve the latest commit at this time
ruby-pytst ---------- ruby-pytst is a port of pytst to Ruby. It is a implementation of Ternary Search Trie (TST) and it also supports scanning by the Aho-Corasick algorithm. It is built with SWIG. This software is distributed under LGPL. I have successfully built on the following environments. (1) Mac OS X 10.4, Ruby 1.8.4 (Fink) (2) Fedora Core 5, Ruby 1.8.4 tst_wrap.cxx in this package was generated on (1) with SWIG 1.3.19. Author: Masaki Yatsu <firstname.lastname@example.org> Installation ------------ Generate the Makefile. % ruby extconf.rb Make it. % make Then, install it as root. % make install If you want to generate the Ruby wrapper codes from tst.i by yourself, execute the following at the beginning. % swig -c++ -ruby -Iinclude tst.i How to Use ---------- There is no documentation at this point, but there is README.html of pytst in the doc directory. There are examples in the example directory. You can run examples like this. % ruby example/tokenize.rb ruby-pytst supports following character sets. * ASCII ($KCODE = 'NONE') * EUC-JP ($KCODE = 'EUC') * Shift_JIS ($KCODE = 'SJIS') * UTF-8 ($KCODE = 'UTF8') You have to specify $KCODE *before* you create a Pytst::TST. There is a EUC-JP example in the example directory. % ruby -Ke example/japanese_euc.rb The option '-Ke' means $KCODE = 'EUC'. See also -------- pytst: http://nicolas.lehuen.com/download/pytst/ Ternary Search Trie: http://en.wikipedia.org/wiki/Ternary_search_tries Aho-Corasick algorithm http://en.wikipedia.org/wiki/Aho-Corasick_algorithm