Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 

wordcut.rb

ASEAN word tokenizer written in Ruby.

Example

Thai

 # coding: utf-8
 require 'wordcut/dict'
 require 'wordcut/tokenizer'
 require 'pp'

 tha_dict = Wordcut::BasicDict.from_bundle("tha", "tdict-std.txt")
 tokenizer = Wordcut::BasicTokenizer.new(tha_dict)
 PP.pp tokenizer.tokenize('กากากา')

About

ASEAN word tokenizer written in Ruby

Resources

License

Releases

No releases published

Packages

No packages published

Languages