A word segmentation tool for ASEAN languages written in Clojure
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
doc
resources
src/wordcut
test
LICENSE
LICENSE-DICT
README.md
project.clj

README.md

wordcut-clj

A word segmentation tool for ASEAN languages written in Clojure

Example

 (:require [wordcut.tokenizer :refer :all])

Khmer (Cambodian)

 ((khmer-tokenizer) "ភាសាខ្មែរ")

Lao

 ((lao-tokenizer) "ພາສາລາວມີ")

Thai

 ((thai-tokenizer) "ภาษาไทย")

Leiningen

 [wordcut "0.1"]