Word segmentation tools for ASEAN languages written in Common Lisp
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
data
src
test
.gitignore
README.md
cl-wordcut.asd
test.lisp

README.md

cl-wordcut

cl-wordcut is a word segmentation tool for ASEAN languages written in Common Lisp.

Example

Khmer (Cambodian)

(require 'cl-wordcut)
(defvar *dict* (cl-wordcut:load-dict-from-bundle "khmerwords.txt"))
(defvar *wordcut* (cl-wordcut:create-basic-wordcut *dict*))
(funcall *wordcut* "ភាសាខ្មែរ")

Lao

(require 'cl-wordcut)
(defvar *dict* (cl-wordcut:load-dict-from-bundle "laowords.txt"))
(defvar *wordcut* (cl-wordcut:create-basic-wordcut *dict*))
(funcall *wordcut* "ພາສາລາວມີ")

Thai

(require 'cl-wordcut)
(defvar *dict* (cl-wordcut:load-dict-from-bundle "tdict-std.txt"))
(defvar *wordcut* (cl-wordcut:create-basic-wordcut *dict*))
(funcall *wordcut* "กากาม")