a tool used to train a detokenization library
Clojure
Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
data
models
src/corpus
test/corpus/test
.gitignore
README
project.clj

README

# corpus

A tool used to train detokenization libraries

## Usage

(use 'corpus.core)
(def w (corpus-file "data/alice-in-wonderland.txt"))
(count w)

## License

Copyright (C) 2010 Lee Hinman

Distributed under the Eclipse Public License, the same as Clojure.