LANGGEN or BILBO BABBITT BABBLES
LANGGEN is my fun project to learn a RUST. It processes text files in given language (tested on English) and creates text model using trigrams. From these trigrams it generates random sentences, which are nonsence, but somehow remind "real" language.
How to use
- Ensure you have latest RUST environment and cargo.
- Clone project from Github
- Get some text files to use as language corpus - for instance this file all Doyle's Sherlock books in one txt or repository contains script to download top 100 english books from Project Guttenberg.
- build and run http server
cargo run --bin serve --release -- [your_text_files]
Also Dockerfiles are available in deploy(generic image) and deploy-s2i(builder image for Openshift)
MIT or Apache 2.0