A simple python library for generating text with markov chains
Python
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.gitignore
README.md
__init__.py
marky.py
poetry.py

README.md

Marky - A Markov Chain Text Generation Library

Marky is a simple library implementing markov-chains of any direction/stepping for text generation.

You can create a chain like so:

text = ...
chain = marky.chain(1, marky.word_tokenize(text))

If you have a problem with nltk leaving punctuation on the ends of words, you can use marky.fix_passage_punc to strip punctuation from the ends of words.

text = ...
chain = marky.chain(1, marky.fix_passage_punc(marky.word_tokenize(text)))

Here, 1 is the step. The step is how far ahead chain pairs words. For example, with a step of 1:

(a, b, c, d, e) would pair (a, b), (b, c), (c, d), (d, e)

Whereas a step of 2 would pair it to:

(a, c), (b, d), (c, e)

If the sign is negative, the pairing goes backwards. This is useful for generating from an end word (see the poetry example). A step of -1 would produce

(e, d), (d, c), (c, b), (b, a)

And -2 would produce

(e, c), (d, b), (c, a)

After you have the chain, you can use marky.take(chain, n) to take n generated words. This is based on the itertools.take recipe, as a MarkovChain instance is also an iterator. If you need to pass more options (check the code for them), you can use the full-featured get_word method instead of next.

The full simple example code is:

text = ...
chain = marky.chain(1, marky.fix_passage_punc(marky.word_tokenize(text)))
print ' '.join(take(chain, 100))

Note: word_tokenize requires nltk, a python library for NLP.