Skip to content
A simple python library for generating text with markov chains
Python
Branch: master
Clone or download

Latest commit

Fetching latest commit…
Cannot retrieve the latest commit at this time.

Files

Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.gitignore
README.md
__init__.py
marky.py
poetry.py

README.md

Marky - A Markov Chain Text Generation Library

Marky is a simple library implementing markov-chains of any direction/stepping for text generation.

You can create a chain like so:

text = ...
chain = marky.chain(1, marky.word_tokenize(text))

If you have a problem with nltk leaving punctuation on the ends of words, you can use marky.fix_passage_punc to strip punctuation from the ends of words.

text = ...
chain = marky.chain(1, marky.fix_passage_punc(marky.word_tokenize(text)))

Here, 1 is the step. The step is how far ahead chain pairs words. For example, with a step of 1:

(a, b, c, d, e) would pair (a, b), (b, c), (c, d), (d, e)

Whereas a step of 2 would pair it to:

(a, c), (b, d), (c, e)

If the sign is negative, the pairing goes backwards. This is useful for generating from an end word (see the poetry example). A step of -1 would produce

(e, d), (d, c), (c, b), (b, a)

And -2 would produce

(e, c), (d, b), (c, a)

After you have the chain, you can use marky.take(chain, n) to take n generated words. This is based on the itertools.take recipe, as a MarkovChain instance is also an iterator. If you need to pass more options (check the code for them), you can use the full-featured get_word method instead of next.

The full simple example code is:

text = ...
chain = marky.chain(1, marky.fix_passage_punc(marky.word_tokenize(text)))
print ' '.join(take(chain, 100))

Note: word_tokenize requires nltk, a python library for NLP.

You can’t perform that action at this time.