A Python system for generating rhyming poetry with Markov chains.
The Rhyming Robot (seuss) uses Markov chains to generate random text. The
chains are generated from analysis of selected source texts placed in the
sources/ directory -- preferably source texts of 50,000 words or greater.
Each line in a poem can be generated using different source texts, allowing
surprising and amusing combinations (the Bible and legal disclaimers, for
Markov chains are generated in advance and cached in the
by makeChains.py. There are three Markov chains used:
- A forward-looking 2nd-order chain; that is, the next word depends on the previous two.
- A reverse-looking 2nd-order chain
- A reverse-looking 1st-order chain
The basic process looks like this:
- The first line of the poem is generated using the forward-looking chain.
- If the second line does not need to rhyme, it uses the forward chain. Otherwise, it uses the rhyming dictionary to find words that rhyme, then attempts to find those words in the reverse 1st-order chain. This chain is used to find the next-to-last word, and the line is filled in in reverse usng the 2nd-order chain after this.
- Subsequent lines follow the same process, according to the supplied rhyme scheme.
The rhyming dictionary is contained in
data/words.sql.gz. You'll need to
import this into an SQLite database named
data/sql-words. To do this, open a
shell in the
data/ directory and type:
$ gunzip words.sql.gz $ sqlite3 sql-words SQLite version 3.7.13 2012-06-11 02:05:22 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> .read words.sql sqlite> .quit
This creates the database and reads the data into it. You're now ready to use the Rhyming Robot.
First, supply the robot with a source text. Ideal source texts are very long (50,000 words or more) and amusing. Choose a source with a normal English vocabulary -- the rhyming dictionary does not understand specialized terminology or foreign languages. You may supply as many source texts as you like and switch between them.
In this example, we will use the World English Bible from ebible.org; use the
plain-text version and put it all in one text file. Cut out any parts you do not
want and save it in
format.pl to split the file into sentences:
$ format.pl bible
format.pl will take
name-raw.txt and transform it into
contains the same source with one sentence per line. This can be loaded by the
Markov chain generator.
Heading back up to the directory containing
$ python makeChains.py bible
This may take a long time. Ensure the script has access to the
directory and can create files there. Once done, there should be three new
cache/ containing the Markov chains.
Now you'll need to edit the
people dictionary in
rhyme.py to know about your
new source. The dictionary maps single characters to source names; you'll use
these characters when choosing what sources will be written in your poem.
You can now write poetry. If you added the Bible with code
b in the
dictionary, you can specify a rhymescheme
aabba and personality
have the generator use the Bible for every line. The personality specification
must be the same length as the rhymescheme.
python rhyme.py to see the detailed usage instructions.
seuss.py contains a very simple IRC bot based on twisted.words. It contains
a number of hard-coded configuration parameters; you will have to adjust it
to your needs.
By default, it loads the "brains" (source texts, which must have premade
Markov chains in
cache/) specified in
gives the fraction of messages which are responded to by the robot
automatically; the bot will respond to random messages along with any message
that mentions its name.
The server can be specified in the block at the bottom of the file, and the channel name must be given when the bot is started; e.g.,
$ python seuss.py channelName
The bot will join any channel it is /invited to, so be wary. It will also
respond to any /msg directed to it; be careful of infinite loops that may
occur when a bot msgs the rhyming bot. (For example, in testing the bot
received a msg from NickServ on connecting, responded, received an "unknown
command" error, responded, etc.) You can use
self.nickExcludeList to avoid
The web interface is not recommended, as poetry generation can use a
substantial amount of processing power. Nevertheless, for your amusement,
poetry.php is included. Simply place the rhyming robot in a web-accessible
directory and point people to
poetry.php. Make sure they can't invoke the
Python files from the web, of course.
poetry.php must be adjusted with a list
of all the valid source texts you have provided.
It turns out that random generation of amusing content has been around for a while. Here's some suggestions: