For my undergrad thesis I am building an automatic poet. The idea is to implement an evolutionary algorithm that can construct a poem using fitness functions trained on a text-based corpora.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Le Thesis

This is my graduation thesis, feel free to poke around and tell me what an idiot I am.

It comes out of a project I slowly started working on a while ago:

Some recaps of general research in this field are on my blog:

Le Goals

To implement a system that, given a corpora of poetry, can learn how to write its own poetry on par so that a layman observer cannot distinguish a generated poem from one written by a poet.


Parameters to be learned:

  • rhyming structure (phonetic match of last syllables in verse)
  • assonance structure (vowel repetition for internal rhyming)
  • alliteration structure (consonant repetition)
  • rhythm
  • verse structure
  • stanza structure

Most of this basically requires a way to syllabify text, which I believe can be achieved by finding a good substring cover of the corpora with strings adhering to the C*V+C* pattern.

Semantically the system needs to

  • sentence structure - supposedly well achievable with lexicalized probabilistic context free grammars
  • semantic structure (at least for internal consistency)
  • sentiment (emotional infliction of the poems)


The above parameters of a poem should be constructed into some sort of DNA from which poems can then be generated. It is important to achieve

  • correct rhyming/assonance/alliteration or combination of these
  • consistent use of rhythm
  • appropriate division into verses and stanzas
  • correct sentence structure
  • internal semantic consistency

Measuring success

It shouldn't be too difficult to measure correctness in terms of proper rhyming etc. as long as we stick to strict poetic structures.

But this looks like an arduous manual process, so a better solution needs to be found.