Skip to content
Go to file

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time


  • python2.7
  • tensorflow 0.12
    • CPU: pip install tensorflow==0.12.0
    • GPU: pip install tensorflow-gpu==0.12.0
  • nltk, cmudict and stopwords
    • import nltk;"cmudict");"stopwords")
  • gensim
    • pip install gensim
  • sklearn
    • pip install sklearn
  • numpy
    • pip install numpy

Data / Models

  • datasets/gutenberg/data.tgz: sonnet data, with train/valid/test splits
  • pretrain_word2vec/dim100/*: pre-trained word2vec model
  • trained_model/model.tgz: trained sonnet model

Pre-training Word Embeddings

  • The pre-trained word2vec model has already been supplied: pretrain_word2vec/dim100/*
  • It was trained on 34M Gutenberg poetry data: download link
  • If you want to train your own word embeddings, you can use the python script (uses gensim's word2vec)
    • python

Training the Sonnet Model

  1. Extract the data; it should produce the train/valid/test splits
    • cd datasets/gutenberg; tar -xvzf data.tgz
  2. Unzip the pre-trained word2vec model
    • gunzip pretrain_word2vec/dim100/*
  3. Set up model hyper-parameters and other settings, which are all defined in
    • the default configuration is the optimal configuration used in the paper (documented here)
  4. Run python
    • takes about 2-3 hours on a single K80 GPU to train 30 epochs

Generating Sonnet Quatrain

  1. Extract the trained model
    • cd trained_model; tar -xvzf model.tgz
  2. Run python -m trained_model
    • the default configuration is the generation configuration used in the paper
    • takes about a minute to generate one quatrain on CPU (GPU not necessary)
usage: [-h] -m MODEL_DIR [-n NUM_SAMPLES] [-r RM_THRESHOLD]
                     [-s SENT_SAMPLE] [-a TEMP_MIN] [-b TEMP_MAX] [-d SEED]
                     [-v] [-p SAVE_PICKLE]

Loads a trained model to do generation

optional arguments:
  -h, --help            show this help message and exit
  -m MODEL_DIR, --model-dir MODEL_DIR
                        directory of the saved model
  -n NUM_SAMPLES, --num-samples NUM_SAMPLES
                        number of quatrains to generate (default=1)
  -r RM_THRESHOLD, --rm-threshold RM_THRESHOLD
                        rhyme cosine similarity threshold (0=off; default=0.9)
  -s SENT_SAMPLE, --sent-sample SENT_SAMPLE
                        number of sentences to sample from using pentameter
                        loss as sample probability (1=turn off sampling;
  -a TEMP_MIN, --temp-min TEMP_MIN
                        minimum temperature for word sampling (default=0.6)
  -b TEMP_MAX, --temp-max TEMP_MAX
                        maximum temperature for word sampling (default=0.8)
  -d SEED, --seed SEED  seed for generation (default=1)
  -v, --verbose         increase output verbosity
  -p SAVE_PICKLE, --save-pickle SAVE_PICKLE
                        save samples in a pickle (list of quatrains)

Generated Quatrains:

python -m trained_model/ -d 1

Temperature = 0.6 - 0.8
  01  [0.43]  with joyous gambols gay and still array
  02  [0.44]  no longer when he twas, while in his day
  03  [0.00]  at first to pass in all delightful ways
  04  [0.40]  around him, charming and of all his days
python -m trained_model/ -d 2
Temperature = 0.6 - 0.8
  01  [0.44]  shall i behold him in his cloudy state
  02  [0.00]  for just but tempteth me to stop and pray
  03  [0.00]  a cry: if it will drag me, find no way
  04  [0.40]  from pardon to him, who will stand and wait

Media Coverage


Jey Han Lau, Trevor Cohn, Timothy Baldwin, Julian Brooke and Adam Hammond (2018). Deep-speare: A joint neural model of poetic language, meter and rhyme (Supplementary Material). In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), Melbourne, Australia, pp. 1948--1958.


Creativity, Machine and Poetry for a public forum on language [video]


Code for Deep-speare: a joint neural model of poetic language, meter and rhyme




No releases published


No packages published
You can’t perform that action at this time.