Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time
July 10, 2018 11:07
September 30, 2022 10:56
July 11, 2018 10:58
July 10, 2018 12:59
July 10, 2018 12:11


  • python2.7
  • tensorflow 0.12
    • CPU: pip install tensorflow==0.12.0
    • GPU: pip install tensorflow-gpu==0.12.0
  • nltk, cmudict and stopwords
    • import nltk;"cmudict");"stopwords")
  • gensim
    • pip install gensim
  • sklearn
    • pip install sklearn
  • numpy
    • pip install numpy

Data / Models

  • datasets/gutenberg/data.tgz: sonnet data, with train/valid/test splits
  • datasets/gutenberg/pretrain.tgz: general poetry data ("Background" corpus in section 3 of the paper) used for pretraining word2vec (34M word tokens)
  • pretrain_word2vec/dim100/*: pre-trained word2vec model
  • trained_model/model.tgz: trained sonnet model

Pre-training Word Embeddings

  • The pre-trained word2vec model has already been supplied: pretrain_word2vec/dim100/*
  • It was trained on 34M Gutenberg poetry data (in datasets/gutenberg/pretrain.tgz)
  • If you want to train your own word embeddings, you can use the python script (uses gensim's word2vec)
    • python

Training the Sonnet Model

  1. Extract the data; it should produce the train/valid/test splits
    • cd datasets/gutenberg; tar -xvzf data.tgz
  2. Unzip the pre-trained word2vec model
    • gunzip pretrain_word2vec/dim100/*
  3. Set up model hyper-parameters and other settings, which are all defined in
    • the default configuration is the optimal configuration used in the paper (documented here)
  4. Run python
    • takes about 2-3 hours on a single K80 GPU to train 30 epochs

Generating Sonnet Quatrain

  1. Extract the trained model
    • cd trained_model; tar -xvzf model.tgz
  2. Run python -m trained_model
    • the default configuration is the generation configuration used in the paper
    • takes about a minute to generate one quatrain on CPU (GPU not necessary)
usage: [-h] -m MODEL_DIR [-n NUM_SAMPLES] [-r RM_THRESHOLD]
                     [-s SENT_SAMPLE] [-a TEMP_MIN] [-b TEMP_MAX] [-d SEED]
                     [-v] [-p SAVE_PICKLE]

Loads a trained model to do generation

optional arguments:
  -h, --help            show this help message and exit
  -m MODEL_DIR, --model-dir MODEL_DIR
                        directory of the saved model
  -n NUM_SAMPLES, --num-samples NUM_SAMPLES
                        number of quatrains to generate (default=1)
  -r RM_THRESHOLD, --rm-threshold RM_THRESHOLD
                        rhyme cosine similarity threshold (0=off; default=0.9)
  -s SENT_SAMPLE, --sent-sample SENT_SAMPLE
                        number of sentences to sample from using pentameter
                        loss as sample probability (1=turn off sampling;
  -a TEMP_MIN, --temp-min TEMP_MIN
                        minimum temperature for word sampling (default=0.6)
  -b TEMP_MAX, --temp-max TEMP_MAX
                        maximum temperature for word sampling (default=0.8)
  -d SEED, --seed SEED  seed for generation (default=1)
  -v, --verbose         increase output verbosity
  -p SAVE_PICKLE, --save-pickle SAVE_PICKLE
                        save samples in a pickle (list of quatrains)

Generated Quatrains:

python -m trained_model/ -d 1

Temperature = 0.6 - 0.8
  01  [0.43]  with joyous gambols gay and still array
  02  [0.44]  no longer when he twas, while in his day
  03  [0.00]  at first to pass in all delightful ways
  04  [0.40]  around him, charming and of all his days
python -m trained_model/ -d 2
Temperature = 0.6 - 0.8
  01  [0.44]  shall i behold him in his cloudy state
  02  [0.00]  for just but tempteth me to stop and pray
  03  [0.00]  a cry: if it will drag me, find no way
  04  [0.40]  from pardon to him, who will stand and wait

Crowdflower and Expert Evaluation

  • Annotations can be found in the folder: evaluation_annotation/

Media Coverage


Jey Han Lau, Trevor Cohn, Timothy Baldwin, Julian Brooke and Adam Hammond (2018). Deep-speare: A joint neural model of poetic language, meter and rhyme (Supplementary Material). In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), Melbourne, Australia, pp. 1948--1958.


Creativity, Machine and Poetry for a public forum on language [video]


Code for Deep-speare: a joint neural model of poetic language, meter and rhyme







No releases published


No packages published