# Rhyming Sonnet Generation

We follow the guidelines given in the assignment to make sure that every other line in our generated sonnet rhymes.

Introducing rhyme into your poems is not actually that difficult. Since the sonnet follows strict rhyming
patterns, we can figure out what rhymes Shakespeare uses by looking at the last words of rhyming line
pairs, and add this to some sort of rhyming dictionary. Then, we can generate two lines that rhyme by
seeding the end of the line with words that rhyme, and then do HMM generation in the reverse direction.

We look at the last words of rhyming line pairs, and add this to some sort of rhyming dictionary

In [1]:
import os
import numpy as np
from IPython.display import HTML
from itertools import groupby
import re
from Rhyme_HMM import unsupervised_HMM as rhyme_unsupervised_HMM
import Rhyme_HMM_helper
import random

In [2]:
shakespeare = open("data/shakespeare.txt", 'r')

poems = shakespeare.readlines()
split_at = "\n"
final_poems = [list(g)[1:] for k, g in groupby(poems, lambda x: x != split_at) if k]
print("Initial number of poems: {}".format(len(final_poems)))
poem_lengths = [len(poem) for poem in final_poems] 
bad_poems = np.where(np.array(poem_lengths)!= 14)[0]
print ("Sonnets {} and {} are not 14 lines long so we remove them from our list.".format(bad_poems[0], bad_poems[1]))

final_poems = [final_poems[i] for i in np.delete(np.arange(len(final_poems)), bad_poems)]
print("Final number of poems: {}".format(len(final_poems)))
final_poems = [''.join([line.strip(' ') for line in poem]) for poem in final_poems]

Initial number of poems: 154
Sonnets 98 and 125 are not 14 lines long so we remove them from our list.
Final number of poems: 152


In [3]:
final_poems[0].split("\n")[0].split(" ")[-1][:-1]

'increase'

In [4]:
final_poems

["From fairest creatures we desire increase,\nThat thereby beauty's rose might never die,\nBut as the riper should by time decease,\nHis tender heir might bear his memory:\nBut thou contracted to thine own bright eyes,\nFeed'st thy light's flame with self-substantial fuel,\nMaking a famine where abundance lies,\nThy self thy foe, to thy sweet self too cruel:\nThou that art now the world's fresh ornament,\nAnd only herald to the gaudy spring,\nWithin thine own bud buriest thy content,\nAnd tender churl mak'st waste in niggarding:\nPity the world, or else this glutton be,\nTo eat the world's due, by the grave and thee.\n",
 "When forty winters shall besiege thy brow,\nAnd dig deep trenches in thy beauty's field,\nThy youth's proud livery so gazed on now,\nWill be a tattered weed of small worth held:\nThen being asked, where all thy beauty lies,\nWhere all the treasure of thy lusty days;\nTo say within thine own deep sunken eyes,\nWere an all-eating shame, and thriftless praise.\nHow much

In [5]:
def get_rhyme_pairs(poem):
    rhyme_pairs = []
    last_words = []
    poem = poem.split("\n")
    for line in poem:
        
        word = line.split(" ")[-1]
        word = re.sub(r'[^\w]', '', word).lower()
        last_words.append(word)

    if '' in last_words:
        last_words.remove('')
    
    rhyme_pairs.append((last_words[0], last_words[2]))
    rhyme_pairs.append((last_words[1], last_words[3]))
    rhyme_pairs.append((last_words[4], last_words[6]))
    rhyme_pairs.append((last_words[5], last_words[7]))
    rhyme_pairs.append((last_words[8], last_words[10]))
    rhyme_pairs.append((last_words[9], last_words[11]))
    rhyme_pairs.append((last_words[12], last_words[13]))
    
    return rhyme_pairs
 

In [6]:
# Now compile all the rhyming words in each poem
rhyming_dict = []
for poem in final_poems:
    rhyming_dict += get_rhyme_pairs(poem)

print(rhyming_dict)

[('increase', 'decease'), ('die', 'memory'), ('eyes', 'lies'), ('fuel', 'cruel'), ('ornament', 'content'), ('spring', 'niggarding'), ('be', 'thee'), ('brow', 'now'), ('field', 'held'), ('lies', 'eyes'), ('days', 'praise'), ('use', 'excuse'), ('mine', 'thine'), ('old', 'cold'), ('viewest', 'renewest'), ('another', 'mother'), ('womb', 'tomb'), ('husbandry', 'posterity'), ('thee', 'see'), ('prime', 'time'), ('be', 'thee'), ('spend', 'lend'), ('legacy', 'free'), ('abuse', 'use'), ('give', 'live'), ('alone', 'gone'), ('deceive', 'leave'), ('thee', 'be'), ('frame', 'same'), ('dwell', 'excel'), ('on', 'gone'), ('there', 'where'), ('left', 'bereft'), ('glass', 'was'), ('meet', 'sweet'), ('deface', 'place'), ('distilled', 'selfkilled'), ('usury', 'thee'), ('loan', 'one'), ('art', 'depart'), ('thee', 'posterity'), ('fair', 'heir'), ('light', 'sight'), ('eye', 'majesty'), ('hill', 'still'), ('age', 'pilgrimage'), ('car', 'are'), ('day', 'way'), ('noon', 'son'), ('sadly', 'gladly'), ('joy', 'annoy

Next we want to cluster these words and put them into lists of rhyming words. We did this, but we noticed that this clusters some words together that may not actually rhyme in their context, such as "I" and "free." Thus, we elected to just use our earlier method of word pairs.

In [7]:
import networkx as nx

G = nx.Graph()
G.add_edges_from(rhyming_dict)

paths_between_generator = nx.all_simple_paths(G,source="thee",target="i")
nodes_between_set = set()
for path in paths_between_generator:
    for node in path:
        nodes_between_set.add(node)
print(nodes_between_set)

{'i', 'be', 'free', 'thee', 'me', 'see'}


In [8]:
import networkx as nx

G = nx.Graph()
G.add_edges_from(rhyming_dict)
rhyme_clusters = []
for graph in list(nx.connected_component_subgraphs(G)):
    rhyme_clusters.append(graph.nodes())

In [9]:
rhyme_clusters

[['increase', 'decease', 'decrease', 'lease', 'cease', 'excess'],
 ['thereby',
  'remedy',
  'enmity',
  'posterity',
  'why',
  'sky',
  'husbandry',
  'idolatry',
  'canopy',
  'lie',
  'by',
  'fortify',
  'usury',
  'fly',
  'i',
  'flattery',
  'melancholy',
  'ye',
  'thee',
  'me',
  'eye',
  'constancy',
  'eternity',
  'masonry',
  'deny',
  'decree',
  'legacy',
  'free',
  'alchemy',
  'memory',
  'dignity',
  'majesty',
  'fee',
  'be',
  'die',
  'history',
  'see',
  'qualify',
  'gravity',
  'defy'],
 ['arise',
  'lies',
  'prophecies',
  'devise',
  'despise',
  'subtleties',
  'spies',
  'eyes',
  'cries'],
 ['jewel', 'cruel', 'fuel'],
 ['ornament',
  'spent',
  'monument',
  'excellent',
  'argument',
  'content',
  'invent',
  'rent'],
 ['prefiguring',
  'spring',
  'niggarding',
  'sing',
  'thing',
  'wing',
  'ordering',
  'king',
  'bring'],
 ['brow', 'bough', 'how', 'bow', 'allow', 'mow', 'now'],
 ['stelled', 'held', 'field'],
 ['praise', 'decays', 'days', 'lays

In [10]:
def sublist_contains(lst, obj):
    for item in lst:
        if obj in item:
            print(item)
    return []


In [11]:
sublist_contains(rhyming_dict,'alchemy')

('eye', 'alchemy')
('flattery', 'alchemy')


[]

In [12]:
import os
import numpy as np
from IPython.display import HTML
from itertools import groupby
import re
from Rhyme_HMM import unsupervised_HMM as rhyme_unsupervised_HMM
from Rhyme_HMM_helper import sample_pair
import random

In [13]:
shakespeare = open("data/shakespeare.txt", 'r')

poems = shakespeare.readlines()
split_at = "\n"
final_poems = [list(g)[1:] for k, g in groupby(poems, lambda x: x != split_at) if k]
print("Initial number of poems: {}".format(len(final_poems)))
poem_lengths = [len(poem) for poem in final_poems] 
bad_poems = np.where(np.array(poem_lengths)!= 14)[0]
print ("Sonnets {} and {} are not 14 lines long so we remove them from our list.".format(bad_poems[0], bad_poems[1]))

final_poems = [final_poems[i] for i in np.delete(np.arange(len(final_poems)), bad_poems)]
print("Final number of poems: {}".format(len(final_poems)))
final_poems = [''.join([line.strip(' ') for line in poem]) for poem in final_poems]

Initial number of poems: 154
Sonnets 98 and 125 are not 14 lines long so we remove them from our list.
Final number of poems: 152


In [14]:
# token_map maps words to numbers
# tokenized_poems replaces the words in poems with their corresponding number
tokenized_poems, token_map = Rhyme_HMM_helper.parse_observations_backwards(final_poems)
token_map_r = Rhyme_HMM_helper.obs_map_reverser(token_map)

In [15]:
# Helpful lists
# Syllables
syllable_file = open("data/Syllable_dictionary.txt", 'r')
syllables = syllable_file.readlines()
syllables = [x.split() for x in syllables]
syllable_dict = {}
"""
for syllable in syllables:
    word = re.sub(r'[^\w]', '', syllable[0])
    syllable_dict[word] = syllable[1:]
"""
# We choose to map words to tuples of lists
# the first list corresponds to the number of syllables if the word were at the end (E)
# the second list corresponds to the number of syllables the word can take anywhere
# E.g. "test": ['E1', '2', '3'] <-> "test": [([1], [2, 3])]
for syllable in syllables:
    word = re.sub(r'[^\w]', '', syllable[0])
    end_syllable_list = []
    regular_syllable_list = []
    for item in syllable[1:]:
        if item[0] == "E":
            end_syllable_list.append(int(item[1:]))
        else:
            regular_syllable_list.append(int(item))
    syllable_dict[word] = (end_syllable_list, regular_syllable_list)
    
syllable_dict

{'gainst': ([], [1]),
 'greeing': ([1], [2]),
 'scaped': ([], [1]),
 'tis': ([], [1]),
 'twixt': ([], [1]),
 'a': ([], [1]),
 'adoting': ([2], [3]),
 'abhor': ([], [2]),
 'abide': ([], [2]),
 'able': ([], [2]),
 'about': ([], [2]),
 'above': ([], [2]),
 'absence': ([], [2]),
 'absent': ([], [2]),
 'abundance': ([], [3]),
 'abundant': ([], [3]),
 'abuse': ([], [2]),
 'abused': ([], [2]),
 'abuses': ([], [3]),
 'abysm': ([], [2]),
 'accents': ([], [2]),
 'acceptable': ([], [4]),
 'acceptance': ([], [3]),
 'accessary': ([], [4]),
 'accident': ([], [3]),
 'accidents': ([], [3]),
 'account': ([], [2]),
 'accumulate': ([], [4]),
 'accuse': ([], [2]),
 'accusing': ([], [3]),
 'achieve': ([], [2]),
 'acknowledge': ([], [3]),
 'acquaintance': ([], [3]),
 'acquainted': ([2], [3]),
 'act': ([], [1]),
 'action': ([], [2]),
 'active': ([], [2]),
 'actor': ([], [2]),
 'add': ([], [1]),
 'added': ([], [2]),
 'adders': ([], [2]),
 'addeth': ([], [2]),
 'adding': ([], [2]),
 'addition': ([], [3]),
 'ad

In [16]:
tokenized_syllable_dict = {}
for key in syllable_dict.keys():
    # If the word in syllable_dict is in our token map, add it to our tokenized_syllable_dict
    try:
        tokenized_syllable_dict[token_map[key]] = syllable_dict[key]
    except KeyError:
        pass
tokenized_syllable_dict

{478: ([], [1]),
 2593: ([1], [2]),
 2251: ([], [1]),
 923: ([], [1]),
 2017: ([], [1]),
 44: ([], [1]),
 838: ([2], [3]),
 3088: ([], [2]),
 1013: ([], [2]),
 2188: ([], [2]),
 2561: ([], [2]),
 2200: ([], [2]),
 1344: ([], [2]),
 1374: ([], [2]),
 41: ([], [3]),
 2360: ([], [3]),
 214: ([], [2]),
 2151: ([], [2]),
 2720: ([], [3]),
 2552: ([], [2]),
 1923: ([], [2]),
 235: ([], [4]),
 2905: ([], [3]),
 1273: ([], [4]),
 2753: ([], [3]),
 2603: ([], [3]),
 1120: ([], [2]),
 2647: ([], [4]),
 2639: ([], [2]),
 1721: ([], [3]),
 1888: ([], [2]),
 1292: ([], [3]),
 2066: ([], [3]),
 814: ([2], [3]),
 3102: ([], [1]),
 1843: ([], [2]),
 1303: ([], [2]),
 889: ([], [2]),
 1932: ([], [1]),
 2082: ([], [2]),
 2554: ([], [2]),
 2906: ([], [2]),
 845: ([], [2]),
 843: ([], [3]),
 1701: ([], [2]),
 2268: ([], [2]),
 2742: ([], [2]),
 2175: ([], [3]),
 1747: ([], [3]),
 2621: ([], [2]),
 2908: ([], [3]),
 1621: ([], [3]),
 357: ([], [2]),
 2713: ([], [3]),
 2087: ([], [2]),
 1825: ([], [3]),
 12

In [17]:
flattened_tokenized_poems = [val for sublist in tokenized_poems for val in sublist]
# flattened_poems = [val for sublist in tokenized_poems for val in sublist]
hmm = rhyme_unsupervised_HMM(flattened_tokenized_poems, 2, 100)

pairs = []
for i in range(7):
    pairs.append(Rhyme_HMM_helper.sample_pair(hmm, token_map, tokenized_syllable_dict, rhyming_dict, num_syllables=10))

print('Rhymed Sonnet:\n====================')
print(pairs[0][0])
print(pairs[1][0])
print(pairs[0][1])
print(pairs[1][1])
print(pairs[2][0])
print(pairs[3][0])
print(pairs[2][1])
print(pairs[3][1])
print(pairs[4][0])
print(pairs[5][0])
print(pairs[4][1])
print(pairs[5][1])
print(pairs[6][0])
print(pairs[6][1])


Iteration: 10
Iteration: 20
Iteration: 30
Iteration: 40
Iteration: 50
Iteration: 60
Iteration: 70
Iteration: 80
Iteration: 90
Iteration: 100
Rhymed Sonnet:
Why and for oaths not all though great joy so;
Break thee far of by accident are friend;
Love secret star to to mute are the grow;
None limbs your mend the hours untainted end;
Your world to when incertainties with art;
Make the niggard greet most crime newfired shine;
Whose argument me that their age depart;
Wide he to ride enjoys bait in stirred mine;
Much fortune find and be or of self dyed;
Ever sins but streams your let health untrue;
Admitted dost invent and dignified;
Be do work weary prescriptions their you;
Day neer what wait of of waiting am pen;
Ill but sad with brief and purge or not men;


In [18]:
# token_map maps words to numbers
# tokenized_poems replaces the words in poems with their corresponding number
tokenized_poems, token_map = Rhyme_HMM_helper.parse_observations(final_poems)
token_map_r = Rhyme_HMM_helper.obs_map_reverser(token_map)
flattened_tokenized_poems = [val for sublist in tokenized_poems for val in sublist]
# flattened_poems = [val for sublist in tokenized_poems for val in sublist]
hmm = rhyme_unsupervised_HMM(flattened_tokenized_poems, 2, 100)

pairs = []
for i in range(7):
    pairs.append(Rhyme_HMM_helper.sample_pair(hmm, token_map, tokenized_syllable_dict, rhyming_dict, num_syllables=10))

print('Rhymed Sonnet:\n====================')
print(pairs[0][0])
print(pairs[1][0])
print(pairs[0][1])
print(pairs[1][1])
print(pairs[2][0])
print(pairs[3][0])
print(pairs[2][1])
print(pairs[3][1])
print(pairs[4][0])
print(pairs[5][0])
print(pairs[4][1])
print(pairs[5][1])
print(pairs[6][0])
print(pairs[6][1])



Iteration: 10
Iteration: 20
Iteration: 30
Iteration: 40
Iteration: 50
Iteration: 60
Iteration: 70
Iteration: 80
Iteration: 90
Iteration: 100
Rhymed Sonnet:
Change up praise i hours it i poor war;
Self and victors these mud;
Joy hath name must sea that more the bar;
Live minds to thou wastes i bud;
Often line dress bright from being;
Tis sure in breath the first me;
Pain any for back sweet were delights dear seeing;
Grant to such foes so free;
Was say and greet thou mine your perish;
Look too all the thine weary my me elsewhere;
On to prepare do fame and fairest cherish;
With we doing but near;
Whereto profitless be times ornaments;
Play sadly all on beseem happy you you rents;
