In [1]:
import pandas as pd
import numpy as np
import urllib
import tensorflow as tf
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import LSTM
from keras.callbacks import ModelCheckpoint
from keras.utils import np_utils
np.random.seed(62)
tf.random.set_seed(62)


When you run this notebook, make sure that you set the hardware accelerator to GPU. The nerual net trains much faster this way. 

I'm roughly following a tutorial on using keras for developing recurrent nerual networks here: 

https://machinelearningmastery.com/text-generation-lstm-recurrent-neural-networks-python-keras/

First I will download shakespeare's sonnets. 

In [2]:
!mkdir dataset

In [3]:
urllib.request.urlretrieve(
    'https://raw.githubusercontent.com/lakigigar/Caltech-CS155-2021/main/projects/project3/data/shakespeare.txt', 
                           './dataset/shakespeare.txt')

('./dataset/shakespeare.txt', <http.client.HTTPMessage at 0x7fab43c54990>)

Next I load the sonnets into memory, convert them to lowercase, strip out unnecessary whitespace and add a terminal * character to indicate the end of the sonnet. 

In [4]:
# load ascii text and covert to lowercase
filename = './dataset/shakespeare.txt'
raw_text = open(filename, 'r', encoding='utf-8').read()
raw_text = raw_text.lower()
sonnets = raw_text.split('\n\n')
sonnets[0] = '\n' + sonnets[0]
N = len(sonnets)
for i in range(N):
  sonnet = sonnets[i][1:] # extract the sonnet, minus the newline at the beginning
  index = sonnet.index('\n')
  sonnet = sonnet[index + 1:] + "*" # I am using an astrix to mark the end of a sonnet.
  sonnets[i] = sonnet

sonnets[:10]

["from fairest creatures we desire increase,\nthat thereby beauty's rose might never die,\nbut as the riper should by time decease,\nhis tender heir might bear his memory:\nbut thou contracted to thine own bright eyes,\nfeed'st thy light's flame with self-substantial fuel,\nmaking a famine where abundance lies,\nthy self thy foe, to thy sweet self too cruel:\nthou that art now the world's fresh ornament,\nand only herald to the gaudy spring,\nwithin thine own bud buriest thy content,\nand tender churl mak'st waste in niggarding:\n  pity the world, or else this glutton be,\n  to eat the world's due, by the grave and thee.*",
 "when forty winters shall besiege thy brow,\nand dig deep trenches in thy beauty's field,\nthy youth's proud livery so gazed on now,\nwill be a tattered weed of small worth held:\nthen being asked, where all thy beauty lies,\nwhere all the treasure of thy lusty days;\nto say within thine own deep sunken eyes,\nwere an all-eating shame, and thriftless praise.\nhow m

We need to encode the sonnets in vectors, and I will do this by representing each character with an integer in a list. 

In [5]:
# create mapping of unique chars to integers
# first I need a new string with all the sonnets concatenated, so that the 
# numbers above the sonnets are not included in the vocabulary. 
raw_text = ""
for i in range(N):
  raw_text += sonnets[i]
chars = sorted(list(set(raw_text)))
char_to_int = dict((c, i) for i, c in enumerate(chars))
int_to_char = dict((i, c) for i, c in enumerate(chars))

# summarize the loaded data
n_chars = len(raw_text)
n_vocab = len(chars)
print("Total Characters: ", n_chars)
print("Total Vocab: ", n_vocab)

Total Characters:  94290
Total Vocab:  39


For each sonnet, I want to include a training vector containing 40 characters, where the prediction target is the next character, up until the sonnet is expected to predict the terminal character. 

In [6]:
# We will predict each character using the preceding 40 characters.
seq_length = 40

# How many entries are we going to have?
entries = 0
for i in range(N):
  entries += len(sonnets[i]) - seq_length

# prepare the dataset of input to output pairs encoded as integers
dataX = np.zeros([entries, seq_length, 1], dtype=np.float32)
dataY = np.zeros(entries)

entry = 0
for i in range(N):
  sonnet = sonnets[i]
  for j in range(seq_length, len(sonnet)):
    # record the character to be predicted
    dataY[entry] = char_to_int[sonnet[j]]

    # record the training vector
    for k in range(40):
      dataX[entry, k, 0] = char_to_int[sonnet[j - seq_length + k]]

    entry += 1

# normalize the input vectors
X = dataX / (n_vocab - 1)

# one hot encode the output variable
Y = np_utils.to_categorical(dataY)

Next I will define the recurrent neural net architecture as a model in Keras. As advised, I will use a single layer of 200 LSTM units, followed by an output layer with 39 units (one for each potential character) and softmax thresholds.

In [7]:
!mkdir checkpoints

In [8]:
def get_model():
  "Returns a keras recurrent neural net model with the desired architecture."
  # define the LSTM model
  model = Sequential()
  model.add(LSTM(200, input_shape=(seq_length, 1)))
  # Apply regularization. This greatly improved model performance. 
  model.add(Dropout(0.1))
  # Add a softmax output layer.
  model.add(Dense(Y.shape[1], activation='softmax'))
  model.compile(loss='categorical_crossentropy', optimizer='adam')
  return model

model = get_model()
# define the checkpoint
filepath="./checkpoints/weights-improvement-{epoch:02d}.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='loss', verbose=True, 
                             save_best_only=True, mode='min')
callbacks_list = [checkpoint]

First I will train with a validation set in order to determine the optimal number of epochs. 

In [9]:
# fit the model
N_test = 10000
X_train = X[N_test:]
Y_train = Y[N_test:]
X_test = X[:N_test]
Y_test = Y[:N_test]
model.fit(X_train, Y_train, epochs=25, batch_size=128, callbacks=callbacks_list,
          validation_data=(X_test, Y_test))

Epoch 1/25

Epoch 00001: loss improved from inf to 2.98197, saving model to ./checkpoints/weights-improvement-01.hdf5
Epoch 2/25

Epoch 00002: loss improved from 2.98197 to 2.80282, saving model to ./checkpoints/weights-improvement-02.hdf5
Epoch 3/25

Epoch 00003: loss improved from 2.80282 to 2.72487, saving model to ./checkpoints/weights-improvement-03.hdf5
Epoch 4/25

Epoch 00004: loss improved from 2.72487 to 2.67493, saving model to ./checkpoints/weights-improvement-04.hdf5
Epoch 5/25

Epoch 00005: loss improved from 2.67493 to 2.63503, saving model to ./checkpoints/weights-improvement-05.hdf5
Epoch 6/25

Epoch 00006: loss improved from 2.63503 to 2.60394, saving model to ./checkpoints/weights-improvement-06.hdf5
Epoch 7/25

Epoch 00007: loss improved from 2.60394 to 2.57843, saving model to ./checkpoints/weights-improvement-07.hdf5
Epoch 8/25

Epoch 00008: loss improved from 2.57843 to 2.55534, saving model to ./checkpoints/weights-improvement-08.hdf5
Epoch 9/25

Epoch 00009: los

<tensorflow.python.keras.callbacks.History at 0x7faaba297290>

We see that we get the lowest validation error around epoch 18. Thus, let's train on the full dataset for 18 epochs. 

In [10]:
model = get_model()
model.fit(X, Y, epochs=18, batch_size=128, callbacks=callbacks_list)

Epoch 1/18

Epoch 00001: loss did not improve from 2.15177
Epoch 2/18

Epoch 00002: loss did not improve from 2.15177
Epoch 3/18

Epoch 00003: loss did not improve from 2.15177
Epoch 4/18

Epoch 00004: loss did not improve from 2.15177
Epoch 5/18

Epoch 00005: loss did not improve from 2.15177
Epoch 6/18

Epoch 00006: loss did not improve from 2.15177
Epoch 7/18

Epoch 00007: loss did not improve from 2.15177
Epoch 8/18

Epoch 00008: loss did not improve from 2.15177
Epoch 9/18

Epoch 00009: loss did not improve from 2.15177
Epoch 10/18

Epoch 00010: loss did not improve from 2.15177
Epoch 11/18

Epoch 00011: loss did not improve from 2.15177
Epoch 12/18

Epoch 00012: loss did not improve from 2.15177
Epoch 13/18

Epoch 00013: loss did not improve from 2.15177
Epoch 14/18

Epoch 00014: loss did not improve from 2.15177
Epoch 15/18

Epoch 00015: loss did not improve from 2.15177
Epoch 16/18

Epoch 00016: loss did not improve from 2.15177
Epoch 17/18

Epoch 00017: loss did not improve fr

<tensorflow.python.keras.callbacks.History at 0x7faab2c449d0>

Keras doesn't have an obvious method to sample the softmax output with a temperature parameter, so I have written my own function for this. 

In [11]:
def sample_softmax(probs, temp):
  # first we invert the exponentiation done in the softmax output
  probs = np.log(probs)
  # Next we divide by the temperature
  probs /= temp
  # We exponentiate again and normalize
  probs = np.exp(probs)
  probs /= np.sum(probs)
  # Finally, we draw a sample from the adjusted distribution. 
  return np.random.choice(list(range(n_vocab)), p=probs)

We can now load the best performing model and generate poems, character at a time. 

In [17]:
def generate_poem(temp, seed= "shall i compare thee to a summer's day?\n"):
  """Returns a poem generated from a recurrent neural network
  trained on shakespeare's sonnets, seeded with a 40-character 
  string 'seed' and sampled with a variance reflected in 'temp'."""
  # load the network weights
  filename = './checkpoints/weights-improvement-18.hdf5'
  model.load_weights(filename)
  model.compile(loss='categorical_crossentropy', optimizer='adam')

  x = np.zeros([1, seq_length, 1])
  for i in range(seq_length):
    x[0, i, 0] = char_to_int[seed[i]]

  # normalize the input
  x = x / (n_vocab - 1)

  # generate characters. Hopefully the sonnet should terminate on its own, but
  # I set a max length of 2500 because that is several times the length of an 
  # actual sonnet. 
  poem = seed
  while poem[-1] != '*' and len(poem) < 2500:
    # predict the probability distribution on the next character
    prediction = model.predict(x, verbose=0)[0]

    # sample the character from the distribution
    index = sample_softmax(prediction, temp)

    # append the new character to the poem
    poem = poem + int_to_char[index]

    # shift the input vector and append the predicted character to the end.
    x[0, 0:-1, 0] = x[0, 1:, 0]
    x[0, -1, 0] = index / (n_vocab - 1)
    
  return poem

Now I will generate poems using the desired temperatures. 

In [18]:
print(generate_poem(.25))

shall i compare thee to a summer's day?
the pane tfe loeer of toes whet fear she sene,
whet thou mas thet whi heve and thete

 aoe thet i ar for the wire   no shet ie more th thoe th thee whal she hr mene thee 
 nhn toet a aoadn the lore tf thee mo tore 
  mo so the love th thye the lore to thye

 no soee i soml   no whet i ar ane then whech her she ken,
  and thes i woil a aaakt to thee whech whet i srene,
that i soel toetl oo tore and thin what s thet whel whet the sores tf thi seee 
 no whet i soo she tored and thou miv so toewe,
  and thes i soo shee whoh the lase oo tees,
  and thet in thee  and thet thas lens of toaee,
  shet shiu d so teae in thes whech she maser so the sire,
  and thet iear ser the seas io thy soeme,
the lane the seat she more sf tiee whe hane th thee 
  aed thet i soor sh thee whet fer the sire 
  and goe thet sere that thnu whll pooesse that whet i soo mene the lores of thet shi helt,
th thet io thee a aoank so tiat is soeete thet sille thet sers 
aod thet io

In [19]:
print(generate_poem(.75))

shall i compare thee to a summer's day?
to caleyt soamd ane are sho buete nn mo.
thr shyele the iiasee shar tho ruueaissets ardm,
ohs shot noo shift enstgr thd pore si thy danrels,
ant yhis h sas mea aod the peekter or toewi,
 nt waal any thai mh mems to the beler 
 nocg fy ln eore hnr teyued cnd tianr foeae
aannere, thet hosd tem poiets, ard thy shr grise oh,toe a torehne sooue,
thas hhaly io then what weo hes seeee 
 oef avu eu liah me aod mo thmt rt wout,
 nfs haad sias whtt srg
theie shtuel soodd.
woee mi thet worrd shy soaki ro breas 
aot bead i cey ailwc dedat titem hore uoerd if iin 
sheel tiu toeotr oate do the wirldr trats,
when hen thy brnl'else hane ho iaatt no alids 
aeti tterne to mo ooves sy toie wes eeast,
who iav thn gens aet blte woet thou st 
ert mr io c rai pam strrte,faro.
tfo asr dnr oy fesi ot pafs sie ioenn,
tnnn me theu feccr ffn saet bytere tooke 
 nhe pis boset sho shie aatse bads ie shs bruhns,deg, tien iaot lo my srigte'tn it gemt sooe,
aod yhu ans mom tho s

In [20]:
print(generate_poem(1.5))

shall i compare thee to a summer's day?
fol dnr mft govp'add lo iyrbedc mao
*
