# Natural language processing of Shakespearean work

## In this shorter project I am going to use a recurrent neural network that will generate new text based on a corpus of text data of a Shakespearen work (It has a very distinctive style. Since the text data uses old style english and is formatted in the style of a stage play, it will be very obvious to us if the model is able to reproduce similar results).

**The model will be built in a sequential way using a technique which I found in literature basically using an embedding layer (to turn positive integers which will be assigned to each character to vectors of probabilities of fixed sizes), a GRU unit (in this case the GRU cell tends to work better then the LSTM cell because of it's less complex build and fewer gates, thus it lacks the output gate of a classic LSTM cell) and finally a Dense layer with number of neuron = number of character classes.**

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [2]:
import tensorflow as tf

In [3]:
path_to_file = 'shakespeare.txt'

In [4]:
#I want to open the file in read mode:
text = open(path_to_file, 'r').read()

In [5]:
print(text[:500])


                     1
  From fairest creatures we desire increase,
  That thereby beauty's rose might never die,
  But as the riper should by time decease,
  His tender heir might bear his memory:
  But thou contracted to thine own bright eyes,
  Feed'st thy light's flame with self-substantial fuel,
  Making a famine where abundance lies,
  Thy self thy foe, to thy sweet self too cruel:
  Thou that art now the world's fresh ornament,
  And only herald to the gaudy spring,
  Within thine own bu


In [6]:
#Understand the unique characters from the text
vocab = sorted(set(text))

In [7]:
vocab

['\n',
 ' ',
 '!',
 '"',
 '&',
 "'",
 '(',
 ')',
 ',',
 '-',
 '.',
 '0',
 '1',
 '2',
 '3',
 '4',
 '5',
 '6',
 '7',
 '8',
 '9',
 ':',
 ';',
 '<',
 '>',
 '?',
 'A',
 'B',
 'C',
 'D',
 'E',
 'F',
 'G',
 'H',
 'I',
 'J',
 'K',
 'L',
 'M',
 'N',
 'O',
 'P',
 'Q',
 'R',
 'S',
 'T',
 'U',
 'V',
 'W',
 'X',
 'Y',
 'Z',
 '[',
 ']',
 '_',
 '`',
 'a',
 'b',
 'c',
 'd',
 'e',
 'f',
 'g',
 'h',
 'i',
 'j',
 'k',
 'l',
 'm',
 'n',
 'o',
 'p',
 'q',
 'r',
 's',
 't',
 'u',
 'v',
 'w',
 'x',
 'y',
 'z',
 '|',
 '}']

In [8]:
len(vocab)

84

- So we understand that we have 84 unique characters which can be then predicted by our model

- The model can not accept string characters to then make predictions on, so I have to assign each one a specific number

In [9]:
#assign a numberr for each character
for pair in enumerate(vocab):
    print(pair)

(0, '\n')
(1, ' ')
(2, '!')
(3, '"')
(4, '&')
(5, "'")
(6, '(')
(7, ')')
(8, ',')
(9, '-')
(10, '.')
(11, '0')
(12, '1')
(13, '2')
(14, '3')
(15, '4')
(16, '5')
(17, '6')
(18, '7')
(19, '8')
(20, '9')
(21, ':')
(22, ';')
(23, '<')
(24, '>')
(25, '?')
(26, 'A')
(27, 'B')
(28, 'C')
(29, 'D')
(30, 'E')
(31, 'F')
(32, 'G')
(33, 'H')
(34, 'I')
(35, 'J')
(36, 'K')
(37, 'L')
(38, 'M')
(39, 'N')
(40, 'O')
(41, 'P')
(42, 'Q')
(43, 'R')
(44, 'S')
(45, 'T')
(46, 'U')
(47, 'V')
(48, 'W')
(49, 'X')
(50, 'Y')
(51, 'Z')
(52, '[')
(53, ']')
(54, '_')
(55, '`')
(56, 'a')
(57, 'b')
(58, 'c')
(59, 'd')
(60, 'e')
(61, 'f')
(62, 'g')
(63, 'h')
(64, 'i')
(65, 'j')
(66, 'k')
(67, 'l')
(68, 'm')
(69, 'n')
(70, 'o')
(71, 'p')
(72, 'q')
(73, 'r')
(74, 's')
(75, 't')
(76, 'u')
(77, 'v')
(78, 'w')
(79, 'x')
(80, 'y')
(81, 'z')
(82, '|')
(83, '}')


The logic works, but I'd rather have key and value pairs to then feed to the model, so apply the above logic to create a dictionary with the index and character pair

In [61]:
char_to_ind = {char:ind for ind, char in enumerate(vocab)}

In [62]:
char_to_ind

{'\n': 0,
 ' ': 1,
 '!': 2,
 '"': 3,
 '&': 4,
 "'": 5,
 '(': 6,
 ')': 7,
 ',': 8,
 '-': 9,
 '.': 10,
 '0': 11,
 '1': 12,
 '2': 13,
 '3': 14,
 '4': 15,
 '5': 16,
 '6': 17,
 '7': 18,
 '8': 19,
 '9': 20,
 ':': 21,
 ';': 22,
 '<': 23,
 '>': 24,
 '?': 25,
 'A': 26,
 'B': 27,
 'C': 28,
 'D': 29,
 'E': 30,
 'F': 31,
 'G': 32,
 'H': 33,
 'I': 34,
 'J': 35,
 'K': 36,
 'L': 37,
 'M': 38,
 'N': 39,
 'O': 40,
 'P': 41,
 'Q': 42,
 'R': 43,
 'S': 44,
 'T': 45,
 'U': 46,
 'V': 47,
 'W': 48,
 'X': 49,
 'Y': 50,
 'Z': 51,
 '[': 52,
 ']': 53,
 '_': 54,
 '`': 55,
 'a': 56,
 'b': 57,
 'c': 58,
 'd': 59,
 'e': 60,
 'f': 61,
 'g': 62,
 'h': 63,
 'i': 64,
 'j': 65,
 'k': 66,
 'l': 67,
 'm': 68,
 'n': 69,
 'o': 70,
 'p': 71,
 'q': 72,
 'r': 73,
 's': 74,
 't': 75,
 'u': 76,
 'v': 77,
 'w': 78,
 'x': 79,
 'y': 80,
 'z': 81,
 '|': 82,
 '}': 83}

In [63]:
char_to_ind['Z']

51

Further in the model I will rest assured need to be using the index to call a character, so to do that I will simply change the 'vocab' variable to an array 

In [59]:
ind_to_char = np.array(vocab)

In [60]:
ind_to_char[51]

'Z'

- So now this indexing works both ways, next step would be encoding the text to an integer (index)

In [65]:
#Encode this character and index to the whole text:
encoded_text = np.array([char_to_ind[char] for char in text])

In [16]:
encoded_text.shape

(5445609,)

- We can see that this whole text dataset is extremely large, having more than 5 milion entries

In [66]:
text[:700]

"\n                     1\n  From fairest creatures we desire increase,\n  That thereby beauty's rose might never die,\n  But as the riper should by time decease,\n  His tender heir might bear his memory:\n  But thou contracted to thine own bright eyes,\n  Feed'st thy light's flame with self-substantial fuel,\n  Making a famine where abundance lies,\n  Thy self thy foe, to thy sweet self too cruel:\n  Thou that art now the world's fresh ornament,\n  And only herald to the gaudy spring,\n  Within thine own bud buriest thy content,\n  And tender churl mak'st waste in niggarding:\n    Pity the world, or else this glutton be,\n    To eat the world's due, by the grave and thee.\n\n\n                     2\n  When fo"

In [67]:
encoded_text[:700]

array([ 0,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
        1,  1,  1,  1,  1, 12,  0,  1,  1, 31, 73, 70, 68,  1, 61, 56, 64,
       73, 60, 74, 75,  1, 58, 73, 60, 56, 75, 76, 73, 60, 74,  1, 78, 60,
        1, 59, 60, 74, 64, 73, 60,  1, 64, 69, 58, 73, 60, 56, 74, 60,  8,
        0,  1,  1, 45, 63, 56, 75,  1, 75, 63, 60, 73, 60, 57, 80,  1, 57,
       60, 56, 76, 75, 80,  5, 74,  1, 73, 70, 74, 60,  1, 68, 64, 62, 63,
       75,  1, 69, 60, 77, 60, 73,  1, 59, 64, 60,  8,  0,  1,  1, 27, 76,
       75,  1, 56, 74,  1, 75, 63, 60,  1, 73, 64, 71, 60, 73,  1, 74, 63,
       70, 76, 67, 59,  1, 57, 80,  1, 75, 64, 68, 60,  1, 59, 60, 58, 60,
       56, 74, 60,  8,  0,  1,  1, 33, 64, 74,  1, 75, 60, 69, 59, 60, 73,
        1, 63, 60, 64, 73,  1, 68, 64, 62, 63, 75,  1, 57, 60, 56, 73,  1,
       63, 64, 74,  1, 68, 60, 68, 70, 73, 80, 21,  0,  1,  1, 27, 76, 75,
        1, 75, 63, 70, 76,  1, 58, 70, 69, 75, 73, 56, 58, 75, 60, 59,  1,
       75, 70,  1, 75, 63

In [68]:
print(text[:700])


                     1
  From fairest creatures we desire increase,
  That thereby beauty's rose might never die,
  But as the riper should by time decease,
  His tender heir might bear his memory:
  But thou contracted to thine own bright eyes,
  Feed'st thy light's flame with self-substantial fuel,
  Making a famine where abundance lies,
  Thy self thy foe, to thy sweet self too cruel:
  Thou that art now the world's fresh ornament,
  And only herald to the gaudy spring,
  Within thine own bud buriest thy content,
  And tender churl mak'st waste in niggarding:
    Pity the world, or else this glutton be,
    To eat the world's due, by the grave and thee.


                     2
  When fo


It is important in Natural Language Processing to feed the information in batches. Batches should contain all neccesary information that the model could get about this text, but they shouldn't be too long, else it will be very long and hard to train the model.

This work of shakespeares has all basic information it needs in max 3 lines, because every second lines rhymes and has unique structural aspects. So 3 lines would be ok to be seen as a sequence of training

In [20]:
lines = '''From fairest creatures we desire increase,
  That thereby beauty's rose might never die,
  But as the riper should by time decease,'''

In [21]:
len(lines)

131

In [22]:
seq_len = 120 #Close to 130 but given all the white spaces it suits ok with 120

In [23]:
#See how many sequences of 120 we have (seq_len + 1 because of 0 indexing)
total_num_seq = len(text) // (seq_len+1)

In [24]:
total_num_seq

45005

Next step is creating the actual training sequences as a dataset and luckily tensorflow has a Dataset object that can be used exactly for this kind of task

In [25]:
char_dataset = tf.data.Dataset.from_tensor_slices(encoded_text)

In [26]:
type(char_dataset)

tensorflow.python.data.ops.from_tensor_slices_op.TensorSliceDataset

In [69]:
#Example of what this future call will do:
for item in char_dataset.take(700):
    print(item.numpy())

0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
12
0
1
1
31
73
70
68
1
61
56
64
73
60
74
75
1
58
73
60
56
75
76
73
60
74
1
78
60
1
59
60
74
64
73
60
1
64
69
58
73
60
56
74
60
8
0
1
1
45
63
56
75
1
75
63
60
73
60
57
80
1
57
60
56
76
75
80
5
74
1
73
70
74
60
1
68
64
62
63
75
1
69
60
77
60
73
1
59
64
60
8
0
1
1
27
76
75
1
56
74
1
75
63
60
1
73
64
71
60
73
1
74
63
70
76
67
59
1
57
80
1
75
64
68
60
1
59
60
58
60
56
74
60
8
0
1
1
33
64
74
1
75
60
69
59
60
73
1
63
60
64
73
1
68
64
62
63
75
1
57
60
56
73
1
63
64
74
1
68
60
68
70
73
80
21
0
1
1
27
76
75
1
75
63
70
76
1
58
70
69
75
73
56
58
75
60
59
1
75
70
1
75
63
64
69
60
1
70
78
69
1
57
73
64
62
63
75
1
60
80
60
74
8
0
1
1
31
60
60
59
5
74
75
1
75
63
80
1
67
64
62
63
75
5
74
1
61
67
56
68
60
1
78
64
75
63
1
74
60
67
61
9
74
76
57
74
75
56
69
75
64
56
67
1
61
76
60
67
8
0
1
1
38
56
66
64
69
62
1
56
1
61
56
68
64
69
60
1
78
63
60
73
60
1
56
57
76
69
59
56
69
58
60
1
67
64
60
74
8
0
1
1
45
63
80
1
74
60
67
61
1
75
63
80
1
61
70
60
8
1
75
70
1
75
63


In [70]:
# as charcters
for item in char_dataset.take(700):
    print(ind_to_char[item.numpy()])



 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1


 
 
F
r
o
m
 
f
a
i
r
e
s
t
 
c
r
e
a
t
u
r
e
s
 
w
e
 
d
e
s
i
r
e
 
i
n
c
r
e
a
s
e
,


 
 
T
h
a
t
 
t
h
e
r
e
b
y
 
b
e
a
u
t
y
'
s
 
r
o
s
e
 
m
i
g
h
t
 
n
e
v
e
r
 
d
i
e
,


 
 
B
u
t
 
a
s
 
t
h
e
 
r
i
p
e
r
 
s
h
o
u
l
d
 
b
y
 
t
i
m
e
 
d
e
c
e
a
s
e
,


 
 
H
i
s
 
t
e
n
d
e
r
 
h
e
i
r
 
m
i
g
h
t
 
b
e
a
r
 
h
i
s
 
m
e
m
o
r
y
:


 
 
B
u
t
 
t
h
o
u
 
c
o
n
t
r
a
c
t
e
d
 
t
o
 
t
h
i
n
e
 
o
w
n
 
b
r
i
g
h
t
 
e
y
e
s
,


 
 
F
e
e
d
'
s
t
 
t
h
y
 
l
i
g
h
t
'
s
 
f
l
a
m
e
 
w
i
t
h
 
s
e
l
f
-
s
u
b
s
t
a
n
t
i
a
l
 
f
u
e
l
,


 
 
M
a
k
i
n
g
 
a
 
f
a
m
i
n
e
 
w
h
e
r
e
 
a
b
u
n
d
a
n
c
e
 
l
i
e
s
,


 
 
T
h
y
 
s
e
l
f
 
t
h
y
 
f
o
e
,
 
t
o
 
t
h
y
 
s
w
e
e
t
 
s
e
l
f
 
t
o
o
 
c
r
u
e
l
:


 
 
T
h
o
u
 
t
h
a
t
 
a
r
t
 
n
o
w
 
t
h
e
 
w
o
r
l
d
'
s
 
f
r
e
s
h
 
o
r
n
a
m
e
n
t
,


 
 
A
n
d
 
o
n
l
y
 
h
e
r
a
l
d
 
t
o
 
t
h
e
 
g
a
u
d
y
 
s
p
r
i
n
g
,


 
 
W
i
t
h
i
n
 
t
h
i
n
e
 
o
w
n
 
b
u


In [29]:
sequences = char_dataset.batch(seq_len + 1, drop_remainder=True) 

#'drop_reminder = True' simply just drops the last sequences (45005 is not divisible with 120)

In [30]:
#Create target text sequence
def create_seq_targets(seq):
    input_txt = seq[:-1] 
    target_txt = seq[1:] 
    return input_txt, target_txt

What this function actually does is that it shifts the sequence with 1 predicted character into the future: 

Example: Hello my name i -----> ello my name is
         

This above function executes the command only for 1 sequence, but I need to map the function to all sequences in my dataset

In [31]:
dataset = sequences.map(create_seq_targets)

In [32]:
#Example for how this looks like for 1 batch:
for input_txt, target_txt in  dataset.take(1):
    print(input_txt.numpy())
    print(''.join(ind_to_char[input_txt.numpy()]))
    print('\n')
    print(target_txt.numpy())
    # There is an extra whitespace!
    print(''.join(ind_to_char[target_txt.numpy()]))

[ 0  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 12  0
  1  1 31 73 70 68  1 61 56 64 73 60 74 75  1 58 73 60 56 75 76 73 60 74
  1 78 60  1 59 60 74 64 73 60  1 64 69 58 73 60 56 74 60  8  0  1  1 45
 63 56 75  1 75 63 60 73 60 57 80  1 57 60 56 76 75 80  5 74  1 73 70 74
 60  1 68 64 62 63 75  1 69 60 77 60 73  1 59 64 60  8  0  1  1 27 76 75]

                     1
  From fairest creatures we desire increase,
  That thereby beauty's rose might never die,
  But


[ 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 12  0  1
  1 31 73 70 68  1 61 56 64 73 60 74 75  1 58 73 60 56 75 76 73 60 74  1
 78 60  1 59 60 74 64 73 60  1 64 69 58 73 60 56 74 60  8  0  1  1 45 63
 56 75  1 75 63 60 73 60 57 80  1 57 60 56 76 75 80  5 74  1 73 70 74 60
  1 68 64 62 63 75  1 69 60 77 60 73  1 59 64 60  8  0  1  1 27 76 75  1]
                     1
  From fairest creatures we desire increase,
  That thereby beauty's rose might never die,
  But 


In the above lines I exemplified what this function actually does on our text. In the numeric outputs can one really see that in the target text the 0 acutally disappears and at the end a 1 is appended (meaning a whitespace) that is why in the character output it's hard to be seen

In [33]:
#Generate the actual training batches that the model receives
batch_size = 128

- the dataset needs to be shuffled so that the model will train well, given the fact that I will not use some extremely complicated and fancy model, because my laptop can't take that honestly (given the extremely high number of entries)

In [34]:
buffer_size = 10000

dataset = dataset.shuffle(buffer_size).batch(batch_size, drop_remainder=True)

In [35]:
dataset

<BatchDataset element_spec=(TensorSpec(shape=(128, 120), dtype=tf.int32, name=None), TensorSpec(shape=(128, 120), dtype=tf.int32, name=None))>

We can now clearly see that the dataset has as input 128 sequences of 120 characters each and a target of the same size for each batch that it feeds to the model 

Next I'm going to declare a few variables that we need for the model building, they are extremely intuitive:

In [36]:
vocab_size = len(vocab)

In [37]:
vocab_size

84

In [38]:
#Choose embedding dimensions in the range of the vocab_size, but not extremely large
embed_dim = 64

In [39]:
#We will have a single layer, so we will populate it with many neurons in order for the model to at least try and train well
rnn_neurons = 1026

The next step can be a bit tricky, hence I want to use the sparse_categorical_crossentropy (beacsue my labels are one hot encoded), but as default this loss function has 'from_logits' parameter set as 'false' (menaning that the values are not one hot encoded) and I want to set it to true. I can't set it later manually because I will only call this function in another function

In [40]:
from tensorflow.keras.losses import sparse_categorical_crossentropy

In [41]:
def sparse_cat_loss(y_true, y_pred):
    return sparse_categorical_crossentropy(y_true, y_pred, from_logits=True) 

In [42]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense,Embedding,GRU

In [43]:
def create_model(vocab_size, embed_dim, rnn_neurons, batch_size):
    
    model = Sequential()
     
    #Adding the embedding layer with the parameters already delimited
        
    model.add(Embedding(vocab_size, embed_dim, batch_input_shape = [batch_size, None]))
    
    #Adding the GRU unit
    
    model.add(GRU(rnn_neurons, return_sequences= True, stateful=True,
                 recurrent_initializer= 'glorot_uniform')) 
    
    #the recurrent initializer has the orthogonal function as default, but I read that the glorot (or xavier) uniformization works better
    
    #Output layer
    
    model.add(Dense(vocab_size))
    
    # As usual, using the Adam optimizer as it is shown many many times that is has the best performance via Gradient Descent
    
    model.compile('adam', loss = sparse_cat_loss) #This is why we changed the function earlier, beacuse we cannot modifiy it here
    
    return model

    

In [44]:
#Create the model

model = create_model(vocab_size=vocab_size, embed_dim=embed_dim,
                    rnn_neurons=rnn_neurons,
                    batch_size=batch_size)

In [45]:
#Example for the model without any training, to showcase that it really grabs random characters

for input_example_batch, target_example_batch in dataset.take(1):
    
    example_batch_predictions = model(input_example_batch)
    

In [46]:
example_batch_predictions.shape

TensorShape([128, 120, 84])

In [47]:
example_batch_predictions[0] #this are probabilities that it assigns for each concuret charatcer

<tf.Tensor: shape=(120, 84), dtype=float32, numpy=
array([[ 0.00063192, -0.00364161, -0.00174475, ..., -0.00661242,
        -0.00419879, -0.00387674],
       [-0.00342538, -0.00333595, -0.00148729, ..., -0.00127263,
        -0.00338095, -0.00265343],
       [ 0.00429707, -0.00058925, -0.00121954, ..., -0.00205248,
        -0.00184891, -0.0065268 ],
       ...,
       [-0.00096255, -0.00537518, -0.00208286, ...,  0.00387456,
         0.00804729, -0.00029055],
       [-0.00174261, -0.00204345, -0.00303095, ...,  0.0023606 ,
         0.00609189,  0.00099497],
       [ 0.00366283,  0.00192956, -0.00114651, ...,  0.00050182,
         0.00175337, -0.00514684]], dtype=float32)>

In [48]:
#Transform into the wanted integers
sampled_indices = tf.random.categorical(example_batch_predictions[0], num_samples = 1)

In [49]:
sampled_indices

<tf.Tensor: shape=(120, 1), dtype=int64, numpy=
array([[58],
       [61],
       [38],
       [26],
       [35],
       [25],
       [ 1],
       [20],
       [61],
       [58],
       [32],
       [39],
       [62],
       [36],
       [40],
       [ 7],
       [72],
       [65],
       [21],
       [46],
       [78],
       [75],
       [64],
       [81],
       [74],
       [17],
       [52],
       [51],
       [70],
       [24],
       [77],
       [71],
       [70],
       [26],
       [56],
       [21],
       [20],
       [ 1],
       [76],
       [74],
       [38],
       [10],
       [34],
       [19],
       [32],
       [56],
       [79],
       [36],
       [59],
       [ 3],
       [39],
       [56],
       [68],
       [55],
       [70],
       [69],
       [49],
       [54],
       [21],
       [62],
       [82],
       [70],
       [65],
       [57],
       [56],
       [83],
       [67],
       [ 8],
       [17],
       [ 1],
       [21],
       [68],
       [19],
   

In [50]:
#Reshape the given indeces in a way in which later we can grab the characters from them
sampled_indices = tf.squeeze(sampled_indices, axis = -1).numpy()

In [51]:
sampled_indices

array([58, 61, 38, 26, 35, 25,  1, 20, 61, 58, 32, 39, 62, 36, 40,  7, 72,
       65, 21, 46, 78, 75, 64, 81, 74, 17, 52, 51, 70, 24, 77, 71, 70, 26,
       56, 21, 20,  1, 76, 74, 38, 10, 34, 19, 32, 56, 79, 36, 59,  3, 39,
       56, 68, 55, 70, 69, 49, 54, 21, 62, 82, 70, 65, 57, 56, 83, 67,  8,
       17,  1, 21, 68, 19, 54, 60, 40, 11, 20, 79,  4, 75, 63, 81, 67, 62,
       68, 28, 37,  7, 43, 47,  4, 45, 35, 36, 44, 46, 53, 45,  2, 16, 29,
       80, 20, 51, 36, 74, 12, 42, 21, 60,  5, 55, 58, 18, 14, 40,  3, 42,
       48], dtype=int64)

In [52]:
ind_to_char[sampled_indices] #a bunch of random predictions

array(['c', 'f', 'M', 'A', 'J', '?', ' ', '9', 'f', 'c', 'G', 'N', 'g',
       'K', 'O', ')', 'q', 'j', ':', 'U', 'w', 't', 'i', 'z', 's', '6',
       '[', 'Z', 'o', '>', 'v', 'p', 'o', 'A', 'a', ':', '9', ' ', 'u',
       's', 'M', '.', 'I', '8', 'G', 'a', 'x', 'K', 'd', '"', 'N', 'a',
       'm', '`', 'o', 'n', 'X', '_', ':', 'g', '|', 'o', 'j', 'b', 'a',
       '}', 'l', ',', '6', ' ', ':', 'm', '8', '_', 'e', 'O', '0', '9',
       'x', '&', 't', 'h', 'z', 'l', 'g', 'm', 'C', 'L', ')', 'R', 'V',
       '&', 'T', 'J', 'K', 'S', 'U', ']', 'T', '!', '5', 'D', 'y', '9',
       'Z', 'K', 's', '1', 'Q', ':', 'e', "'", '`', 'c', '7', '3', 'O',
       '"', 'Q', 'W'], dtype='<U1')

- I just wanted to showcase in the above few lines that without training the model gives out entirily random predictions, so the shuffeling works well and also I wanted to compare the performance of the model after training with these random generations

In [54]:
#Train the model
# epochs = 30

# model.fit(dataset, epochs = epochs)

Training takes a really long time because of my GPU, so I had someone run my model on google colab on better GPU and I will evaluate the already finished model

In [55]:
from tensorflow.keras.models import load_model

- After loading the model, I wanted to change up the output shape, because I want it to output a single batch of text, not 128 as before

- That is the reason that I load only the weights of the model

- And I want to create a function that generates text based on the outputs of the model

In [56]:
model = create_model(vocab_size, embed_dim, rnn_neurons, batch_size = 1)

model.load_weights('shakespeare_model.h5')

model.build(tf.TensorShape([1,None]))

In [57]:
model.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_1 (Embedding)     (1, None, 64)             5376      
                                                                 
 gru_1 (GRU)                 (1, None, 1026)           3361176   
                                                                 
 dense_1 (Dense)             (1, None, 84)             86268     
                                                                 
Total params: 3,452,820
Trainable params: 3,452,820
Non-trainable params: 0
_________________________________________________________________


In [77]:
#Creating the function that generates text:
def text_generator(model, start_seed, gen_size = 700, temp = 0.9):
    
    #Number of characters to generate (could be really anything)
    
    num_generate = gen_size
    
    #For every character I will transform it to index and create a list of those
    #Basically vectorizing the starting seed text
    input_txt = [char_to_ind[i] for i in start_seed]
    
    #Expand to match the batch format shape
    
    input_txt = tf.expand_dims(input_txt, 0)
    
    #Empty list to hold the generated text:
    
    gen_text = []
    
    #Adding the temperature which is a parameter that effects the randomness of the resulting text
    
    #It effects the probability of next characters
    
    temperature = temp
    
    #Again making sure that the batch_size == 1
    
    model.reset_states()
    
    for nr in range(num_generate):
        
        #Generate predictions
        
        preds = model(input_txt)
        
        #I have to remove the batch shape dimension (reverse the expand command)
        preds = tf.squeeze(preds, 0)
        
        #I want to use a categorical distribution to select the next character
        predictions = preds / temperature
        predicted_ind = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()
        
        #Passing the predicted character dor the next input
        
        input_txt = tf.expand_dims([predicted_ind], 0)
        
        #Transforming back to character (not index)
        
        gen_text.append(ind_to_char[predicted_ind])
        
    return (start_seed + ''.join(gen_text))

For the last part let's actually generate text with this model:

- we can pass in any text that should appear in a Shakespeare work

In [78]:
print(text_generator(model, 'love', gen_size = 700))

lovery for one foot
    faintes driven by a matter Prosecubber, he would not be.
  POLIXENES. I shall?  
  STEPHANO. But that the love is too young players of this form.
  JESSICE. I would I had by it be here went Troy, sweet bind win
    A most indulated flower mutiny.
  LADY MACBETH. Your answer, sir.
  AUTOLYCUS. Very true. And you will do it, and hear to the sheep -
    I'll to the Queen, and the ships for his lord,
    Which makes him gasp as tenderly away.
  ROSALINE. Who were best been sickness?
  PAULINA. Ay, and look so. His face is like that sleeve.
  SATURNINUS. A goodly hard hour mine commotion,
    In some e both present money. I,
    Make this forever, which not the ship lies let h


## From my point of view, it's really cool that such an easy model is able to learn on a character by character basis and generate shakesperean text, where it can delimit character names in all caps and what they say, and the sentances really make sense. Surely there are made up words and typos, but all in all I am satisfied by the text generation of this model. 

# Thank you for you attention, hope you enjoyed it!