# Neural Nets Models
I ran this notebook in Google Colab, but some of the models can be loaded from saved weights in the ./models folder. I don't have all of them though because I had implemented checkpoints in the model callbacks and didn't realized that the filename passed was saving the weights in the local runtime instance rather than my drive folder.

In [None]:
import pandas as pd
import numpy as np
# import re
import pickle as pkl
from nltk.tokenize import word_tokenize
from gensim.models import KeyedVectors
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
# Neural Net Layers
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, Bidirectional, Embedding
# Neural Net Training
from tensorflow.keras.models import load_model
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping

In [None]:
import nltk
nltk.download('punkt')

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


True

In [None]:
# from google.colab import drive
# drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
with open('/content/drive/My Drive/datasets/text_with_placeholders.pkl', 'rb') as f:
    text = pkl.load(f)

In [None]:
with open('/content/drive/My Drive/datasets/keep_upper_case.pkl', 'rb') as f:
    keep_upper_case = pkl.load(f)

In [None]:
text_pars = []
for sublist in text:
    text_pars.append([*['-BGP-']*10, *word_tokenize(sublist), *['-ENP-']*10])

If I wasn't running this in Colab, the cell would have been in a module. However, I didn't want to deal with the hassle of creating a module and uploading it into Drive and mounting it correctly and all that just for this one cell. (Note: it's not a function because it only needs to be run once per kernel).  
This cell is doing two things. One, it's decapitalizing anything that's not in the ```keep_upper_case ``` set, which consists of terms like ```Mr.``` and ```Park``` that I want to be left as capitals. It will also not de-capitalize a single letter followed by a period (unles that letter is I), as there are several places when Austen refers to characters by initials, and I didn't import these to my proper nouns because they were too difficult to sub out.  
The second thing that this cell does is fix tokenizing errors from the ```word_tokenize``` function. Several examples of such errors: Austens uses ```&c.``` as etc. ```word_tokenize``` splits this up into 2-3 tokens depending on context. The term ```d'ye``` (used in the context of how d'ye do), is split into ```d``` and ```'ye```. Since ```d'``` is the part of the contraction that replaces the word do, I switched the apostrophe back over. 

In [None]:
#Don't lowercase initials (i.e. M.D.)
import re
r = re.compile('-[A-Z]{3}')
remove_indices = []
for i, doc in enumerate(text_pars):
    for j, token in enumerate(doc):
        if token.count('-') < 2 and token not in keep_upper_case and \
            token.find('.') <= 0 and len(token) > 2 and \
            token not in ['c', 'c.', 'd'] \
            and not(len(token)==1 and token.isupper()):
            text_pars[i][j] = token.lower()
        elif token.count('-') < 2 and (token.isupper() or token.islower()):
            if token in ['A', 'a', 'I', 'I.'] or \
            (token[-1] == '.' and len(token) > 2):
                if r.fullmatch(token):
                    text_pars[i][j] = token + '-'
                    text_pars[i][j+1] = '--'
                else:
                    text_pars[i][j] = token.lower()
            elif 'c' in token:
                remove_indices.append((i, j-1))
                text_pars[i][j] = '&c.'
                if text_pars[i][j+1] == '.':
                    remove_indices.append((i, j+1))
            elif token == 'd' or token == 'D':
                if text_pars[i][j+1] == '-':
                    text_pars[i][j] = 'd--m'
                    remove_indices.append((i, j+1))
                else:
                    text_pars[i][j] = 'd\''
                    text_pars[i][j] = 'ye'
            elif token == 'G' and text_pars[i][j-1].lower() == 'by':
                text_pars[i][j] = 'God'
                remove_indices.append((i, j+1))
            elif '.' not in token and len(token) > 1:
                text_pars[i][j] = token.lower()
            elif token.isupper() and token[-1].isalpha() and token[0].isalpha():
                text_pars[i][j] = token + '.'
                remove_indices.append((i, j+1))
for i, j in remove_indices:
    text_pars[i] = text_pars[i][:j] + text_pars[i][j+1:]

In [None]:
max_words = 20000 # Max size of the dictionary
tokenizer = Tokenizer(num_words=max_words, filters='', lower=False)

In [None]:
tokenizer.fit_on_texts(text_pars)

In [None]:
vocab_size = len(tokenizer.word_index) + 1
vocab_size

13874

In [None]:
sequences = tokenizer.texts_to_sequences(text_pars)

In [None]:
# Reverse dictionary to decode tokenized sequences back to words
reverse_word_map = dict(map(reversed, tokenizer.word_index.items()))

In [None]:
window_size = 20
window_dist = window_size//2
train_size = window_size - 1
X_windows = []
Y_labels = []

# I tried using a sliding window with the y value pulled from the middle,
## but couldn't work out how to implement text generation from the 
### finished model. Disappointing because models with data in this format
#### saw accuracy jumps of 0.1
# Sliding window to generate train data
# for i in range(window_dist, len(seq)-window_dist):
#     X_windows.append([*seq[i-window_dist:i], *seq[i+1:i+window_dist]])
#     Y_labels.append(seq[i])


for sequence in sequences:
    for i in range(len(sequence)-window_size):
        X_windows.append(sequence[i:i+train_size])
        Y_labels.append(sequence[i+window_size])
X = np.asarray(X_windows)
Y = np.asarray(Y_labels).reshape(-1, 1)

In [None]:
len(X_windows)

853935

Prior to attempting the neural nets models, I had trained both a fasttext and a word2vec word embedding model on my data. I used these pretrained embeddings here.  
*Note: I did not include these notebooks for submission because I ended up not using either of these embedding models. However, the notebooks in question are linked to in the readme of my submission folder.*

In [None]:
from gensim.models import KeyedVectors
word_vectors = KeyedVectors.load('/content/drive/My Drive/Colab Notebooks/metis_proj4/models/fasttext_vectors.kv', mmap='r')
# pretrained_embedding = word_vectors.get_keras_embedding()

  'See the migration notes for details: %s' % _MIGRATION_NOTES_URL


In [None]:
len(reverse_word_map)

13873

In [None]:
word_index = tokenizer.word_index

embedding_dim = word_vectors.vector_size

embedding_matrix = np.zeros((len(word_index) + 1, embedding_dim))
for word, i in word_index.items():
    embedding_vector = word_vectors[word]
    embedding_matrix[i] = embedding_vector


I started out with two stacked LSTM layers following the Embedding layer. I chose this because I read a paper about the effectiveness of stacked LSTMs in text generation. However, after I abandoned the skipgram sliding window input data and my accuracy dropped, I added the Bidirectional layer to the first LSTM layer, which helped a bit. 

In [None]:
model_ft = Sequential([
    Embedding(vocab_size, embedding_dim, input_length=train_size, 
              weights=[embedding_matrix], trainable=False),
    Bidirectional(LSTM(100, return_sequences=True)),
    LSTM(200),
    Dense(150, activation='relu'),
    Dense(vocab_size, activation='softmax')
])

In [None]:
model_ft.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding (Embedding)        (None, 19, 100)           1387400   
_________________________________________________________________
bidirectional (Bidirectional (None, 19, 200)           160800    
_________________________________________________________________
lstm_1 (LSTM)                (None, 200)               320800    
_________________________________________________________________
dense (Dense)                (None, 150)               30150     
_________________________________________________________________
dense_1 (Dense)              (None, 13874)             2094974   
Total params: 3,994,124
Trainable params: 2,606,724
Non-trainable params: 1,387,400
_________________________________________________________________


In [None]:
#These weights were included in my project submission, so this cell is runnable
model_ft.load_weights('./models/model_weights_ft.hdf5')

In [None]:
model_ft.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

In [None]:
# fit model
model_ft.fit(X, Y, batch_size=256, epochs=15, verbose = 2)

Epoch 1/15
3336/3336 - 43s - loss: 4.7955 - accuracy: 0.1830
Epoch 2/15
3336/3336 - 43s - loss: 4.7346 - accuracy: 0.1852
Epoch 3/15
3336/3336 - 43s - loss: 4.6815 - accuracy: 0.1874
Epoch 4/15
3336/3336 - 42s - loss: 4.6335 - accuracy: 0.1890
Epoch 5/15
3336/3336 - 43s - loss: 4.5903 - accuracy: 0.1903
Epoch 6/15
3336/3336 - 43s - loss: 4.5500 - accuracy: 0.1919
Epoch 7/15
3336/3336 - 43s - loss: 4.5132 - accuracy: 0.1934
Epoch 8/15
3336/3336 - 42s - loss: 4.4801 - accuracy: 0.1948
Epoch 9/15
3336/3336 - 42s - loss: 4.4485 - accuracy: 0.1965
Epoch 10/15
3336/3336 - 43s - loss: 4.4187 - accuracy: 0.1982
Epoch 11/15
3336/3336 - 42s - loss: 4.3919 - accuracy: 0.2003
Epoch 12/15
3336/3336 - 43s - loss: 4.3659 - accuracy: 0.2019
Epoch 13/15
3336/3336 - 42s - loss: 4.3415 - accuracy: 0.2039
Epoch 14/15
3336/3336 - 43s - loss: 4.3186 - accuracy: 0.2057
Epoch 15/15
3336/3336 - 42s - loss: 4.2964 - accuracy: 0.2074


<tensorflow.python.keras.callbacks.History at 0x7f46e41c1e10>

In [None]:
# model_ft.save('/content/drive/My Drive/models/model_weights_ft.hdf5')

In [None]:
word_vectors_w2v = KeyedVectors.load('/content/drive/My Drive/Colab Notebooks/metis_proj4/models/word2vec_vectors.kv', mmap='r')
pretrained_embedding = word_vectors_w2v.get_keras_embedding()

  'See the migration notes for details: %s' % _MIGRATION_NOTES_URL


In [None]:
model_w2v = Sequential([
    pretrained_embedding,
    Bidirectional(LSTM(100, return_sequences=True)),
    LSTM(200, go_backwards=True),
    Dense(150, activation='relu'),
    Dense(vocab_size, activation='softmax')
])

In [None]:
#These weights were also included in my project submission, so this cell is runnable
model_w2v.load_weights('./models/model_weights_w2v.hdf5')

In [None]:
model_w2v.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit model
model_w2v.fit(X, Y, batch_size=256, epochs=15, verbose = 2)
#Epoch 1: 0.1726, Epoch 15: 0.1867

Epoch 1/15
3336/3336 - 43s - loss: 4.6595 - accuracy: 0.1884
Epoch 2/15
3336/3336 - 43s - loss: 4.6249 - accuracy: 0.1893
Epoch 3/15
3336/3336 - 43s - loss: 4.6037 - accuracy: 0.1903
Epoch 4/15
3336/3336 - 43s - loss: 4.5828 - accuracy: 0.1910
Epoch 5/15
3336/3336 - 43s - loss: 4.5623 - accuracy: 0.1916
Epoch 6/15
3336/3336 - 43s - loss: 4.5420 - accuracy: 0.1921
Epoch 7/15
3336/3336 - 43s - loss: 4.5231 - accuracy: 0.1929
Epoch 8/15
3336/3336 - 43s - loss: 4.5034 - accuracy: 0.1937
Epoch 9/15
3336/3336 - 43s - loss: 4.4871 - accuracy: 0.1941
Epoch 10/15
3336/3336 - 43s - loss: 4.4659 - accuracy: 0.1945
Epoch 11/15
3336/3336 - 43s - loss: 4.4474 - accuracy: 0.1955
Epoch 12/15
3336/3336 - 43s - loss: 4.4303 - accuracy: 0.1959
Epoch 13/15
3336/3336 - 43s - loss: 4.4126 - accuracy: 0.1965
Epoch 14/15
3336/3336 - 43s - loss: 4.3955 - accuracy: 0.1973
Epoch 15/15
3336/3336 - 43s - loss: 4.3794 - accuracy: 0.1976


<tensorflow.python.keras.callbacks.History at 0x7f46dfbd9b70>

In [None]:
model_w2v.save('/content/drive/My Drive/models/model_weights_w2v.hdf5')

In [None]:
model = Sequential([
    Embedding(vocab_size, embedding_dim, input_length=train_size),
    Bidirectional(LSTM(100, return_sequences=True)),
    LSTM(200),
    Dense(150, activation='relu'),
    Dense(vocab_size, activation='softmax')
])

I do not have the weights for this model as they were subsequently overwritten by accident.

In [None]:
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit model
model.fit(X, Y, batch_size=512, epochs=15, verbose = 2)
#Epoch 1: 0.2581, Epoch 15: 0.2963

Epoch 1/15
1668/1668 - 47s - loss: 3.4903 - accuracy: 0.3040
Epoch 2/15
1668/1668 - 46s - loss: 3.4275 - accuracy: 0.3101
Epoch 3/15
1668/1668 - 46s - loss: 3.4021 - accuracy: 0.3138
Epoch 4/15
1668/1668 - 46s - loss: 3.3787 - accuracy: 0.3166
Epoch 5/15
1668/1668 - 46s - loss: 3.3571 - accuracy: 0.3192
Epoch 6/15
1668/1668 - 46s - loss: 3.3368 - accuracy: 0.3219
Epoch 7/15
1668/1668 - 46s - loss: 3.3174 - accuracy: 0.3246
Epoch 8/15
1668/1668 - 46s - loss: 3.2983 - accuracy: 0.3268
Epoch 9/15
1668/1668 - 46s - loss: 3.2828 - accuracy: 0.3291
Epoch 10/15
1668/1668 - 46s - loss: 3.2680 - accuracy: 0.3311
Epoch 11/15
1668/1668 - 46s - loss: 3.2484 - accuracy: 0.3330
Epoch 12/15
1668/1668 - 46s - loss: 3.2347 - accuracy: 0.3351
Epoch 13/15
1668/1668 - 46s - loss: 3.2198 - accuracy: 0.3374
Epoch 14/15
1668/1668 - 46s - loss: 3.2046 - accuracy: 0.3393
Epoch 15/15
1668/1668 - 46s - loss: 3.1919 - accuracy: 0.3413


<tensorflow.python.keras.callbacks.History at 0x7f46dd7205c0>

After 15 epochs, the model performances with the three different embedding types (FastText, Word2Vec, and native Embedding), were:  
FastTest - 42s - loss: 4.2964 - accuracy: 0.2074  
Word2Vec - 43s - loss: 4.3794 - accuracy: 0.1976
native Embedding layer - 46s - loss: 3.1919 - accuracy: 0.3413

I played around with parameters a bit, but saw similar results each time, so I abandoned the pre-trained models and stuck to the native Embedding model instead. 

*Note: The batch size of the native Embedding model is currently twice that of the previous two. However, I changed this after having compared the three models and finding the native Embedding one consistently superior with the same parameters as the other two. The main difference resulting from the change in batch size was that the runtime shortened.*  

*Also, I've mentioned it a couple times so far, but there were a number of times when rather than copy and paste the model code, I just changed the code and reran the cell. Thus, I don't have access to all the model outputs, as I don't have time to rerun them.*

In [None]:
# model.save('/content/drive/My Drive/models/model_weights.hdf5')

I decided to try implementing a Dropout layer, and wasn't sure whether to use the regular one or the Gaussian one, so I tried out both.

In [None]:
from tensorflow.keras.layers import Dropout, GaussianDropout

In [None]:
#These weights were included in my project submission, so this cell is runnable
model_2.load_weights('./models/model_2_weights.hdf5')

In [None]:
# define model
model_2 = Sequential([
    Embedding(vocab_size, embedding_dim, input_length=train_size),
    Bidirectional(LSTM(100, return_sequences=True)),
    LSTM(200),
    Dense(150, activation='relu'),
    Dropout(rate=0.1),
    Dense(vocab_size, activation='softmax')
])

In [None]:
model_2.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit model
model_2.fit(X, Y, batch_size=512, epochs=15, verbose = 2)

Epoch 1/15
1668/1668 - 47s - loss: 5.5922 - accuracy: 0.1359
Epoch 2/15
1668/1668 - 47s - loss: 5.2700 - accuracy: 0.1600
Epoch 3/15
1668/1668 - 46s - loss: 5.1357 - accuracy: 0.1695
Epoch 4/15
1668/1668 - 46s - loss: 5.0487 - accuracy: 0.1753
Epoch 5/15
1668/1668 - 46s - loss: 4.9732 - accuracy: 0.1799
Epoch 6/15
1668/1668 - 46s - loss: 4.9043 - accuracy: 0.1842
Epoch 7/15
1668/1668 - 46s - loss: 4.8407 - accuracy: 0.1882
Epoch 8/15
1668/1668 - 46s - loss: 4.7797 - accuracy: 0.1920
Epoch 9/15
1668/1668 - 46s - loss: 4.7228 - accuracy: 0.1957
Epoch 10/15
1668/1668 - 46s - loss: 4.6686 - accuracy: 0.1992
Epoch 11/15
1668/1668 - 46s - loss: 4.6160 - accuracy: 0.2021
Epoch 12/15
1668/1668 - 46s - loss: 4.5662 - accuracy: 0.2052
Epoch 13/15
1668/1668 - 46s - loss: 4.5173 - accuracy: 0.2083
Epoch 14/15
1668/1668 - 46s - loss: 4.4704 - accuracy: 0.2113
Epoch 15/15
1668/1668 - 46s - loss: 4.4255 - accuracy: 0.2142


<tensorflow.python.keras.callbacks.History at 0x7f46e6caefd0>

I don't have the weights for this model. I'm not sure exactly why, but I think either I decided they weren't worth keeping or my runtime crashed and I had to restart the notebook.

In [None]:
# define model
model_2_gauss = Sequential([
    Embedding(vocab_size, embedding_dim, input_length=train_size),
    Bidirectional(LSTM(100, return_sequences=True)),
    LSTM(200),
    Dense(150, activation='relu'),
    GaussianDropout(rate=0.1),
    Dense(vocab_size, activation='softmax')
])

In [None]:
model_2_gauss.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit model
model_2_gauss.fit(X, Y, batch_size=512, epochs=15, verbose = 2)

Epoch 1/15
1668/1668 - 46s - loss: 5.5860 - accuracy: 0.1368
Epoch 2/15
1668/1668 - 46s - loss: 5.2806 - accuracy: 0.1595
Epoch 3/15
1668/1668 - 46s - loss: 5.1405 - accuracy: 0.1684
Epoch 4/15
1668/1668 - 46s - loss: 5.0542 - accuracy: 0.1750
Epoch 5/15
1668/1668 - 46s - loss: 4.9824 - accuracy: 0.1796
Epoch 6/15
1668/1668 - 46s - loss: 4.9152 - accuracy: 0.1843
Epoch 7/15
1668/1668 - 46s - loss: 4.8530 - accuracy: 0.1881
Epoch 8/15
1668/1668 - 46s - loss: 4.7929 - accuracy: 0.1916
Epoch 9/15
1668/1668 - 46s - loss: 4.7361 - accuracy: 0.1948
Epoch 10/15
1668/1668 - 46s - loss: 4.6811 - accuracy: 0.1978
Epoch 11/15
1668/1668 - 46s - loss: 4.6287 - accuracy: 0.2011
Epoch 12/15
1668/1668 - 46s - loss: 4.5777 - accuracy: 0.2042
Epoch 13/15
1668/1668 - 46s - loss: 4.5295 - accuracy: 0.2071
Epoch 14/15
1668/1668 - 46s - loss: 4.4826 - accuracy: 0.2099
Epoch 15/15
1668/1668 - 46s - loss: 4.4390 - accuracy: 0.2131


<tensorflow.python.keras.callbacks.History at 0x7f46e42aa668>

The Dropout layers didn't add to the aperformance so I dropped them.  
The second LSTM layer has go_backwards=True because I was trying out having a forwards LSTM following by a backwards LSTM. That didn't help the performance much, but I accidentally forgot to change it back to forwards when I added the Bidirectional layer in. The performance jumped, and when I tried it out with the Bidirectional layer followed by a forwards LSTM layer, the performance went down again. So I stuck to the backwards layer.

In [None]:
model_3 = Sequential([
    Embedding(vocab_size, embedding_dim, input_length=train_size),
    Bidirectional(LSTM(100, return_sequences=True)),
    LSTM(200, go_backwards=True),
    Dense(150, activation='relu'),
    Dense(vocab_size, activation='softmax')
])

In [None]:
#These weights were included in my project submission, so this cell is runnable
# This is the model that I accidentally overwrote model 1's weights with
model_3.load_weights('./models/model_weights.hdf5')

In [None]:
# model_3.load_weights('/content/drive/My Drive/models/model_weights.hdf5')

The performance of the below model may seem like a sudden jump, but this is actually after something like 400 epochs of training, where I kept running the model, saving the weights, and then loading them back up again and retraining

In [None]:
model_3.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit model
model_3.fit(np.random.shuffle(X), Y, batch_size=1024, epochs=100, verbose = 2)

Epoch 1/200
834/834 - 51s - loss: 2.5574 - accuracy: 0.4364
Epoch 2/200
834/834 - 51s - loss: 2.4852 - accuracy: 0.4495
Epoch 3/200
834/834 - 51s - loss: 2.4706 - accuracy: 0.4526
Epoch 4/200
834/834 - 50s - loss: 2.4647 - accuracy: 0.4538
Epoch 5/200
834/834 - 50s - loss: 2.4665 - accuracy: 0.4536
Epoch 6/200
834/834 - 50s - loss: 2.4596 - accuracy: 0.4546
Epoch 7/200
834/834 - 50s - loss: 2.4817 - accuracy: 0.4503
Epoch 8/200
834/834 - 50s - loss: 2.4427 - accuracy: 0.4580
Epoch 9/200
834/834 - 50s - loss: 2.4390 - accuracy: 0.4582
Epoch 10/200
834/834 - 50s - loss: 2.4363 - accuracy: 0.4587
Epoch 11/200
834/834 - 50s - loss: 2.4335 - accuracy: 0.4596
Epoch 12/200
834/834 - 50s - loss: 2.4317 - accuracy: 0.4594
Epoch 13/200
834/834 - 50s - loss: 2.4291 - accuracy: 0.4600
Epoch 14/200
834/834 - 50s - loss: 2.4294 - accuracy: 0.4600
Epoch 15/200
834/834 - 51s - loss: 2.4262 - accuracy: 0.4608
Epoch 16/200
834/834 - 50s - loss: 2.4208 - accuracy: 0.4615
Epoch 17/200
834/834 - 50s - loss

<tensorflow.python.keras.callbacks.History at 0x7f4688f41e48>

In [None]:
model_3.save('/content/drive/My Drive/models/model_weights.hdf5')

In [None]:
np.random.shuffle(X)

In [None]:
model_4 = Sequential([
    Embedding(vocab_size, embedding_dim, input_length=train_size),
    Bidirectional(LSTM(100, return_sequences=True)),
    LSTM(200, go_backwards=True),
    Dense(150, activation='relu'),
    Dense(vocab_size, activation='softmax')
])

In [None]:
# model_4.load_weights('/content/drive/My Drive/models/model_weights.hdf5')

The filepath in the below cell in wrong :( so I don't have the weights for this model

In [None]:
# # Early stopping allows model to stop training if improvement stops.
es = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=50)
model_4.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
filepath = "./model_4_weights_sg.hdf5"
# # Model checkpointing allows us to preserve progress during training if training is interrupted
checkpoint = ModelCheckpoint(filepath, monitor='loss', verbose=1, save_best_only=True, mode='min')
callbacks_list = [checkpoint]
history_4 = model_4.fit(X, Y, epochs = 100, batch_size = 1024, callbacks = callbacks_list, verbose = 1)

Epoch 1/100
Epoch 00001: loss improved from inf to 2.55860, saving model to ./model_4_weights_sg.hdf5
Epoch 2/100
Epoch 00002: loss improved from 2.55860 to 2.48676, saving model to ./model_4_weights_sg.hdf5
Epoch 3/100
Epoch 00003: loss improved from 2.48676 to 2.47141, saving model to ./model_4_weights_sg.hdf5
Epoch 4/100
Epoch 00004: loss improved from 2.47141 to 2.46960, saving model to ./model_4_weights_sg.hdf5
Epoch 5/100
Epoch 00005: loss improved from 2.46960 to 2.46227, saving model to ./model_4_weights_sg.hdf5
Epoch 6/100
Epoch 00006: loss improved from 2.46227 to 2.45665, saving model to ./model_4_weights_sg.hdf5
Epoch 7/100
Epoch 00007: loss improved from 2.45665 to 2.45393, saving model to ./model_4_weights_sg.hdf5
Epoch 8/100
Epoch 00008: loss improved from 2.45393 to 2.44953, saving model to ./model_4_weights_sg.hdf5
Epoch 9/100
Epoch 00009: loss improved from 2.44953 to 2.44339, saving model to ./model_4_weights_sg.hdf5
Epoch 10/100
Epoch 00010: loss improved from 2.443

## Generating the Text!

In [None]:
def gen(model, input_str, max_len = 20):
    ''' Generates a sequence given a string seq using specified model until the total sequence length
    reaches max_len'''
    # Tokenize the input string
    tokenized_sent = tokenizer.texts_to_sequences([word_tokenize(input_str)])[0]
    while len(tokenized_sent) < max_len:
        padded_sentence = pad_sequences([tokenized_sent[-19:]], maxlen=19)[0]
        op = model.predict(np.asarray(padded_sentence).reshape(1,-1))
        tokenized_sent.append(op.argmax()+1)
        
    return " ".join(map(lambda x : reverse_word_map[x], tokenized_sent))

In [None]:
model_list = [model, model_2, model_3, model_4]

In [None]:
def test_models(test_string, sequence_length= 50, model_list = model_list):
    '''Generates output given input test_string up to sequence_length'''
    print('Input String: ', test_string)
    for counter,model in enumerate(model_list):
        print("Model ", counter+1, ":")
        print(gen(model,test_string,sequence_length))
    pass

I'm not sure what't up with the outputs of these cells. For some reason the output of models 2 and 3 are completely sparsified, which is very strange, since they were not before I downloaded this notebook off Google Colab. For reference, model 2's output was similar to model 1's, but with more repeated words, while model 3's output was similar to model 4's in terms of comprehensibility.

In [None]:
test_models('-BGP- -BGP- -BGP- -BGP- -BGP- -BGP- -BGP- -BGP- -BGP- -BGP-', 200, model_list)

Input String:  -BGP- -BGP- -BGP- -BGP- -BGP- -BGP- -BGP- -BGP- -BGP- -BGP-
Model  1 :
-BGP- -BGP- -BGP- -BGP- -BGP- -BGP- -BGP- -BGP- -BGP- -BGP- to to wherever not . in a were moral she . not rapid join recollected respectability with of the the the the the the -LNM- of not . in contrasted to wherever in candlelight on calm a . season do more when youthful she -PLC- . , estimation restless . fourth - , ground she . , sees and whenever features her . reasonably to benches cheese which as as as as on dined to poor not as her in woman as . in disputable rest in a . in gentleness with honour were be be was voices be and and intercourse might . in . in so consequence ; in a and leaving having her she and . to to not be . in . not to violated you her -LNM- before the . fortitude eighteen engaging prison is plays ecstasy the the the the the the the the the the the were he what sailors confused to . , her of distinct . as in gilding than to think . work -FNM- to to pianoforte of . not to bill

In [None]:
test_models('It is a truth universally acknowledged', 200, model_list)

Input String:  It is a truth universally acknowledged
Model  1 :
It is a truth universally acknowledged was and gracious the a a a a a a a a . to -MNM- . is so . '' was complaints suppose is so and slightest first it was was was it it it it it it it it it it it be own - . in almost house have her fashion fairly but and could message the a the a a loss , rather ! , considerably to to not to . was and . in us was and . . . . at . at little bought having her and not believe a a a a a looking a . in all hesitation then -LNM- qualified and and guarded none be brothers marriage who i . in . at . is leave . at little only her to her completion . , had would her and and and and what as in riotous and that -FNM- is -FNM- and travelling . doted to beg through companion lines - . at ceased the a and head everybody -FNM- and . , . in ; convinced have her and As not the . at . . maintain he . is . at ceased
Model  2 :
It is a truth universally acknowledged it it -FNM- -FNM- -FNM- -FNM- -FNM- -FNM- 

In [None]:
test_models('You must allow me to tell you', 200, model_list)

Input String:  You must allow me to tell you
Model  1 :
must allow me to tell you to designs saying a a being her and lively the the it it was it it it it it , was and connections mentioning morning her nurse of -LNM- who `` , always . '' was unexpected and and . duties and . '' was to and and be and . all to be and . in her homewards interrupted but . her hopeless they and thirteen . , paper her in bye be how was . wearing had attentive sister-in-law the . ; in a almost . while very for poetical recommendation must deserve to stood ill his a . to a a . always to to to to arms was and correct a because they and although hills be . as in not General all relating a in grove idea with not her -FNM- the the modern the what is the and complying , was and forbid . , . in ; in consequence to a in mourning must sisters copied of claimed not the ready warmly of of . not to to however and ill-timed be . prospects i `` civil he had and or could that . ; was her
Model  2 :
must allow me to tell yo