<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Deep-Learning-for-NLP" data-toc-modified-id="Deep-Learning-for-NLP-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Deep Learning for NLP</a></span></li><li><span><a href="#Embeddings-Example:-Skipgrams!" data-toc-modified-id="Embeddings-Example:-Skipgrams!-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Embeddings Example: Skipgrams!</a></span><ul class="toc-item"><li><span><a href="#Loading-in-Data" data-toc-modified-id="Loading-in-Data-2.1"><span class="toc-item-num">2.1&nbsp;&nbsp;</span>Loading in Data</a></span></li><li><span><a href="#Keras-Tools-for-Preprocessing-Text-Data" data-toc-modified-id="Keras-Tools-for-Preprocessing-Text-Data-2.2"><span class="toc-item-num">2.2&nbsp;&nbsp;</span>Keras Tools for Preprocessing Text Data</a></span></li><li><span><a href="#Generating-Input-and-Output-Labels" data-toc-modified-id="Generating-Input-and-Output-Labels-2.3"><span class="toc-item-num">2.3&nbsp;&nbsp;</span>Generating Input and Output Labels</a></span></li><li><span><a href="#Creating-the-Model-Architecture" data-toc-modified-id="Creating-the-Model-Architecture-2.4"><span class="toc-item-num">2.4&nbsp;&nbsp;</span>Creating the Model Architecture</a></span></li><li><span><a href="#Compiling-and-Training-the-Model" data-toc-modified-id="Compiling-and-Training-the-Model-2.5"><span class="toc-item-num">2.5&nbsp;&nbsp;</span>Compiling and Training the Model</a></span></li><li><span><a href="#Saving-the-Word-Vectors" data-toc-modified-id="Saving-the-Word-Vectors-2.6"><span class="toc-item-num">2.6&nbsp;&nbsp;</span>Saving the Word Vectors</a></span></li><li><span><a href="#Examining-the-Vectors" data-toc-modified-id="Examining-the-Vectors-2.7"><span class="toc-item-num">2.7&nbsp;&nbsp;</span>Examining the Vectors</a></span></li></ul></li><li><span><a href="#RNN-Example:-Text-Classification" data-toc-modified-id="RNN-Example:-Text-Classification-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>RNN Example: Text Classification</a></span></li><li><span><a href="#LSTM-Example:-Sentiment-Analysis" data-toc-modified-id="LSTM-Example:-Sentiment-Analysis-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>LSTM Example: Sentiment Analysis</a></span></li><li><span><a href="#CNN-Example:-Sentiment-Analysis" data-toc-modified-id="CNN-Example:-Sentiment-Analysis-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>CNN Example: Sentiment Analysis</a></span></li><li><span><a href="#BRNN-Example:-Sentiment-Analysis" data-toc-modified-id="BRNN-Example:-Sentiment-Analysis-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>BRNN Example: Sentiment Analysis</a></span></li><li><span><a href="#LSTM-Example:-Text-Generation" data-toc-modified-id="LSTM-Example:-Text-Generation-7"><span class="toc-item-num">7&nbsp;&nbsp;</span>LSTM Example: Text Generation</a></span></li><li><span><a href="#Save-Trained-Models" data-toc-modified-id="Save-Trained-Models-8"><span class="toc-item-num">8&nbsp;&nbsp;</span>Save Trained Models</a></span></li></ul></div>

In [1]:
# Imports
# Basics
import pandas as pd 
import numpy as np
import random
from matplotlib import pyplot as plt
%matplotlib inline

# gensim
import gensim

# keras
np.random.seed(13)
from keras.models import Sequential
from keras.layers import (Dense, Embedding, Reshape, Activation, 
                          SimpleRNN, LSTM, Convolution1D, 
                          MaxPooling1D, Dropout, Bidirectional)
from keras.utils import np_utils
from keras.utils.data_utils import get_file
from keras.preprocessing.text import Tokenizer
from keras.datasets import imdb, reuters
from keras.preprocessing import sequence
from keras.optimizers import SGD, RMSprop


# logging for gensim (set to INFO)
import logging
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)

Using TensorFlow backend.


# Deep Learning for NLP

We took a quick look at an NLP task in the previous notebook's Dropout example. In this notebook we'll look at some more complete NLP pipelines and learn how to work with several new types of keras Layers.

We'll start with a detailed look at building a network with an Embeddings model, and then we'll run through several examples using more advanced layer types: RNNs, LSTMs, CNNs, and BRNNs.

# Embeddings Example: Skipgrams!

Let's use keras to train a word embeddings model. Along the way, we'll use some special keras tools to create a complete NLP pipeline.

## Loading in Data

We'll work with text data from H.P. Lovecraft's story, [The Nameless City](https://raw.githubusercontent.com/urschrei/lovecraft/master/lovecraft.txt). Each item in our corpus will be a paragraph from the story. Note that there are line breaks in the text that will lead to empty elements, and poems in which each line gets its own item in the corpus.

In [2]:
# Load in Corpus using Keras utility
# We'll use some Lovecraft
!curl -o lovecraft.txt https://raw.githubusercontent.com/urschrei/lovecraft/master/lovecraft.txt

corpus = open("lovecraft.txt").readlines()[0:200]

corpus[6]

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
 51 2822k   51 1443k    0     0  1443k      0  0:00:01 --:--:--  0:00:01 1884k
100 2822k  100 2822k    0     0  2822k      0  0:00:01 --:--:--  0:00:01 3286k


'That is not dead which can eternal lie, And with strange aeons death may die.\n'

## Keras Tools for Preprocessing Text Data

`Keras` has some nice text preprocessing functions too!

In [3]:
# For simplicity, one "sentence" per line 
corpus = [sentence for sentence in corpus if sentence.count(" ") >= 2] # remove sentences that have less than 2 spaces
# eg:
## i die 
### Therefore this code will capture sentences that are longer

# Tokenize using Keras
tokenizer = Tokenizer(filters="""!"#$%&()*+,-./:;<=>?@[\\]^_`{|}~\t\n',""")
tokenizer.fit_on_texts(corpus) # fit is usually about some object, looking at the data then figuring something out

# Convert tokenized sentences to sequence format
sequences = tokenizer.texts_to_sequences(corpus)
nb_samples = sum(len(s) for s in corpus)

print(corpus[3])
print(sequences[3])  # this is what it looks like after being transformed into the tokenizer

That is not dead which can eternal lie, And with strange aeons death may die.

[9, 84, 13, 247, 22, 209, 523, 781, 2, 16, 56, 524, 191, 281, 525]


## Generating Input and Output Labels
Now we need to generate our `X_train` and `y_train` so we can train the model.

In [4]:
# Setting parameters for our model:

# Vocab size
V = len(tokenizer.word_index) + 1

# Dimension to reduce to (length of word embedding vectors)
dim = 100
window_size = 2

print("vocabulary size: ", V)

vocabulary size:  3316


In [5]:
# Function to generate the inputs and outputs for all windows
def generate_data(sequences, window_size, V):
    maxlen = window_size*2
    # For each line (sentence)
    for words in sequences:
        L = len(words)
        # Choose the target word
        for index, word in enumerate(words):
            # Create the window
            s = index-window_size
            e = index+window_size+1
                    
            in_words = []
            labels = []
            # Create the input/outputs for skipgrams
            for i in range(s, e):
                if i != index and 0 <= i < L:
                    in_words.append([word] )
                    labels.append(words[i])

            x = np.array(in_words,dtype=np.int32)
            y = np_utils.to_categorical(labels, V)
            yield (x, y)
            
# We'll call this later on, from within our train_skipgram_model function.

## Creating the Model Architecture
- Lastly, we create the (shallow) network!

In [6]:
# Create the Keras model and view it 
skipgram = Sequential()
skipgram.add(Embedding(input_dim=V, input_length=1, embeddings_initializer="glorot_uniform", output_dim=dim))
skipgram.add(Reshape((dim, )))
skipgram.add(Dense(input_dim=dim, units=V, activation='softmax'))

skipgram.summary()

Instructions for updating:
Colocations handled automatically by placer.


Instructions for updating:
Colocations handled automatically by placer.


Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_1 (Embedding)      (None, 1, 100)            331600    
_________________________________________________________________
reshape_1 (Reshape)          (None, 100)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 3316)              334916    
Total params: 666,516
Trainable params: 666,516
Non-trainable params: 0
_________________________________________________________________


## Compiling and Training the Model
- Time to compile and train
- We use crossentropy, common loss for classification

In [7]:
def train_skipgram_model(skipgram, sequences, window_size, dimension_size=V):
    """
    skipgram: Keras model to train
    sequences: list of lists of integers. 
               sequences[i][j] is the encoding of word j in document i
    window_size: number of words in the window
    dimension_size: integer. Size of the vector space for the word vectors.
    
    Note: this is slow to train. Took an 1hr 40min (no GPU) on 2016 Macbook Pro.
    """
    # Note this cell took 1hr 40min on my machine (no GPU)
    # Compile the Keras Model
    skipgram.compile(loss='categorical_crossentropy', optimizer="adadelta")

    # Fit the Skipgrams
    for iteration in range(10):
        loss = 0.
        for sequence, label in generate_data(sequences, window_size, dimension_size):
            loss += skipgram.train_on_batch(sequence, label)

        print(iteration, loss)

## Saving the Word Vectors
- Let's save the vectors to a file
- So we can load them into word2vec and test them out!

In [8]:
def write_model_weights(skipgram, tokenizer, filename='vectors.txt'):
    # Write the resulting vectors to a text file
    with open(filename ,'w') as f:
        f.write(f"{V-1} {dim}\n")
        vectors = skipgram.get_weights()[0]
        for word, i in tokenizer.word_index.items():
            line = f"{word} " + " ".join([str(num) for num in vectors[i,:]]) + "\n"
            f.write(line)

In [9]:
import os

if not os.path.exists('vectors.txt'):
    train_skipgram_model(skipgram, sequences, window_size, V)
    write_model_weights(skipgram, token)

## Examining the Vectors
- Let's load the vectors in to query them with `gensim`!

In [10]:
# Load the vectors into word2vec and see how we did!
w2v = gensim.models.KeyedVectors.load_word2vec_format('./vectors.txt', binary=False)
w2v.most_similar(positive=['white', 'rabbit'])

2019-12-10 10:24:36,273 : INFO : loading projection weights from ./vectors.txt
2019-12-10 10:24:36,844 : INFO : loaded (3315, 100) matrix from ./vectors.txt
2019-12-10 10:24:36,846 : INFO : precomputing L2-norms of word weight vectors


[('drew', 0.6355757713317871),
 ('legends', 0.5915203094482422),
 ('mist', 0.5870316028594971),
 ('sound', 0.5777257680892944),
 ('voices', 0.5727794766426086),
 ('slimy', 0.5717293620109558),
 ('both', 0.5646233558654785),
 ('less', 0.5638689994812012),
 ('might', 0.5628299117088318),
 ('wife', 0.5529348850250244)]

# RNN Example: Text Classification
- Let's try a Recurrent Neural Net for the same Reuters classification task from the Intro to Keras notebook:

In [21]:
# Same data loading as before
max_features = 2000
(X_train, y_train), (X_test, y_test) = reuters.load_data(num_words=max_features)
maxlen = 10
X_train = sequence.pad_sequences(X_train, maxlen=maxlen)
X_test = sequence.pad_sequences(X_test, maxlen=maxlen)
nb_epoch = 20

In [22]:
# Create the model
rnn_model = Sequential()
# The Embedding layer allows us to map words into dense vectors as inputs, common first layer
rnn_model.add(Embedding(input_dim=max_features, output_dim=100, embeddings_initializer='glorot_uniform', input_length=maxlen))
# This is the most basic kind of RNN!  We're using 20 units, 
#which somewhat reflects our "memory" of past events in a sequence
# For the purposes of keras, it's just another type of "unit" you can try!
rnn_model.add(SimpleRNN(20, return_sequences=False))
rnn_model.add(Dense(46))  # 46 here because predicting 46 categories
rnn_model.add(Activation('softmax'))

rnn_model.compile(loss='sparse_categorical_crossentropy',
                  optimizer='adadelta',
                  metrics=['accuracy'])

rnn_model.summary()

Model: "sequential_7"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_6 (Embedding)      (None, 10, 100)           200000    
_________________________________________________________________
simple_rnn_2 (SimpleRNN)     (None, 20)                2420      
_________________________________________________________________
dense_7 (Dense)              (None, 46)                966       
_________________________________________________________________
activation_5 (Activation)    (None, 46)                0         
Total params: 203,386
Trainable params: 203,386
Non-trainable params: 0
_________________________________________________________________


In [23]:
# Train the model
rnn_model.fit(X_train, y_train, batch_size=256, epochs=nb_epoch, 
              validation_data=(X_test, y_test))

# We can see taht the model is starting to overfit because accuracy keeps increasing but validation accuracy doesn't
## common rectifications are
### regularization, early stopping

Train on 8982 samples, validate on 2246 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.callbacks.History at 0x1b384bae588>

**WHOA!** Over 100% improvement on ANN!

# LSTM Example: Sentiment Analysis

Here is some code to train sentiment analysis on IMDB reviews.

*(Note that this network takes about 20 minutes to train over 15 epochs.)*

In [14]:
# Set parameters
max_features = 20000
maxlen = 80  # cut texts after this number of words (among top max_features most common words)
batch_size = 32

# Load data (Keras utility)
print('Loading data...')
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=max_features)
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')

# Pad Short sentences
print('Pad sequences (samples x time)')
X_train = sequence.pad_sequences(X_train, maxlen=maxlen)
X_test = sequence.pad_sequences(X_test, maxlen=maxlen)
print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)

# Build our model!
print('Build model...')
lstm_model = Sequential()
lstm_model.add(Embedding(max_features, 128, dropout=0.2))
# Here's the LSTM magic!
lstm_model.add(LSTM(128, dropout_W=0.2, dropout_U=0.2))  # try using a GRU instead, for fun
lstm_model.add(Dense(1))   # make sure your dense layers corresponds to the number of categories you want to predict
# Sigmoid for binary classification
lstm_model.add(Activation('sigmoid'))

# try using different optimizers and different optimizer configs
lstm_model.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

print('Train...')
lstm_model.fit(X_train, y_train, batch_size=batch_size, epochs=15,
               validation_data=(X_test, y_test))
score, acc = lstm_model.evaluate(X_test, y_test,
                                 batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)

# we can see from this is that our training loss is increasing but our valiudation loss is increasing
# we can see that in this case this is overfitting.
# in fact, if we look at the first 2-3 iterations, that is enough.
## so when this happens, you might want to look at yoru model again
### maybe your model is too complex. mayeb you don't wnat to have 126 units -> just 64 or 32
### maybe you want to increase your dropout

Loading data...
Downloading data from https://s3.amazonaws.com/text-datasets/imdb.npz
25000 train sequences
25000 test sequences
Pad sequences (samples x time)
X_train shape: (25000, 80)
X_test shape: (25000, 80)
Build model...




Train...
Train on 25000 samples, validate on 25000 samples
Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15
Test score: 1.0816763665413855
Test accuracy: 0.8108000159263611


# CNN Example: Sentiment Analysis
- Here's the same Sentiment Analysis task with a CNN + LSTM:

In [15]:
# this is a sample of how you can use CNN and RNN together

# Embedding
max_features = 20000
maxlen = 100
embedding_size = 128

# Convolution
filter_length = 5
nb_filter = 64
pool_length = 4

# LSTM
lstm_output_size = 70

# Training
batch_size = 30
nb_epoch = 2

'''
Note:
batch_size is highly sensitive.
Only 2 epochs are needed as the dataset is very small.
'''

# Load data
print('Loading data...')
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=max_features)
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')

# Pad sentences
print('Pad sequences (samples x time)')
X_train = sequence.pad_sequences(X_train, maxlen=maxlen)
X_test = sequence.pad_sequences(X_test, maxlen=maxlen)
print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)

print('Build model...')

cnn_model = Sequential()
cnn_model.add(Embedding(max_features, embedding_size, input_length=maxlen))
cnn_model.add(Dropout(0.25))

# Convolution!
cnn_model.add(Convolution1D(nb_filter=nb_filter,
                            filter_length=filter_length,
                            border_mode='valid',
                            activation='relu',
                            subsample_length=1))
cnn_model.add(MaxPooling1D(pool_length=pool_length))

# LSTM!
cnn_model.add(LSTM(lstm_output_size))
cnn_model.add(Dense(1))
cnn_model.add(Activation('sigmoid'))

cnn_model.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

print('Train...')
cnn_model.fit(X_train, y_train, batch_size=batch_size, epochs=nb_epoch,
          validation_data=(X_test, y_test))
score, acc = cnn_model.evaluate(X_test, y_test, batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)

Loading data...
25000 train sequences
25000 test sequences
Pad sequences (samples x time)
X_train shape: (25000, 100)
X_test shape: (25000, 100)
Build model...




Train...
Train on 25000 samples, validate on 25000 samples
Epoch 1/2
Epoch 2/2
Test score: 0.3618625885009766
Test accuracy: 0.8505200147628784


# BRNN Example: Sentiment Analysis
- One more time on the sentiment, now with a Bidirectional Recurrent Neural Net!

*(This takes about 2 minutes per epoch to train)*

In [16]:
max_features = 20000
maxlen = 100  # cut texts after this number of words (among top max_features most common words)
batch_size = 32

print('Loading data...')
(X_train, y_train), (X_test, y_test) = imdb.load_data(nb_words=max_features)
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')

print("Pad sequences (samples x time)")
X_train = sequence.pad_sequences(X_train, maxlen=maxlen)
X_test = sequence.pad_sequences(X_test, maxlen=maxlen)
print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)
y_train = np.array(y_train)
y_test = np.array(y_test)

brnn_model = Sequential()
brnn_model.add(Embedding(max_features, 128, input_length=maxlen))

# Bidirectional LSTM!!!
brnn_model.add(Bidirectional(LSTM(64)))
brnn_model.add(Dropout(0.5))
brnn_model.add(Dense(1, activation='sigmoid'))

# try using different optimizers and different optimizer configs
brnn_model.compile('adam', 'binary_crossentropy', metrics=['accuracy'])

print('Train...')
brnn_model.fit(X_train, y_train,
               batch_size=batch_size,
               epochs=4,
               validation_data=[X_test, y_test])

score, acc = brnn_model.evaluate(X_test, y_test, batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)

Loading data...




25000 train sequences
25000 test sequences
Pad sequences (samples x time)
X_train shape: (25000, 100)
X_test shape: (25000, 100)
Train...
Train on 25000 samples, validate on 25000 samples
Epoch 1/4
Epoch 2/4
Epoch 3/4
Epoch 4/4
Test score: 0.5665420207881927
Test accuracy: 0.8351200222969055


In [17]:
cnn_model.summary()

Model: "sequential_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_4 (Embedding)      (None, 100, 128)          2560000   
_________________________________________________________________
dropout_1 (Dropout)          (None, 100, 128)          0         
_________________________________________________________________
conv1d_1 (Conv1D)            (None, 96, 64)            41024     
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 24, 64)            0         
_________________________________________________________________
lstm_2 (LSTM)                (None, 70)                37800     
_________________________________________________________________
dense_4 (Dense)              (None, 1)                 71        
_________________________________________________________________
activation_3 (Activation)    (None, 1)                

In [18]:
brnn_model.summary()

Model: "sequential_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_5 (Embedding)      (None, 100, 128)          2560000   
_________________________________________________________________
bidirectional_1 (Bidirection (None, 128)               98816     
_________________________________________________________________
dropout_2 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_5 (Dense)              (None, 1)                 129       
Total params: 2,658,945
Trainable params: 2,658,945
Non-trainable params: 0
_________________________________________________________________


# LSTM Example: Text Generation
- Finally, a different example!
- We're going to do **Text Generation** with **LSTM**
- We'll watch our model start spitting out words of Nils's favorite philosopher in real time!
  - ps, it's Friedrich Nietzsche
- As we go through epoch's, the ability to generate Nietzcheian sentences will get better and better!
- Let's go!

In [24]:
import sys
path = get_file('nietzsche.txt', origin="https://s3.amazonaws.com/text-datasets/nietzsche.txt")
text = open(path).read().lower()
print('corpus length:', len(text))

chars = sorted(list(set(text)))
print('total chars:', len(chars))
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))

# cut the text in semi-redundant sequences of maxlen characters
maxlen = 40
step = 3
sentences = []
next_chars = []
for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i: i + maxlen])
    next_chars.append(text[i + maxlen])
print('nb sequences:', len(sentences))
# sample sentences
# ['preface\n\n\nsppoisng that teurth is a woma',
#   'face\n\n\nsupposeing that truth is a woman--',
#    'e\n\n\nsupposing that truth is a woman--what t']

print('Vectorization...')
X = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        X[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1


# This part is your model
# build the model: a single LSTM
print('Build model...')
textgen_model = Sequential()
textgen_model.add(LSTM(128, input_shape=(maxlen, len(chars))))
textgen_model.add(Dense(len(chars)))   # don't understand this part
textgen_model.add(Activation('softmax'))

optimizer = RMSprop(lr=0.01)
textgen_model.compile(loss='categorical_crossentropy', optimizer=optimizer)



# How does this work on a higher level?
# I want our model to learn what character coems after what character
# When to insert space, when to insert punctuation, when to insert what word
# After we train the model fully, then we start off with a sample seed word
# I fit the seed word into the model, and ask the model to predict what is the next word
# maybe we fit a vector of 40 characters, then we fit this into the model
# then it will predict the next character for me
# then in the next iteration, we're going to change the input data by now using the PREDICTION DATA from the previous input 
# to output the next character prediction
# Basically we are going to do text generation character by character
### remember -- it is a CHARACTER (a letter.) NOT WORDS!!



def sample(preds, temperature=1.0):
    # helper function to sample an index from a probability array
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

# train the model, output generated text after each iteration
for iteration in range(1, 60):
    print()
    print('-' * 50)
    print('Iteration', iteration)
    textgen_model.fit(X, y, batch_size=128, epochs=1)

    start_index = random.randint(0, len(text) - maxlen - 1)

    for diversity in [0.2, 0.5, 1.0, 1.2]:
        print()
        print('----- diversity:', diversity)

        generated = ''
        sentence = text[start_index: start_index + maxlen]
        generated += sentence
        print('----- Generating with seed: "' + sentence + '"')
        sys.stdout.write(generated)

        for i in range(400):
            x = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(sentence):
                x[0, t, char_indices[char]] = 1.

            preds = textgen_model.predict(x, verbose=0)[0]
            next_index = sample(preds, diversity)
            next_char = indices_char[next_index]

            generated += next_char
            sentence = sentence[1:] + next_char

            sys.stdout.write(next_char)
            sys.stdout.flush()
        print()

corpus length: 600901
total chars: 59
nb sequences: 200287
Vectorization...
Build model...

--------------------------------------------------
Iteration 1
Epoch 1/1

----- diversity: 0.2
----- Generating with seed: "ntiment in europe at present is perhaps "
ntiment in europe at present is perhaps and the strange of the strong and the string and have to are the string of the strange of the sace and and this spead to at in the hologe and strunged and the religious and and in the sace of the precention of the string the sace to all the self and the string and the sime of the for the still and and the still of a something and strange of him and of the strange of the strange of the streng that 

----- diversity: 0.5
----- Generating with seed: "ntiment in europe at present is perhaps "
ntiment in europe at present is perhaps not regor of him to cave and
himself to the tourstes of this sidestinal in the nature of the decented to a all the intoring constined and and
deffection, the self--a ge

l'homme voit of prevatives, the svaul and more some there not delecismen, raish of incloks of the passion
the fire and strength. conscience, for how life, when, therefore therein--an adventre most more
predual
ngable against willing and chance, in of he question namile sympated, theere verigor: therexaser, in  lite and
insical): putting they -so. but
willinate in
feeling makeness: is hefboch wele the reasonles

----- diversity: 1.2
----- Generating with seed: "t dans ces moments-la, que
l'homme voit "
t dans ces moments-la, que
l'homme voit are the pri. life too, sister, circual", a xhays niw, if a yakphing i trincreff rkeens, wherene of subjespsient explanedly with, and or pulienous demosing the charce--prociie
short. : this depetion. dicinture shall mayer of one prilide the ymined crobly things undestrecign surjet one let the by misupolding excendith perseedraciatity to orner what also instincts, more), as regarded, thereby, the na

--------------------------------------------------


ndustry, patient adaptableness to rank and the strong of the streed the streed the subtletly consequence that the same the constant to the consequence and strength and the subtletrable to the sacrifice of the consequence than the streed that a consequence and all the streed and interroration of the soul. the sacrifice of the same the streed the sense and the same the presentive that the states of the states and soul of the man of the su

----- diversity: 0.5
----- Generating with seed: "ndustry, patient adaptableness to rank a"
ndustry, patient adaptableness to rank and solitation of the conscience, indeed to them are as the same in the soul, with one must "beginst not be than as a person as the sense of the streed to the soul, all things in the light: could to be beating,
does not reading the moral one upon the future of his own truth the more such as to be the translad to the and say with a revenges in the
consider for example to the forgots of man of the mo

----- diversity: 1.0
---

exumcessify as a
presence of faith, respes and rofes from this predict essentihe as the life mode. one are a footh
has too man as the reason, the inations, who is not
spirds a
heams to be unit effect

----- diversity: 1.2
----- Generating with seed: "y reason
many think him red-hot.

92. wh"
y reason
many think him red-hot.

92. whe of the onever that predeterneed in every summity, gotth of it
in which my fairing in efgets the legbinece. to satisus, coltable--experined of the epicuous like,
there is asgest deligings olliest will, prabsirators, there are so's by side. his
ageray such kinless, both, the lyxiating up to . he looking
pofrind-lhu-dly.

24x create ber. mus: a duclact for as present, hes why of individual, genous


--------------------------------------------------
Iteration 12
Epoch 1/1

----- diversity: 0.2
----- Generating with seed: "sponsibility--and who now, creditably, r"
sponsibility--and who now, creditably, really an an and the conscience of the possessed to a relig

unexhausted to the sense of the constantly and sense of the same the sense of the soul the same the former and profoundest than the same the world and soul and as the same the problem of the saint in the same things and interpreting and and sinking of the problem of the statesment and problem and and proposition who has a states of the subject of the same things of the same sense of the subject interest an

----- diversity: 0.5
----- Generating with seed: "nowledge of his conviction how
unexhaust"
nowledge of his conviction how
unexhausted with itself and self-lower and probably that we seem the cruelty, whether who has slight struggentians, as an attempts of the process to a resse the same the former and self-plairs, and the former of the intentious shame with the foregroun of the sense and probably in their and interpreted and religion and every who are at the agreeably man does how than a satisfied not interest and the fredect

----- diversity: 1.0
----- Generating with seed: "nowle

not the same bellsed of the present vingers has apparencation, that would
refpous exist as somethings are not only
of illursmanity sinditions; perishiat and with it
it that his expeitions m

----- diversity: 1.2
----- Generating with seed: "ust upside down: an exceedingly
unseemly"
ust upside down: an exceedingly
unseemly other pearsio inesolwkerelly, and according tyward dateful and adxerngeancess becan psyccave. in effect,
in each its,
too mitrhome;
sought, neceptic, that zan"s, value us, gentering of thing surminful for the rele to the brasing,  there are a great others reclas"
air. at reduring handayed of the granted bread doved to may in wholenertings.


of afraid ga. to honowable of things craacte.=--on woat

--------------------------------------------------
Iteration 19
Epoch 1/1

----- diversity: 0.2
----- Generating with seed: " the richer man now feels himself inward"
 the richer man now feels himself inwards and the world and has and the same the problem of the sensation of

hing of anything shameful, devilish and the probably the same time is a complete of the sense and substany the contemplien to the properly and the subject of the same the probably the same the propersed the same and the same the propersen the most strange and the contemplation of the same the same the more the enty and subject of the same the same as a period of the properse and and the probably the same and and the same the same the sa

----- diversity: 0.5
----- Generating with seed: "hing of anything shameful, devilish and "
hing of anything shameful, devilish and from the else, who can so delicate and the same time, and concealer"--the same as a prominent the highest relation of the however, the tendenching of the stronger, the
true, man to distance, but in the world of the universal in the most "eman, the last according to the same term of the last every concerning self-langes, all the hand in the heart of his attempts of himself of the mendal morality, a

----- diversity: 1.0
---

li king ulembortions." in the own recovers, in the rather, as the recognize so warded and with durs in neavible world once more. envy and
finally the indecended shuth in the eldomy, for a long indivicularly amblived is the
sake have of the mare
a

----- diversity: 1.2
----- Generating with seed: "mpt to get along with a single one has n"
mpt to get along with a single one has nay, of
emplopial suspicional and his living class
awadous
muchar slaves: furned usefliexivest
without and individual, even aftusi "s"
to quietite and coked to love egoism. these just hitherto at it the dangess tellowing say
beyond then intestives of the
protrive
remornhossiff fie; conseasity.=--is that doesness) to
dogma" infloieness.
we--it is not a wosorst without less ba systew or of
ner
let" b

--------------------------------------------------
Iteration 26
Epoch 1/1

----- diversity: 0.2
----- Generating with seed: "self.
he feels hatred, consequently, for"
self.
he feels hatred, consequently, for the same t

 now, for all non-philosophical persons, and in the same the extent and the spirit of the same the southoxt of the same the superstition of the same the southoxt of the states of the same the superstition of the state of the same the sense of the same the most delight of the presenting the same the same the states of the same the same the same the same the southertheless and the superstitions of the strength of the states of the sense o

----- diversity: 0.5
----- Generating with seed: " now, for all non-philosophical persons,"
 now, for all non-philosophical persons, because the stronger than the fact that the fact, the most distinct sim the value to us the controlly an art belittpons that
individual of the individual to the self-delicate belitwed the extinguing the most provent of contemplated renounce in the same time of extent and profoundly, and the present, and a still not in the nature and an appear as a statesman
instincts, and it is contempre, which i

----- diversity: 1.0
---

"pain and moreod stronging the latter wants gradanitomably "golend sexyed--and is non"
our near--who dependentianlined, personal at how with the unseduced interest in
allown and forms of their incendingly
wished"
of all profound opinion and him. he ref

----- diversity: 1.2
----- Generating with seed: "hen their own opinion of themselves by m"
hen their own opinion of themselves by many propes more
         feptire
today in its astraunment
and living mankind; when premon in the
govered sa"!--has not nationr, with art--surs inmont heralty throughour an inutpination
by a
herditive better and succastoss of our almost animalismes questions about that live of theor,
although a
gainsanminifitize fatonything moath intent-eloamentifis such pilty not mediove man on them, through their

--------------------------------------------------
Iteration 33
Epoch 1/1

----- diversity: 0.2
----- Generating with seed: "ay use (as an
abbreviation of the means "
ay use (as an
abbreviation of the means of th



hey 

----- diversity: 1.2
----- Generating with seed: "lso in his deterioration (deteriorated i"
lso in his deterioration (deteriorated is cahatherol best croplegw, in the statyd. of account be vation;
a it that religio, for simalingly
noce-roung an ourselops speriles
to helpe"; :ne-wowlened--and ever theeth to the point, that which question without centricism, and one"m--half done-wwill delicable insigned with an
actsleugrely educated and bijory"--is modern  animals, sumpluany of
factleps, who
staveslofk, of
his most without di gf

--------------------------------------------------
Iteration 35
Epoch 1/1

KeyboardInterrupt: 

# Save Trained Models

In [27]:
import pickle

with open('models/skipgram.pkl', 'wb') as pkl:
    pickle.dump(skipgram, pkl)
with open('models/rnn_model.pkl', 'wb') as pkl:
    pickle.dump(rnn_model, pkl)
with open('models/lstm_model.pkl', 'wb') as pkl:
    pickle.dump(lstm_model, pkl)
with open('models/cnn_model.pkl', 'wb') as pkl:
    pickle.dump(cnn_model, pkl)
with open('models/brnn_model.pkl', 'wb') as pkl:
    pickle.dump(brnn_model, pkl)
with open('models/textgen_model.pkl', 'wb') as pkl:
    pickle.dump(brnn_model, pkl)


FileNotFoundError: [Errno 2] No such file or directory: 'models/lstm_model.pkl'