## (1) What is LSTM?

when using RNNs for text, character and word-level models, text is encoded into integers for the model to process. In char-level models, we tokenize each letter into a one-hot vector from the corpus of letters. In word-level models, we tokenize each word into a one-hot vector form the corpus of words. 

#### One-Hot Vector Encoding Example in a Character-Level Network
> Corpus: [‘a’, ‘b’, ‘c’, ‘d, ‘e’, ‘f’, g’, ‘h’, ‘i’] — len(Corpus) = 9 <br>
 Word: [‘bad’] → [‘b’, ‘a’, ‘d’]
 b → [0, 1, 0, 0, 0, 0, 0, 0, 0], a → [1, 0, 0, 0, 0, 0, 0, 0, 0, 0], d → [0, 0, 0, 1, 0, 0, 0, 0, 0]
 
 Once we have encoded the text into a vector, it is time to train the model using an LSTM. In this code example, we will use the Keras library built on top of Tensorflow to greatly simplify this task.

## (2) LSTM Application


LSTM networks are capable of modeling sequential / temporal aspects of data and hence have been used widely for text, videos, and time-series. Few applications are language modeling, text classification, machine translation, dialog systems, question-answering, speech recognition, translating videos to natural language, image caption generation, hand writing recognition/generation, natural language generation, anomaly detection, and many more in future...

In [1]:
#make sure to import torch above all of libraries! (otherwise, it gives you errors)

import torch
import numpy as np 
import pandas as pd
import matplotlib.pyplot as plt
import os
import tensorflow as tf
import keras
import timeit
import shap
import eli5
from eli5.sklearn import PermutationImportance
import graphviz
from sklearn.model_selection import train_test_split, StratifiedKFold 
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.utils import class_weight
from keras.models import Sequential
from keras.layers import Dense, Activation,Dropout, BatchNormalization
from keras.callbacks import EarlyStopping
from keras import optimizers
from keras import backend as K
from keras.optimizers import RMSprop
from keras.callbacks import LambdaCallback
import random
from keras.layers import Dropout
from keras.layers import LSTM
from keras.callbacks import ModelCheckpoint
from keras.utils import np_utils
import sys
import io


Using TensorFlow backend.


In [6]:
description = df.long_description
description = ' '.join(map(str, description)).lower()
description=description.splitlines() #solving \n\n problem 
description = ' '.join(map(str, description)).lower()


In [7]:
## clean up

puncts = [',', '.', '"', ':', ')', '(', '-', '!', '?', '|', ';', "'", '$', '&', '/', '[', ']', '>', '%', '=', '#', '*', '+', '\\', '•',  '~', '@', '£', 
 '·', '_', '{', '}','\n','\r','©', '^', '®', '`',  '<', '→', '°', '€', '™', '›',  '♥', '←', '×', '§', '″', '′', 'Â', '█', '½', 'à', '…', 
 '“', '★', '”', '–', '●', 'â', '►', '−', '¢', '²', '¬', '░', '¶', '↑', '±', '¿', '▾', '═', '¦', '║', '―', '¥', '▓', '—', '‹', '─', 
 '▒', '：', '¼', '⊕', '▼', '▪', '†', '■', '’', '▀', '¨', '▄', '♫', '☆', 'é', '¯', '♦', '¤', '▲', 'è', '¸', '¾', 'Ã', '⋅', '‘', '∞', 
 '∙', '）', '↓', '、', '│', '（', '»', '，', '♪', '╩', '╚', '³', '・', '╦', '╣', '╔', '╗', '▬', '❤', 'ï', 'Ø', '¹', '≤', '‡', '√', ]

for punct in puncts:
    description=description.replace(punct,'')


In [8]:
 description[:1000]

'one of the most commercially successful conversational ai companies in the world conversica is building the next generation of our artificial intelligence platform and we are looking for data science interns in the areas of natural language processing natural language generation and deep learning recognized by gartner inc harvard business review etc we are passionate driven selfstarting resourceful innovative collaborative and we get a lot done while having fun if that sounds like you then read on  conversica is seeking talented and passionate data scientists to help us evolve our artificial intelligence machine learning and natural language processing systems and technologies  your responsibilities  as conversicas data scientist you will work with other data scientists and engineers to solve and contribute to the innovative products we are building you will be responsible for developing improving and experimenting with methodologies and algorithms to support these efforts using machi

In [9]:
chars = sorted(list(set(description)))
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))

In [10]:
n_chars = len(description)
print ("Total Characters: ", n_chars)

Total Characters:  8098854


In [11]:
maxlen = 70
step = 5
sentences = []
next_chars = []
for i in range(0, len(description) - maxlen, step):
    sentences.append(description[i: i + maxlen])
    next_chars.append(description[i + maxlen])
print('Number of sequences:', len(sentences), "\n")

print(sentences[:10], "\n")
print(next_chars[:10])

Number of sequences: 1619757 

['one of the most commercially successful conversational ai companies in', 'f the most commercially successful conversational ai companies in the ', ' most commercially successful conversational ai companies in the world', ' commercially successful conversational ai companies in the world conv', 'ercially successful conversational ai companies in the world conversic', 'lly successful conversational ai companies in the world conversica is ', 'uccessful conversational ai companies in the world conversica is build', 'sful conversational ai companies in the world conversica is building t', 'conversational ai companies in the world conversica is building the ne', 'rsational ai companies in the world conversica is building the next ge'] 

[' ', 'w', ' ', 'e', 'a', 'b', 'i', 'h', 'x', 'n']


In [12]:
x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1

In [13]:
model = Sequential()
model.add(LSTM(128, input_shape=(maxlen, len(chars))))
model.add(Dense(len(chars)))
model.add(Activation('softmax'))

In [14]:
optimizer = RMSprop(lr=0.001)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)

In [None]:
def sample(preds, temperature=1.0):
    # helper function to sample an index from a probability array
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

def on_epoch_end(epoch, logs):
    # Function invoked for specified epochs. Prints generated text.
    # Using epoch+1 to be consistent with the training epochs printed by Keras
        print()
        print('----- Generating text after Epoch: %d' % epoch)

        start_index = random.randint(0, len(description) - maxlen - 1)
        for diversity in [0.4]:
            print('----- diversity:', diversity)

            generated = ''
            sentence = description[start_index: start_index + maxlen]
            generated += sentence
            print('----- Generating with seed: "' + sentence + '"')
            sys.stdout.write(generated)

            for i in range(700):
                x_pred = np.zeros((1, maxlen, len(chars)))
                for t, char in enumerate(sentence):
                    x_pred[0, t, char_indices[char]] = 1.

                preds = model.predict(x_pred, verbose=0)[0]
                next_index = sample(preds, diversity)
                next_char = indices_char[next_index]

                generated += next_char
                sentence = sentence[1:] + next_char

                sys.stdout.write(next_char)
                sys.stdout.flush()
            print()
            
generate_text = LambdaCallback(on_epoch_end=on_epoch_end)

In [None]:
# define the checkpoint
filepath = "weights.hdf5"
checkpoint = ModelCheckpoint(filepath, 
                             monitor='loss', 
                             verbose=1, 
                             save_best_only=True, 
                             mode='min')

# fit model using our gpu
with tf.device('/gpu:0'):
    model.fit(x, y,
              batch_size=128,
              epochs=5,
              verbose=2,
              callbacks=[generate_text, checkpoint])

Epoch 1/5
 - 1448s - loss: 1.7025

----- Generating text after Epoch: 0
----- diversity: 0.4
----- Generating with seed: " career can ultimately take you we empower you to do great work in a c"
 career can ultimately take you we empower you to do great work in a company opportunity to control and and statistics and a data engineering products with a sql strategics status and individual and and able the specific or interned service and products to enal and the needs to and a distriction and the employee of data and data and the related with opportunity experience with data and develop and and a production machine learning and create with a complex strong strategic products and and stateholofich content and and status of considere who develops and market including of internal statistics status of data and and analytics and and machine learning products and development of and strong operations and and and machine learning and data analysis and work and 

Epoch 00001: loss improved from in