# Meditations with LSTM

### Plan

1. Create a plan
1. Obtain the data
1. Analyze the data
1. Clean the data
1. Prepare the data
1. Create the LSTM model
1. Train the model
1. Use the model for predictions
1. Create the Python module
1. Bonus: setup your local machine for deep learning
1. Bonus: setup AWS for deep learning

### Prepare your environment

### Obtain the data

In [None]:
# Go to http://www.gutenberg.org/ebooks/2680
# Download Plain Text UTF-8 version
# Place it in the same directory as your Jupyter notebook

In [None]:
# Read the source file.

In [None]:
path = './meditations.txt'
raw_source_text = open(path).read()

### Analyze the data

In [None]:
# Scan the source file. What can cause problems while training our model?

In [None]:
start_from = 'THE FIRST BOOK'
end_at = 'APPENDIX'

In [None]:
# Do we have enought raw text for training?

In [None]:
len(raw_source_text)

In [None]:
# Find the index for the real start of the text.

In [None]:
raw_source_text.find(start_from)

In [None]:
# Let's inspect what we are removing.

In [None]:
raw_source_text[:raw_source_text.find(start_from)]

In [None]:
# Find the index for the real end of the text.

In [None]:
raw_source_text.find(end_at)

In [None]:
# That didn't work. Let's try again.

In [None]:
raw_source_text.rfind(end_at)

In [None]:
raw_source_text[raw_source_text.rfind(end_at):]

In [None]:
# Check the first 2000 characters of the real text.

In [None]:
raw_source_text[raw_source_text.find(start_from):raw_source_text.find(start_from)+2000]

### Clean the data

In [None]:
# Get rid of everything before and after the real text. 

In [None]:
prepared_text = raw_source_text[raw_source_text.find(start_from):raw_source_text.rfind(end_at)]

In [None]:
# Check the first 1000 characters.

In [None]:
prepared_text[:1000]

In [None]:
# Check the last 1000 characters.

In [None]:
prepared_text[-1000:]

In [None]:
# The first problem, which we are going to solve is the book titles.
# The Meditations consists of 12 books and all of them begin with 'THE XXXX BOOK'.
# We don't want our model to learn this so we are going to remove it.

In [None]:
import re

In [None]:
# `r` syntax variants for strings are very useful for regular expressions.
# `r` strings treat all the characters literally.
# There is no need need to escape slashes and other special characters.
prepared_text = re.sub(r'THE\s\w+\sBOOK\n', '', prepared_text)

In [None]:
prepared_text[:2000]

In [None]:
# There is another problem - chapter numbering.
# Each book of The Meditations has numbered chapters.
# Again, we don't want to train on this so let's remove it.

In [None]:
prepared_text = re.sub(r'\n[XVI]+\.\s', '', prepared_text)

In [None]:
prepared_text[:2000]

In [None]:
# We are not done yet: there are a lot of \n and other special characters.
# First, print a list of all the unique characters in the text.

In [None]:
print(sorted(list(set(prepared_text))))

In [None]:
# It is a normal practice to convert all the text to lowercase to improve performance.
# Let's do that right now.

In [None]:
prepared_text = prepared_text.lower()

In [None]:
# Remove all the characters except the whitelisted ones.

In [None]:
prepared_text = re.sub(r'[^\,\.\!\?\'\;\:\-a-z]+', ' ', prepared_text)

In [None]:
# Inspect the list of unique characters again.

In [None]:
print(sorted(list(set(prepared_text))))

In [None]:
len(sorted(list(set(prepared_text))))

In [None]:
# Check the first 1000 words.

In [None]:
prepared_text[:1000]

In [None]:
# Total length of the `prepared text`.

In [None]:
len(prepared_text)

In [None]:
# Rename the `prepared_text` to `text`.

In [None]:
text = prepared_text

### Prepare the data

In [None]:
# Declare variables for things, which we are going to use a lot.

In [None]:
unique_chars = sorted(list(set(text)))
vocabulary_size = len(unique_chars)
char_to_index = {c: i for i, c in enumerate(unique_chars)}
index_to_char = {i: c for i, c in enumerate(unique_chars)}

In [None]:
# Decide what sentence length and step we are going to use.

In [None]:
sentence_length = 88
step = 3

In [None]:
# Make sentence -> next character pairs.

In [None]:
sentences = []
next_chars = []
for i in range(0, len(text)-sentence_length, step):
    sentences.append(text[i:i+sentence_length])
    next_chars.append(text[i+sentence_length])

In [None]:
# Total number of sentences.

In [None]:
len(sentences)

In [None]:
# Text vectorization.

In [None]:
import numpy as np

x_shape = len(sentences), sentence_length, vocabulary_size
y_shape = len(sentences), vocabulary_size

x = np.zeros(shape=x_shape, dtype=np.bool)
y = np.zeros(shape=y_shape, dtype=np.bool)

for i, sentence in enumerate(sentences):
    y[i,char_to_index[next_chars[i]]] = 1
    for j, char in enumerate(sentence):
        x[i,j,char_to_index[char]] = 1

### Create the LSTM model

In [None]:
# Import the required libraries.

In [None]:
from keras.models import Sequential
from keras.layers import Dense, Activation, LSTM
from keras.optimizers import RMSprop

In [None]:
# Decide on the model parameters.

In [None]:
learning_rate = 0.01
batch_size = 512
epochs = 50
lstm_size = 128

In [None]:
# Create the model.

In [None]:
model = Sequential()
model.add(LSTM(lstm_size, input_shape=(sentence_length, vocabulary_size)))
model.add(Dense(vocabulary_size))
model.add(Activation('softmax'))

In [None]:
# Create the optimizer.

In [None]:
optimizer = RMSprop(learning_rate)

In [None]:
# Compile the model.

In [None]:
model.compile(loss='categorical_crossentropy', optimizer=optimizer)

### Train the model

In [None]:
# Train the model.

In [None]:
model.fit(x, y, batch_size=batch_size, epochs=epochs)

In [None]:
# Save the weights.

In [None]:
model.save_weights('./weights.hdf5')
# model.load_weights('./weights.hdf5')

### Use the model for predictions

In [None]:
# Get 10 random places in the text to use as inputs.

In [None]:
start_indices = np.random.choice(len(text) - sentence_length, 10)

In [None]:
start_indices

In [None]:
# Decide on prediction length and temperature.

In [None]:
prediction_length = 100
temperature = 0.3

In [None]:
# Make predictions.

In [None]:
for i in start_indices:
    print('-' * prediction_length)
    input_chars = text[i:i+sentence_length]
    print('Input:', input_chars)
    output_chars = ''
    for j in range(prediction_length):
        x_predict = np.zeros(shape=(1, sentence_length, vocabulary_size))
        for k, char in enumerate(input_chars):
            x_predict[0, k, char_to_index[char]] = 1
            
        probs = model.predict(x_predict, verbose=0)[0]
        probs = np.asarray(probs).astype('float64')
        probs = np.clip(probs, a_min=1e-32, a_max=None)
        not_probs = np.exp(np.log(probs) / temperature)
        adjusted_probs = not_probs / np.sum(not_probs)
        predicted_index = np.argmax(np.random.multinomial(1, adjusted_probs))
        predicted_char = index_to_char[predicted_index]
        output_chars += predicted_char
        input_chars = input_chars[1:] + predicted_char
        
    print('Output:', output_chars)

### Create the Python module

In [None]:
# Convert the Juptyer notebook to Python module for production use.

In [None]:
# see lstm.py

### Bonus: setup your local machine for deep learning

1. Install Anaconda from [here](https://www.anaconda.com/download)
1. Install Tensorflow from [here](https://www.tensorflow.org/install/)
    - If you have a Windows or Linux computer with a supported NVIDIA GPU you can install Tensorflow with GPU support.
        - [Supported GPUs](https://developer.nvidia.com/cuda-gpus)
1. Create deep-learning environment by running:
    - Windows: `conda env create -f environment_windows.yml`
    - Windows GPU: `conda env create -f environment_windows_gpu.yml`
    - Linux: `conda env create -f environment_linux.yml`
    - Linux GPU: `conda env create -f environment_linux_gpu.yml`
    - macOS: `conda env create -f environment_macos.yml`
    - macOS GPU: NA (As of version 1.2, TensorFlow no longer provides GPU support on macOS.)
1. Activate the environment by running:
    - Windows: `activate deep_learning`
    - Linux: `source activate deep_learning`
    - macOS: `source activate deep_learning`

### Bonus: setup AWS for deep learning

1. Create [AWS](https://aws.amazon.com) account or login to [AWS](https://aws.amazon.com)
1. Review the limits for `p2.xlarge` instances [here](https://console.aws.amazon.com/ec2/v2/home?#Limits)
1. If your current limit is 0:
    - Click `Request limit increase`
    - Fill all the fields in the form:
        - For the "Region" field, select your prefered region (I recommend US East (Northern Virginia)
        - For the "New limit value", enter `1` (You will not get more anyway, but your waiting time might be longer)
        - For the "Use Case Description", enter: "Studying deep learning with Deep Learning Tribe"
    - Wait up to 48 hours for the limit increase
1. Visit [EC2 Management Console](https://console.aws.amazon.com/ec2/v2/home)
1. Click `Launch Instance`
1. Select `AWS Marketplace` (in the menu on the left)
1. Search for `Deep Learning AMI (Ubuntu)`
1. Click `Select`
1. Click `Continue`
1. Select `6. Configure Security Group` (in the menu on top)
1. If you don't have a security group with port 8888 open:
    - Select `Create a new security group`
    - Click `Add Rule`
        - Make sure that `Type` is set to `Custom TCP Rule`
        - Set `Port Range` to `8888`
        - Enter `For running Jupyter on port 8888` in the `Description` field
1. Click `Review and Launch`
1. Click `Launch`