# Introduction

The following was inspired by the following articles

- Medium
    - How to build a Recurrent Neural Network in TensorFlow [[1]](https://medium.com/@erikhallstrm/hello-world-rnn-83cd7105b767)[[2]](https://medium.com/@erikhallstrm/tensorflow-rnn-api-2bb31821b185)[[3]](https://medium.com/@erikhallstrm/using-the-tensorflow-lstm-api-3-7-5f2b97ca6b73)[[4]](https://medium.com/@erikhallstrm/using-the-tensorflow-multilayered-lstm-api-f6e7da7bbe40)[[5]](https://medium.com/@erikhallstrm/using-the-dynamicrnn-api-in-tensorflow-7237aba7f7ea)[[6]](https://medium.com/@erikhallstrm/using-the-dropout-api-in-tensorflow-2b2e6561dfeb)  
    - [RNN example by Python](https://towardsdatascience.com/recurrent-neural-networks-by-example-in-python-ffd204f99470)       
- GitRepos
    - [char-rnn-tensorflow](https://github.com/sherjilozair/char-rnn-tensorflow)
    - [RNN](https://github.com/WillKoehrsen/recurrent-neural-networks/tree/master/notebooks)
    - [KERAS for Humans](https://github.com/keras-team/keras/blob/master/examples/lstm_text_generation.py)
- Kaggle Repos
    - [Learn by example RNN/LSTM/GRU time series](https://www.kaggle.com/charel/learn-by-example-rnn-lstm-gru-time-series)
- machinelearningmaster
    - [How to Develop a Character-Based Neural Language Model in Keras](https://machinelearningmastery.com/develop-character-based-neural-language-model-keras/)
    - [Adventures of machine learning](http://adventuresinmachinelearning.com/keras-lstm-tutorial/)
-  Troubleshooting
    - [Input size of the LSTM layer](https://github.com/keras-team/keras/issues/2045)

# Imports

In [27]:
from __future__ import print_function
from keras.callbacks import LambdaCallback
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.layers import LSTM
from keras.optimizers import RMSprop
from keras.utils.data_utils import get_file
import numpy as np
import random
import sys
import io

# Functions

## Sample

In [15]:
def sample(preds, temperature=1.0):
    # helper function to sample an index from a probability array
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

## Epoch End

In [16]:
def on_epoch_end(epoch, _):
    # Function invoked at end of each epoch. Prints generated text.
    print()
    print('----- Generating text after Epoch: %d' % epoch)

    start_index = random.randint(0, len(text) - maxlen - 1)
    for diversity in [0.2, 0.5, 1.0, 1.2]:
        print('----- diversity:', diversity)

        generated = ''
        sentence = text[start_index: start_index + maxlen]
        generated += sentence
        print('----- Generating with seed: "' + sentence + '"')
        sys.stdout.write(generated)

        for i in range(400):
            x_pred = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(sentence):
                x_pred[0, t, char_indices[char]] = 1.

            preds = model.predict(x_pred, verbose=0)[0]
            next_index = sample(preds, diversity)
            next_char = indices_char[next_index]

            generated += next_char
            sentence = sentence[1:] + next_char

            sys.stdout.write(next_char)
            sys.stdout.flush()
        print()

## Read 

In [2]:
def readData(FileDir):
    # Get the database from a list of characters that is found in the input.txt
    # should be simple plain text file
    data = open(FileDir, 'r').read() 
    return data

# Data

## I/O

In [7]:
fileDir = 'data/rnn/input.txt'
text = readData(fileDir)

In [8]:
print('corpus length:', len(text))

corpus length: 4994


In [11]:
chars = sorted(list(set(text)))
print('total chars:', len(chars))
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))

total chars: 49


## Cleaning

### Formating

In [12]:
# cut the text in semi-redundant sequences of maxlen characters
maxlen = 40
step = 3
sentences = []
next_chars = []
for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i: i + maxlen])
    next_chars.append(text[i + maxlen])
print('nb sequences:', len(sentences))

nb sequences: 1652


### Convert Text to Sequence

The main purpose is to divide the entire abstract information into sentences with **n** number of words.  Where each word is a number which corresponds to the vocabulary.

Our preprocessing is going to involve using a Tokenizer to convert the patents from sequences of words (strings) into sequences of integers.

We will use a sequencing function to perform this task.  Where this function takes various parameters including a training length which is the number of words we will feed into the network as features with the next word the label. For example, if we set training_length = 50, then the model will take in 50 words as features and the 51st word as the label.

For each abstract, we can make multiple training examples by slicing at different points. We can use the first 50 words as features with the 51st as a label, then the 2nd through 51st word as features and the 52nd as the label, then 3rd - 52nd with 53rd as label and so on. This gives us much more data to train on and the performance of the model is proportional to the amount of training data.

In [13]:
print('Vectorization...')
x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1

Vectorization...


# Model Generation

This is where you can change LSTM to GRU or RNN.

This is basically where you design your network.

The rest of the code is just to clean the data in such a way that it can be places in here.

The output of the data is sampled at each end of the epoch.

## Multiple Layers

In [32]:
# build the model: a single LSTM
print('Build model...')
model = Sequential()
model.add(LSTM(128, return_sequences=True, input_shape=(maxlen, len(chars))))
model.add(LSTM(128, input_shape=(maxlen, len(chars))))

model.add(Dense(len(chars), activation='softmax'))

optimizer = RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)
print(model.summary())



Build model...
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm_10 (LSTM)               (None, 40, 128)           91136     
_________________________________________________________________
lstm_11 (LSTM)               (None, 128)               131584    
_________________________________________________________________
dense_4 (Dense)              (None, 49)                6321      
Total params: 229,041
Trainable params: 229,041
Non-trainable params: 0
_________________________________________________________________
None


## Single Layer

In [34]:
# build the model: a single LSTM
print('Build model...')
model = Sequential()
model.add(LSTM(128, input_shape=(maxlen, len(chars))))
model.add(Dense(len(chars), activation='softmax'))

optimizer = RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)
print(model.summary())

Build model...
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm_12 (LSTM)               (None, 128)               91136     
_________________________________________________________________
dense_5 (Dense)              (None, 49)                6321      
Total params: 97,457
Trainable params: 97,457
Non-trainable params: 0
_________________________________________________________________
None


# Training the RNN - LSTM

Split a the data into training and validation set and use one hot encoding.

In [17]:
print_callback = LambdaCallback(on_epoch_end=on_epoch_end)

In [18]:
np.shape(x)

(1652, 40, 49)

In [19]:
np.shape(y)

(1652, 49)

## Training Single Layer

In [25]:
model.fit(x, y,
          batch_size=128,
          epochs=60,
          callbacks=[print_callback])

Epoch 1/60

----- Generating text after Epoch: 0
----- diversity: 0.2
----- Generating with seed: " took for granted everything we had
As i"
 took for granted everything we had
As i                                                                                                                                                                                                                                                                                                                                                                            e                                   
----- diversity: 0.5
----- Generating with seed: " took for granted everything we had
As i"
 took for granted everything we had
As i      oe to e l  e    ei eoe e  i  u   t       u        d   h       e         o      rt   e   ybl  h   dteoi  eyt   i     l   o
    ee  e oe   e  dio e       oe  
 e e olt  tr o      t e  e n        e e    o      y e    e       ie e oei   e ai eo e h   e    io e  t a e   'te  uyt   te 

atlwlwl thye ifpter ykis slyot?me yow uhstbavh  hesd norwy yoth d un tldhay
flatwanikidl w ae me
yoe t 
for nleot yaaelly e'
Bwhan b' f ne yow d yowhehy'weeTas ts lin 
owt Flskin  patWar yr y wytOi litomeps mw Ioe layydcisw me Ill dyve a'T fin ake shAnir wlyeo ow s yoknht aoi oboroslm ghawd you  ue s me mT then dy byiotl yeit yofoy yowe y yob
----- diversity: 1.2
----- Generating with seed: "our own)
And all of the things I've been"
our own)
And all of the things I've beenine meliAt'snwn0va'0of du'h
rt
h  tom hor'thw n0 yowoI d int thelneeb ifefotbd mes ySlplyval tnm 
wanh toy  t ehadyy'lf ame dwy gfs tdbi'd totevihone gen
sWe bet yoaeruhwythmyl
ome be'nyey(s yfmei Imaiofld
oeat fknn
ffy ou mt m tewar'sSalouone mily That b y s deldofe To ln y'tgintly yorttttB, t dreisubjnvow lnOhm siuy
Wie me min yts' yeAnsda fo'v myone y wayofd  t oyjunf yhe

0r
yhle y Tyesdms'boi
Epoch 5/60

----- Generating text after Epoch: 4
----- diversity: 0.2
----- Generating with seed: "ow (feeling right now) 

Whan s me ane ane ou the than s me all ou the than s me ou the than s me I've an s me an  ou the the than s now the the then s methand withan s me all ou than s me ou than s me an wan s me ame ou the than s me ine and ou the t
----- diversity: 0.5
----- Generating with seed: "new
What am I doing without you?
I'm onl"
new
What am I doing without you?
I'm onl our owe
Oha then oue an s me homebeby you? ng he thon s me wan ou thal n sher owe ou than I wou t meb I've and ou thing withing hot me ame and ou that alling ou an I me ane the the thon
And dowr
Ohhand ou then our
And wat an  ou dowithand your ound
On' hithang ou
Sou?t you than owead of an our I meas now an owithants me an wans dowing at ane wine e't bee be wat an alling ou
And hit things you tha
----- diversity: 1.0
----- Generating with seed: "new
What am I doing without you?
I'm onl"
new
What am I doing without you?
I'm onl ho is wurt
an  ou
I'vy Whings foptBe
Ihin aywagn
Gu'
Auow sou? an whing hinhs I'b newind we thant nm fan I 

Now I'm on't ing
gf leow
you?
ont owises doe bod
BuPs been you hine
I'l ufer kou's yoe
Sone on ait pevighouh
Whr zow
And wu me ke 
or I me wo e you
op satha(duwI
sallitut yop?in kngtone wam I wat ee eede na ne ay u know
Ang  ep ke'ma you t om ay your wonflen semnY Yom that im wo nu wetsa kowh
urlithe the togrg
ne ou beangner no
wA
the pet sekutBr0 ow your own
Aha kave trap non terle Ie kin t I Fon
Mhise 
Epoch 12/60

----- Generating text after Epoch: 11
----- diversity: 0.2
----- Generating with seed: "mebody wants your love
Baby open the doo"
mebody wants your love
Baby open the door
And I'll the thon
And I'll the time I've bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop bop b
----- di

Somebody needs your ithout you tann wht aomebody needs you ling as I've been I'l let you t ing t me know

on s the tom bots amn then s eray
Ohan I boing without you?
And all the t
me
and with you t ant me mand you?
And all the
----- diversity: 1.0
----- Generating with seed: "n here outside my door
And all of the ti"
n here outside my door
And all of the time I've thet ne one e dot a witht your I chingiwe
Whan's for
Tom If yout you?
Bn a puat you
Somebody nee kin 
ur hou my noe
So bon bown
Hayle out e payby yoras wa
Somibeds ilg the t
ing t me an was nw oor mop

Ao liggt in'the boon
And all of this is of do bligot

ohe
 am rod
Atlleteay res of sou t ing aimhour you lingt
Oow Yan a beas iow

on d ing for
gor gome
on 'l ats ufort mive tots igh you lit
----- diversity: 1.2
----- Generating with seed: "n here outside my door
And all of the ti"
n here outside my door
And all of the time I'm loory
Yhan loig
eld ou sayku
Oo ise know
tha ts alave my litt you tors our ilit thp tovt
Somebodythe 

KeyboardInterrupt: 

## Training Multilayer

In [33]:
# multiple LSTM layers
model.fit(x, y,
          batch_size=128,
          epochs=60,
          callbacks=[print_callback])

Epoch 1/60

----- Generating text after Epoch: 0
----- diversity: 0.2
----- Generating with seed: "eave me alone
Don't you turn out the lig"
eave me alone
Don't you turn out the lig  e    o                    o        o r     o    o              ao                     o                            o        n  o                       e    n    o                    o  a       n               t           n             o                                n         n   o       a   n                  no     o  n                      n                              n            o       
----- diversity: 0.5
----- Generating with seed: "eave me alone
Don't you turn out the lig"
eave me alone
Don't you turn out the lig d n rl   n  eo b o
 ao ao   onwon o  loranon o  onofyos    waonot  n n    e   m ta       nan n oo hrotn  n o nwi s   ode  nba n enhnenoo oe n  o o   a o e   r  l ton  nts  s  sn  
o  gn o
no o a     aotrh  we l u doo   n        nne 
 ton ntey f  odenn  oone   in o  e    donotnhsno eni

KeyboardInterrupt: 