# Create a sequence generator with RNN

In this notebook, we develog a Recurrent Neural Network (RNN) to create a Galician language sequence generator. In this project, we use text from the Galician politician Beiras. 
We test diferents RNN network:
* LSTM
* GRU
* GRU + Dropout

We also test ModelCheckpoint in training.

The best network for this is case is a GRU with 3 layers.

This work is based in 
https://github.com/udacity/aind2-rnn/blob/master/RNN_project.ipynb

## Load the data
First we load the data and preprocess:
* Lower
* remove lines with http links
* remove symbols: '[ºªàâäçèêïìôöü&%@•…«»”“*/!"(),.:;_¿¡¿‘’´\[\]\']'


In [3]:
import sys
sys.path.insert(0, '../aux/')
import numpy as np
from beiras_aux import load_text,predict_next_chars,print_predicctions


In [5]:
window_size = 100
step_size = 1
X,y,chars,chars_to_indices,indices_to_chars,text_clean=load_text('../data/Beiras.txt',window_size,step_size);

* X .- Array shape (sentences, window_size, num_chars) .- Input for training.
* y .- Array shape (sentences, num_chars) .- Output for training.
* chars . -Array with chars we have in the clean text
* chars_to_indices,indices_to_chars .- dictionaries to convert fron number to char and char to index
* text_clean .- All the text clean.


## Test we have a GPU
I used a g2.2xlarge EC2 machine. Without a GPU this is too slow.

In [4]:
from tensorflow.python.client import device_lib
device_lib.list_local_devices()

[name: "/cpu:0"
 device_type: "CPU"
 memory_limit: 268435456
 locality {
 }
 incarnation: 13559481850202299021, name: "/gpu:0"
 device_type: "GPU"
 memory_limit: 28573696
 locality {
   bus_id: 1
 }
 incarnation: 16123509706186973612
 physical_device_desc: "device: 0, name: GRID K520, pci bus id: 0000:00:03.0"]

## Simple model
* LSTM(200)
* Dense()

In [17]:
### necessary functions from the keras library
from keras.models import Sequential
from keras.layers import Dense, Activation, LSTM
from keras.optimizers import RMSprop
from keras.utils.data_utils import get_file

# TODO build the required RNN model: a single LSTM hidden layer with softmax activation, categorical_crossentropy loss 
#Number of unique chars
def create_simple_model(chars):
    num_chars = len(chars)
    model= Sequential()
    # 1 Layer .- LSTM layer 1 should be an LSTM module with 200 hidden units
    model.add(LSTM(200,input_shape = (window_size,num_chars)))
    # 2 Layer .-  Dense, with number chars unit and softmax activation
    model.add(Dense(num_chars,activation='softmax'))
    # initialize optimizer
    optimizer = RMSprop(lr=0.001, rho=0.9, epsilon=1e-08, decay=0.0)
    # compile model --> make sure initialized optimizer and callbacks - as defined above - are used
    model.compile(loss='categorical_crossentropy', optimizer=optimizer)
    return model


Train in a small dataset to test

In [5]:
Xsmall = X[:10000,:,:]
ysmall = y[:10000,:]

In [None]:
# train the model
model=create_simple_model(chars)
model.fit(Xsmall, ysmall, batch_size=500, epochs=40,verbose = 1)

# save weights
model.save_weights('../model_weights/best_beiras_small_textdata_weights.hdf5')

In [18]:

#Train
model=create_simple_model(chars)
model.fit(X, y, batch_size=500, nb_epoch=30,verbose = 1)

# save weights
model.save_weights('../model_weights/best_beiras_large_textdata_weights.hdf5')



Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


In [18]:
#Print predicctions
model=create_simple_model(chars)
print_predicctions(model,'../model_weights/best_beiras_large_textdata_weights.hdf5'
                   ,chars_to_indices,indices_to_chars,text_clean,window_size)


------------------
input chars = 
pla panfletaria contra as leoninas taxas impostas polo ministro de xustiza actual malia que vulneran"

predicted chars = 
 estaban a contra de contra de contra de contra de contra de contra de contra de contra de contra de"

------------------
input chars = 
poema de rosalía titulado a xusticia pola man e dado á luz no seu libro follas novas por certo que s"

predicted chars = 
e desenvolver a crise de descomposición do seu contra de contra de contra de contra de contra de con"

------------------
input chars = 
se moito cando dixen eu que as suas políticas agresoras do común cidadán matan e a sua cospedal alcu"

predicted chars = 
ñadora do seu contra da contradición nacional e a contradición nacional e a contradición nacional e "



## Complex network
* LSTM(200)
* LSTM(200)
* Dense
It is better than simple one

In [20]:
def create_complex_model(chars):
    num_chars = len(chars)
    model= Sequential()
    # 1 Layer .- LSTM layer 1 should be an LSTM module with 200 hidden units
    model.add(LSTM(200,input_shape = (window_size,num_chars),return_sequences=True))
    # 2 Layer .- LSTM layer 2 should be an LSTM module with 200 hidden units
    model.add(LSTM(200))
    # 3 Layer .-  Dense, with number chars unit and softmax activation
    model.add(Dense(num_chars,activation='softmax'))
    # initialize optimizer
    optimizer = RMSprop(lr=0.001, rho=0.9, epsilon=1e-08, decay=0.0)
    # compile model --> make sure initialized optimizer and callbacks - as defined above - are used
    model.compile(loss='categorical_crossentropy', optimizer=optimizer)
    return model

In [None]:
# Train
model=create_complex_model(chars)
model.summary()
print(X.shape)
model.fit(X, y, batch_size=500, nb_epoch=30,verbose = 1)

# save weights
model.save_weights('../model_weights/best_beiras_complex_textdata_weights.hdf5')

In [21]:
# Print prediccions.
model=create_complex_model(chars)
print_predicctions(model,'../model_weights/best_beiras_complex_textdata_weights.hdf5'
                   ,chars_to_indices,indices_to_chars,text_clean,window_size)


------------------
input chars = 
pla panfletaria contra as leoninas taxas impostas polo ministro de xustiza actual malia que vulneran"

predicted chars = 
 con el e máis a sua parte do partido galeguista e a memória de aquil mesmo contro con este senso má"

------------------
input chars = 
poema de rosalía titulado a xusticia pola man e dado á luz no seu libro follas novas por certo que s"

predicted chars = 
e acaso por ser o que estaban a algúns dos colexios e máis a mariña de anos antes de morte ao pé do "

------------------
input chars = 
se moito cando dixen eu que as suas políticas agresoras do común cidadán matan e a sua cospedal alcu"

predicted chars = 
ñada polo proprio país e a sua propria conciencia social e política- e a construción dun proxecto es"



## Complex network with GRU
* GRU(200)
* GRU(200)
* Dense

It is better than LSTM network

In [8]:
from keras.layers import Dense, Activation,GRU
from keras.models import Sequential
from keras.layers import Dense, Activation, LSTM
from keras.optimizers import RMSprop
from keras.utils.data_utils import get_file
def create_gru_model(chars):
    num_chars = len(chars)
    model= Sequential()
    # 1 Layer .- GRU layer 1 should be an GRU module with 200 hidden units
    model.add(GRU(200,input_shape = (window_size,num_chars),return_sequences=True))
    # 2 Layer .- GRU layer 3 should be an GRU module with 200 hidden units
    model.add(GRU(200))
     # 3 Layer .-  Dense, with number chars unit and softmax activation
    model.add(Dense(num_chars,activation='softmax'))
    # initialize optimizer
    optimizer =RMSprop(lr=0.001, rho=0.9, epsilon=1e-08, decay=0.0)
    # compile model --> make sure initialized optimizer and callbacks - as defined above - are used
    model.compile(loss='categorical_crossentropy', optimizer=optimizer)
    return model

In [None]:
#Train
model=create_gru_model(chars)
model.summary()
print(X.shape)
model.fit(X, y, batch_size=500, nb_epoch=30,verbose = 1)

# save weights
model.save_weights('../model_weights/best_beiras_gru_textdata_weights.hdf5')

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
gru_1 (GRU)                  (None, 100, 200)          153600    
_________________________________________________________________
gru_2 (GRU)                  (None, 200)               240600    
_________________________________________________________________
dense_6 (Dense)              (None, 55)                11055     
Total params: 405,255
Trainable params: 405,255
Non-trainable params: 0
_________________________________________________________________
(1174280, 100, 55)




Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


In [9]:
#Print predicctions
model=create_gru_model(chars)
print_predicctions(model,'../model_weights/best_beiras_gru_textdata_weights.hdf5' ,
                   chars_to_indices,indices_to_chars,text_clean,window_size)


pla panfletaria contra as leoninas taxas impostas polo ministro de xustiza actual malia que vulneran.... un contrasentido arestora e a construción de anos de autonomía galega non é unha concepción do seu 
poema de rosalía titulado a xusticia pola man e dado á luz no seu libro follas novas por certo que s....e desenvolve o proceso de descomposición do sistema-mundo as condicións de intervención de capital e
se moito cando dixen eu que as suas políticas agresoras do común cidadán matan e a sua cospedal alcu....malo de estado español e o proceso de descomposición do poder constitucional e desembocar nestas ase


## Train the  model with ModelCheckpoint
Save the best model after every epoch. For select the best model, we create a validate data set.This is a regularization type. 
**It does not work in this case.**

In [18]:
# Create to dataset: 1 for train, (90%) and other for validate the model.
# The validate dataset is used after evety epoch to select the best model

total_len=len(X)
len_train=int(total_len * 0.9)
X_train=X[:len_train]
y_train=y[:len_train]
X_validate=X[len_train:]
y_validate=y[len_train:]

### Simple model  with ModelCheckpoint

In [None]:
from keras.models import Sequential
from keras.layers import Dense, Activation, LSTM
from keras.optimizers import RMSprop
from keras.utils.data_utils import get_file
import keras
import random
from keras.callbacks import ModelCheckpoint   

# train the model
checkpointer = ModelCheckpoint(filepath='../model_weights/best_beiras_simple_checkpoint_textdata_weights.hdf5', verbose=1, 
                               save_best_only=True)
# train the model
model=create_simple_model(chars)
model.fit(X_train, y_train, batch_size=500, epochs=30,
           validation_data=(X_validate, y_validate),
          callbacks=[checkpointer],verbose = 1)




Train on 1056852 samples, validate on 117428 samples
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


<keras.callbacks.History at 0x7f6d7d709400>

In [23]:
model=create_simple_model(chars)
print_predicctions(model,'../model_weights/best_beiras_simple_checkpoint_textdata_weights.hdf5',
                   chars_to_indices,indices_to_chars,text_clean,window_size)

------------------
input chars = 
pla panfletaria contra as leoninas taxas impostas polo ministro de xustiza actual malia que vulneran"

predicted chars = 
 a sua constitución estaba a partir de política de castelao e a constitución do colonizador e a cons"

------------------
input chars = 
poema de rosalía titulado a xusticia pola man e dado á luz no seu libro follas novas por certo que s"

predicted chars = 
e desenvolve a seguida a sua constitución española de compromiso constitucional e a constitución da "

------------------
input chars = 
se moito cando dixen eu que as suas políticas agresoras do común cidadán matan e a sua cospedal alcu"

predicted chars = 
ñada de compromiso constitucional e a constitución da constitución do colonizador e a constitución d"



### Gru model  with ModelCheckpoint

In [None]:
### necessary functions from the keras library
from keras.models import Sequential
from keras.layers import Dense, Activation, LSTM
from keras.optimizers import RMSprop
from keras.utils.data_utils import get_file
import keras
import random
from keras.callbacks import ModelCheckpoint   

# train the model
checkpointer = ModelCheckpoint(filepath='../model_weights/best_beiras_gru_checkpoint_textdata_weights.hdf5', verbose=1, 
                               save_best_only=True)
# train the model
model=create_gru_model(chars)
model.fit(X_train, y_train, batch_size=500, epochs=30,
           validation_data=(X_validate, y_validate),
          callbacks=[checkpointer],verbose = 1)

Train on 1056852 samples, validate on 117428 samples
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


<keras.callbacks.History at 0x7f6ed032d3c8>

In [15]:
model=create_gru_model(chars)
print_predicctions(model,'../model_weights/best_beiras_gru_checkpoint_textdata_weights.hdf5',
                   chars_to_indices,indices_to_chars,text_clean,window_size)


------------------
input chars = 
pla panfletaria contra as leoninas taxas impostas polo ministro de xustiza actual malia que vulneran"

predicted chars = 
 a sua propria estrutura de constitución e a sua propria estrutura de constitución e a sua propria e"

------------------
input chars = 
poema de rosalía titulado a xusticia pola man e dado á luz no seu libro follas novas por certo que s"

predicted chars = 
e acabar en contra da sua propria estrutura de constitución e a sua propria estrutura de constitució"

------------------
input chars = 
se moito cando dixen eu que as suas políticas agresoras do común cidadán matan e a sua cospedal alcu"

predicted chars = 
ñada en contra da sua propria estrutura de constitución e a sua propria estrutura de constitución e "



## GRU + Dropout
Another type ofregularation. ** It does not work in this case **
* GRU
* Dropout
* GRU
* Dropout
* Dense

In [9]:
from keras.layers import Dropout
from keras.layers import Dense, Activation,GRU
from keras.models import Sequential
from keras.optimizers import RMSprop

def create_gru_dropout_model(chars):
    num_chars = len(chars)
    model= Sequential()
    # 1 Layer .- LSTM layer 1 should be an LSTM module with 200 hidden units
    model.add(GRU(200,input_shape = (window_size,num_chars),return_sequences=True))
    # 2 Layer .-  Dense, with number chars unit and softmax activation
    model.add(Dropout(0.2))
    model.add(GRU(200))
    model.add(Dropout(0.2))
    model.add(Dense(num_chars,activation='softmax'))
    # initialize optimizer
    optimizer = RMSprop(lr=0.001, rho=0.9, epsilon=1e-08, decay=0.0)
    # compile model --> make sure initialized optimizer and callbacks - as defined above - are used
    model.compile(loss='categorical_crossentropy', optimizer=optimizer)
    return model

In [None]:

model=create_gru_dropout_model(chars)
model.summary()
model.fit(X, y, batch_size=500, nb_epoch=30,verbose = 1)

# save weights
model.save_weights('../model_weights/best_beiras_gru_dropout_textdata_weights.hdf5')

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
gru_1 (GRU)                  (None, 100, 200)          153600    
_________________________________________________________________
dropout_1 (Dropout)          (None, 100, 200)          0         
_________________________________________________________________
gru_2 (GRU)                  (None, 200)               240600    
_________________________________________________________________
dropout_2 (Dropout)          (None, 200)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 55)                11055     
Total params: 405,255
Trainable params: 405,255
Non-trainable params: 0
_________________________________________________________________




Epoch 1/30
Epoch 2/30

In [11]:
model=create_gru_dropout_model(chars)
print_predicctions(model,'../model_weights/best_beiras_gru_dropout_textdata_weights.hdf5',
                   chars_to_indices,indices_to_chars,text_clean,window_size)

------------------
input chars = 
pla panfletaria contra as leoninas taxas impostas polo ministro de xustiza actual malia que vulneran"

predicted chars = 
 por caso a contradición e de contradición e de contradición e de contradición e de contradición e d"

------------------
input chars = 
poema de rosalía titulado a xusticia pola man e dado á luz no seu libro follas novas por certo que s"

predicted chars = 
e constitue unha constitución de contradicións de competencia e a construción dos cidadáns do común "

------------------
input chars = 
se moito cando dixen eu que as suas políticas agresoras do común cidadán matan e a sua cospedal alcu"

predicted chars = 
mante en cartas de compostela a construción dos cidadáns do común de competencia e a construción dos"

