## Instalación de Librerías

Usando Anaconda Prompt se debe usar los siguientes comandos para importar la librería de Keras.
<br>Link: https://medium.com/@pushkarmandot/installing-tensorflow-theano-and-keras-in-spyder-84de7eb0f0df
<br>TensorFlow: https://www.tensorflow.org/install/

```conda create -n tensorflow-gpu pip python=3.5```
<br>```conda activate tensorflow--gpu```
<br>```conda install keras ```

Si está usando Linux, usar los siguientes comandos.

In [None]:
!pip install --force-reinstall regex==2017.04.5
!pip install pathlib --user
!pip install msgpack --user
!pip install tensorflow-gpu --user
!pip install keras --user

Añadir en el path las librerias importadas si es que estás en Windows:

``` set path=%PATH%;C:\Users\Alvaro\AppData\Roaming\Python\Python35\Scripts ```

Pruebo la correcta importación de librerías.

In [4]:
import tensorflow as tf
hello = tf.constant("Hello, TF!")
sess = tf.Session()
print(sess.run(hello))

ImportError: Could not find 'nvcuda.dll'. TensorFlow requires that this DLL be installed in a directory that is named in your %PATH% environment variable. Typically it is installed in 'C:\Windows\System32'. If it is not present, ensure that you have a CUDA-capable GPU with the correct driver installed.

In [3]:
a = tf.constant(10)
b = tf.constant(32)
print(sess.run(a + b))

NameError: name 'tf' is not defined

In [None]:
import keras

## Procesamiento

In [None]:
from __future__ import print_function
import matplotlib.pyplot as plt
import numpy as np
import time
import csv
from keras.models import Sequential
from keras.layers.core import Dense, Activation, Dropout
from keras.layers.recurrent import LSTM, SimpleRNN
from keras.layers.wrappers import TimeDistributed
import pickle

In [2]:
#Archivo de texto 
DATA_DIR = "./lotr.txt" 
#Modificar BATCH_SIZE o HIDDEN_DIM en caso tengan problemas de memoria
BATCH_SIZE = 50 
HIDDEN_DIM = 250 #500
#Parametro para longitud de secuencia a analizar
SEQ_LENGTH = 50 
#Parametro para cargar un pesos previamente entrenados (checkpoint)
WEIGHTS = '' 

#Parametro para indicar cuantos caracteres generar en cada prueba
GENERATE_LENGTH = 500 
#Parametros para la red neuronal
LAYER_NUM = 2 
NB_EPOCH = 20

**Función A:
<br>(1) Carga de un archivo de texto, (2) Construcción de estructuras de entrada y salida de la red**

In [7]:
# method for preparing the training data
def load_data(data_dir, seq_length):
    #Carga del archivo
    data = open(data_dir, 'r').read()
    #Caracteres unicos
    chars = list(set(data))
    VOCAB_SIZE = len(chars)

    print('Data length: {} characters'.format(len(data)))
    print('Vocabulary size: {} characters'.format(VOCAB_SIZE))
    print(chars)
    
    #Indexacion de los caracteres
    ix_to_char = {ix:char for ix, char in enumerate(chars)}
    char_to_ix = {char:ix for ix, char in enumerate(chars)}
    
    #Estructuras de entrada y salida
    NUMBER_OF_SEQ = int(len(data)/seq_length)
    print('Number of sequences: {}'.format(NUMBER_OF_SEQ))
    X = np.zeros((NUMBER_OF_SEQ, seq_length, VOCAB_SIZE))
    y = np.zeros((NUMBER_OF_SEQ, seq_length, VOCAB_SIZE))
    
    for i in range(0, NUMBER_OF_SEQ):
        #LLenado de la estructura de entrada X
        X_sequence = data[i*seq_length:(i+1)*seq_length]
        X_sequence_ix = [char_to_ix[value] for value in X_sequence]
        #one-hot-vector (input)
        input_sequence = np.zeros((seq_length, VOCAB_SIZE))  
        #uso del diccionario para completar el one-hot-vector
        for j in range(seq_length):
            input_sequence[j][X_sequence_ix[j]] = 1.
            X[i] = input_sequence
            
        #Llenado de la estructura de salida y
        y_sequence = data[i*seq_length+1:(i+1)*seq_length+1]
        y_sequence_ix = [char_to_ix[value] for value in y_sequence]
        #one-hot-vector (output)
        target_sequence = np.zeros((seq_length, VOCAB_SIZE))
        #uso del diccionario para completar el one-hot-vector
        for j in range(seq_length):
            target_sequence[j][y_sequence_ix[j]] = 1.
            y[i] = target_sequence
            
    return X, y, VOCAB_SIZE, ix_to_char

**Función B:
<br>Generación de textos**

In [8]:
# method for generating text
def generate_text(model, length, vocab_size, ix_to_char):
    # starting with random character
    ix = [np.random.randint(vocab_size)]
    y_char = [ix_to_char[ix[-1]]]
    X = np.zeros((1, length, vocab_size))
    for i in range(length):
        # appending the last predicted character to sequence
        X[0, i, :][ix[-1]] = 1
        print(ix_to_char[ix[-1]], end="")
        ix = np.argmax(model.predict(X[:, :i+1, :])[0], 1)
        y_char.append(ix_to_char[ix[-1]])
    return ('').join(y_char)

## Entrenamiento y Prueba

**Uso de la Función A: carga de los datos**

In [9]:
# Creating training data
X, y, VOCAB_SIZE, ix_to_char = load_data(DATA_DIR, SEQ_LENGTH)

Data length: 3262172 characters
Vocabulary size: 99 characters
['J', '‘', 'j', '#', '1', 'f', 'Z', 'W', '!', '9', 'e', 'H', ')', '’', ' ', '`', 'k', 'o', 'y', 'n', 'P', 'µ', 'h', 'i', '4', 'T', "'", '}', 's', 'K', ';', 'B', 'u', 'G', 'p', 'q', '>', 'a', '*', 'w', 't', '‚', '6', '®', 'c', '=', ',', 'g', '»', '&', 'x', '5', 'r', 'Y', 'F', 'b', 'm', 'L', '—', 'V', 'U', 'R', '_', 'I', '\n', '.', 'v', '…', '8', '"', '2', 'N', 'ó', '7', 'Q', 'M', '«', ':', '3', 'C', 'X', '/', '–', '-', 'z', 'd', 'E', 'D', 'S', '¢', 'O', '?', 'l', '0', '¤', '¥', 'A', '<', '(']
Number of sequences: 65243


**Es importante guardar el diccionario `ix_to_char` en un archivo binario. Este debe ser cargado cada vez que se quiera retomar el entrenamiento o generar texto a partir de un checkpoint, debido a que el orden de los caracteres en el diccionario podría modificarse (no es un orden fijo)**
<br>**NO MODIFICAR ESTE PICKLE AL REINICIAR EL NOTEBOOK PARA PROBAR CHECKPOINTS**

In [10]:
#No modificar el pickle al reiniciar el cuaderno de trabajo para probar checkpoints previos
with open('ix_to_char.pickle', 'wb') as handle:
    pickle.dump(ix_to_char, handle, protocol=pickle.HIGHEST_PROTOCOL)

In [11]:
print(ix_to_char)

{0: 'J', 1: '‘', 2: 'j', 3: '#', 4: '1', 5: 'f', 6: 'Z', 7: 'W', 8: '!', 9: '9', 10: 'e', 11: 'H', 12: ')', 13: '’', 14: ' ', 15: '`', 16: 'k', 17: 'o', 18: 'y', 19: 'n', 20: 'P', 21: 'µ', 22: 'h', 23: 'i', 24: '4', 25: 'T', 26: "'", 27: '}', 28: 's', 29: 'K', 30: ';', 31: 'B', 32: 'u', 33: 'G', 34: 'p', 35: 'q', 36: '>', 37: 'a', 38: '*', 39: 'w', 40: 't', 41: '‚', 42: '6', 43: '®', 44: 'c', 45: '=', 46: ',', 47: 'g', 48: '»', 49: '&', 50: 'x', 51: '5', 52: 'r', 53: 'Y', 54: 'F', 55: 'b', 56: 'm', 57: 'L', 58: '—', 59: 'V', 60: 'U', 61: 'R', 62: '_', 63: 'I', 64: '\n', 65: '.', 66: 'v', 67: '…', 68: '8', 69: '"', 70: '2', 71: 'N', 72: 'ó', 73: '7', 74: 'Q', 75: 'M', 76: '«', 77: ':', 78: '3', 79: 'C', 80: 'X', 81: '/', 82: '–', 83: '-', 84: 'z', 85: 'd', 86: 'E', 87: 'D', 88: 'S', 89: '¢', 90: 'O', 91: '?', 92: 'l', 93: '0', 94: '¤', 95: '¥', 96: 'A', 97: '<', 98: '('}


In [12]:
print(X.shape, y.shape, VOCAB_SIZE)

(65243, 50, 99) (65243, 50, 99) 99


### Creación de la RNN (LSTM)

In [13]:
# Creating and compiling the Network
model = Sequential()

#Añadiendo las capas LSTM
model.add(LSTM(HIDDEN_DIM, input_shape=(None, VOCAB_SIZE), return_sequences=True))
for i in range(LAYER_NUM - 1):
    model.add(LSTM(HIDDEN_DIM, return_sequences=True))
#Añadiendo la operacion de salida
model.add(TimeDistributed(Dense(VOCAB_SIZE)))
model.add(Activation('softmax'))

#"Compilando" = instanciando la RNN con su función de pérdida y optimización
model.compile(loss="categorical_crossentropy", optimizer="rmsprop")

In [14]:
# Generate some sample before training to know how bad it is!
generate_text(model, GENERATE_LENGTH, VOCAB_SIZE, ix_to_char)

3kk‚h–h`??GG–SGGGY4yyyyyyy‚‚yy‚‚y‚y‚y‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy

'3kk‚h–h`??GG–SGGGY4yyyyyyy‚‚yy‚‚y‚y‚y‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚‚yy‚‚y‚y‚y‚y‚y‚y‚y‚‚yy‚‚yy‚'

**Se cargan los pesos (y el diccionario de los one-hot-vectors) en caso haya habido un entrenamiento previo**
<br>WEIGHTS debe tener el valor del nombre del archivo de "checkpoint" guardado. Por ejemplo:
<br>```WEIGHTS = "checkpoint_layer_2_hidden_250_epoch_60.hdf5"```

In [16]:
#Se cargan los pesos de un entrenamiento previo (si se desea restaurar una ejecucion)
#Se calcula el numero de epocas en base al nombre del archivo
#Se carga el diccionario de caracteres (one-hot-vectors) para la generacion
if not WEIGHTS == '':
    model.load_weights(WEIGHTS)
    nb_epoch = int(WEIGHTS[WEIGHTS.rfind('_') + 1:WEIGHTS.find('.')])
    with open('ix_to_char.pickle', 'rb') as handle:
        ix_to_char = pickle.load(handle)
else:
    #Si se va a empezar de 0:
    nb_epoch = 0

### Entrenamiento

In [None]:
# Training if there is no trained weights specified

#Esta es la iteración importante
#Pueden cambiar la condición para que termine en un determinado numero de epochs.
while True:
    print('\n\nEpoch: {}\n'.format(nb_epoch))
    #Ajuste del modelo, y entrenamiento de 1 epoca
    model.fit(X, y, batch_size=BATCH_SIZE, verbose=1, epochs=1)
    nb_epoch += 1
    #Generacion de un texto al final de la epoca
    generate_text(model, GENERATE_LENGTH, VOCAB_SIZE, ix_to_char)
    #Pueden modificar esto para tener más checkpoints
    if nb_epoch % 10 == 0:
        model.save_weights('checkpoint_layer_{}_hidden_{}_epoch_{}.hdf5'.format(LAYER_NUM, HIDDEN_DIM, nb_epoch))
        break



Epoch: 0

Epoch 1/1
Z  the  hobbits  and  the  hills  and  the  hills  and  the  hills  and  the  hills  and  the  hills  and  the  hills  and  the  hills  and  the  hills  and  the  hills  and  the  hills  and  the  hills  and  the  hills  and  the  hills  and  the  hills  and  the  hills  and  the  hills  and  the  hills  and  the  hills  and  the  hills  and  the  hills  and  the  hills  and  the  hills  and  the  hills  and  the  hills  and  the  hills  and  the  hills  and  the  hills  and  the  hills  and  th

Epoch: 1

Epoch 1/1
k  the  stone  of  the  stone  of  the  stone  of  the  stone  of  the  stone  of  the  stone  of  the  stone  of  the  stone  of  the  stone  of  the  stone  of  the  stone  of  the  stone  of  the  stone  of  the  stone  of  the  stone  of  the  stone  of  the  stone  of  the  stone  of  the  stone  of  the  stone  of  the  stone  of  the  stone  of  the  stone  of  the  stone  of  the  stone  of  the  stone  of  the  stone  of  the  stone  of  the  

Company  was  still  a  dark  shape  and  the  stone  that  had  been  set  out  on  the  stone  and  the  stone  that  had  been  set  out  and  started  and  started  and  started  and  started  and  started  and  started  and  started  and  started  and  started  and  started  and  started  and  started  and  started  and  started  and  started  and  started  and  started  and  started  and  started  and  started  and  started  and  started  and  started  and  started  and  started  and  star

Epoch: 14

Epoch 1/1
7  the  dark  shapes  of  the  door  and  the  stones  and  the  dwarves  and  the  trees  and  the  dwarves  and  the  trees  and  the  dwarves  and  the  trees  and  the  dwarves  and  the  trees  and  the  dwarves  and  the  trees  and  the  dwarves  and  the  trees  and  the  dwarves  and  the  trees  and  the  dwarves  and  the  trees  and  the  dwarves  and  the  trees  and  the  dwarves  and  the  trees  and  the  dwarves  and  the  trees  and  the  dwarves  and  th

e  the  strangers  saw  them  all  the  strange  thing  and  the  sound  of  the  dwarves  and  the  sound  of  the  dwarves  and  the  sound  of  the  dwarves  and  the  sound  of  the  dwarves  and  the  sound  of  the  dwarves  and  the  sound  of  the  dwarves  and  the  sound  of  the  dwarves  and  the  sound  of  the  dwarves  and  the  sound  of  the  dwarves  and  the  sound  of  the  dwarves  and  the  sound  of  the  dwarves  and  the  sound  of  the  dwarves  and  the  sound  of  the

Epoch: 27

Epoch 1/1
‚d  the  stranger  of  the  stream  that  had  been  seen  they  had  been  seen  the  sun  was  still  seen  of  the  songs  of  the  dwarves  and  the  dwarves  and  the  dwarves  and  the  dwarves  and  the  dwarves  and  the  dwarves  and  the  dwarves  and  the  dwarves  and  the  dwarves  and  the  dwarves  and  the  dwarves  and  the  dwarves  and  the  dwarves  and  the  dwarves  and  the  dwarves  and  the  dwarves  and  the  dwarves  and  the  dwarves  and  the  

nd  the  strangers  and  the  sound  of  the  dwarves  and  the  rocks  and  the  rocks  and  the  rocks  and  the  rocks  and  the  rocks  and  the  rocks  and  the  rocks  and  the  rocks  and  the  rocks  and  the  rocks  and  the  rocks  and  the  rocks  and  the  rocks  and  the  rocks  and  the  rocks  and  the  rocks  and  the  rocks  and  the  rocks  and  the  rocks  and  the  rocks  and  the  rocks  and  the  rocks  and  the  rocks  and  the  rocks  and  the  rocks  and  the  rocks  and

Epoch: 40

Epoch 1/1
g  the  stone  and  the  trees  and  the  dwarves  and  the  light  of  the  dwarves  and  the  little  stone  they  could  not  get  on  the  ground.  "The  dwarves  are  a  great  grey  smoke-messes  and  the  dwarves  and  the  little  stone  they  could  not  get  on  the  ground.  "The  dwarves  are  a  great  grey  smoke-messes  and  the  dwarves  and  the  little  stone  they  could  not  get  on  the  ground.  "The  dwarves  are  a  great  grey  smoke-messes  and  

### Generación de texto
Si instancian el modelo y sus parametros (ejecutando algunas celdas preliminares), y tienen los 2 archivos requeridos (.pickle y .hdf5) pueden generar el texto. 
<br>En el ejemplo de LOTR: `VOCAB_SIZE = 84` (si desean probarlo, se adjuntar los pesos y el diccionario, pero no el texto)

In [None]:
#Cuidar de no reemplazar el pickle original
with open('ix_to_char.pickle', 'rb') as handle:
    ix_to_char = pickle.load(handle)
    
WEIGHTS = "checkpoint_layer_2_hidden_250_epoch_60.hdf5"
# Loading the trained weights
model.load_weights(WEIGHTS)
generate_text(model, GENERATE_LENGTH, VOCAB_SIZE, ix_to_char)
print('\n\n')