# R. N. N. Tolkien


Neste notebook nós criamos uma RNN treinada nos livros do Tolkien que gera um caractere por vez, com base nos caracteres anteriores da sequência.

Esta rede é baseada no [notebook do Udacity sobre RNNs](https://github.com/udacity/deep-learning/blob/master/intro-to-rnns/Anna_KaRNNa.ipynb), que por sua vez é baseado no trabalho de Andrej Karpathy [sobre RNNs](http://karpathy.github.io/2015/05/21/rnn-effectiveness/) e sua [implementação em Torch](https://github.com/karpathy/char-rnn).

Basicamente o que queremos aqui é adivinhar o próximo caractere dado a sequência que já foi gerada. Assim como está abaixo, na figura que mostra a arquitetura geral da RNN baseada nos caracteres.

<img src="assets/charseq.jpeg" width="500">

Primeiro carregamos o texto que está no arquivo 'tolkien.txt'

In [1]:
import time
from collections import namedtuple
import numpy as np
import tensorflow as tf

with open('tolkien_primeirolivro.txt', 'r') as f:
    text=f.read()
    
vocab = set(text)
vocab_to_int = {c: i for i, c in enumerate(vocab)}
int_to_vocab = dict(enumerate(vocab))

encoded = np.array([vocab_to_int[c] for c in text], dtype=np.int32)

Dá uma olhada no que foi carregado

In [2]:
text[:100]

'O SENHOR DOS ANÉIS\nPRIMEIRA PARTE\nA SOCIEDADE DO ANEL\n\nA SOCIEDADE DO ANEL\nPRIMEIRA PARTE\nLIVRO 1\nCA'

Este mesmo trecho do livro foi codificado em inteiros, veja abaixo. Estes inteiros é que são a entrada para  

In [3]:
encoded[:100]

array([58, 85, 70, 34, 33, 49, 58, 89, 85, 17, 58, 70, 85, 97, 33, 77, 96,
       70,  1,  6, 89, 96, 23, 34, 96, 89, 97, 85,  6, 97, 89, 12, 34,  1,
       97, 85, 70, 58, 19, 96, 34, 17, 97, 17, 34, 85, 17, 58, 85, 97, 33,
       34, 21,  1,  1, 97, 85, 70, 58, 19, 96, 34, 17, 97, 17, 34, 85, 17,
       58, 85, 97, 33, 34, 21,  1,  6, 89, 96, 23, 34, 96, 89, 97, 85,  6,
       97, 89, 12, 34,  1, 21, 96, 46, 89, 58, 85, 92,  1, 19, 97])

Note que os números que representam os caracteres não são os códigos ASCII dos mesmos, nem sequer tem alguma outra lógica mais elaborada. Eles simplesmente são os números em que os caracteres aparecem no conjunto _vocab_ do código acima.

In [4]:
len(vocab)

102

## Construindo o Modelo

Esta é uma imagem de como ficará a nossa RNN depois de montada:

<img src="assets/charRNN.png" width=500px>

Abaixo vamos seguir com o código para gerar:
* os placeholders dos inputs
* a rede recorrente com células LSTM
* a camada de outputs
* a função de custo a otimizar, e
* o otimizador

A partir daí juntamos tudo num modelo que chamamos de CharRNN

Ainda, criamos a função para gerar 'batches' (lotes) de caracteres do livro para alimentar o treinamento.

Depois disso setamos os hiper-parâmetros da rede e a treinamos.

## Os Inputs

In [5]:
def build_inputs(batch_size, num_steps):
    ''' Define placeholders for inputs, targets, and dropout 
    
        Arguments
        ---------
        batch_size: Batch size, number of sequences per batch
        num_steps: Number of sequence steps in a batch
        
    '''
    # Declare placeholders we'll feed into the graph
    inputs = tf.placeholder(tf.int32, [batch_size, num_steps], name='inputs')
    targets = tf.placeholder(tf.int32, [batch_size, num_steps], name='targets')
    
    # Keep probability placeholder for drop out layers
    keep_prob = tf.placeholder(tf.float32, name='keep_prob')
    
    return inputs, targets, keep_prob

## As Células LSTM para _n_ layers

In [6]:
def build_lstm(lstm_size, num_layers, batch_size, keep_prob):
    ''' Build LSTM cell.
    
        Arguments
        ---------
        keep_prob: Scalar tensor (tf.placeholder) for the dropout keep probability
        lstm_size: Size of the hidden layers in the LSTM cells
        num_layers: Number of LSTM layers
        batch_size: Batch size

    '''
    ### Build the LSTM Cell
    
    def build_cell(lstm_size, keep_prob):
        # Use a basic LSTM cell
        lstm = tf.contrib.rnn.BasicLSTMCell(lstm_size)
        
        # Add dropout to the cell
        drop = tf.contrib.rnn.DropoutWrapper(lstm, output_keep_prob=keep_prob)
        return drop
    
    
    # Stack up multiple LSTM layers, for deep learning
    cell = tf.contrib.rnn.MultiRNNCell([build_cell(lstm_size, keep_prob) for _ in range(num_layers)])
    initial_state = cell.zero_state(batch_size, tf.float32)
    
    return cell, initial_state

## A camada de outputs com softmax

In [7]:
def build_output(lstm_output, in_size, out_size):
    ''' Build a softmax layer, return the softmax output and logits.
    
        Arguments
        ---------
        
        x: Input tensor
        in_size: Size of the input tensor, for example, size of the LSTM cells
        out_size: Size of this softmax layer
    
    '''

    # Reshape output so it's a bunch of rows, one row for each step for each sequence.
    # That is, the shape should be batch_size*num_steps rows by lstm_size columns
    seq_output = tf.concat(lstm_output, axis=1)
    x = tf.reshape(seq_output, [-1, in_size])
    
    # Connect the RNN outputs to a softmax layer
    with tf.variable_scope('softmax'):
        softmax_w = tf.Variable(tf.truncated_normal((in_size, out_size), stddev=0.1))
        softmax_b = tf.Variable(tf.zeros(out_size))
    
    # Since output is a bunch of rows of RNN cell outputs, logits will be a bunch
    # of rows of logit outputs, one for each step and sequence
    logits = tf.matmul(x, softmax_w) + softmax_b
    
    # Use softmax to get the probabilities for predicted characters
    out = tf.nn.softmax(logits, name='predictions')
    
    return out, logits

## A função de custo a otimizar

In [8]:
def build_loss(logits, targets, lstm_size, num_classes):
    ''' Calculate the loss from the logits and the targets.
    
        Arguments
        ---------
        logits: Logits from final fully connected layer
        targets: Targets for supervised learning
        lstm_size: Number of LSTM hidden units
        num_classes: Number of classes in targets
        
    '''
    
    # One-hot encode targets and reshape to match logits, one row per batch_size per step
    y_one_hot = tf.one_hot(targets, num_classes)
    y_reshaped = tf.reshape(y_one_hot, logits.get_shape())
    
    # Softmax cross entropy loss
    loss = tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y_reshaped)
    loss = tf.reduce_mean(loss)
    return loss

## O otimizador

In [9]:
def build_optimizer(loss, learning_rate, grad_clip):
    ''' Build optmizer for training, using gradient clipping.
    
        Arguments:
        loss: Network loss
        learning_rate: Learning rate for optimizer
    
    '''
    
    # Optimizer for training, using gradient clipping to control exploding gradients
    tvars = tf.trainable_variables()
    grads, _ = tf.clip_by_global_norm(tf.gradients(loss, tvars), grad_clip)
    train_op = tf.train.AdamOptimizer(learning_rate)
    optimizer = train_op.apply_gradients(zip(grads, tvars))
    
    return optimizer

## Juntando tudo numa rede só.....

In [10]:
class CharRNN:
    
    def __init__(self, num_classes, batch_size=64, num_steps=50, 
                       lstm_size=128, num_layers=2, learning_rate=0.001, 
                       grad_clip=5, sampling=False):
    
        # When we're using this network for sampling later, we'll be passing in
        # one character at a time, so providing an option for that
        if sampling == True:
            batch_size, num_steps = 1, 1
        else:
            batch_size, num_steps = batch_size, num_steps

        tf.reset_default_graph()
        
        # Build the input placeholder tensors
        self.inputs, self.targets, self.keep_prob = build_inputs(batch_size, num_steps)

        # Build the LSTM cell
        cell, self.initial_state = build_lstm(lstm_size, num_layers, batch_size, self.keep_prob)

        ### Run the data through the RNN layers
        # First, one-hot encode the input tokens
        x_one_hot = tf.one_hot(self.inputs, num_classes)
        
        # Run each sequence step through the RNN and collect the outputs
        outputs, state = tf.nn.dynamic_rnn(cell, x_one_hot, initial_state=self.initial_state)
        self.final_state = state
        
        # Get softmax predictions and logits
        self.prediction, self.logits = build_output(outputs, lstm_size, num_classes)
        
        # Loss and optimizer (with gradient clipping)
        self.loss = build_loss(self.logits, self.targets, lstm_size, num_classes)
        self.optimizer = build_optimizer(self.loss, learning_rate, grad_clip)

## A função para gerar lotes de caracteres para o treinamento

Esta é a função para gerar mini-lotes de caracteres para o treinamento. Note que a gente quer que nossos lotes sejam de um tamanho fixo (número de passos) e para acelerar o treinamento, vamos passar vários lotes de uma só vez (número de sequencias - que passamos por vez. Fica mais fácil de olhar no desenho a seguir:

<img src="assets/sequence_batching@1x.png" width=500px>

In [11]:
def get_batches(arr, n_seqs, n_steps):
    '''Create a generator that returns batches of size
       n_seqs x n_steps from arr.
       
       Arguments
       ---------
       arr: Array you want to make batches from
       n_seqs: Batch size, the number of sequences per batch
       n_steps: Number of sequence steps per batch
    '''
    # Get the number of characters per batch and number of batches we can make
    characters_per_batch = n_seqs * n_steps
    n_batches = len(arr)//characters_per_batch
    
    # Keep only enough characters to make full batches
    arr = arr[:n_batches * characters_per_batch]
    
    # Reshape into n_seqs rows
    arr = arr.reshape((n_seqs, -1))
    
    for n in range(0, arr.shape[1], n_steps):
        # The features
        x = arr[:, n:n+n_steps]
        # The targets, shifted by one
        y = np.zeros_like(x)
        y[:, :-1], y[:, -1] = x[:, 1:], x[:, 0]
        yield x, y

## Hiper-parâmetros

In [14]:
epochs = 10             # Number of full passes on all text
batch_size = 100        # Sequences per batch
num_steps = 100         # Number of sequence steps per batch
lstm_size = 512         # Size of hidden layers in LSTMs
num_layers = 2          # Number of LSTM layers
learning_rate = 0.001   # Learning rate
keep_prob = 0.5         # Dropout keep probability
save_every_n = 100      # Save trainning progress every N iterations

## Treinamento

In [15]:
model = CharRNN(len(vocab), batch_size=batch_size, num_steps=num_steps,
                lstm_size=lstm_size, num_layers=num_layers, 
                learning_rate=learning_rate)

saver = tf.train.Saver(max_to_keep=100)
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    
    # Use the line below to load a checkpoint and resume training
    #saver.restore(sess, 'checkpoints/______.ckpt')
    counter = 0
    for e in range(epochs):
        # Train network
        new_state = sess.run(model.initial_state)
        loss = 0
        for x, y in get_batches(encoded, batch_size, num_steps):
            counter += 1
            start = time.time()
            feed = {model.inputs: x,
                    model.targets: y,
                    model.keep_prob: keep_prob,
                    model.initial_state: new_state}
            batch_loss, new_state, _ = sess.run([model.loss, 
                                                 model.final_state, 
                                                 model.optimizer], 
                                                 feed_dict=feed)
            
            end = time.time()
            print('Epoch: {}/{}... '.format(e+1, epochs),
                  'Training Step: {}... '.format(counter),
                  'Training loss: {:.4f}... '.format(batch_loss),
                  '{:.4f} sec/batch'.format((end-start)))
        
            if (counter % save_every_n == 0):
                saver.save(sess, "checkpoints/i{}_l{}.ckpt".format(counter, lstm_size))
    
    saver.save(sess, "checkpoints/i{}_l{}.ckpt".format(counter, lstm_size))

Epoch: 1/10...  Training Step: 1...  Training loss: 4.6286...  1.4603 sec/batch
Epoch: 1/10...  Training Step: 2...  Training loss: 4.5457...  1.4500 sec/batch
Epoch: 1/10...  Training Step: 3...  Training loss: 4.1190...  1.4363 sec/batch
Epoch: 1/10...  Training Step: 4...  Training loss: 5.7026...  1.4232 sec/batch
Epoch: 1/10...  Training Step: 5...  Training loss: 4.3598...  1.4107 sec/batch
Epoch: 1/10...  Training Step: 6...  Training loss: 3.8755...  1.4394 sec/batch
Epoch: 1/10...  Training Step: 7...  Training loss: 3.7356...  1.4116 sec/batch
Epoch: 1/10...  Training Step: 8...  Training loss: 3.6383...  1.4408 sec/batch
Epoch: 1/10...  Training Step: 9...  Training loss: 3.5047...  1.4516 sec/batch
Epoch: 1/10...  Training Step: 10...  Training loss: 3.4571...  1.5076 sec/batch
Epoch: 1/10...  Training Step: 11...  Training loss: 3.4405...  1.4118 sec/batch
Epoch: 1/10...  Training Step: 12...  Training loss: 3.3846...  1.4369 sec/batch
Epoch: 1/10...  Training Step: 13... 

Epoch: 2/10...  Training Step: 103...  Training loss: 3.0498...  1.3807 sec/batch
Epoch: 2/10...  Training Step: 104...  Training loss: 3.0240...  1.3803 sec/batch
Epoch: 2/10...  Training Step: 105...  Training loss: 3.0246...  1.3780 sec/batch
Epoch: 2/10...  Training Step: 106...  Training loss: 3.0316...  1.3794 sec/batch
Epoch: 2/10...  Training Step: 107...  Training loss: 3.0291...  1.3802 sec/batch
Epoch: 2/10...  Training Step: 108...  Training loss: 3.0308...  1.3721 sec/batch
Epoch: 2/10...  Training Step: 109...  Training loss: 3.0137...  1.3713 sec/batch
Epoch: 2/10...  Training Step: 110...  Training loss: 3.0407...  1.3740 sec/batch
Epoch: 2/10...  Training Step: 111...  Training loss: 3.0271...  1.3773 sec/batch
Epoch: 2/10...  Training Step: 112...  Training loss: 3.0350...  1.3856 sec/batch
Epoch: 2/10...  Training Step: 113...  Training loss: 2.9926...  1.3733 sec/batch
Epoch: 2/10...  Training Step: 114...  Training loss: 3.0353...  1.3915 sec/batch
Epoch: 2/10...  

Epoch: 3/10...  Training Step: 203...  Training loss: 2.4959...  1.3887 sec/batch
Epoch: 3/10...  Training Step: 204...  Training loss: 2.4605...  1.4173 sec/batch
Epoch: 3/10...  Training Step: 205...  Training loss: 2.4675...  1.3825 sec/batch
Epoch: 3/10...  Training Step: 206...  Training loss: 2.4733...  1.3861 sec/batch
Epoch: 3/10...  Training Step: 207...  Training loss: 2.4712...  1.3902 sec/batch
Epoch: 3/10...  Training Step: 208...  Training loss: 2.4803...  1.3969 sec/batch
Epoch: 3/10...  Training Step: 209...  Training loss: 2.4533...  1.3923 sec/batch
Epoch: 3/10...  Training Step: 210...  Training loss: 2.4689...  1.3977 sec/batch
Epoch: 3/10...  Training Step: 211...  Training loss: 2.4733...  1.4172 sec/batch
Epoch: 3/10...  Training Step: 212...  Training loss: 2.4398...  1.4117 sec/batch
Epoch: 3/10...  Training Step: 213...  Training loss: 2.4221...  1.3913 sec/batch
Epoch: 3/10...  Training Step: 214...  Training loss: 2.4385...  1.4044 sec/batch
Epoch: 3/10...  

Epoch: 4/10...  Training Step: 303...  Training loss: 2.2779...  1.3737 sec/batch
Epoch: 4/10...  Training Step: 304...  Training loss: 2.2451...  1.3742 sec/batch
Epoch: 4/10...  Training Step: 305...  Training loss: 2.2722...  1.3768 sec/batch
Epoch: 4/10...  Training Step: 306...  Training loss: 2.2677...  1.3797 sec/batch
Epoch: 4/10...  Training Step: 307...  Training loss: 2.2803...  1.3806 sec/batch
Epoch: 4/10...  Training Step: 308...  Training loss: 2.2669...  1.3799 sec/batch
Epoch: 4/10...  Training Step: 309...  Training loss: 2.2638...  1.3845 sec/batch
Epoch: 4/10...  Training Step: 310...  Training loss: 2.2613...  1.3871 sec/batch
Epoch: 4/10...  Training Step: 311...  Training loss: 2.2704...  1.3904 sec/batch
Epoch: 4/10...  Training Step: 312...  Training loss: 2.2418...  1.4067 sec/batch
Epoch: 4/10...  Training Step: 313...  Training loss: 2.2340...  1.4205 sec/batch
Epoch: 4/10...  Training Step: 314...  Training loss: 2.2344...  1.3798 sec/batch
Epoch: 4/10...  

Epoch: 5/10...  Training Step: 403...  Training loss: 2.1670...  1.3782 sec/batch
Epoch: 5/10...  Training Step: 404...  Training loss: 2.1411...  1.3969 sec/batch
Epoch: 5/10...  Training Step: 405...  Training loss: 2.1602...  1.3999 sec/batch
Epoch: 5/10...  Training Step: 406...  Training loss: 2.1694...  1.4149 sec/batch
Epoch: 5/10...  Training Step: 407...  Training loss: 2.1793...  1.3993 sec/batch
Epoch: 5/10...  Training Step: 408...  Training loss: 2.1586...  1.3953 sec/batch
Epoch: 5/10...  Training Step: 409...  Training loss: 2.1535...  1.4020 sec/batch
Epoch: 5/10...  Training Step: 410...  Training loss: 2.1597...  1.4062 sec/batch
Epoch: 5/10...  Training Step: 411...  Training loss: 2.1632...  1.4030 sec/batch
Epoch: 5/10...  Training Step: 412...  Training loss: 2.1317...  1.4048 sec/batch
Epoch: 5/10...  Training Step: 413...  Training loss: 2.1261...  1.3999 sec/batch
Epoch: 5/10...  Training Step: 414...  Training loss: 2.1096...  1.4195 sec/batch
Epoch: 5/10...  

Epoch: 6/10...  Training Step: 503...  Training loss: 2.0588...  1.3747 sec/batch
Epoch: 6/10...  Training Step: 504...  Training loss: 2.0471...  1.3789 sec/batch
Epoch: 6/10...  Training Step: 505...  Training loss: 2.0528...  1.3782 sec/batch
Epoch: 6/10...  Training Step: 506...  Training loss: 2.0729...  1.3799 sec/batch
Epoch: 6/10...  Training Step: 507...  Training loss: 2.0817...  1.3790 sec/batch
Epoch: 6/10...  Training Step: 508...  Training loss: 2.0503...  1.3795 sec/batch
Epoch: 6/10...  Training Step: 509...  Training loss: 2.0522...  1.3812 sec/batch
Epoch: 6/10...  Training Step: 510...  Training loss: 2.0564...  1.4132 sec/batch
Epoch: 6/10...  Training Step: 511...  Training loss: 2.0547...  1.3900 sec/batch
Epoch: 6/10...  Training Step: 512...  Training loss: 2.0318...  1.3997 sec/batch
Epoch: 6/10...  Training Step: 513...  Training loss: 2.0298...  1.3978 sec/batch
Epoch: 6/10...  Training Step: 514...  Training loss: 2.0152...  1.4111 sec/batch
Epoch: 6/10...  

Epoch: 7/10...  Training Step: 603...  Training loss: 1.9615...  1.3812 sec/batch
Epoch: 7/10...  Training Step: 604...  Training loss: 1.9582...  1.3859 sec/batch
Epoch: 7/10...  Training Step: 605...  Training loss: 1.9543...  1.3986 sec/batch
Epoch: 7/10...  Training Step: 606...  Training loss: 1.9789...  1.3899 sec/batch
Epoch: 7/10...  Training Step: 607...  Training loss: 1.9874...  1.4056 sec/batch
Epoch: 7/10...  Training Step: 608...  Training loss: 1.9471...  1.4097 sec/batch
Epoch: 7/10...  Training Step: 609...  Training loss: 1.9617...  1.4116 sec/batch
Epoch: 7/10...  Training Step: 610...  Training loss: 1.9596...  1.4210 sec/batch
Epoch: 7/10...  Training Step: 611...  Training loss: 1.9521...  1.4110 sec/batch
Epoch: 7/10...  Training Step: 612...  Training loss: 1.9341...  1.4390 sec/batch
Epoch: 7/10...  Training Step: 613...  Training loss: 1.9332...  1.4557 sec/batch
Epoch: 7/10...  Training Step: 614...  Training loss: 1.9183...  1.4216 sec/batch
Epoch: 7/10...  

Epoch: 8/10...  Training Step: 703...  Training loss: 1.8678...  1.3776 sec/batch
Epoch: 8/10...  Training Step: 704...  Training loss: 1.8599...  1.3780 sec/batch
Epoch: 8/10...  Training Step: 705...  Training loss: 1.8561...  1.3791 sec/batch
Epoch: 8/10...  Training Step: 706...  Training loss: 1.8814...  1.3777 sec/batch
Epoch: 8/10...  Training Step: 707...  Training loss: 1.8996...  1.3776 sec/batch
Epoch: 8/10...  Training Step: 708...  Training loss: 1.8597...  1.3773 sec/batch
Epoch: 8/10...  Training Step: 709...  Training loss: 1.8786...  1.3773 sec/batch
Epoch: 8/10...  Training Step: 710...  Training loss: 1.8663...  1.3821 sec/batch
Epoch: 8/10...  Training Step: 711...  Training loss: 1.8687...  1.3899 sec/batch
Epoch: 8/10...  Training Step: 712...  Training loss: 1.8526...  1.3974 sec/batch
Epoch: 8/10...  Training Step: 713...  Training loss: 1.8402...  1.4133 sec/batch
Epoch: 8/10...  Training Step: 714...  Training loss: 1.8191...  1.4060 sec/batch
Epoch: 8/10...  

Epoch: 9/10...  Training Step: 803...  Training loss: 1.7802...  1.3928 sec/batch
Epoch: 9/10...  Training Step: 804...  Training loss: 1.7775...  1.3893 sec/batch
Epoch: 9/10...  Training Step: 805...  Training loss: 1.7946...  1.3916 sec/batch
Epoch: 9/10...  Training Step: 806...  Training loss: 1.8089...  1.4060 sec/batch
Epoch: 9/10...  Training Step: 807...  Training loss: 1.8370...  1.4466 sec/batch
Epoch: 9/10...  Training Step: 808...  Training loss: 1.7827...  1.4016 sec/batch
Epoch: 9/10...  Training Step: 809...  Training loss: 1.7979...  1.3971 sec/batch
Epoch: 9/10...  Training Step: 810...  Training loss: 1.7805...  1.4177 sec/batch
Epoch: 9/10...  Training Step: 811...  Training loss: 1.7881...  1.4260 sec/batch
Epoch: 9/10...  Training Step: 812...  Training loss: 1.7784...  1.4304 sec/batch
Epoch: 9/10...  Training Step: 813...  Training loss: 1.7595...  1.4818 sec/batch
Epoch: 9/10...  Training Step: 814...  Training loss: 1.7413...  1.4780 sec/batch
Epoch: 9/10...  

Epoch: 10/10...  Training Step: 903...  Training loss: 1.7252...  1.3767 sec/batch
Epoch: 10/10...  Training Step: 904...  Training loss: 1.7108...  1.3787 sec/batch
Epoch: 10/10...  Training Step: 905...  Training loss: 1.7298...  1.3757 sec/batch
Epoch: 10/10...  Training Step: 906...  Training loss: 1.7445...  1.3767 sec/batch
Epoch: 10/10...  Training Step: 907...  Training loss: 1.7722...  1.3792 sec/batch
Epoch: 10/10...  Training Step: 908...  Training loss: 1.7185...  1.3779 sec/batch
Epoch: 10/10...  Training Step: 909...  Training loss: 1.7265...  1.3789 sec/batch
Epoch: 10/10...  Training Step: 910...  Training loss: 1.7066...  1.3885 sec/batch
Epoch: 10/10...  Training Step: 911...  Training loss: 1.7281...  1.4036 sec/batch
Epoch: 10/10...  Training Step: 912...  Training loss: 1.7142...  1.3916 sec/batch
Epoch: 10/10...  Training Step: 913...  Training loss: 1.6915...  1.4294 sec/batch
Epoch: 10/10...  Training Step: 914...  Training loss: 1.6749...  1.4149 sec/batch
Epoc

Bora treinar mais um pouquinho, que 1.66 é um custo ainda muito alto!

Mas, agora, vamos iniciar o treinamento do último checkpoint, com o learning_rate menor.

In [28]:
tf.train.latest_checkpoint('checkpoints')

'checkpoints\\i1000_l512.ckpt'

In [35]:
epochs = 20             # Number of full passes on all text
batch_size = 100        # Sequences per batch
num_steps = 100         # Number of sequence steps per batch
lstm_size = 512         # Size of hidden layers in LSTMs
num_layers = 2          # Number of LSTM layers
learning_rate = 0.0005  # Learning rate
keep_prob = 0.5         # Dropout keep probability
save_every_n = 100      # Save trainning progress every N iterations

In [36]:
model = CharRNN(len(vocab), batch_size=batch_size, num_steps=num_steps,
                lstm_size=lstm_size, num_layers=num_layers, 
                learning_rate=learning_rate)

saver = tf.train.Saver(max_to_keep=100)
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    
    # Use the line below to load a checkpoint and resume training
    saver.restore(sess, 'checkpoints\\i1000_l512.ckpt')
    counter = 0
    for e in range(epochs):
        # Train network
        new_state = sess.run(model.initial_state)
        loss = 0
        for x, y in get_batches(encoded, batch_size, num_steps):
            counter += 1
            start = time.time()
            feed = {model.inputs: x,
                    model.targets: y,
                    model.keep_prob: keep_prob,
                    model.initial_state: new_state}
            batch_loss, new_state, _ = sess.run([model.loss, 
                                                 model.final_state, 
                                                 model.optimizer], 
                                                 feed_dict=feed)
            
            end = time.time()
            print('Epoch: {}/{}... '.format(e+1, epochs),
                  'Training Step: {}... '.format(counter),
                  'Training loss: {:.4f}... '.format(batch_loss),
                  '{:.4f} sec/batch'.format((end-start)))
        
            if (counter % save_every_n == 0):
                saver.save(sess, "checkpoints/i{}_l{}.ckpt".format(counter, lstm_size))
    
    saver.save(sess, "checkpoints/i{}_l{}.ckpt".format(counter, lstm_size))

INFO:tensorflow:Restoring parameters from checkpoints\i1000_l512.ckpt
Epoch: 1/20...  Training Step: 1...  Training loss: 1.7380...  1.4652 sec/batch
Epoch: 1/20...  Training Step: 2...  Training loss: 1.6807...  1.3927 sec/batch
Epoch: 1/20...  Training Step: 3...  Training loss: 1.6765...  1.3901 sec/batch
Epoch: 1/20...  Training Step: 4...  Training loss: 1.6602...  1.4199 sec/batch
Epoch: 1/20...  Training Step: 5...  Training loss: 1.6738...  1.4193 sec/batch
Epoch: 1/20...  Training Step: 6...  Training loss: 1.6918...  1.4205 sec/batch
Epoch: 1/20...  Training Step: 7...  Training loss: 1.7110...  1.4125 sec/batch
Epoch: 1/20...  Training Step: 8...  Training loss: 1.6636...  1.4258 sec/batch
Epoch: 1/20...  Training Step: 9...  Training loss: 1.6714...  1.4293 sec/batch
Epoch: 1/20...  Training Step: 10...  Training loss: 1.6581...  1.4105 sec/batch
Epoch: 1/20...  Training Step: 11...  Training loss: 1.6695...  1.4280 sec/batch
Epoch: 1/20...  Training Step: 12...  Training l

Epoch: 2/20...  Training Step: 102...  Training loss: 1.6435...  1.4018 sec/batch
Epoch: 2/20...  Training Step: 103...  Training loss: 1.6268...  1.3802 sec/batch
Epoch: 2/20...  Training Step: 104...  Training loss: 1.6262...  1.3876 sec/batch
Epoch: 2/20...  Training Step: 105...  Training loss: 1.6451...  1.4126 sec/batch
Epoch: 2/20...  Training Step: 106...  Training loss: 1.6604...  1.3947 sec/batch
Epoch: 2/20...  Training Step: 107...  Training loss: 1.6785...  1.4353 sec/batch
Epoch: 2/20...  Training Step: 108...  Training loss: 1.6335...  1.4335 sec/batch
Epoch: 2/20...  Training Step: 109...  Training loss: 1.6377...  1.4124 sec/batch
Epoch: 2/20...  Training Step: 110...  Training loss: 1.6259...  1.4489 sec/batch
Epoch: 2/20...  Training Step: 111...  Training loss: 1.6339...  1.4369 sec/batch
Epoch: 2/20...  Training Step: 112...  Training loss: 1.6233...  1.4199 sec/batch
Epoch: 2/20...  Training Step: 113...  Training loss: 1.5885...  1.4164 sec/batch
Epoch: 2/20...  

Epoch: 3/20...  Training Step: 202...  Training loss: 1.6186...  1.3800 sec/batch
Epoch: 3/20...  Training Step: 203...  Training loss: 1.6205...  1.3898 sec/batch
Epoch: 3/20...  Training Step: 204...  Training loss: 1.5882...  1.3962 sec/batch
Epoch: 3/20...  Training Step: 205...  Training loss: 1.6235...  1.4008 sec/batch
Epoch: 3/20...  Training Step: 206...  Training loss: 1.6402...  1.4083 sec/batch
Epoch: 3/20...  Training Step: 207...  Training loss: 1.6593...  1.4065 sec/batch
Epoch: 3/20...  Training Step: 208...  Training loss: 1.6086...  1.4209 sec/batch
Epoch: 3/20...  Training Step: 209...  Training loss: 1.6172...  1.4401 sec/batch
Epoch: 3/20...  Training Step: 210...  Training loss: 1.5933...  1.4060 sec/batch
Epoch: 3/20...  Training Step: 211...  Training loss: 1.6187...  1.4279 sec/batch
Epoch: 3/20...  Training Step: 212...  Training loss: 1.6025...  1.4300 sec/batch
Epoch: 3/20...  Training Step: 213...  Training loss: 1.5843...  1.4308 sec/batch
Epoch: 3/20...  

Epoch: 4/20...  Training Step: 302...  Training loss: 1.5942...  1.3890 sec/batch
Epoch: 4/20...  Training Step: 303...  Training loss: 1.5889...  1.3860 sec/batch
Epoch: 4/20...  Training Step: 304...  Training loss: 1.5712...  1.4182 sec/batch
Epoch: 4/20...  Training Step: 305...  Training loss: 1.6041...  1.4100 sec/batch
Epoch: 4/20...  Training Step: 306...  Training loss: 1.6145...  1.4004 sec/batch
Epoch: 4/20...  Training Step: 307...  Training loss: 1.6322...  1.4283 sec/batch
Epoch: 4/20...  Training Step: 308...  Training loss: 1.5796...  1.4119 sec/batch
Epoch: 4/20...  Training Step: 309...  Training loss: 1.6068...  1.4179 sec/batch
Epoch: 4/20...  Training Step: 310...  Training loss: 1.5787...  1.4104 sec/batch
Epoch: 4/20...  Training Step: 311...  Training loss: 1.5833...  1.4292 sec/batch
Epoch: 4/20...  Training Step: 312...  Training loss: 1.5853...  1.4314 sec/batch
Epoch: 4/20...  Training Step: 313...  Training loss: 1.5414...  1.4139 sec/batch
Epoch: 4/20...  

Epoch: 5/20...  Training Step: 402...  Training loss: 1.5787...  1.3742 sec/batch
Epoch: 5/20...  Training Step: 403...  Training loss: 1.5688...  1.3747 sec/batch
Epoch: 5/20...  Training Step: 404...  Training loss: 1.5570...  1.3727 sec/batch
Epoch: 5/20...  Training Step: 405...  Training loss: 1.5897...  1.3742 sec/batch
Epoch: 5/20...  Training Step: 406...  Training loss: 1.5849...  1.3727 sec/batch
Epoch: 5/20...  Training Step: 407...  Training loss: 1.6119...  1.3731 sec/batch
Epoch: 5/20...  Training Step: 408...  Training loss: 1.5676...  1.3726 sec/batch
Epoch: 5/20...  Training Step: 409...  Training loss: 1.5775...  1.3737 sec/batch
Epoch: 5/20...  Training Step: 410...  Training loss: 1.5530...  1.3716 sec/batch
Epoch: 5/20...  Training Step: 411...  Training loss: 1.5719...  1.3706 sec/batch
Epoch: 5/20...  Training Step: 412...  Training loss: 1.5585...  1.3721 sec/batch
Epoch: 5/20...  Training Step: 413...  Training loss: 1.5361...  1.3711 sec/batch
Epoch: 5/20...  

Epoch: 6/20...  Training Step: 502...  Training loss: 1.5593...  1.3939 sec/batch
Epoch: 6/20...  Training Step: 503...  Training loss: 1.5538...  1.4111 sec/batch
Epoch: 6/20...  Training Step: 504...  Training loss: 1.5341...  1.3988 sec/batch
Epoch: 6/20...  Training Step: 505...  Training loss: 1.5597...  1.3916 sec/batch
Epoch: 6/20...  Training Step: 506...  Training loss: 1.5808...  1.4005 sec/batch
Epoch: 6/20...  Training Step: 507...  Training loss: 1.5925...  1.3726 sec/batch
Epoch: 6/20...  Training Step: 508...  Training loss: 1.5487...  1.3740 sec/batch
Epoch: 6/20...  Training Step: 509...  Training loss: 1.5548...  1.3726 sec/batch
Epoch: 6/20...  Training Step: 510...  Training loss: 1.5295...  1.3864 sec/batch
Epoch: 6/20...  Training Step: 511...  Training loss: 1.5500...  1.3741 sec/batch
Epoch: 6/20...  Training Step: 512...  Training loss: 1.5385...  1.3721 sec/batch
Epoch: 6/20...  Training Step: 513...  Training loss: 1.5123...  1.3756 sec/batch
Epoch: 6/20...  

Epoch: 7/20...  Training Step: 602...  Training loss: 1.5224...  1.3869 sec/batch
Epoch: 7/20...  Training Step: 603...  Training loss: 1.5296...  1.3855 sec/batch
Epoch: 7/20...  Training Step: 604...  Training loss: 1.5135...  1.3846 sec/batch
Epoch: 7/20...  Training Step: 605...  Training loss: 1.5442...  1.3908 sec/batch
Epoch: 7/20...  Training Step: 606...  Training loss: 1.5625...  1.3912 sec/batch
Epoch: 7/20...  Training Step: 607...  Training loss: 1.5783...  1.3701 sec/batch
Epoch: 7/20...  Training Step: 608...  Training loss: 1.5331...  1.3730 sec/batch
Epoch: 7/20...  Training Step: 609...  Training loss: 1.5443...  1.3735 sec/batch
Epoch: 7/20...  Training Step: 610...  Training loss: 1.5131...  1.3715 sec/batch
Epoch: 7/20...  Training Step: 611...  Training loss: 1.5339...  1.3732 sec/batch
Epoch: 7/20...  Training Step: 612...  Training loss: 1.5296...  1.3741 sec/batch
Epoch: 7/20...  Training Step: 613...  Training loss: 1.4960...  1.3761 sec/batch
Epoch: 7/20...  

Epoch: 8/20...  Training Step: 702...  Training loss: 1.5156...  1.3777 sec/batch
Epoch: 8/20...  Training Step: 703...  Training loss: 1.5186...  1.3768 sec/batch
Epoch: 8/20...  Training Step: 704...  Training loss: 1.5012...  1.3754 sec/batch
Epoch: 8/20...  Training Step: 705...  Training loss: 1.5243...  1.3668 sec/batch
Epoch: 8/20...  Training Step: 706...  Training loss: 1.5439...  1.3727 sec/batch
Epoch: 8/20...  Training Step: 707...  Training loss: 1.5549...  1.3782 sec/batch
Epoch: 8/20...  Training Step: 708...  Training loss: 1.5229...  1.3755 sec/batch
Epoch: 8/20...  Training Step: 709...  Training loss: 1.5296...  1.3890 sec/batch
Epoch: 8/20...  Training Step: 710...  Training loss: 1.4970...  1.3760 sec/batch
Epoch: 8/20...  Training Step: 711...  Training loss: 1.5148...  1.3748 sec/batch
Epoch: 8/20...  Training Step: 712...  Training loss: 1.5113...  1.3759 sec/batch
Epoch: 8/20...  Training Step: 713...  Training loss: 1.4781...  1.3772 sec/batch
Epoch: 8/20...  

Epoch: 9/20...  Training Step: 802...  Training loss: 1.4989...  1.4040 sec/batch
Epoch: 9/20...  Training Step: 803...  Training loss: 1.4981...  1.3944 sec/batch
Epoch: 9/20...  Training Step: 804...  Training loss: 1.4717...  1.4204 sec/batch
Epoch: 9/20...  Training Step: 805...  Training loss: 1.5131...  1.4098 sec/batch
Epoch: 9/20...  Training Step: 806...  Training loss: 1.5130...  1.4447 sec/batch
Epoch: 9/20...  Training Step: 807...  Training loss: 1.5373...  1.4815 sec/batch
Epoch: 9/20...  Training Step: 808...  Training loss: 1.4942...  1.4752 sec/batch
Epoch: 9/20...  Training Step: 809...  Training loss: 1.5037...  1.4333 sec/batch
Epoch: 9/20...  Training Step: 810...  Training loss: 1.4799...  1.4626 sec/batch
Epoch: 9/20...  Training Step: 811...  Training loss: 1.4912...  1.4486 sec/batch
Epoch: 9/20...  Training Step: 812...  Training loss: 1.4891...  1.4333 sec/batch
Epoch: 9/20...  Training Step: 813...  Training loss: 1.4657...  1.4024 sec/batch
Epoch: 9/20...  

Epoch: 10/20...  Training Step: 902...  Training loss: 1.4738...  1.3765 sec/batch
Epoch: 10/20...  Training Step: 903...  Training loss: 1.4895...  1.3915 sec/batch
Epoch: 10/20...  Training Step: 904...  Training loss: 1.4688...  1.3973 sec/batch
Epoch: 10/20...  Training Step: 905...  Training loss: 1.4946...  1.4010 sec/batch
Epoch: 10/20...  Training Step: 906...  Training loss: 1.5063...  1.4037 sec/batch
Epoch: 10/20...  Training Step: 907...  Training loss: 1.5332...  1.4129 sec/batch
Epoch: 10/20...  Training Step: 908...  Training loss: 1.4907...  1.4170 sec/batch
Epoch: 10/20...  Training Step: 909...  Training loss: 1.4868...  1.4133 sec/batch
Epoch: 10/20...  Training Step: 910...  Training loss: 1.4676...  1.4148 sec/batch
Epoch: 10/20...  Training Step: 911...  Training loss: 1.4772...  1.4595 sec/batch
Epoch: 10/20...  Training Step: 912...  Training loss: 1.4788...  1.4300 sec/batch
Epoch: 10/20...  Training Step: 913...  Training loss: 1.4412...  1.4504 sec/batch
Epoc

Epoch: 11/20...  Training Step: 1001...  Training loss: 1.5313...  1.3752 sec/batch
Epoch: 11/20...  Training Step: 1002...  Training loss: 1.4616...  1.3797 sec/batch
Epoch: 11/20...  Training Step: 1003...  Training loss: 1.4713...  1.3813 sec/batch
Epoch: 11/20...  Training Step: 1004...  Training loss: 1.4489...  1.3940 sec/batch
Epoch: 11/20...  Training Step: 1005...  Training loss: 1.4869...  1.3896 sec/batch
Epoch: 11/20...  Training Step: 1006...  Training loss: 1.4828...  1.3788 sec/batch
Epoch: 11/20...  Training Step: 1007...  Training loss: 1.5129...  1.4046 sec/batch
Epoch: 11/20...  Training Step: 1008...  Training loss: 1.4824...  1.4113 sec/batch
Epoch: 11/20...  Training Step: 1009...  Training loss: 1.4775...  1.4019 sec/batch
Epoch: 11/20...  Training Step: 1010...  Training loss: 1.4513...  1.3923 sec/batch
Epoch: 11/20...  Training Step: 1011...  Training loss: 1.4661...  1.3939 sec/batch
Epoch: 11/20...  Training Step: 1012...  Training loss: 1.4535...  1.4084 se

Epoch: 11/20...  Training Step: 1099...  Training loss: 1.4351...  1.4248 sec/batch
Epoch: 11/20...  Training Step: 1100...  Training loss: 1.4361...  1.4642 sec/batch
Epoch: 12/20...  Training Step: 1101...  Training loss: 1.5080...  1.3742 sec/batch
Epoch: 12/20...  Training Step: 1102...  Training loss: 1.4536...  1.3799 sec/batch
Epoch: 12/20...  Training Step: 1103...  Training loss: 1.4573...  1.3776 sec/batch
Epoch: 12/20...  Training Step: 1104...  Training loss: 1.4289...  1.3767 sec/batch
Epoch: 12/20...  Training Step: 1105...  Training loss: 1.4630...  1.3799 sec/batch
Epoch: 12/20...  Training Step: 1106...  Training loss: 1.4794...  1.3793 sec/batch
Epoch: 12/20...  Training Step: 1107...  Training loss: 1.5017...  1.3807 sec/batch
Epoch: 12/20...  Training Step: 1108...  Training loss: 1.4552...  1.3795 sec/batch
Epoch: 12/20...  Training Step: 1109...  Training loss: 1.4649...  1.3792 sec/batch
Epoch: 12/20...  Training Step: 1110...  Training loss: 1.4442...  1.3817 se

Epoch: 12/20...  Training Step: 1197...  Training loss: 1.4242...  1.4195 sec/batch
Epoch: 12/20...  Training Step: 1198...  Training loss: 1.4126...  1.4319 sec/batch
Epoch: 12/20...  Training Step: 1199...  Training loss: 1.4239...  1.4565 sec/batch
Epoch: 12/20...  Training Step: 1200...  Training loss: 1.4352...  1.5267 sec/batch
Epoch: 13/20...  Training Step: 1201...  Training loss: 1.5036...  1.3761 sec/batch
Epoch: 13/20...  Training Step: 1202...  Training loss: 1.4417...  1.3796 sec/batch
Epoch: 13/20...  Training Step: 1203...  Training loss: 1.4348...  1.3763 sec/batch
Epoch: 13/20...  Training Step: 1204...  Training loss: 1.4134...  1.3902 sec/batch
Epoch: 13/20...  Training Step: 1205...  Training loss: 1.4552...  1.3900 sec/batch
Epoch: 13/20...  Training Step: 1206...  Training loss: 1.4610...  1.4309 sec/batch
Epoch: 13/20...  Training Step: 1207...  Training loss: 1.4766...  1.4263 sec/batch
Epoch: 13/20...  Training Step: 1208...  Training loss: 1.4477...  1.4211 se

Epoch: 13/20...  Training Step: 1295...  Training loss: 1.4451...  1.4243 sec/batch
Epoch: 13/20...  Training Step: 1296...  Training loss: 1.3891...  1.4348 sec/batch
Epoch: 13/20...  Training Step: 1297...  Training loss: 1.4029...  1.4335 sec/batch
Epoch: 13/20...  Training Step: 1298...  Training loss: 1.4088...  1.4291 sec/batch
Epoch: 13/20...  Training Step: 1299...  Training loss: 1.4007...  1.4142 sec/batch
Epoch: 13/20...  Training Step: 1300...  Training loss: 1.4300...  1.4257 sec/batch
Epoch: 14/20...  Training Step: 1301...  Training loss: 1.4892...  1.3757 sec/batch
Epoch: 14/20...  Training Step: 1302...  Training loss: 1.4291...  1.3762 sec/batch
Epoch: 14/20...  Training Step: 1303...  Training loss: 1.4235...  1.3721 sec/batch
Epoch: 14/20...  Training Step: 1304...  Training loss: 1.4203...  1.3767 sec/batch
Epoch: 14/20...  Training Step: 1305...  Training loss: 1.4405...  1.3767 sec/batch
Epoch: 14/20...  Training Step: 1306...  Training loss: 1.4512...  1.3787 se

Epoch: 14/20...  Training Step: 1393...  Training loss: 1.4266...  1.4229 sec/batch
Epoch: 14/20...  Training Step: 1394...  Training loss: 1.3910...  1.4333 sec/batch
Epoch: 14/20...  Training Step: 1395...  Training loss: 1.4365...  1.3977 sec/batch
Epoch: 14/20...  Training Step: 1396...  Training loss: 1.3893...  1.4104 sec/batch
Epoch: 14/20...  Training Step: 1397...  Training loss: 1.3883...  1.4163 sec/batch
Epoch: 14/20...  Training Step: 1398...  Training loss: 1.3911...  1.4071 sec/batch
Epoch: 14/20...  Training Step: 1399...  Training loss: 1.3856...  1.3764 sec/batch
Epoch: 14/20...  Training Step: 1400...  Training loss: 1.4136...  1.3789 sec/batch
Epoch: 15/20...  Training Step: 1401...  Training loss: 1.4744...  1.3754 sec/batch
Epoch: 15/20...  Training Step: 1402...  Training loss: 1.4207...  1.3790 sec/batch
Epoch: 15/20...  Training Step: 1403...  Training loss: 1.4106...  1.3905 sec/batch
Epoch: 15/20...  Training Step: 1404...  Training loss: 1.4043...  1.3887 se

Epoch: 15/20...  Training Step: 1491...  Training loss: 1.3784...  1.4264 sec/batch
Epoch: 15/20...  Training Step: 1492...  Training loss: 1.3800...  1.4199 sec/batch
Epoch: 15/20...  Training Step: 1493...  Training loss: 1.4212...  1.4436 sec/batch
Epoch: 15/20...  Training Step: 1494...  Training loss: 1.3856...  1.4347 sec/batch
Epoch: 15/20...  Training Step: 1495...  Training loss: 1.4246...  1.4326 sec/batch
Epoch: 15/20...  Training Step: 1496...  Training loss: 1.3695...  1.4164 sec/batch
Epoch: 15/20...  Training Step: 1497...  Training loss: 1.3795...  1.4244 sec/batch
Epoch: 15/20...  Training Step: 1498...  Training loss: 1.3743...  1.4515 sec/batch
Epoch: 15/20...  Training Step: 1499...  Training loss: 1.3819...  1.4159 sec/batch
Epoch: 15/20...  Training Step: 1500...  Training loss: 1.3979...  1.3735 sec/batch
Epoch: 16/20...  Training Step: 1501...  Training loss: 1.4693...  1.3763 sec/batch
Epoch: 16/20...  Training Step: 1502...  Training loss: 1.4128...  1.3751 se

Epoch: 16/20...  Training Step: 1589...  Training loss: 1.3719...  1.3731 sec/batch
Epoch: 16/20...  Training Step: 1590...  Training loss: 1.3710...  1.3716 sec/batch
Epoch: 16/20...  Training Step: 1591...  Training loss: 1.3709...  1.3752 sec/batch
Epoch: 16/20...  Training Step: 1592...  Training loss: 1.3647...  1.3767 sec/batch
Epoch: 16/20...  Training Step: 1593...  Training loss: 1.4053...  1.3752 sec/batch
Epoch: 16/20...  Training Step: 1594...  Training loss: 1.3660...  1.3757 sec/batch
Epoch: 16/20...  Training Step: 1595...  Training loss: 1.4080...  1.3777 sec/batch
Epoch: 16/20...  Training Step: 1596...  Training loss: 1.3632...  1.3751 sec/batch
Epoch: 16/20...  Training Step: 1597...  Training loss: 1.3856...  1.3762 sec/batch
Epoch: 16/20...  Training Step: 1598...  Training loss: 1.3612...  1.3777 sec/batch
Epoch: 16/20...  Training Step: 1599...  Training loss: 1.3695...  1.3767 sec/batch
Epoch: 16/20...  Training Step: 1600...  Training loss: 1.3830...  1.3752 se

Epoch: 17/20...  Training Step: 1687...  Training loss: 1.3485...  1.3767 sec/batch
Epoch: 17/20...  Training Step: 1688...  Training loss: 1.3670...  1.3773 sec/batch
Epoch: 17/20...  Training Step: 1689...  Training loss: 1.3546...  1.3767 sec/batch
Epoch: 17/20...  Training Step: 1690...  Training loss: 1.3623...  1.3757 sec/batch
Epoch: 17/20...  Training Step: 1691...  Training loss: 1.3473...  1.3722 sec/batch
Epoch: 17/20...  Training Step: 1692...  Training loss: 1.3497...  1.3772 sec/batch
Epoch: 17/20...  Training Step: 1693...  Training loss: 1.3889...  1.3717 sec/batch
Epoch: 17/20...  Training Step: 1694...  Training loss: 1.3631...  1.3792 sec/batch
Epoch: 17/20...  Training Step: 1695...  Training loss: 1.4040...  1.3742 sec/batch
Epoch: 17/20...  Training Step: 1696...  Training loss: 1.3526...  1.3732 sec/batch
Epoch: 17/20...  Training Step: 1697...  Training loss: 1.3638...  1.3716 sec/batch
Epoch: 17/20...  Training Step: 1698...  Training loss: 1.3605...  1.3767 se

Epoch: 18/20...  Training Step: 1785...  Training loss: 1.3653...  1.3737 sec/batch
Epoch: 18/20...  Training Step: 1786...  Training loss: 1.3708...  1.3732 sec/batch
Epoch: 18/20...  Training Step: 1787...  Training loss: 1.3311...  1.3732 sec/batch
Epoch: 18/20...  Training Step: 1788...  Training loss: 1.3610...  1.3818 sec/batch
Epoch: 18/20...  Training Step: 1789...  Training loss: 1.3506...  1.3747 sec/batch
Epoch: 18/20...  Training Step: 1790...  Training loss: 1.3612...  1.3793 sec/batch
Epoch: 18/20...  Training Step: 1791...  Training loss: 1.3482...  1.3854 sec/batch
Epoch: 18/20...  Training Step: 1792...  Training loss: 1.3524...  1.3980 sec/batch
Epoch: 18/20...  Training Step: 1793...  Training loss: 1.3931...  1.3918 sec/batch
Epoch: 18/20...  Training Step: 1794...  Training loss: 1.3554...  1.4065 sec/batch
Epoch: 18/20...  Training Step: 1795...  Training loss: 1.3862...  1.3909 sec/batch
Epoch: 18/20...  Training Step: 1796...  Training loss: 1.3488...  1.4092 se

Epoch: 19/20...  Training Step: 1883...  Training loss: 1.3586...  1.3732 sec/batch
Epoch: 19/20...  Training Step: 1884...  Training loss: 1.3400...  1.3747 sec/batch
Epoch: 19/20...  Training Step: 1885...  Training loss: 1.3618...  1.3762 sec/batch
Epoch: 19/20...  Training Step: 1886...  Training loss: 1.3615...  1.3721 sec/batch
Epoch: 19/20...  Training Step: 1887...  Training loss: 1.3293...  1.3721 sec/batch
Epoch: 19/20...  Training Step: 1888...  Training loss: 1.3555...  1.3732 sec/batch
Epoch: 19/20...  Training Step: 1889...  Training loss: 1.3460...  1.3716 sec/batch
Epoch: 19/20...  Training Step: 1890...  Training loss: 1.3493...  1.3762 sec/batch
Epoch: 19/20...  Training Step: 1891...  Training loss: 1.3310...  1.3731 sec/batch
Epoch: 19/20...  Training Step: 1892...  Training loss: 1.3410...  1.3747 sec/batch
Epoch: 19/20...  Training Step: 1893...  Training loss: 1.3742...  1.3787 sec/batch
Epoch: 19/20...  Training Step: 1894...  Training loss: 1.3386...  1.3777 se

Epoch: 20/20...  Training Step: 1981...  Training loss: 1.3518...  1.3729 sec/batch
Epoch: 20/20...  Training Step: 1982...  Training loss: 1.3737...  1.3747 sec/batch
Epoch: 20/20...  Training Step: 1983...  Training loss: 1.3470...  1.3762 sec/batch
Epoch: 20/20...  Training Step: 1984...  Training loss: 1.3404...  1.3731 sec/batch
Epoch: 20/20...  Training Step: 1985...  Training loss: 1.3455...  1.3762 sec/batch
Epoch: 20/20...  Training Step: 1986...  Training loss: 1.3566...  1.3792 sec/batch
Epoch: 20/20...  Training Step: 1987...  Training loss: 1.3231...  1.3747 sec/batch
Epoch: 20/20...  Training Step: 1988...  Training loss: 1.3380...  1.3732 sec/batch
Epoch: 20/20...  Training Step: 1989...  Training loss: 1.3286...  1.3787 sec/batch
Epoch: 20/20...  Training Step: 1990...  Training loss: 1.3263...  1.3797 sec/batch
Epoch: 20/20...  Training Step: 1991...  Training loss: 1.3307...  1.3757 sec/batch
Epoch: 20/20...  Training Step: 1992...  Training loss: 1.3274...  1.3747 se

# Agora só falta Gerar o texto a partir da rede

Geraremos caractere a caractere, usando uma função que escolhe probabilisticamente entre os _top N_ caracteres mais prováveis de aparecerem de acordo com a RNN treinada.

In [37]:
def pick_top_n(preds, vocab_size, top_n=5):
    p = np.squeeze(preds)
    p[np.argsort(p)[:-top_n]] = 0
    p = p / np.sum(p)
    c = np.random.choice(vocab_size, 1, p=p)[0]
    return c

In [38]:
def sample(checkpoint, n_samples, lstm_size, vocab_size, prime="The "):
    samples = [c for c in prime]
    model = CharRNN(len(vocab), lstm_size=lstm_size, sampling=True)
    saver = tf.train.Saver()
    with tf.Session() as sess:
        saver.restore(sess, checkpoint)
        new_state = sess.run(model.initial_state)
        for c in prime:
            x = np.zeros((1, 1))
            x[0,0] = vocab_to_int[c]
            feed = {model.inputs: x,
                    model.keep_prob: 1.,
                    model.initial_state: new_state}
            preds, new_state = sess.run([model.prediction, model.final_state], 
                                         feed_dict=feed)

        c = pick_top_n(preds, len(vocab))
        samples.append(int_to_vocab[c])

        for i in range(n_samples):
            x[0,0] = c
            feed = {model.inputs: x,
                    model.keep_prob: 1.,
                    model.initial_state: new_state}
            preds, new_state = sess.run([model.prediction, model.final_state], 
                                         feed_dict=feed)

            c = pick_top_n(preds, len(vocab))
            samples.append(int_to_vocab[c])
        
    return ''.join(samples)

Aqui está o caminho do último checkpoint:

In [39]:
tf.train.latest_checkpoint('checkpoints')

'checkpoints\\i2000_l512.ckpt'

Alguns exemplos de texto gerado a partir deste checkpoint é este:

In [40]:
checkpoint = tf.train.latest_checkpoint('checkpoints')
samp = sample(checkpoint, 1000, lstm_size, len(vocab), prime="Era")
print(samp)

INFO:tensorflow:Restoring parameters from checkpoints\i2000_l512.ckpt
Eram sombras. Mas não estava pronto. As pravessas e em seus olhos, ele passavam de novo. Entregando-se correndo a partiram. O portão se encontrava as passagres do Anel e estava claro, em suas
sementes a cabeça e a certa, em pess antes do que era uma margina, mos quando os outros permarecem esperando as coisas até o que procurava em direção ao leste, e estavam entre ela, e se algumas portas atrás dele, com um cavale repontou e ali, atantou a chama do campo, e estava em suas coiras atrás deles a lenta e enquanto a mesm em seus anos sobre o senhor entes de uma coisa que, sinuos e sobres e olhos contra um consolho prescapar. Olhou ao lado de Bri, e o meu caminho davam por algum
senhor. E estava
subindo no lago cobriu o carado e suriam e colocadas a espada de ali, e a espura se apagar pou a chiva que era o sim dos anos, a lesbe das montanhas se abriumos e com embarco a maiar e descia sobre o mesmo tampé. O sol entre seus 

In [46]:
checkpoint = tf.train.latest_checkpoint('checkpoints')
samp = sample(checkpoint, 1000, lstm_size, len(vocab), prime="Era")
print(samp)

INFO:tensorflow:Restoring parameters from checkpoints\i2000_l512.ckpt
Eram na floresta no tempo, conduzindo-o pela margem. E se enquanto alguma for a cordento e os perto da perigo, e a centeria a comida e a mesmo asi estivessem cheias de carados. As coisas acha que seus portas, além das montanhas do mal estava sentado até que as passagem solteres, e agora se esgrevam pronunciar por uma vez a noite a coração por trazia de novo. A mara do modo tinha através do convidado para ele. Em sua mão de volta de caminhos saber que as pontos profundos dos outros arasmos pra ele, e a luz de seu anos para
o próprio Cavaleiro da Entrad. A perdo em passou em cira na esprada. A passagem subindo de um lar e outros ele se apreximava por esta longa, cominha de como a espada, em que ele supou se tentar engrando a mais frate, e o menos sentiu o senhor;
e assim e chegaram o que tem sabedos. Mesmo as palavras do Anel estava melhor. Ele se pode chamar de uma crescura. O caminho e a porta sinto como um cavilho e

### Note como o texto gerado vai melhorando com o passar do treinamento:

In [24]:
checkpoint = 'checkpoints\\i100_l512.ckpt'
samp = sample(checkpoint, 1000, lstm_size, len(vocab), prime="Era")
print(samp)

INFO:tensorflow:Restoring parameters from checkpoints\i100_l512.ckpt
Eraa  e sa aoo  eas s a o a  aeesee  esa  soss oess ee ae s  sasa  esase   ooo eee  ss  es s  eeo e  es e a aoe ao  es esaesa  ao ose a a o ao e  seeoe ase ose o  aaa  a eeaa sas  seooa eoea so ssees ose oosss e  ee aao a aoao eeao eeo   aa oses e a eses aeasssea  o o a aaeee  oesa  aae oo  oa  e o ooe  oooeo  s o oaoa  ae  sa aaas   aoa o  oses  osaeao  oaa  o ea  eae s ooe  oaoe e oea oe a  aae  oe sse eeoae  a asseo o ae   esa e  aaaa o  ae ose   eoesa e saeos ssaeos a eees  so e aseae   aeoaa  aaaes esaaso  eaee   s raea o s eao  e   osaa  s ea a es aoa esas o aaee s oeooos    oaae  oaeeoe   eosoa  a ees  ao o  eeo  aa asee see sooos  ee aee  o ssaoaa  sae a se  oa s ao soeo   aoaseo o e oe os essosaee ess  o a sea o oeo o soeeoa   eaoao  oeo seoeo aeos  ae e aa e ae ssaa aoee  a  saaa   eaeoo a  eo aaoeea  eoe  so oa  e ae eea   aeoae  osso a a a ee  e  ra soo sa o a a saa ae   sa saae  oaaa e a ee  oaos oo oe ao

In [25]:
checkpoint = 'checkpoints/i200_l512.ckpt'
samp = sample(checkpoint, 1000, lstm_size, len(vocab), prime="Era")
print(samp)

INFO:tensorflow:Restoring parameters from checkpoints/i200_l512.ckpt
Era, 
ente sue sanara aranam e palas dos ue se te tirade eres amarartina donso ses re tiadas ponasa dor am ramam uido de cola dendo enante core a contondos ostiridam dam desdo esestos as ertasane to conam a palana semas ar e pantena a e paremo dontam ementos ues elorone do pemtamas ene tentera so pestoras or os ane as a es o tereno. — E tas elo telia, aramo cas erar aro pianom o eria poma sentamto emo ta tos enentemes o doro pirona pamtes do pomas destens ue dos aras dele de endam ere asemtam a pelo dessa de parte e e pasrrane anside eltona dore do esenantoras o poros uass os e taro su ro cas ase pos eresemas elo pore endo ses eses uos pirtor eror ua poreras ento tarem doses e prro des oss ui a essiito, am osser a as e casdira. — Esso esta ciado dos o es as rres e penes onta se eltoras a te tono esase as ararta ame cortordo se tanosem arena cimas antinode. — Ertas urares delamtos ui dertone de pirantu sa amtos antino 

In [26]:
checkpoint = 'checkpoints/i600_l512.ckpt'
samp = sample(checkpoint, 1000, lstm_size, len(vocab), prime="Era")
print(samp)

INFO:tensorflow:Restoring parameters from checkpoints/i600_l512.ckpt
Erar ma no sua conco tero do pestos, em a nom sembia, e astresso de suar estinha conte para está dila.
Ertovam a misto do fonha e contente a condo ele came estados algore dos osses derros. A alguma empava e encontado, e ancurta de penterios de pasa cariginta.
— Então por cortou, ela num cominho de paria camanta.
— Mas não tressa, alés pora essa entenda de padina. Alave es alguntos. — Não está pera o dia estenda a destra e enteram por esto destando, coma e silha e perecas essor camo e a mosso astia pale contando, e cama cintira disse pode es raveles., Mesta ne cunaram a dis com de porte dere ela em ancostar de cadente a asserado. Nove e ascula espentido, e sobbide o que ente para a contuda.
— Vente a manter, alhos se serrando, e pranasser pos enseras precidessas e ancira de sou e escirindo a semes, e esculos de cirando da
caleça estrava do cregindo.
Osso podo sua angontaram não com diveita am olente a pinaceste. Es uma

In [27]:
checkpoint = 'checkpoints/i1000_l512.ckpt'
samp = sample(checkpoint, 1000, lstm_size, len(vocab), prime="Era")
print(samp)

INFO:tensorflow:Restoring parameters from checkpoints/i1000_l512.ckpt
Eran, das alguéis altave de aragora e o pessoria protunda, e as pessagassas procimeras
em solhar, em podo o caminho com os harigos, e o que assia
acerder que não senhorem seu corrente dos próximos do cantina, e enquinto e estejamado palavras angosa com a sobre dos cobras de modo. A encurta e a chegar a menher atravia de argum. Não sendo então estantemante, as mais distestavam de pedidass das coisas de se arrasteres.
— Acho delho que possíamos duintar de modem a com deles. Encandou-se sera de carragado a coisa. E estava algués do que elfos. As muitas antes da conter o mai do que não podemos se astia e procurando-e paseram a cencação do passo das árvores, como acho de sua própria choia a com desejo.
— Não passo da munta de sous, enquentro de
seu creste distante, por escitou a mais entrada do modo, e não tinha acondecuro da melo, ma malgum do com o desejarem a mais de
alhos estadas antes aligasse coisas, e o pispou a pi