# Ismail Kadare - Gjakftohtesia

This notebook builds a character-wise RNN trained on "Gjakftohtesia" of the albanian author Ismail Kadare. It'll be used to generate a new chapter.

This network is based on https://github.com/udacity/deep-learning/tree/master/intro-to-rnns

In [1]:
import time
from collections import namedtuple

import numpy as np
import tensorflow as tf

Create the dictionaries

In [2]:
with open('Ismail Kadare - Gjakftohtesia.txt', 'r', encoding="iso-8859-1") as f:
    text=f.read()
vocab = sorted(set(text))
vocab_to_int = {c: i for i, c in enumerate(vocab)}
int_to_vocab = dict(enumerate(vocab))
encoded = np.array([vocab_to_int[c] for c in text], dtype=np.int32)

In [3]:
print('vocab', vocab)
print('vocab_to_int', vocab_to_int)
print('int_to_vocab', int_to_vocab)

vocab ['\n', '\x0c', ' ', '!', '"', '#', '%', '&', "'", '(', ')', '*', '+', ',', '-', '.', '/', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', ':', ';', '<', '>', '?', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '[', '\\', ']', '^', '_', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '~', '\x82', '\x86', '\x88', '\x89', '\x8a', '\x8d', '\x8e', '\x8f', '\x90', '\x91', '\x93', '\x94', '\x95', '\x97', '\x98', '\x99', '\x9a', '\x9d', '\x9e', '\x9f', '¡', '£', '¤', '¥', '§', 'Ç', 'È', 'Ñ', 'ã', 'å', 'æ', 'è', 'ë', 'ó']
vocab_to_int {'\x8e': 96, 'L': 43, 'l': 74, 'D': 35, '+': 12, 'å': 119, 'F': 37, 'V': 53, 'Q': 48, 'b': 64, '&': 7, 'R': 49, '"': 4, '?': 31, ',': 13, '\n': 0, 'è': 121, ':': 27, '9': 26, ')': 10, '0': 17, '\x8f': 97, 'e': 67, '\\': 59, 'f': 68, '\x86': 91, 'C': 34, 'Z': 57, '/': 16, '(': 9, '!': 3, '\x8d':

In [4]:
text[:100]

'ISMAIL \nKADARE \n\nGJAKFTOHTèSIA \n\nNOVELA \n\nSHTèPIA BOTUESE ÇNAIM FRASHèRIÈ \n\n\x0c\nRedaktor \n\n\x0c\ns \n\nSilva'

Characters encoded as integers

In [5]:
encoded[:100]

array([ 40,  50,  44,  32,  40,  43,   2,   0,  42,  32,  35,  32,  49,
        36,   2,   0,   0,  38,  41,  32,  42,  37,  51,  46,  39,  51,
       121,  50,  40,  32,   2,   0,   0,  45,  46,  53,  36,  43,  32,
         2,   0,   0,  50,  39,  51, 121,  47,  40,  32,   2,  33,  46,
        51,  52,  36,  50,  36,   2, 115,  45,  32,  40,  44,   2,  37,
        49,  32,  50,  39, 121,  49,  40, 116,   2,   0,   0,   1,   0,
        49,  67,  66,  63,  73,  82,  77,  80,   2,   0,   0,   1,   0,
        81,   2,   0,   0,  50,  71,  74,  84,  63], dtype=int32)

How many character classes?

In [6]:
len(vocab)

124

## Making training mini-batches

In [7]:
def get_batches(arr, batch_size, n_steps):
    '''Create a generator that returns batches of size
       batch_size x n_steps from arr.
       
       Arguments
       ---------
       arr: Array you want to make batches from
       batch_size: Batch size, the number of sequences per batch
       n_steps: Number of sequence steps per batch
    '''
    # Get the number of characters per batch and number of batches we can make
    characters_per_batch = batch_size * n_steps
    n_batches = len(arr) // characters_per_batch
    
    # Keep only enough characters to make full batches
    arr = arr[:n_batches*characters_per_batch]
    
    # Reshape into batch_size rows
    arr = arr.reshape(batch_size,-1)
    
    for n in range(0, arr.shape[1], n_steps):
        # The features
        x = arr[:, n:n+n_steps]
        # The targets, shifted by one
        y = np.zeros_like(x)
        y[:, :-1] = x[:, 1:]
        y[:, -1] = x[:, 0]
        yield x, y

Test get_batches()

In [8]:
batches = get_batches(encoded, 10, 50)
x, y = next(batches)

In [9]:
print('x\n', x[:10, :10])
print('\ny\n', y[:10, :10])

x
 [[40 50 44 32 40 43  2  0 42 32]
 [ 0 82 99  2 81 63 72 13  2 73]
 [ 2 78 83 81 83 74 74 99 15  2]
 [78 80 67 72  2 81 67 65 71 74]
 [72 77 80 69 83  2 78 99 80 78]
 [84 67 13  2 79 99  2 66 83 73]
 [64 83 81 70 83 80  2 75 67  2]
 [63  2 71 81 70 82 67  2 78 77]
 [77 71  2 78 99 80 78 63 80 63]
 [99 80 13  2 79 99  2 71 81 70]]

y
 [[50 44 32 40 43  2  0 42 32 35]
 [82 99  2 81 63 72 13  2 73 63]
 [78 83 81 83 74 74 99 15  2 50]
 [80 67 72  2 81 67 65 71 74 71]
 [77 80 69 83  2 78 99 80 78 71]
 [67 13  2 79 99  2 66 83 73 67]
 [83 81 70 83 80  2 75 67  2 81]
 [ 2 71 81 70 82 67  2 78 77 88]
 [71  2 78 99 80 78 63 80 63  2]
 [80 13  2 79 99  2 71 81 70 82]]


In [10]:
def build_inputs(batch_size, num_steps):
    ''' Define placeholders for inputs, targets, and dropout 
    
        Arguments
        ---------
        batch_size: Batch size, number of sequences per batch
        num_steps: Number of sequence steps in a batch
        
    '''
    # Declare placeholders we'll feed into the graph
    inputs = tf.placeholder(tf.int32, [batch_size, num_steps])
    targets = tf.placeholder(tf.int32, [batch_size, num_steps])
    
    # Keep probability placeholder for drop out layers
    keep_prob = tf.placeholder(tf.float32, name='keep_prob')
    
    return inputs, targets, keep_prob

### LSTM Cell

In [11]:
def build_lstm(lstm_size, num_layers, batch_size, keep_prob):
    ''' Build LSTM cell.
    
        Arguments
        ---------
        keep_prob: Scalar tensor (tf.placeholder) for the dropout keep probability
        lstm_size: Size of the hidden layers in the LSTM cells
        num_layers: Number of LSTM layers
        batch_size: Batch size

    '''
    ### Build the LSTM Cell
    def build_cell(num_units, keep_prob):
        # Use a basic LSTM cell
        lstm = tf.contrib.rnn.BasicLSTMCell(num_units)
        # Add dropout to the cell outputs
        drop = tf.contrib.rnn.DropoutWrapper(lstm, output_keep_prob=keep_prob)
    
        return drop
    
    
    # Stack up multiple LSTM layers, for deep learning
    cell = tf.contrib.rnn.MultiRNNCell([build_cell(lstm_size, keep_prob) for _ in range(num_layers)])
    initial_state = cell.zero_state(batch_size, tf.float32)
    
    return cell, initial_state

### RNN Output

In [12]:
def build_output(lstm_output, in_size, out_size):
    ''' Build a softmax layer, return the softmax output and logits.
    
        Arguments
        ---------
        
        lstm_output: List of output tensors from the LSTM layer
        in_size: Size of the input tensor, for example, size of the LSTM cells
        out_size: Size of this softmax layer
    
    '''

    # Reshape output so it's a bunch of rows, one row for each step for each sequence.
    # Concatenate lstm_output over axis 1 (the columns)
    seq_output = tf.concat(lstm_output, axis=1)
    # Reshape seq_output to a 2D tensor with lstm_size columns
    x = tf.reshape(seq_output, [-1, in_size])
    
    # Connect the RNN outputs to a softmax layer
    with tf.variable_scope('softmax'):
        # Create the weight and bias variables here
        softmax_w = tf.Variable(tf.truncated_normal((in_size, out_size), stddev=0.1))
        softmax_b = tf.Variable(tf.zeros(out_size))
    
    # Since output is a bunch of rows of RNN cell outputs, logits will be a bunch
    # of rows of logit outputs, one for each step and sequence
    logits = tf.matmul(x, softmax_w) + softmax_b
    
    # Use softmax to get the probabilities for predicted characters
    out = tf.nn.softmax(logits, name='predictions')
    
    return out, logits

### Training loss

In [13]:
def build_loss(logits, targets, lstm_size, num_classes):
    ''' Calculate the loss from the logits and the targets.
    
        Arguments
        ---------
        logits: Logits from final fully connected layer
        targets: Targets for supervised learning
        lstm_size: Number of LSTM hidden units
        num_classes: Number of classes in targets
        
    '''
    
    # One-hot encode targets and reshape to match logits, one row per sequence per step
    y_one_hot = tf.one_hot(targets, num_classes)
    y_reshaped =  tf.reshape(y_one_hot, logits.get_shape())
    
    # Softmax cross entropy loss
    loss = tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y_reshaped)
    loss = tf.reduce_mean(loss)
    
    return loss

### Optimizer

In [14]:
def build_optimizer(loss, learning_rate, grad_clip):
    ''' Build optmizer for training, using gradient clipping.
    
        Arguments:
        loss: Network loss
        learning_rate: Learning rate for optimizer
    
    '''
    
    # Optimizer for training, using gradient clipping to control exploding gradients
    tvars = tf.trainable_variables()
    grads, _ = tf.clip_by_global_norm(tf.gradients(loss, tvars), grad_clip)
    train_op = tf.train.AdamOptimizer(learning_rate)
    optimizer = train_op.apply_gradients(zip(grads, tvars))
    
    return optimizer

### Build the network

In [15]:
class CharRNN:
    
    def __init__(self, num_classes, batch_size=64, num_steps=50, 
                       lstm_size=128, num_layers=2, learning_rate=0.001, 
                       grad_clip=5, sampling=False):
    
        # When we're using this network for sampling later, we'll be passing in
        # one character at a time, so providing an option for that
        if sampling == True:
            batch_size, num_steps = 1, 1
        else:
            batch_size, num_steps = batch_size, num_steps

        tf.reset_default_graph()
        
        # Build the input placeholder tensors
        self.inputs, self.targets, self.keep_prob = build_inputs(batch_size, num_steps)

        # Build the LSTM cell
        cell, self.initial_state = build_lstm(lstm_size, num_layers, batch_size, keep_prob)

        ### Run the data through the RNN layers
        # First, one-hot encode the input tokens
        x_one_hot = tf.one_hot(self.inputs, num_classes)
        
        # Run each sequence step through the RNN with tf.nn.dynamic_rnn 
        outputs, state = tf.nn.dynamic_rnn(cell, x_one_hot, initial_state=self.initial_state)
        self.final_state = state
        
        # Get softmax predictions and logits
        self.prediction, self.logits = build_output(outputs, lstm_size, num_classes)
        
        # Loss and optimizer (with gradient clipping)
        self.loss = build_loss(self.logits, self.targets, lstm_size, num_classes) 
        self.optimizer = build_optimizer(self.loss, learning_rate, grad_clip)

## Hyperparameters

In [16]:
batch_size = 10         # Sequences per batch
num_steps = 50          # Number of sequence steps per batch
lstm_size = 128         # Size of hidden layers in LSTMs
num_layers = 2          # Number of LSTM layers
learning_rate = 0.01    # Learning rate
keep_prob = 0.5         # Dropout keep probability

## Training

In [17]:
epochs = 20
# Print losses every N interations
print_every_n = 50

# Save every N iterations
save_every_n = 200

model = CharRNN(len(vocab), batch_size=batch_size, num_steps=num_steps,
                lstm_size=lstm_size, num_layers=num_layers, 
                learning_rate=learning_rate)

saver = tf.train.Saver(max_to_keep=100)
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    
    # Use the line below to load a checkpoint and resume training
    #saver.restore(sess, 'checkpoints/______.ckpt')
    counter = 0
    for e in range(epochs):
        # Train network
        new_state = sess.run(model.initial_state)
        loss = 0
        for x, y in get_batches(encoded, batch_size, num_steps):
            counter += 1
            start = time.time()
            feed = {model.inputs: x,
                    model.targets: y,
                    model.keep_prob: keep_prob,
                    model.initial_state: new_state}
            batch_loss, new_state, _ = sess.run([model.loss, 
                                                 model.final_state, 
                                                 model.optimizer], 
                                                 feed_dict=feed)
            if (counter % print_every_n == 0):
                end = time.time()
                print('Epoch: {}/{}... '.format(e+1, epochs),
                      'Training Step: {}... '.format(counter),
                      'Training loss: {:.4f}... '.format(batch_loss),
                      '{:.4f} sec/batch'.format((end-start)))
        
            if (counter % save_every_n == 0):
                saver.save(sess, "checkpoints/i{}_l{}.ckpt".format(counter, lstm_size))
    
    saver.save(sess, "checkpoints/i{}_l{}.ckpt".format(counter, lstm_size))

Epoch: 1/20...  Training Step: 50...  Training loss: 2.9993...  0.0554 sec/batch
Epoch: 1/20...  Training Step: 100...  Training loss: 2.7332...  0.0551 sec/batch
Epoch: 1/20...  Training Step: 150...  Training loss: 2.6128...  0.0558 sec/batch
Epoch: 1/20...  Training Step: 200...  Training loss: 2.4931...  0.0574 sec/batch
Epoch: 1/20...  Training Step: 250...  Training loss: 2.4146...  0.0544 sec/batch
Epoch: 1/20...  Training Step: 300...  Training loss: 2.3376...  0.0543 sec/batch
Epoch: 1/20...  Training Step: 350...  Training loss: 2.3415...  0.0576 sec/batch
Epoch: 1/20...  Training Step: 400...  Training loss: 2.3921...  0.0546 sec/batch
Epoch: 1/20...  Training Step: 450...  Training loss: 2.1877...  0.0532 sec/batch
Epoch: 1/20...  Training Step: 500...  Training loss: 2.1777...  0.0540 sec/batch
Epoch: 1/20...  Training Step: 550...  Training loss: 2.2191...  0.0597 sec/batch
Epoch: 1/20...  Training Step: 600...  Training loss: 2.1889...  0.0566 sec/batch
Epoch: 1/20...  T

Epoch: 3/20...  Training Step: 5000...  Training loss: 1.8790...  0.0544 sec/batch
Epoch: 3/20...  Training Step: 5050...  Training loss: 1.9629...  0.0573 sec/batch
Epoch: 3/20...  Training Step: 5100...  Training loss: 1.9858...  0.0539 sec/batch
Epoch: 3/20...  Training Step: 5150...  Training loss: 1.9960...  0.0554 sec/batch
Epoch: 3/20...  Training Step: 5200...  Training loss: 1.7993...  0.0535 sec/batch
Epoch: 3/20...  Training Step: 5250...  Training loss: 1.9475...  0.0571 sec/batch
Epoch: 3/20...  Training Step: 5300...  Training loss: 1.8610...  0.0544 sec/batch
Epoch: 3/20...  Training Step: 5350...  Training loss: 1.9023...  0.0538 sec/batch
Epoch: 4/20...  Training Step: 5400...  Training loss: 1.9034...  0.0580 sec/batch
Epoch: 4/20...  Training Step: 5450...  Training loss: 1.9979...  0.0580 sec/batch
Epoch: 4/20...  Training Step: 5500...  Training loss: 1.7879...  0.0532 sec/batch
Epoch: 4/20...  Training Step: 5550...  Training loss: 1.8614...  0.0568 sec/batch
Epoc

Epoch: 6/20...  Training Step: 9950...  Training loss: 1.8407...  0.0530 sec/batch
Epoch: 6/20...  Training Step: 10000...  Training loss: 1.8873...  0.0561 sec/batch
Epoch: 6/20...  Training Step: 10050...  Training loss: 1.7573...  0.0583 sec/batch
Epoch: 6/20...  Training Step: 10100...  Training loss: 1.7461...  0.0588 sec/batch
Epoch: 6/20...  Training Step: 10150...  Training loss: 1.8376...  0.0541 sec/batch
Epoch: 6/20...  Training Step: 10200...  Training loss: 2.0065...  0.0580 sec/batch
Epoch: 6/20...  Training Step: 10250...  Training loss: 1.8945...  0.0555 sec/batch
Epoch: 6/20...  Training Step: 10300...  Training loss: 1.8741...  0.0536 sec/batch
Epoch: 6/20...  Training Step: 10350...  Training loss: 1.7522...  0.0531 sec/batch
Epoch: 6/20...  Training Step: 10400...  Training loss: 1.8030...  0.0577 sec/batch
Epoch: 6/20...  Training Step: 10450...  Training loss: 1.9705...  0.0536 sec/batch
Epoch: 6/20...  Training Step: 10500...  Training loss: 1.7888...  0.0581 sec

Epoch: 9/20...  Training Step: 14850...  Training loss: 1.7998...  0.0582 sec/batch
Epoch: 9/20...  Training Step: 14900...  Training loss: 1.8372...  0.0534 sec/batch
Epoch: 9/20...  Training Step: 14950...  Training loss: 1.8067...  0.0540 sec/batch
Epoch: 9/20...  Training Step: 15000...  Training loss: 1.6907...  0.0578 sec/batch
Epoch: 9/20...  Training Step: 15050...  Training loss: 1.7446...  0.0563 sec/batch
Epoch: 9/20...  Training Step: 15100...  Training loss: 1.7953...  0.0575 sec/batch
Epoch: 9/20...  Training Step: 15150...  Training loss: 1.7273...  0.0593 sec/batch
Epoch: 9/20...  Training Step: 15200...  Training loss: 1.8483...  0.0539 sec/batch
Epoch: 9/20...  Training Step: 15250...  Training loss: 1.8002...  0.0529 sec/batch
Epoch: 9/20...  Training Step: 15300...  Training loss: 1.9102...  0.0542 sec/batch
Epoch: 9/20...  Training Step: 15350...  Training loss: 1.8625...  0.0536 sec/batch
Epoch: 9/20...  Training Step: 15400...  Training loss: 1.9272...  0.0530 se

Epoch: 11/20...  Training Step: 19700...  Training loss: 2.0123...  0.0577 sec/batch
Epoch: 12/20...  Training Step: 19750...  Training loss: 1.8515...  0.0570 sec/batch
Epoch: 12/20...  Training Step: 19800...  Training loss: 1.6793...  0.0533 sec/batch
Epoch: 12/20...  Training Step: 19850...  Training loss: 1.9866...  0.0554 sec/batch
Epoch: 12/20...  Training Step: 19900...  Training loss: 1.8985...  0.0583 sec/batch
Epoch: 12/20...  Training Step: 19950...  Training loss: 1.7675...  0.0582 sec/batch
Epoch: 12/20...  Training Step: 20000...  Training loss: 1.6944...  0.0531 sec/batch
Epoch: 12/20...  Training Step: 20050...  Training loss: 1.7886...  0.0527 sec/batch
Epoch: 12/20...  Training Step: 20100...  Training loss: 1.7885...  0.0536 sec/batch
Epoch: 12/20...  Training Step: 20150...  Training loss: 1.7040...  0.0538 sec/batch
Epoch: 12/20...  Training Step: 20200...  Training loss: 1.7522...  0.0576 sec/batch
Epoch: 12/20...  Training Step: 20250...  Training loss: 1.6167..

Epoch: 14/20...  Training Step: 24550...  Training loss: 1.7823...  0.0574 sec/batch
Epoch: 14/20...  Training Step: 24600...  Training loss: 1.8236...  0.0550 sec/batch
Epoch: 14/20...  Training Step: 24650...  Training loss: 1.6900...  0.0529 sec/batch
Epoch: 14/20...  Training Step: 24700...  Training loss: 1.6795...  0.0599 sec/batch
Epoch: 14/20...  Training Step: 24750...  Training loss: 1.7965...  0.0569 sec/batch
Epoch: 14/20...  Training Step: 24800...  Training loss: 1.7467...  0.0604 sec/batch
Epoch: 14/20...  Training Step: 24850...  Training loss: 1.6667...  0.0591 sec/batch
Epoch: 14/20...  Training Step: 24900...  Training loss: 1.6316...  0.0525 sec/batch
Epoch: 14/20...  Training Step: 24950...  Training loss: 1.7028...  0.0569 sec/batch
Epoch: 14/20...  Training Step: 25000...  Training loss: 1.7937...  0.0574 sec/batch
Epoch: 14/20...  Training Step: 25050...  Training loss: 1.6833...  0.0560 sec/batch
Epoch: 15/20...  Training Step: 25100...  Training loss: 1.8016..

Epoch: 17/20...  Training Step: 29400...  Training loss: 1.8221...  0.0546 sec/batch
Epoch: 17/20...  Training Step: 29450...  Training loss: 1.8874...  0.0574 sec/batch
Epoch: 17/20...  Training Step: 29500...  Training loss: 1.9169...  0.0557 sec/batch
Epoch: 17/20...  Training Step: 29550...  Training loss: 1.7483...  0.0545 sec/batch
Epoch: 17/20...  Training Step: 29600...  Training loss: 1.8436...  0.0534 sec/batch
Epoch: 17/20...  Training Step: 29650...  Training loss: 1.6681...  0.0559 sec/batch
Epoch: 17/20...  Training Step: 29700...  Training loss: 1.7720...  0.0548 sec/batch
Epoch: 17/20...  Training Step: 29750...  Training loss: 1.7068...  0.0574 sec/batch
Epoch: 17/20...  Training Step: 29800...  Training loss: 1.7992...  0.0591 sec/batch
Epoch: 17/20...  Training Step: 29850...  Training loss: 1.7315...  0.0586 sec/batch
Epoch: 17/20...  Training Step: 29900...  Training loss: 1.7415...  0.0536 sec/batch
Epoch: 17/20...  Training Step: 29950...  Training loss: 1.6943..

Epoch: 20/20...  Training Step: 34250...  Training loss: 1.7387...  0.0573 sec/batch
Epoch: 20/20...  Training Step: 34300...  Training loss: 1.8364...  0.0599 sec/batch
Epoch: 20/20...  Training Step: 34350...  Training loss: 1.6159...  0.0553 sec/batch
Epoch: 20/20...  Training Step: 34400...  Training loss: 1.6049...  0.0571 sec/batch
Epoch: 20/20...  Training Step: 34450...  Training loss: 1.6911...  0.0534 sec/batch
Epoch: 20/20...  Training Step: 34500...  Training loss: 1.6979...  0.0528 sec/batch
Epoch: 20/20...  Training Step: 34550...  Training loss: 1.7935...  0.0550 sec/batch
Epoch: 20/20...  Training Step: 34600...  Training loss: 1.6797...  0.0545 sec/batch
Epoch: 20/20...  Training Step: 34650...  Training loss: 1.6567...  0.0539 sec/batch
Epoch: 20/20...  Training Step: 34700...  Training loss: 1.7191...  0.0539 sec/batch
Epoch: 20/20...  Training Step: 34750...  Training loss: 1.7842...  0.0565 sec/batch
Epoch: 20/20...  Training Step: 34800...  Training loss: 1.7931..

#### Saved checkpoints

In [18]:
tf.train.get_checkpoint_state('checkpoints')

model_checkpoint_path: "checkpoints/i35820_l128.ckpt"
all_model_checkpoint_paths: "checkpoints/i16200_l128.ckpt"
all_model_checkpoint_paths: "checkpoints/i16400_l128.ckpt"
all_model_checkpoint_paths: "checkpoints/i16600_l128.ckpt"
all_model_checkpoint_paths: "checkpoints/i16800_l128.ckpt"
all_model_checkpoint_paths: "checkpoints/i17000_l128.ckpt"
all_model_checkpoint_paths: "checkpoints/i17200_l128.ckpt"
all_model_checkpoint_paths: "checkpoints/i17400_l128.ckpt"
all_model_checkpoint_paths: "checkpoints/i17600_l128.ckpt"
all_model_checkpoint_paths: "checkpoints/i17800_l128.ckpt"
all_model_checkpoint_paths: "checkpoints/i18000_l128.ckpt"
all_model_checkpoint_paths: "checkpoints/i18200_l128.ckpt"
all_model_checkpoint_paths: "checkpoints/i18400_l128.ckpt"
all_model_checkpoint_paths: "checkpoints/i18600_l128.ckpt"
all_model_checkpoint_paths: "checkpoints/i18800_l128.ckpt"
all_model_checkpoint_paths: "checkpoints/i19000_l128.ckpt"
all_model_checkpoint_paths: "checkpoints/i19200_l128.ckpt"
al

## Sampling

Now that the network is trained, we'll can use it to generate new text. The idea is that we pass in a character, then the network will predict the next character. We can use the new one, to predict the next one. And we keep doing this to generate all new text. I also included some functionality to prime the network with some text by passing in a string and building up a state from that.

The network gives us predictions for each character. To reduce noise and make things a little less random, I'm going to only choose a new character from the top N most likely characters.



In [19]:
def pick_top_n(preds, vocab_size, top_n=5):
    p = np.squeeze(preds)
    p[np.argsort(p)[:-top_n]] = 0
    p = p / np.sum(p)
    c = np.random.choice(vocab_size, 1, p=p)[0]
    return c

In [20]:
def sample(checkpoint, n_samples, lstm_size, vocab_size, prime="The "):
    samples = [c for c in prime]
    model = CharRNN(len(vocab), lstm_size=lstm_size, sampling=True)
    saver = tf.train.Saver()
    with tf.Session() as sess:
        saver.restore(sess, checkpoint)
        new_state = sess.run(model.initial_state)
        for c in prime:
            x = np.zeros((1, 1))
            x[0,0] = vocab_to_int[c]
            feed = {model.inputs: x,
                    model.keep_prob: 1.,
                    model.initial_state: new_state}
            preds, new_state = sess.run([model.prediction, model.final_state], 
                                         feed_dict=feed)

        c = pick_top_n(preds, len(vocab))
        samples.append(int_to_vocab[c])

        for i in range(n_samples):
            x[0,0] = c
            feed = {model.inputs: x,
                    model.keep_prob: 1.,
                    model.initial_state: new_state}
            preds, new_state = sess.run([model.prediction, model.final_state], 
                                         feed_dict=feed)

            c = pick_top_n(preds, len(vocab))
            samples.append(int_to_vocab[c])
        
    return ''.join(samples)

Here, pass in the path to a checkpoint and sample from the network.

In [21]:
tf.train.latest_checkpoint('checkpoints')

'checkpoints/i35820_l128.ckpt'

In [22]:
checkpoint = tf.train.latest_checkpoint('checkpoints')
samp = sample(checkpoint, 2000, lstm_size, len(vocab), prime="Far")
print(samp)

Fartonit. Do t  do t truuit. 
Ata ishte prpara porsa njerzniva nj 
gjithuk po 
prgjagja, n nj shpejste nuk e prgjigjur t 

ndindur. 
Sipis thoshte 
tepr t ndrsrete e pasini 
t tij, t shtrenat e 
pashtresin. Ajo, dentiti n nj kmeshn i 
porsuenin dy t thosht t 
mendohet t kujdohesht, tani ndshpar n kokn sikur 
nga 
ndjyeshi prej nga at. 
Ai e tha ato t trulle t ndanuar katrto me pas ti
t'D dhe t pare 
matusn pjes... 

Ñ An e 
t gjith pa dashur pas po kalonit dhe 
ndihmni me nj t, ku ngrihej 
e ns shkurtet. 

ÑPo t koh pas pak t thoshte njrn aty njortoseshesh 
i theti, npas. 
Ata pastaj i shkoa t madh depatis dhiu 
pas tyre, por m ardhur te ditat dhe 

kishte prndojt se m ndeitire 
pika 
me njn ndrtesit. 
ÑPo prperia e till tryeza e tyre diti tani meshurit 
nj mbando t tjern se n nj gjekta prpara m tepo nuk 
i prerjen qt, 
pos t  thoshte 
ndrgjetja dhe nj dshir. 
ÑPranira t ndojunja e prprtitjen.iNj me 
marreti