# RNN with Tensorboard

## Tensorflow 개요
<img src="../resources/tensorflow_overview.png" width="1000">
        [이미지 출처](https://www.youtube.com/watch?v=-57Ne86Ia8w&list=PLlMkM4tgfjnLSOjrEJN31gZATbcj_MpUm&index=3)
1. Tensorflow의 operation과 이미 구현된 모델을 이용하여 Graph를 제작한다.
2. Input에 해당하는 부분을 placeholder로 만들어, `placeholder`에 넣을 데이터를 `feed_dict`에 넣는다.
3. Iteration마다 학습을하고 weights를 갱신한다.
4. validation 데이터를 이용하여 학습이 잘되고 있는지 확인한다.
5. 나의 모델에 test 데이터를 넣어서 결과를 확인한다.

## 글자 단위(Character-level)의 RNN 구현과 Tensorboard를 배워보자

### <학습목표>
1. 이번 노트북에서는 글자(character) 단위의 입력값으로 RNN을 학습해 보고, 결과를 Tensorboard를 이용하여 보는 것을 목표로 합니다.
2. 학습할 데이터는 Sherlock homes 시리즈 중 The Sign of the Four의 영문책을 이용하여 학습합니다.
3. 학습된 모델을 이용하여 새로운 문장을 만들어 봅니다.

이 수업의 내용은 Andrej Karpathy의 [블로그 포스트](http://karpathy.github.io/2015/05/21/rnn-effectiveness/)와  [Torch로 구현된 코드](https://github.com/karpathy/char-rnn)에 기반한 Tensorflow 수업입니다. 아래 사진은 일반적인 글자 입력 단위의 RNN의 구조입니다.

<img src="../resources/charseq.jpeg" width="500" alt="charseq">

Dependencies 읽기

In [3]:
# python2 -- python3
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from collections import namedtuple
from six.moves import urllib

import time
import re
import os
import numpy as np
import tensorflow as tf
import tensorflow.contrib.rnn as rnn

In [4]:
# Download the data.
url = 'http://cvlab.postech.ac.kr/~wgchang/data/others/'

def maybe_download(filename, expected_bytes):
    """Download a file if not present, and make sure it's the right size."""
    if not os.path.exists(filename):
        if not os.path.isdir(os.path.dirname(filename)):
            os.makedirs(os.path.dirname(filename))
        filename, _ = urllib.request.urlretrieve(url + os.path.basename(filename), filename)
    statinfo = os.stat(filename)
    if statinfo.st_size == expected_bytes:
        print('Found and verified', filename)
    else:
        print(statinfo.st_size)
        raise Exception(
            'Failed to verify ' + filename + '. Can you get to it with a browser?')
    return filename


In [5]:
filename = maybe_download('../data/sherlock.txt', 3377296)
# filename = maybe_download('../data/sherlock_short.txt', 609394)

Found and verified ../data/sherlock.txt


Tensorflow GPU settings

In [4]:
# configuration for prevent whole gpu usage
config = tf.ConfigProto()
config.gpu_options.allow_growth=True

우선 텍스트 데이터를 불러들인 후 각 단어들을 정수값으로 변환하여 모델이 학습할 수 있도록 합니다.

In [5]:
def remove_multiple_s(text):
    # 여러번 띄어쓰기가 된 부분을 한번으로 수정합니다.
    text = re.sub(r' +',r' ',text)
    # 여러번 탭이 된 부분을 한번으로 수정합니다.
    text = re.sub(r'\t+',r' ',text)
    # 여러번 newline으로된 부분을 한번으로 수정합니다.
    text = re.sub(r'\n+',r' ',text)
    # 특수문자를 제거합니다.
    text = re.sub(r'[^A-Za-z0-9.,?!\'" ]+',r'',text)
    return text

In [6]:
with open(filename, 'r') as f:
    text=f.read()
text=remove_multiple_s(text)
charset = set(text)
char_to_int = {c: i for i, c in enumerate(charset)}
int_to_char = dict(enumerate(charset))
chars = np.array([char_to_int[c] for c in text], dtype=np.int32)

텍스트의 길이와 텍스트가 숫자로 변환되었는지 확인합니다.

In [7]:
len(text)

3140216

In [8]:
text[:100]

' CHAPTER I  Mr. Sherlock Holmes  In the year 1878 I took my degree of Doctor of Medicine of the  Uni'

In [9]:
chars[:100]

array([ 1, 18, 25, 17, 33, 37, 20, 35,  1, 24,  1,  1, 28, 61,  5,  1, 34,
       51, 46, 61, 55, 56, 44, 52,  1, 25, 56, 55, 54, 46, 60,  1,  1, 24,
       57,  1, 63, 51, 46,  1, 66, 46, 43, 61,  1,  6, 15, 12, 15,  1, 24,
        1, 63, 56, 56, 52,  1, 54, 66,  1, 47, 46, 48, 61, 46, 46,  1, 56,
       49,  1, 21, 56, 44, 63, 56, 61,  1, 56, 49,  1, 28, 46, 47, 50, 44,
       50, 57, 46,  1, 56, 49,  1, 63, 51, 46,  1,  1, 36, 57, 50], dtype=int32)

이제 데이터를 training과 validation으로 나누고 각각을 batch로 만들어봅시다. 이번 과제에서는 Test set은 따로 없습니다.
문장에서 input과 target의 배열을 만듭니다. 여기서 target은 input과 같은 길이의 글자열이지만 한 글자가 밀려진 글자열입니다.
batch 크기를 맞추기 위해서 문장의 뒤에 남는 부분은 버립니다.
split_frac은 training과 validation을 나누는 set의 비율을 나타냅니다. 전체 batch갯수중 90%를 training으로, 10%를 validation으로 사용합니다.

<img src="../resources/dataset.jpeg" width="500" alt="split dataset">

x matrix(행렬)는 (`batch크기 x 글자열 길이`)입니다.

In [8]:
def split_data(chars, **params):
    batch_size = params['batch_size']
    time_steps = params['time_steps']
    split_frac = params.get('split_frac') or 0.9
    
    slice_size = batch_size * time_steps
    n_batches = int(len(chars) / slice_size)
    # Drop the last few characters to make only full batches
    x = chars[: n_batches*slice_size]
    y = chars[1: n_batches*slice_size + 1]
    
    # Split the data into batch_size slices, then stack them into a 2D matrix 
    x = np.stack(np.split(x, batch_size))
    y = np.stack(np.split(y, batch_size))
    
    # Now x and y are arrays with dimensions batch_size x n_batches*time_steps
    
    # Split into training and validation sets, keep the virst split_frac batches for training
    split_idx = int(n_batches*split_frac)
    train_x, train_y= x[:, :split_idx*time_steps], y[:, :split_idx*time_steps]
    val_x, val_y = x[:, split_idx*time_steps:], y[:, split_idx*time_steps:]
    
    return train_x, train_y, val_x, val_y

In [9]:
train_x, train_y, val_x, val_y = split_data(chars, **{'batch_size':10, 'time_steps':200, 'split_frac':0.8})

데이터가 나눠졌는지 확인해봅시다.

In [10]:
train_x.shape

(10, 251200)

In [11]:
val_x.shape

(10, 62800)

In [12]:
train_x[:,:10]

array([[ 1, 18, 25, 17, 33, 37, 20, 35,  1, 24],
       [51, 46, 61, 46,  4,  1, 63, 51, 43, 63],
       [39, 43, 55, 55, 46, 66,  4,  1, 43, 57],
       [60,  1, 56, 49,  1,  1, 55, 43, 66, 50],
       [44, 52, 46, 60, 63,  1, 47, 46, 59, 61],
       [56, 57, 55, 66,  1,  1, 64, 43, 50, 63],
       [57, 47,  1, 24,  1, 44, 56, 62, 55, 47],
       [ 4,  1, 60, 56, 54, 46, 63, 51, 50, 57],
       [63, 56, 57, 46,  5,  1, 17,  1, 60, 51],
       [ 1, 56, 57, 46,  1, 56, 44, 44, 43, 60]], dtype=int32)

학습을 할 때 각각 batch를 순서대로 넣어야 하기때문에 batch하나를 가져오는 함수를 만들어봅시다. 각 batch는 (`batch 크기 X time_steps`)입니다.
예를 들면, 우리의 모델이 100개의 문자열에 대해서 학습을 한다면, `time_steps = 100`이 됩니다. 그 다음 batch는 학습한 그 다음 문자열부터 학습됩니다.

In [13]:
def get_batch(arrs, num_steps):
    batch_size, slice_size = arrs[0].shape
    n_batches = int(slice_size/num_steps)
    for b in range(n_batches):
        yield [x[:, b*num_steps: (b+1)*num_steps] for x in arrs]

이제 tensorflow를 이용하여 RNN을 만들어봅시다. tensorflow관련 함수들은 [Tensorflow RNN API](https://www.tensorflow.org/api_docs/python/tf/contrib/rnn)를 참조하시면 됩니다.
##### 참조 링크
- [One-hot vector](https://www.tensorflow.org/api_docs/python/tf/one_hot): 
<img src='../resources/one_hot.png' width="700" alt="one hot encoding">
- [Dropout](https://www.youtube.com/watch?v=NhZVe50QwPM)[참조논문](https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf): Dropout은 random하게 특정 node를 0으로 만들어서 back-propagation이 0으로 된 node 이후로 진행되지 않게하여 overfiting을 막아주는 regularization역할을하여 학습을 원할하게합니다. **Advanced Topic: [Batch Normalization](https://arxiv.org/abs/1502.03167)를 추가적으로 공부하시면 overfitting 관련 공부에 도움이 됩니다.**
<img src='../resources/dropout.png' width="700" alt="dropout">
- [Optimizer](https://www.tensorflow.org/versions/r0.12/api_docs/python/train/optimizers)

## TensorBoard에 그래프를 기입

```python
def define_your_model():
    ###
    tf.summary.histogram('histogram', histogram)
    tf.summary.scalar('scalar', scalar)
    ###
    merged = tf.summary.merge_all()
    ###
model = define_your_model()

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    ###
    file_writer = tf.summary.FileWriter('logs/', sess.graph)
    file_writer.add_summary(summary_to_record, iteration_index)
    ###
```

In [14]:
def define_rnn_graph(num_classes, **params):
    # parameters
    lstm_size = params.get('lstm_size') or 128
    batch_size = params.get('batch_size') or 50
    time_steps = params.get('time_steps') or 50
    num_layers = params.get('num_layers') or 2
    optimizer_params = params.get('optimizer_params') or {'learning_rate': 1e-3}
    grad_clip = params.get('grad_clip') or 10
    sampling = params.get('sampling') or False
    
    if sampling == True:
        batch_size, time_steps = 1, 1

    tf.reset_default_graph()
    
    # placeholders를 선언합니다.
    # input을 tf.one_hot함수를 이용하여 one_hot vector로 바꿔줍니다.
    with tf.name_scope('inputs'):
        inputs = tf.placeholder(tf.int32, [batch_size, time_steps], name='inputs')
        x_one_hot = tf.one_hot(inputs, num_classes, name='x_one_hot')
    # target도 비슷한 방식으로 진행합니다
    with tf.name_scope('targets'):
        targets = tf.placeholder(tf.int32, [batch_size, time_steps], name='targets')
        y_one_hot = tf.one_hot(targets, num_classes, name='y_one_hot')

        # Loss를 계산하기위해 one_hot vector들의 matrix를 tf.reshape함수를 이용하여 하나의 긴 vector로 바꾸어줍니다.
        y_reshaped = tf.reshape(y_one_hot, [-1, num_classes])
    
    # Dropout을 위한 확률값을 저장하는 place holder
    keep_prob = tf.placeholder(tf.float32, name='keep_prob')
    
    # RNN의 한 종류인 LSTM 구현
    with tf.name_scope("RNN_layers"):
        lstm_layers = []
        for _ in range(num_layers):
            lstm = rnn.BasicLSTMCell(lstm_size)
            # rnn.DropoutWrapper를 이용하여 RNN model에 Dropout 추가
            drop = rnn.DropoutWrapper(lstm, output_keep_prob=keep_prob)
            # LSTM hidden layer 추가, weight sharing
            lstm_layers.append(drop)
        cell = rnn.MultiRNNCell(lstm_layers)

    # tf.nn.dynamic_rnn함수를 이용해 RNN을 실행
    with tf.name_scope("RNN_init_state"):
        initial_state = cell.zero_state(batch_size, tf.float32)
    
    # forward propagation
    with tf.name_scope("RNN_forward"):
        outputs, state = tf.nn.dynamic_rnn(cell, x_one_hot, initial_state=initial_state)
    
    final_state = state

    # Output을 Concatenate한 후에 Reshape합니다.
    with tf.name_scope('reshaper'):
        seq_output = tf.concat(outputs, axis=1,name='seq_output')
        output = tf.reshape(seq_output, [-1, lstm_size], name='graph_output')
    
    # Cost를 계산하기위해 RNN putput을  input으로하는 softmax layer를 제작합니다.
    with tf.name_scope('logits'):
        softmax_w = tf.Variable(tf.truncated_normal((lstm_size, num_classes), stddev=0.1),
                               name='softmax_w')
        softmax_b = tf.Variable(tf.zeros(num_classes), name='softmax_b')
        logits = tf.matmul(output, softmax_w) + softmax_b
        # weights & bias를 histogram으로 작성
        tf.summary.histogram('softmax_w', softmax_w)
        tf.summary.histogram('softmax_b', softmax_b)
        
    with tf.name_scope('predictions'):
        preds = tf.nn.softmax(logits, name='predictions')
        # prediction의 확률값을 histogram으로 작성
        tf.summary.histogram('predictions', preds)
        
    with tf.name_scope('cost'):
        loss = tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y_reshaped, name='loss')
        cost = tf.reduce_mean(loss, name='cost')
        # cost값을 scalar value로 작성
        tf.summary.scalar('cost', cost)
    
    # 학습을 위한 Optimizer를 정의합니다.
    # 대표적인 optimizer로는 SGD(stocastic gradient descent), Adam, RMSprop 등이 있습니다.
    # Gradient clipping을 통해 gradient값이 매우 큰 경우는 grad_clip값으로 제한합니다.
    with tf.name_scope('train'):
        tvars = tf.trainable_variables()
        grads, _ = tf.clip_by_global_norm(tf.gradients(cost, tvars), grad_clip)
        train_op = tf.train.AdamOptimizer(**optimizer_params)
        optimizer = train_op.apply_gradients(zip(grads, tvars))
    
    # summary를 merge합니다.
    merged = tf.summary.merge_all()
    
    # 앞에 선언한 노드들을 모두 Graph로 만들어서 결과로 반환합니다.
    export_nodes = ['inputs', 'targets', 'initial_state', 'final_state',
                    'keep_prob', 'cost', 'preds', 'optimizer','merged']
    Graph = namedtuple('Graph', export_nodes)
    local_dict = locals()
    graph = Graph(*[local_dict[each] for each in export_nodes])
    
    return graph

## Hyperparameters

위에 선언한 함수에서 이제 hyperparameter들을 정합니다. 
일반적으로 network의 크기가 커질 수록(hidden unit이 많을 수록, layer 수가 많을 수록) 성능이 향상되지만, overfitting(fit to variance)이 되는 현상을 잘 관찰해야 합니다. hyperparameter들이 너무 적을 경우에는 underfitting(fit to bias)되는 현상이 있을 수 있습니다.

In [15]:
params = {
    'lstm_size' : 1024,
    'batch_size': 100,
    'time_steps': 100,    
    'num_layers' : 2,
    'optimizer_params': {'learning_rate': 1e-3}}

## 학습 (Training)

Checkpoint를 저장할 directory를 만듭니다.

In [16]:
if not os.path.isdir('checkpoints/sherlock'):
    os.makedirs('checkpoints/sherlock')

In [17]:
epochs = 20
checkpoint_interval = 50
checkpoint = None
# 기존 checkpoint를 실행하고싶다면 None 대신 checkpoint_path를 넣으면됩니다.
# checkpoint = 'checkpoints/sherlock/i6250_l1024_1.073'

In [18]:
train_x, train_y, val_x, val_y = split_data(chars, **params)
model = define_rnn_graph(len(charset), **params)
saver = tf.train.Saver(max_to_keep=200)
epoch_start = 0
with tf.Session(config=config) as sess:
    sess.run(tf.global_variables_initializer())
    # tensorboard 작성을 위한 Filewriter를 만듭니다.
    train_writer = tf.summary.FileWriter('./logs/train', sess.graph)
    test_writer = tf.summary.FileWriter('./logs/test')
       
    n_batches = int(train_x.shape[1]/params['time_steps'])
    iterations = n_batches * epochs
     # 기존의 checkpoint를 읽어서 다시 학습
    if checkpoint:
        try:
            saver.restore(sess, checkpoint)
            iteration=int(re.search(r'\bi([\d]+)_[\w.]+\b',checkpoint).group(1))
            epoch_start = int(iteration/n_batches)
        except:
            print('Cannot read the checkpoint. Set it None.')
            epoch_start = 0
            checkpoint = None
            
    for e in range(epoch_start, epochs):
        # network 학습
        new_state = sess.run(model.initial_state)
        loss = 0
        for i, (x, y) in enumerate(get_batch([train_x, train_y], params['time_steps']), 1):
            iteration = e*n_batches + i
            # training 시간을 기록
            start = time.time()
            feed = {model.inputs: x,
                    model.targets: y,
                    model.keep_prob: 0.5,
                    model.initial_state: new_state }
            summary, batch_loss, new_state, _ = sess.run([model.merged, model.cost, \
                                                          model.final_state, model.optimizer], feed_dict=feed)
            loss += batch_loss
            end = time.time()
            print('Epoch {}/{} '.format(e+1, epochs),
                  'Iteration {}/{}'.format(iteration, iterations),
                  'Training loss: {:.4f}'.format(loss/i),
                  '{:.4f} sec/batch'.format((end-start)))
            # summary추가
            train_writer.add_summary(summary, iteration)
            
            if (iteration%checkpoint_interval == 0) or (iteration == iterations):
                # validation loss 확인. dropout의 값을 1로 설정하여 모든 node가 동작하도록 한다.
                val_loss = []
                new_state = sess.run(model.initial_state)
                for x, y in get_batch([val_x, val_y], params['time_steps']):
                    feed = {model.inputs: x,
                            model.targets: y,
                            model.keep_prob: 1.,
                            model.initial_state: new_state}
                    summary, batch_loss, new_state = sess.run([model.merged, model.cost, \
                                                               model.final_state], feed_dict=feed)
                    val_loss.append(batch_loss)
                # summary추가
                test_writer.add_summary(summary, iteration)
                
                print('Validation loss:', np.mean(val_loss),
                      'Saving checkpoint!')
                saver.save(sess, "checkpoints/sherlock/i{}_l{}_{:.3f}".format(iteration, params['lstm_size'], np.mean(val_loss)))

Epoch 1/20  Iteration 1/5640 Training loss: 4.2347 1.4247 sec/batch
Epoch 1/20  Iteration 2/5640 Training loss: 4.1081 0.9780 sec/batch
Epoch 1/20  Iteration 3/5640 Training loss: 6.5366 0.9816 sec/batch
Epoch 1/20  Iteration 4/5640 Training loss: 6.3720 0.9994 sec/batch
Epoch 1/20  Iteration 5/5640 Training loss: 5.8561 0.9682 sec/batch
Epoch 1/20  Iteration 6/5640 Training loss: 5.5569 0.9761 sec/batch
Epoch 1/20  Iteration 7/5640 Training loss: 5.3370 0.9598 sec/batch
Epoch 1/20  Iteration 8/5640 Training loss: 5.1355 0.9804 sec/batch
Epoch 1/20  Iteration 9/5640 Training loss: 4.9863 0.9615 sec/batch
Epoch 1/20  Iteration 10/5640 Training loss: 4.8483 0.9755 sec/batch
Epoch 1/20  Iteration 11/5640 Training loss: 4.7166 1.0040 sec/batch
Epoch 1/20  Iteration 12/5640 Training loss: 4.6026 0.9690 sec/batch
Epoch 1/20  Iteration 13/5640 Training loss: 4.5022 0.9554 sec/batch
Epoch 1/20  Iteration 14/5640 Training loss: 4.4120 1.0061 sec/batch
Epoch 1/20  Iteration 15/5640 Training loss

In [19]:
tf.train.get_checkpoint_state('checkpoints/sherlock')

model_checkpoint_path: "checkpoints/sherlock/i5640_l1024_1.083"
all_model_checkpoint_paths: "checkpoints/sherlock/i50_l1024_2.980"
all_model_checkpoint_paths: "checkpoints/sherlock/i100_l1024_2.813"
all_model_checkpoint_paths: "checkpoints/sherlock/i150_l1024_2.459"
all_model_checkpoint_paths: "checkpoints/sherlock/i200_l1024_2.264"
all_model_checkpoint_paths: "checkpoints/sherlock/i250_l1024_2.136"
all_model_checkpoint_paths: "checkpoints/sherlock/i300_l1024_2.017"
all_model_checkpoint_paths: "checkpoints/sherlock/i350_l1024_1.932"
all_model_checkpoint_paths: "checkpoints/sherlock/i400_l1024_1.869"
all_model_checkpoint_paths: "checkpoints/sherlock/i450_l1024_1.803"
all_model_checkpoint_paths: "checkpoints/sherlock/i500_l1024_1.749"
all_model_checkpoint_paths: "checkpoints/sherlock/i550_l1024_1.708"
all_model_checkpoint_paths: "checkpoints/sherlock/i600_l1024_1.656"
all_model_checkpoint_paths: "checkpoints/sherlock/i650_l1024_1.619"
all_model_checkpoint_paths: "checkpoints/sherlock/i70

## Sampling

이제 학습된 모델을 이용하여 문장을 만들어봅시다. 학습된 모델이 문장을 만드는 방법은 이전 글자가 주어졌을때, 다음 글자를 예측을 반복적으로 하면서 이루어집니다. 학습된 모델은 주어진 이전 글자에 대해 다음 글자를 확률 값으로 예측을 하게됩니다. 각각의 확률을 적용하여 Random sampling을 하여 새로운 글자가 추가가 되고, 새로운 글자와 이전 state를 이용하여 다음 글자를 예측합니다. 이 과정을 반복하게되면 문장을 만들 수 있습니다.
확률값이 가장 높은 `N`가지중에 하나를 선택하도록 코드를 작성해봅시다.

In [20]:
def pick_top_n(preds, charset_size, top_n=5):
    p = np.squeeze(preds)
    p[np.argsort(p)[:-top_n]] = 0
    p = p / np.sum(p)
    c = np.random.choice(charset_size, 1, p=p)[0]
    return c

In [21]:
def sample(checkpoint, n_samples, lstm_size, charset_size, prime="The "):
    samples = list(prime)
    model = define_rnn_graph(charset_size, **{'lstm_size':lstm_size, 'sampling':True})
    saver = tf.train.Saver()
    with tf.Session(config=config) as sess:
        saver.restore(sess, checkpoint)
        new_state = sess.run(model.initial_state)
        for c in prime:
            x = np.zeros((1, 1))
            x[0,0] = char_to_int[c]
            feed = {model.inputs: x,
                    model.keep_prob: 1.,
                    model.initial_state: new_state}
            preds, new_state = sess.run([model.preds, model.final_state], 
                                         feed_dict=feed)

        c = pick_top_n(preds, len(charset))
        samples.append(int_to_char[c])

        for i in range(n_samples):
            x[0,0] = c
            feed = {model.inputs: x,
                    model.keep_prob: 1.,
                    model.initial_state: new_state}
            preds, new_state = sess.run([model.preds, model.final_state], 
                                         feed_dict=feed)

            c = pick_top_n(preds, len(charset))
            samples.append(int_to_char[c])
        
    return ''.join(samples)

Validation Loss가 가장 작은 모델을 포함한 여러 모델을 이용하여 문장을 만들어봅시다.

In [23]:
all_checkpoints=re.findall(r'\b([\w/]+_([\d.]+))\b',str(tf.train.get_checkpoint_state('checkpoints/sherlock')),re.IGNORECASE)
all_checkpoints_sorted_by_valloss = sorted(all_checkpoints, key=lambda tup: float(tup[1]))

In [24]:
all_checkpoints_sorted_by_valloss[:10]

[('checkpoints/sherlock/i4850_l1024_1.081', '1.081'),
 ('checkpoints/sherlock/i5150_l1024_1.082', '1.082'),
 ('checkpoints/sherlock/i5640_l1024_1.083', '1.083'),
 ('checkpoints/sherlock/i5100_l1024_1.083', '1.083'),
 ('checkpoints/sherlock/i5250_l1024_1.083', '1.083'),
 ('checkpoints/sherlock/i5640_l1024_1.083', '1.083'),
 ('checkpoints/sherlock/i4300_l1024_1.084', '1.084'),
 ('checkpoints/sherlock/i4650_l1024_1.084', '1.084'),
 ('checkpoints/sherlock/i4750_l1024_1.084', '1.084'),
 ('checkpoints/sherlock/i4950_l1024_1.084', '1.084')]

In [25]:
n_samples = 2000

10번째 checkpoint의 prediction 결과

In [26]:
checkpoint = all_checkpoints[10][0]
samp = sample(checkpoint, n_samples, params['lstm_size'], len(charset), prime="The ")
print('<<{}>>\n'.format(checkpoint)+samp)
print('='*100)

<<checkpoints/sherlock/i500_l1024_1.749>>
The was a sarn to simper and  hers, that I was not the  some of at the mastle an thenes with and an  the contired. He was there was it help buchere to stald and andent out thome of the wilfing a mould of mad it who his  well, and to me with the  to bake whe had sint our himseres of her. I was  therl was in the dient of all this  seal the  state.  "I, were that that have nisheld with the wanded  to seve the mere to tho gotes alle the  and thes,  was a sait the mently of insomand, and that to ser were  ars as atteller tomnint  our the digtt a stold whor was nagethall and to sellition  then all the casersed thene. This was whon  sumn and selfit through. The willon the cearing.  "An a meate and allowed the colmers  fal the sand of at that was  the deall see of the sing thremes our thas a mostle were had house beand his asent, but it is an that wis  a teen a for our waster and him here. Ho mes wish a  colfare this aly she cherers," said he."  "We ar

validation loss가 가장 작은 checkpoint의 prediction 결과

In [27]:
for checkpoint, _ in all_checkpoints_sorted_by_valloss[:2]:
    samp = sample(checkpoint, n_samples, params['lstm_size'], len(charset), prime="The ")
    print('<<{}>>\n'.format(checkpoint)+samp)
    print('='*100)

<<checkpoints/sherlock/i4850_l1024_1.081>>
The  Buskervilles which I heard in her case we had some seat by meening  which had been drawn and showing that it was the man who is a  dead, therefore, to fancy as a feeling at his police, and he  shone into a chair, but he was always supposed to follow into the  servants? It is a steel one, and a clatter on treasure and too  beautifully a still one. There was no one that he had an account of  making the mother, but his heart had been a palefanthed face which  showed that he had suddenly said, and a sudden and a masterful  mother, a comfortable place.  "I am not all a strange able to see you."  "We have to be seen in my mouth, and I am sure of that importance and  the police that there is a missionable son, to my companion's  sister."  "I'd tell you what we have, Mr. Holmes, and I shall see that you  have already expected the singular praction of the child  at Carrith."  "You must have said anything of the lady. I tried to decide. If he  woul

In [30]:
for checkpoint, _ in all_checkpoints_sorted_by_valloss[:1]:
    samp = sample(checkpoint, n_samples, params['lstm_size'], len(charset), prime="Mr.")
    print('<<{}>>\n'.format(checkpoint)+samp)
    print('='*100)

<<checkpoints/sherlock/i4850_l1024_1.081>>
Mr. Holmes,  that when I had some points of that way has been taking to the  boxtomy steps, he said that he is a professional buind. This man  has been correctlooking from her since I has shown myself to  bushed the little characters. It was a small scent for an amateur  of the same answer. Then all these three men was a man who will find  the situation of which he could not beat.  "What do you think of it?" he cried to the leaster peace of  reputation. "I will remember your surprise and the situation to  watch him."  I walked out at the door of the station and was some which seemed to  hold the conversation at the station at the other end.  "What will you do, Mr. McCarthy?" he asked.  "It will be a large and pretty practice that you have a lodger, sie  and here we are now to be defined to take a power."  "I've hold it in to determine an abstraction. You see. There is  no man to contain it. We must see what you have seen since I have  heard of

마지막 checkpoint의 prediction 결과

In [28]:
checkpoint = tf.train.latest_checkpoint('checkpoints/sherlock')
samp = sample(checkpoint, 1000, params['lstm_size'], len(charset), prime="The ")
print(samp)

The  Amprican, a delicate police that he would not take a sergeant of his  figure. There is no other possibility that they were suspicions often, and  was attached to the community of his face that he sprang up and his  back without a figure as we walked over the door and the drop and  loose at the salling place.  "This is surely the present," He said the landlord insade a strong brain  beside the window.  "What do you think of the matter?"  "Well, I seem to be a very common possession which has been  convinced that the man was in a parageness who had the most sense  by walking out into the study it was of the death of a difficult  interest. Then, that is the maid and that we should have a stopped  afternoon to see him, for else would be off the death of myself, and,  by heavens, takes it all right to hin heaves. I should not do it as  all as a suspicion, and I am not to be told him what was the story. That  has been making a family pass on this table. I take my word that  I could hard

## Tensorboard
log directory를 설정해주고 실행합니다.
```bash
$ tensorboard --logdir='logs/'
```