# TV Script Generation
In this project, you'll generate your own [Simpsons](https://en.wikipedia.org/wiki/The_Simpsons) TV scripts using RNNs.  You'll be using part of the [Simpsons dataset](https://www.kaggle.com/wcukierski/the-simpsons-by-the-data) of scripts from 27 seasons.  The Neural Network you'll build will generate a new TV script for a scene at [Moe's Tavern](https://simpsonswiki.com/wiki/Moe's_Tavern).
## Get the Data
The data is already provided for you.  You'll be using a subset of the original dataset.  It consists of only the scenes in Moe's Tavern.  This doesn't include other versions of the tavern, like "Moe's Cavern", "Flaming Moe's", "Uncle Moe's Family Feed-Bag", etc..

In [4]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
import helper

data_dir = './data/simpsons/moes_tavern_lines.txt'
text = helper.load_data(data_dir)
# Ignore notice, since we don't use it for analysing the data
text = text[81:]

## Explore the Data
Play around with `view_sentence_range` to view different parts of the data.

In [5]:
view_sentence_range = (0, 10)

"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
import numpy as np

print('Dataset Stats')
print('Roughly the number of unique words: {}'.format(len({word: None for word in text.split()})))
scenes = text.split('\n\n')
print('Number of scenes: {}'.format(len(scenes)))
sentence_count_scene = [scene.count('\n') for scene in scenes]
print('Average number of sentences in each scene: {}'.format(np.average(sentence_count_scene)))

sentences = [sentence for scene in scenes for sentence in scene.split('\n')]
print('Number of lines: {}'.format(len(sentences)))
word_count_sentence = [len(sentence.split()) for sentence in sentences]
print('Average number of words in each line: {}'.format(np.average(word_count_sentence)))

print()
print('The sentences {} to {}:'.format(*view_sentence_range))
print('\n'.join(text.split('\n')[view_sentence_range[0]:view_sentence_range[1]]))

Dataset Stats
Roughly the number of unique words: 11492
Number of scenes: 262
Average number of sentences in each scene: 15.248091603053435
Number of lines: 4257
Average number of words in each line: 11.50434578341555

The sentences 0 to 10:
Moe_Szyslak: (INTO PHONE) Moe's Tavern. Where the elite meet to drink.
Bart_Simpson: Eh, yeah, hello, is Mike there? Last name, Rotch.
Moe_Szyslak: (INTO PHONE) Hold on, I'll check. (TO BARFLIES) Mike Rotch. Mike Rotch. Hey, has anybody seen Mike Rotch, lately?
Moe_Szyslak: (INTO PHONE) Listen you little puke. One of these days I'm gonna catch you, and I'm gonna carve my name on your back with an ice pick.
Moe_Szyslak: What's the matter Homer? You're not your normal effervescent self.
Homer_Simpson: I got my problems, Moe. Give me another one.
Moe_Szyslak: Homer, hey, you should not drink to forget your problems.
Barney_Gumble: Yeah, you should only drink to enhance your social skills.




## Implement Preprocessing Functions
The first thing to do to any dataset is preprocessing.  Implement the following preprocessing functions below:
- Lookup Table
- Tokenize Punctuation

### Lookup Table
To create a word embedding, you first need to transform the words to ids.  In this function, create two dictionaries:
- Dictionary to go from the words to an id, we'll call `vocab_to_int`
- Dictionary to go from the id to word, we'll call `int_to_vocab`

Return these dictionaries in the following tuple `(vocab_to_int, int_to_vocab)`

In [6]:
import numpy as np
import problem_unittests as tests

def create_lookup_tables(text):
    """
    Create lookup tables for vocabulary
    :param text: The text of tv scripts split into words
    :return: A tuple of dicts (vocab_to_int, int_to_vocab)
    """
    vocab = sorted(set(text))
    vocab_to_int = {c: i for i, c in enumerate(vocab)}
    int_to_vocab = dict(enumerate(vocab))
    return vocab_to_int, int_to_vocab


"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_create_lookup_tables(create_lookup_tables)

Tests Passed


### Tokenize Punctuation
We'll be splitting the script into a word array using spaces as delimiters.  However, punctuations like periods and exclamation marks make it hard for the neural network to distinguish between the word "bye" and "bye!".

Implement the function `token_lookup` to return a dict that will be used to tokenize symbols like "!" into "||Exclamation_Mark||".  Create a dictionary for the following symbols where the symbol is the key and value is the token:
- Period ( . )
- Comma ( , )
- Quotation Mark ( " )
- Semicolon ( ; )
- Exclamation mark ( ! )
- Question mark ( ? )
- Left Parentheses ( ( )
- Right Parentheses ( ) )
- Dash ( -- )
- Return ( \n )

This dictionary will be used to token the symbols and add the delimiter (space) around it.  This separates the symbols as it's own word, making it easier for the neural network to predict on the next word. Make sure you don't use a token that could be confused as a word. Instead of using the token "dash", try using something like "||dash||".

In [7]:
def token_lookup():
    """
    Generate a dict to turn punctuation into a token.
    :return: Tokenize dictionary where the key is the punctuation and the value is the token
    """
    return { 
        '.' : '<PERIOD>', ',' : '<COMMA>', '"':'<QUOTATION_MARK>', ';' : '<SEMICOLON>',
        '!' : '<EXCLAMATION_MARK>', '?' : '<QUESTION_MARK>', '(' : '<LEFT_PARENTHESES>',
        ')' : '<RIGHT_PARENTHESES>', '--' : '<DASH>', '\n' : '<RETURN>'
    }

"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_tokenize(token_lookup)

Tests Passed


## Preprocess all the data and save it
Running the code cell below will preprocess all the data and save it to file.

In [8]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
# Preprocess Training, Validation, and Testing Data
helper.preprocess_and_save_data(data_dir, token_lookup, create_lookup_tables)

# Check Point
This is your first checkpoint. If you ever decide to come back to this notebook or have to restart the notebook, you can start from here. The preprocessed data has been saved to disk.

In [1]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
import helper
import numpy as np
import problem_unittests as tests

int_text, vocab_to_int, int_to_vocab, token_dict = helper.load_preprocess()

## Build the Neural Network
You'll build the components necessary to build a RNN by implementing the following functions below:
- get_inputs
- get_init_cell
- get_embed
- build_rnn
- build_nn
- get_batches

### Check the Version of TensorFlow and Access to GPU

In [2]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
from distutils.version import LooseVersion
import warnings
import tensorflow as tf

# Check TensorFlow Version
assert LooseVersion(tf.__version__) >= LooseVersion('1.0'), 'Please use TensorFlow version 1.0 or newer'
print('TensorFlow Version: {}'.format(tf.__version__))

# Check for a GPU
if not tf.test.gpu_device_name():
    warnings.warn('No GPU found. Please use a GPU to train your neural network.')
else:
    print('Default GPU Device: {}'.format(tf.test.gpu_device_name()))

TensorFlow Version: 1.3.0
Default GPU Device: /gpu:0


### Input
Implement the `get_inputs()` function to create TF Placeholders for the Neural Network.  It should create the following placeholders:
- Input text placeholder named "input" using the [TF Placeholder](https://www.tensorflow.org/api_docs/python/tf/placeholder) `name` parameter.
- Targets placeholder
- Learning Rate placeholder

Return the placeholders in the following tuple `(Input, Targets, LearningRate)`

In [3]:
def get_inputs():
    """
    Create TF Placeholders for input, targets, and learning rate.
    :return: Tuple (input, targets, learning rate)
    """
    # TODO: Implement Function
    inputs = tf.placeholder(tf.int32, [None, None], name='input')
    targets = tf.placeholder(tf.int32, [None, None], name='targets')
    learning_rate = tf.placeholder(tf.float32, name='learning_rate')
    return inputs, targets, learning_rate


"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_get_inputs(get_inputs)

Tests Passed


### Build RNN Cell and Initialize
Stack one or more [`BasicLSTMCells`](https://www.tensorflow.org/api_docs/python/tf/contrib/rnn/BasicLSTMCell) in a [`MultiRNNCell`](https://www.tensorflow.org/api_docs/python/tf/contrib/rnn/MultiRNNCell).
- The Rnn size should be set using `rnn_size`
- Initalize Cell State using the MultiRNNCell's [`zero_state()`](https://www.tensorflow.org/api_docs/python/tf/contrib/rnn/MultiRNNCell#zero_state) function
    - Apply the name "initial_state" to the initial state using [`tf.identity()`](https://www.tensorflow.org/api_docs/python/tf/identity)

Return the cell and initial state in the following tuple `(Cell, InitialState)`

In [4]:
def get_init_cell(batch_size, rnn_size):
    """
    Create an RNN Cell and initialize it.
    :param batch_size: Size of batches
    :param rnn_size: Size of RNNs
    :return: Tuple (cell, initialize state)
    """
    # TODO: Implement Function
    def build_cell(lstm_size):
        # Use a basic LSTM cell
        lstm = tf.contrib.rnn.BasicLSTMCell(lstm_size)
        
        # Add dropout to the cell
        # drop = tf.contrib.rnn.DropoutWrapper(lstm, output_keep_prob=keep_prob)
        return lstm
    
    
    # Stack up multiple LSTM layers, for deep learning
    cell = tf.contrib.rnn.MultiRNNCell([build_cell(rnn_size) for _ in range(5)])
    initial_state = cell.zero_state(batch_size, tf.float32)
    initial_state = tf.identity(initial_state, name='initial_state')
    
    return cell, initial_state


"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_get_init_cell(get_init_cell)

Tests Passed


### Word Embedding
Apply embedding to `input_data` using TensorFlow.  Return the embedded sequence.

In [5]:
def get_embed(input_data, vocab_size, embed_dim):
    """
    Create embedding for <input_data>.
    :param input_data: TF placeholder for text input.
    :param vocab_size: Number of words in vocabulary.
    :param embed_dim: Number of embedding dimensions
    :return: Embedded input.
    """
    embedding = tf.Variable(tf.random_uniform((vocab_size, embed_dim), -1, 1))
    embed = tf.nn.embedding_lookup(embedding, input_data)
    return embed


"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_get_embed(get_embed)

Tests Passed


### Build RNN
You created a RNN Cell in the `get_init_cell()` function.  Time to use the cell to create a RNN.
- Build the RNN using the [`tf.nn.dynamic_rnn()`](https://www.tensorflow.org/api_docs/python/tf/nn/dynamic_rnn)
 - Apply the name "final_state" to the final state using [`tf.identity()`](https://www.tensorflow.org/api_docs/python/tf/identity)

Return the outputs and final_state state in the following tuple `(Outputs, FinalState)` 

In [6]:
def build_rnn(cell, inputs):
    """
    Create a RNN using a RNN Cell
    :param cell: RNN Cell
    :param inputs: Input text data
    :return: Tuple (Outputs, Final State)
    """
    outputs, final_state = tf.nn.dynamic_rnn(cell, inputs, dtype=tf.float32)
    final_state = tf.identity(final_state, name='final_state')
    return outputs, final_state


"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_build_rnn(build_rnn)

Tests Passed


### Build the Neural Network
Apply the functions you implemented above to:
- Apply embedding to `input_data` using your `get_embed(input_data, vocab_size, embed_dim)` function.
- Build RNN using `cell` and your `build_rnn(cell, inputs)` function.
- Apply a fully connected layer with a linear activation and `vocab_size` as the number of outputs.

Return the logits and final state in the following tuple (Logits, FinalState) 

In [7]:
def build_nn(cell, rnn_size, input_data, vocab_size, embed_dim):
    """
    Build part of the neural network
    :param cell: RNN cell
    :param rnn_size: Size of rnns
    :param input_data: Input data
    :param vocab_size: Vocabulary size
    :param embed_dim: Number of embedding dimensions
    :return: Tuple (Logits, FinalState)
    """
    inputs = get_embed(input_data, vocab_size, embed_dim)
    outputs, final_state = build_rnn(cell, inputs)
    
    logits = tf.contrib.layers.fully_connected(outputs, vocab_size, activation_fn=None)
    
    return logits, final_state


"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_build_nn(build_nn)

AssertionError: Final state wrong shape.  Found shape (2, 2, 128, 256)

### Batches
Implement `get_batches` to create batches of input and targets using `int_text`.  The batches should be a Numpy array with the shape `(number of batches, 2, batch size, sequence length)`. Each batch contains two elements:
- The first element is a single batch of **input** with the shape `[batch size, sequence length]`
- The second element is a single batch of **targets** with the shape `[batch size, sequence length]`

If you can't fill the last batch with enough data, drop the last batch.

For exmple, `get_batches([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20], 3, 2)` would return a Numpy array of the following:
```
[
  # First Batch
  [
    # Batch of Input
    [[ 1  2], [ 7  8], [13 14]]
    # Batch of targets
    [[ 2  3], [ 8  9], [14 15]]
  ]

  # Second Batch
  [
    # Batch of Input
    [[ 3  4], [ 9 10], [15 16]]
    # Batch of targets
    [[ 4  5], [10 11], [16 17]]
  ]

  # Third Batch
  [
    # Batch of Input
    [[ 5  6], [11 12], [17 18]]
    # Batch of targets
    [[ 6  7], [12 13], [18  1]]
  ]
]
```

Notice that the last target value in the last batch is the first input value of the first batch. In this case, `1`. This is a common technique used when creating sequence batches, although it is rather unintuitive.

In [11]:
def get_batches(int_text, batch_size, seq_length):
    """
    Return batches of input and target
    :param int_text: Text with the words replaced by their ids
    :param batch_size: The size of batch
    :param seq_length: The length of sequence
    :return: Batches as a Numpy array
    """
    # Solution from the Slack Team
    # This was really confusing me
    n_batches = int(len(int_text) / (batch_size * seq_length))

    # Drop the last few characters to make only full batches
    xdata = np.array(int_text[: n_batches * batch_size * seq_length])
    ydata = np.roll(xdata, -1)
    
    x_batches = np.split(xdata.reshape(batch_size, -1), n_batches, 1)
    y_batches = np.split(ydata.reshape(batch_size, -1), n_batches, 1)

    return np.array(list(zip(x_batches, y_batches)))

"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_get_batches(get_batches)

Tests Passed


## Neural Network Training
### Hyperparameters
Tune the following parameters:

- Set `num_epochs` to the number of epochs.
- Set `batch_size` to the batch size.
- Set `rnn_size` to the size of the RNNs.
- Set `embed_dim` to the size of the embedding.
- Set `seq_length` to the length of sequence.
- Set `learning_rate` to the learning rate.
- Set `show_every_n_batches` to the number of batches the neural network should print progress.

In [12]:
# Number of Epochs
num_epochs = 2000
# Batch Size
batch_size = 512
# RNN Size
rnn_size = 512
# Embedding Dimension Size
embed_dim = 32
# Sequence Length
seq_length = 20
# Learning Rate
learning_rate = 0.01
# Show stats for every n number of batches
show_every_n_batches = 5

"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
save_dir = './save'

### Build the Graph
Build the graph using the neural network you implemented.

In [13]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
from tensorflow.contrib import seq2seq

train_graph = tf.Graph()
with train_graph.as_default():
    vocab_size = len(int_to_vocab)
    input_text, targets, lr = get_inputs()
    input_data_shape = tf.shape(input_text)
    cell, initial_state = get_init_cell(input_data_shape[0], rnn_size)
    logits, final_state = build_nn(cell, rnn_size, input_text, vocab_size, embed_dim)

    # Probabilities for generating words
    probs = tf.nn.softmax(logits, name='probs')

    # Loss function
    cost = seq2seq.sequence_loss(
        logits,
        targets,
        tf.ones([input_data_shape[0], input_data_shape[1]]))

    # Optimizer
    optimizer = tf.train.AdamOptimizer(lr)

    # Gradient Clipping
    gradients = optimizer.compute_gradients(cost)
    capped_gradients = [(tf.clip_by_value(grad, -1., 1.), var) for grad, var in gradients if grad is not None]
    train_op = optimizer.apply_gradients(capped_gradients)

## Train
Train the neural network on the preprocessed data.  If you have a hard time getting a good loss, check the [forums](https://discussions.udacity.com/) to see if anyone is having the same problem.

In [14]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
batches = get_batches(int_text, batch_size, seq_length)

with tf.Session(graph=train_graph) as sess:
    sess.run(tf.global_variables_initializer())

    for epoch_i in range(num_epochs):
        state = sess.run(initial_state, {input_text: batches[0][0]})

        for batch_i, (x, y) in enumerate(batches):
            feed = {
                input_text: x,
                targets: y,
                initial_state: state,
                lr: learning_rate}
            train_loss, state, _ = sess.run([cost, final_state, train_op], feed)

            # Show every <show_every_n_batches> batches
            if (epoch_i * len(batches) + batch_i) % show_every_n_batches == 0:
                print('Epoch {:>3} Batch {:>4}/{}   train_loss = {:.3f}'.format(
                    epoch_i,
                    batch_i,
                    len(batches),
                    train_loss))

    # Save Model
    saver = tf.train.Saver()
    saver.save(sess, save_dir)
    print('Model Trained and Saved')

Epoch   0 Batch    0/6   train_loss = 8.822
Epoch   0 Batch    5/6   train_loss = 6.713
Epoch   1 Batch    4/6   train_loss = 6.143
Epoch   2 Batch    3/6   train_loss = 6.175
Epoch   3 Batch    2/6   train_loss = 6.009
Epoch   4 Batch    1/6   train_loss = 6.085
Epoch   5 Batch    0/6   train_loss = 5.998
Epoch   5 Batch    5/6   train_loss = 5.929
Epoch   6 Batch    4/6   train_loss = 5.974
Epoch   7 Batch    3/6   train_loss = 5.944
Epoch   8 Batch    2/6   train_loss = 5.964
Epoch   9 Batch    1/6   train_loss = 5.975
Epoch  10 Batch    0/6   train_loss = 5.930
Epoch  10 Batch    5/6   train_loss = 5.905
Epoch  11 Batch    4/6   train_loss = 5.932
Epoch  12 Batch    3/6   train_loss = 5.950
Epoch  13 Batch    2/6   train_loss = 5.940
Epoch  14 Batch    1/6   train_loss = 5.976
Epoch  15 Batch    0/6   train_loss = 5.929
Epoch  15 Batch    5/6   train_loss = 5.901
Epoch  16 Batch    4/6   train_loss = 5.930
Epoch  17 Batch    3/6   train_loss = 5.939
Epoch  18 Batch    2/6   train_l

Epoch 155 Batch    5/6   train_loss = 5.793
Epoch 156 Batch    4/6   train_loss = 5.807
Epoch 157 Batch    3/6   train_loss = 5.780
Epoch 158 Batch    2/6   train_loss = 5.768
Epoch 159 Batch    1/6   train_loss = 5.818
Epoch 160 Batch    0/6   train_loss = 5.774
Epoch 160 Batch    5/6   train_loss = 5.751
Epoch 161 Batch    4/6   train_loss = 5.905
Epoch 162 Batch    3/6   train_loss = 5.794
Epoch 163 Batch    2/6   train_loss = 5.786
Epoch 164 Batch    1/6   train_loss = 5.792
Epoch 165 Batch    0/6   train_loss = 5.709
Epoch 165 Batch    5/6   train_loss = 5.637
Epoch 166 Batch    4/6   train_loss = 5.642
Epoch 167 Batch    3/6   train_loss = 5.647
Epoch 168 Batch    2/6   train_loss = 5.626
Epoch 169 Batch    1/6   train_loss = 5.647
Epoch 170 Batch    0/6   train_loss = 5.576
Epoch 170 Batch    5/6   train_loss = 5.540
Epoch 171 Batch    4/6   train_loss = 5.554
Epoch 172 Batch    3/6   train_loss = 5.546
Epoch 173 Batch    2/6   train_loss = 5.526
Epoch 174 Batch    1/6   train_l

Epoch 311 Batch    4/6   train_loss = 4.121
Epoch 312 Batch    3/6   train_loss = 4.095
Epoch 313 Batch    2/6   train_loss = 4.083
Epoch 314 Batch    1/6   train_loss = 4.117
Epoch 315 Batch    0/6   train_loss = 4.097
Epoch 315 Batch    5/6   train_loss = 4.041
Epoch 316 Batch    4/6   train_loss = 4.083
Epoch 317 Batch    3/6   train_loss = 4.050
Epoch 318 Batch    2/6   train_loss = 4.041
Epoch 319 Batch    1/6   train_loss = 4.068
Epoch 320 Batch    0/6   train_loss = 4.067
Epoch 320 Batch    5/6   train_loss = 4.006
Epoch 321 Batch    4/6   train_loss = 4.064
Epoch 322 Batch    3/6   train_loss = 4.007
Epoch 323 Batch    2/6   train_loss = 3.996
Epoch 324 Batch    1/6   train_loss = 4.035
Epoch 325 Batch    0/6   train_loss = 4.006
Epoch 325 Batch    5/6   train_loss = 3.950
Epoch 326 Batch    4/6   train_loss = 3.980
Epoch 327 Batch    3/6   train_loss = 3.981
Epoch 328 Batch    2/6   train_loss = 3.949
Epoch 329 Batch    1/6   train_loss = 3.993
Epoch 330 Batch    0/6   train_l

Epoch 467 Batch    3/6   train_loss = 3.126
Epoch 468 Batch    2/6   train_loss = 3.112
Epoch 469 Batch    1/6   train_loss = 3.167
Epoch 470 Batch    0/6   train_loss = 3.181
Epoch 470 Batch    5/6   train_loss = 3.100
Epoch 471 Batch    4/6   train_loss = 3.156
Epoch 472 Batch    3/6   train_loss = 3.089
Epoch 473 Batch    2/6   train_loss = 3.095
Epoch 474 Batch    1/6   train_loss = 3.130
Epoch 475 Batch    0/6   train_loss = 3.140
Epoch 475 Batch    5/6   train_loss = 3.051
Epoch 476 Batch    4/6   train_loss = 3.141
Epoch 477 Batch    3/6   train_loss = 3.052
Epoch 478 Batch    2/6   train_loss = 3.098
Epoch 479 Batch    1/6   train_loss = 3.126
Epoch 480 Batch    0/6   train_loss = 3.193
Epoch 480 Batch    5/6   train_loss = 3.101
Epoch 481 Batch    4/6   train_loss = 3.164
Epoch 482 Batch    3/6   train_loss = 3.089
Epoch 483 Batch    2/6   train_loss = 3.092
Epoch 484 Batch    1/6   train_loss = 3.157
Epoch 485 Batch    0/6   train_loss = 3.147
Epoch 485 Batch    5/6   train_l

Epoch 623 Batch    2/6   train_loss = 2.625
Epoch 624 Batch    1/6   train_loss = 2.663
Epoch 625 Batch    0/6   train_loss = 2.697
Epoch 625 Batch    5/6   train_loss = 2.610
Epoch 626 Batch    4/6   train_loss = 2.661
Epoch 627 Batch    3/6   train_loss = 2.607
Epoch 628 Batch    2/6   train_loss = 2.588
Epoch 629 Batch    1/6   train_loss = 2.665
Epoch 630 Batch    0/6   train_loss = 2.655
Epoch 630 Batch    5/6   train_loss = 2.621
Epoch 631 Batch    4/6   train_loss = 2.666
Epoch 632 Batch    3/6   train_loss = 2.619
Epoch 633 Batch    2/6   train_loss = 2.561
Epoch 634 Batch    1/6   train_loss = 2.687
Epoch 635 Batch    0/6   train_loss = 2.634
Epoch 635 Batch    5/6   train_loss = 2.600
Epoch 636 Batch    4/6   train_loss = 2.616
Epoch 637 Batch    3/6   train_loss = 2.582
Epoch 638 Batch    2/6   train_loss = 2.554
Epoch 639 Batch    1/6   train_loss = 2.614
Epoch 640 Batch    0/6   train_loss = 2.658
Epoch 640 Batch    5/6   train_loss = 2.564
Epoch 641 Batch    4/6   train_l

Epoch 779 Batch    1/6   train_loss = 2.432
Epoch 780 Batch    0/6   train_loss = 2.435
Epoch 780 Batch    5/6   train_loss = 2.396
Epoch 781 Batch    4/6   train_loss = 2.377
Epoch 782 Batch    3/6   train_loss = 2.368
Epoch 783 Batch    2/6   train_loss = 2.302
Epoch 784 Batch    1/6   train_loss = 2.443
Epoch 785 Batch    0/6   train_loss = 2.412
Epoch 785 Batch    5/6   train_loss = 2.351
Epoch 786 Batch    4/6   train_loss = 2.407
Epoch 787 Batch    3/6   train_loss = 2.290
Epoch 788 Batch    2/6   train_loss = 2.328
Epoch 789 Batch    1/6   train_loss = 2.422
Epoch 790 Batch    0/6   train_loss = 2.365
Epoch 790 Batch    5/6   train_loss = 2.335
Epoch 791 Batch    4/6   train_loss = 2.354
Epoch 792 Batch    3/6   train_loss = 2.291
Epoch 793 Batch    2/6   train_loss = 2.259
Epoch 794 Batch    1/6   train_loss = 2.344
Epoch 795 Batch    0/6   train_loss = 2.297
Epoch 795 Batch    5/6   train_loss = 2.271
Epoch 796 Batch    4/6   train_loss = 2.296
Epoch 797 Batch    3/6   train_l

Epoch 935 Batch    0/6   train_loss = 1.950
Epoch 935 Batch    5/6   train_loss = 1.942
Epoch 936 Batch    4/6   train_loss = 1.927
Epoch 937 Batch    3/6   train_loss = 1.941
Epoch 938 Batch    2/6   train_loss = 1.898
Epoch 939 Batch    1/6   train_loss = 2.006
Epoch 940 Batch    0/6   train_loss = 2.016
Epoch 940 Batch    5/6   train_loss = 1.948
Epoch 941 Batch    4/6   train_loss = 1.983
Epoch 942 Batch    3/6   train_loss = 1.911
Epoch 943 Batch    2/6   train_loss = 1.962
Epoch 944 Batch    1/6   train_loss = 1.976
Epoch 945 Batch    0/6   train_loss = 1.987
Epoch 945 Batch    5/6   train_loss = 1.969
Epoch 946 Batch    4/6   train_loss = 1.947
Epoch 947 Batch    3/6   train_loss = 1.950
Epoch 948 Batch    2/6   train_loss = 1.916
Epoch 949 Batch    1/6   train_loss = 1.992
Epoch 950 Batch    0/6   train_loss = 1.992
Epoch 950 Batch    5/6   train_loss = 1.937
Epoch 951 Batch    4/6   train_loss = 1.951
Epoch 952 Batch    3/6   train_loss = 1.936
Epoch 953 Batch    2/6   train_l

Epoch 1088 Batch    2/6   train_loss = 1.789
Epoch 1089 Batch    1/6   train_loss = 1.866
Epoch 1090 Batch    0/6   train_loss = 1.876
Epoch 1090 Batch    5/6   train_loss = 1.825
Epoch 1091 Batch    4/6   train_loss = 1.836
Epoch 1092 Batch    3/6   train_loss = 1.816
Epoch 1093 Batch    2/6   train_loss = 1.798
Epoch 1094 Batch    1/6   train_loss = 1.886
Epoch 1095 Batch    0/6   train_loss = 1.868
Epoch 1095 Batch    5/6   train_loss = 1.811
Epoch 1096 Batch    4/6   train_loss = 1.835
Epoch 1097 Batch    3/6   train_loss = 1.790
Epoch 1098 Batch    2/6   train_loss = 1.794
Epoch 1099 Batch    1/6   train_loss = 1.862
Epoch 1100 Batch    0/6   train_loss = 1.847
Epoch 1100 Batch    5/6   train_loss = 1.789
Epoch 1101 Batch    4/6   train_loss = 1.796
Epoch 1102 Batch    3/6   train_loss = 1.757
Epoch 1103 Batch    2/6   train_loss = 1.785
Epoch 1104 Batch    1/6   train_loss = 1.815
Epoch 1105 Batch    0/6   train_loss = 1.858
Epoch 1105 Batch    5/6   train_loss = 1.773
Epoch 1106

Epoch 1240 Batch    5/6   train_loss = 1.825
Epoch 1241 Batch    4/6   train_loss = 1.840
Epoch 1242 Batch    3/6   train_loss = 1.808
Epoch 1243 Batch    2/6   train_loss = 1.803
Epoch 1244 Batch    1/6   train_loss = 1.857
Epoch 1245 Batch    0/6   train_loss = 1.842
Epoch 1245 Batch    5/6   train_loss = 1.772
Epoch 1246 Batch    4/6   train_loss = 1.805
Epoch 1247 Batch    3/6   train_loss = 1.800
Epoch 1248 Batch    2/6   train_loss = 1.787
Epoch 1249 Batch    1/6   train_loss = 1.867
Epoch 1250 Batch    0/6   train_loss = 1.841
Epoch 1250 Batch    5/6   train_loss = 1.810
Epoch 1251 Batch    4/6   train_loss = 1.785
Epoch 1252 Batch    3/6   train_loss = 1.772
Epoch 1253 Batch    2/6   train_loss = 1.764
Epoch 1254 Batch    1/6   train_loss = 1.817
Epoch 1255 Batch    0/6   train_loss = 1.820
Epoch 1255 Batch    5/6   train_loss = 1.783
Epoch 1256 Batch    4/6   train_loss = 1.772
Epoch 1257 Batch    3/6   train_loss = 1.779
Epoch 1258 Batch    2/6   train_loss = 1.733
Epoch 1259

Epoch 1393 Batch    2/6   train_loss = 1.529
Epoch 1394 Batch    1/6   train_loss = 1.614
Epoch 1395 Batch    0/6   train_loss = 1.636
Epoch 1395 Batch    5/6   train_loss = 1.517
Epoch 1396 Batch    4/6   train_loss = 1.558
Epoch 1397 Batch    3/6   train_loss = 1.521
Epoch 1398 Batch    2/6   train_loss = 1.497
Epoch 1399 Batch    1/6   train_loss = 1.608
Epoch 1400 Batch    0/6   train_loss = 1.603
Epoch 1400 Batch    5/6   train_loss = 1.542
Epoch 1401 Batch    4/6   train_loss = 1.551
Epoch 1402 Batch    3/6   train_loss = 1.539
Epoch 1403 Batch    2/6   train_loss = 1.539
Epoch 1404 Batch    1/6   train_loss = 1.586
Epoch 1405 Batch    0/6   train_loss = 1.582
Epoch 1405 Batch    5/6   train_loss = 1.523
Epoch 1406 Batch    4/6   train_loss = 1.519
Epoch 1407 Batch    3/6   train_loss = 1.530
Epoch 1408 Batch    2/6   train_loss = 1.506
Epoch 1409 Batch    1/6   train_loss = 1.586
Epoch 1410 Batch    0/6   train_loss = 1.572
Epoch 1410 Batch    5/6   train_loss = 1.506
Epoch 1411

Epoch 1545 Batch    5/6   train_loss = 1.527
Epoch 1546 Batch    4/6   train_loss = 1.532
Epoch 1547 Batch    3/6   train_loss = 1.515
Epoch 1548 Batch    2/6   train_loss = 1.518
Epoch 1549 Batch    1/6   train_loss = 1.553
Epoch 1550 Batch    0/6   train_loss = 1.588
Epoch 1550 Batch    5/6   train_loss = 1.514
Epoch 1551 Batch    4/6   train_loss = 1.510
Epoch 1552 Batch    3/6   train_loss = 1.498
Epoch 1553 Batch    2/6   train_loss = 1.487
Epoch 1554 Batch    1/6   train_loss = 1.550
Epoch 1555 Batch    0/6   train_loss = 1.550
Epoch 1555 Batch    5/6   train_loss = 1.501
Epoch 1556 Batch    4/6   train_loss = 1.470
Epoch 1557 Batch    3/6   train_loss = 1.491
Epoch 1558 Batch    2/6   train_loss = 1.459
Epoch 1559 Batch    1/6   train_loss = 1.520
Epoch 1560 Batch    0/6   train_loss = 1.540
Epoch 1560 Batch    5/6   train_loss = 1.487
Epoch 1561 Batch    4/6   train_loss = 1.470
Epoch 1562 Batch    3/6   train_loss = 1.492
Epoch 1563 Batch    2/6   train_loss = 1.454
Epoch 1564

Epoch 1698 Batch    2/6   train_loss = 1.371
Epoch 1699 Batch    1/6   train_loss = 1.437
Epoch 1700 Batch    0/6   train_loss = 1.446
Epoch 1700 Batch    5/6   train_loss = 1.402
Epoch 1701 Batch    4/6   train_loss = 1.386
Epoch 1702 Batch    3/6   train_loss = 1.406
Epoch 1703 Batch    2/6   train_loss = 1.391
Epoch 1704 Batch    1/6   train_loss = 1.440
Epoch 1705 Batch    0/6   train_loss = 1.461
Epoch 1705 Batch    5/6   train_loss = 1.389
Epoch 1706 Batch    4/6   train_loss = 1.394
Epoch 1707 Batch    3/6   train_loss = 1.377
Epoch 1708 Batch    2/6   train_loss = 1.352
Epoch 1709 Batch    1/6   train_loss = 1.448
Epoch 1710 Batch    0/6   train_loss = 1.448
Epoch 1710 Batch    5/6   train_loss = 1.420
Epoch 1711 Batch    4/6   train_loss = 1.432
Epoch 1712 Batch    3/6   train_loss = 1.442
Epoch 1713 Batch    2/6   train_loss = 1.429
Epoch 1714 Batch    1/6   train_loss = 1.495
Epoch 1715 Batch    0/6   train_loss = 1.505
Epoch 1715 Batch    5/6   train_loss = 1.449
Epoch 1716

Epoch 1850 Batch    5/6   train_loss = 1.204
Epoch 1851 Batch    4/6   train_loss = 1.175
Epoch 1852 Batch    3/6   train_loss = 1.202
Epoch 1853 Batch    2/6   train_loss = 1.175
Epoch 1854 Batch    1/6   train_loss = 1.252
Epoch 1855 Batch    0/6   train_loss = 1.242
Epoch 1855 Batch    5/6   train_loss = 1.209
Epoch 1856 Batch    4/6   train_loss = 1.189
Epoch 1857 Batch    3/6   train_loss = 1.186
Epoch 1858 Batch    2/6   train_loss = 1.221
Epoch 1859 Batch    1/6   train_loss = 1.249
Epoch 1860 Batch    0/6   train_loss = 1.296
Epoch 1860 Batch    5/6   train_loss = 1.213
Epoch 1861 Batch    4/6   train_loss = 1.230
Epoch 1862 Batch    3/6   train_loss = 1.237
Epoch 1863 Batch    2/6   train_loss = 1.198
Epoch 1864 Batch    1/6   train_loss = 1.303
Epoch 1865 Batch    0/6   train_loss = 1.268
Epoch 1865 Batch    5/6   train_loss = 1.224
Epoch 1866 Batch    4/6   train_loss = 1.237
Epoch 1867 Batch    3/6   train_loss = 1.203
Epoch 1868 Batch    2/6   train_loss = 1.195
Epoch 1869

## Save Parameters
Save `seq_length` and `save_dir` for generating a new TV script.

In [15]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
# Save parameters for checkpoint
helper.save_params((seq_length, save_dir))

# Checkpoint

In [16]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
import tensorflow as tf
import numpy as np
import helper
import problem_unittests as tests

print(tf.__version__)

_, vocab_to_int, int_to_vocab, token_dict = helper.load_preprocess()
seq_length, load_dir = helper.load_params()

1.3.0


## Implement Generate Functions
### Get Tensors
Get tensors from `loaded_graph` using the function [`get_tensor_by_name()`](https://www.tensorflow.org/api_docs/python/tf/Graph#get_tensor_by_name).  Get the tensors using the following names:
- "input:0"
- "initial_state:0"
- "final_state:0"
- "probs:0"

Return the tensors in the following tuple `(InputTensor, InitialStateTensor, FinalStateTensor, ProbsTensor)` 

In [17]:
def get_tensors(loaded_graph):
    """
    Get input, initial state, final state, and probabilities tensor from <loaded_graph>
    :param loaded_graph: TensorFlow graph loaded from file
    :return: Tuple (InputTensor, InitialStateTensor, FinalStateTensor, ProbsTensor)
    """
    input_tensor = loaded_graph.get_tensor_by_name('input:0')
    initial_state = loaded_graph.get_tensor_by_name('initial_state:0')
    final_state = loaded_graph.get_tensor_by_name('final_state:0')
    probs = loaded_graph.get_tensor_by_name('probs:0')
    return input_tensor, initial_state, final_state, probs


"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_get_tensors(get_tensors)

Tests Passed


### Choose Word
Implement the `pick_word()` function to select the next word using `probabilities`.

In [18]:
def pick_word(probabilities, int_to_vocab):
    """
    Pick the next word in the generated text
    :param probabilities: Probabilites of the next word
    :param int_to_vocab: Dictionary of word ids as the keys and words as the values
    :return: String of the predicted word
    """
    return int_to_vocab[np.argmax(probabilities)]


"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_pick_word(pick_word)

Tests Passed


## Generate TV Script
This will generate the TV script for you.  Set `gen_length` to the length of TV script you want to generate.

In [19]:
gen_length = 200
# homer_simpson, moe_szyslak, or Barney_Gumble
prime_word = 'moe_szyslak'

"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
loaded_graph = tf.Graph()
with tf.Session(graph=loaded_graph) as sess:
    # Load saved model
    loader = tf.train.import_meta_graph(load_dir + '.meta')
    loader.restore(sess, load_dir)

    # Get Tensors from loaded model
    input_text, initial_state, final_state, probs = get_tensors(loaded_graph)

    # Sentences generation setup
    gen_sentences = [prime_word + ':']
    prev_state = sess.run(initial_state, {input_text: np.array([[1]])})

    # Generate sentences
    for n in range(gen_length):
        # Dynamic Input
        dyn_input = [[vocab_to_int[word] for word in gen_sentences[-seq_length:]]]
        dyn_seq_length = len(dyn_input[0])

        # Get Prediction
        probabilities, prev_state = sess.run(
            [probs, final_state],
            {input_text: dyn_input, initial_state: prev_state})
        
        # pred_word = pick_word(probabilities[dyn_seq_length-1], int_to_vocab)
        pred_word = pick_word(probabilities[0][dyn_seq_length-1], int_to_vocab)
        
        gen_sentences.append(pred_word)
    
    # Remove tokens
    tv_script = ' '.join(gen_sentences)
    for key, token in token_dict.items():
        ending = ' ' if key in ['\n', '(', '"'] else ''
        tv_script = tv_script.replace(' ' + token.lower(), key)
    tv_script = tv_script.replace('\n ', '\n')
    tv_script = tv_script.replace('( ', '(')
        
    print(tv_script)

INFO:tensorflow:Restoring parameters from ./save
moe_szyslak:(rueful) no did you know, i unsanitary.
moe_szyslak:(disgusted) eh, i unsanitary when-i-get-a-hold-of-you..." i should seen next sound
homer_simpson:(tough, but carl on the think is the poet.
moe_szyslak: no, two i do occurred the throat) oh he plan at the kind across other gas. and i don't fantastic heard!
moe_szyslak: which big won't cares.
homer_simpson:(great) where you girls feedbag.
moe_szyslak:(barbara night) whoa, we'll called it she break with my think flaming anything the ticket of woe:.
homer_simpson:(gasp) and a-lug happy to, a stole out".
homer_simpson: wow, i got you nuts.
homer_simpson:(singing) hey, i wish i show of me, i do.(would sick"...


moe_szyslak:(covering using myself losers, i did you bottle on that bad. and one to starla.
homer_simpson: but i'm do that in the barflies!.
homer_simpson: i increasingly!
krusty_the_clown:


# The TV Script is Nonsensical
It's ok if the TV script doesn't make any sense.  We trained on less than a megabyte of text.  In order to get good results, you'll have to use a smaller vocabulary or get more data.  Luckly there's more data!  As we mentioned in the begging of this project, this is a subset of [another dataset](https://www.kaggle.com/wcukierski/the-simpsons-by-the-data).  We didn't have you train on all the data, because that would take too long.  However, you are free to train your neural network on all the data.  After you complete the project, of course.
# Submitting This Project
When submitting this project, make sure to run all the cells before saving the notebook. Save the notebook file as "dlnd_tv_script_generation.ipynb" and save it as a HTML file under "File" -> "Download as". Include the "helper.py" and "problem_unittests.py" files in your submission.