# TV Script Generation

In this project, you'll generate your own [Seinfeld](https://en.wikipedia.org/wiki/Seinfeld) TV scripts using RNNs.  You'll be using part of the [Seinfeld dataset](https://www.kaggle.com/thec03u5/seinfeld-chronicles#scripts.csv) of scripts from 9 seasons.  The Neural Network you'll build will generate a new ,"fake" TV script, based on patterns it recognizes in this training data.

## Get the Data

The data is already provided for you in `./data/Seinfeld_Scripts.txt` and you're encouraged to open that file and look at the text. 
>* As a first step, we'll load in this data and look at some samples. 
* Then, you'll be tasked with defining and training an RNN to generate a new script!

In [1]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
# load in data
import helper
data_dir = './data/Seinfeld_Scripts.txt'
text = helper.load_data(data_dir)

## Explore the Data
Play around with `view_line_range` to view different parts of the data. This will give you a sense of the data you'll be working with. You can see, for example, that it is all lowercase text, and each new line of dialogue is separated by a newline character `\n`.

In [2]:
view_line_range = (0, 10)

"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
import numpy as np

print('Dataset Stats')
print('Roughly the number of unique words: {}'.format(len({word: None for word in text.split()})))

lines = text.split('\n')
print('Number of lines: {}'.format(len(lines)))
word_count_line = [len(line.split()) for line in lines]
print('Average number of words in each line: {}'.format(np.average(word_count_line)))

print()
print('The lines {} to {}:'.format(*view_line_range))
print('\n'.join(text.split('\n')[view_line_range[0]:view_line_range[1]]))

Dataset Stats
Roughly the number of unique words: 46367
Number of lines: 109233
Average number of words in each line: 5.544240293684143

The lines 0 to 10:
jerry: do you know what this is all about? do you know, why were here? to be out, this is out...and out is one of the single most enjoyable experiences of life. people...did you ever hear people talking about we should go out? this is what theyre talking about...this whole thing, were all out now, no one is home. not one person here is home, were all out! there are people trying to find us, they dont know where we are. (on an imaginary phone) did you ring?, i cant find him. where did he go? he didnt tell me where he was going. he must have gone out. you wanna go out you get ready, you pick out the clothes, right? you take the shower, you get all ready, get the cash, get your friends, the car, the spot, the reservation...then youre standing around, what do you do? you go we gotta be getting back. once youre out, you wanna get back! y

---
## Implement Pre-processing Functions
The first thing to do to any dataset is pre-processing.  Implement the following pre-processing functions below:
- Lookup Table
- Tokenize Punctuation

### Lookup Table
To create a word embedding, you first need to transform the words to ids.  In this function, create two dictionaries:
- Dictionary to go from the words to an id, we'll call `vocab_to_int`
- Dictionary to go from the id to word, we'll call `int_to_vocab`

Return these dictionaries in the following **tuple** `(vocab_to_int, int_to_vocab)`

In [3]:
import problem_unittests as tests
import re
from string import punctuation
from collections import Counter

def create_lookup_tables(text):
    """
    Create lookup tables for vocabulary
    :param text: The text of tv scripts split into words
    :return: A tuple of dicts (vocab_to_int, int_to_vocab)
    """
    # TODO: Implement Function
    vocabulary = Counter(text)
    int_to_vocab = {i : word for i, word in enumerate(sorted(vocabulary, key=vocabulary.get, reverse=True))}
    vocab_to_int = {word : i for i, word in int_to_vocab.items()}
    
    # return tuple
    return (vocab_to_int, int_to_vocab)


"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_create_lookup_tables(create_lookup_tables)

Tests Passed


### Tokenize Punctuation
We'll be splitting the script into a word array using spaces as delimiters.  However, punctuations like periods and exclamation marks can create multiple ids for the same word. For example, "bye" and "bye!" would generate two different word ids.

Implement the function `token_lookup` to return a dict that will be used to tokenize symbols like "!" into "||Exclamation_Mark||".  Create a dictionary for the following symbols where the symbol is the key and value is the token:
- Period ( **.** )
- Comma ( **,** )
- Quotation Mark ( **"** )
- Semicolon ( **;** )
- Exclamation mark ( **!** )
- Question mark ( **?** )
- Left Parentheses ( **(** )
- Right Parentheses ( **)** )
- Dash ( **-** )
- Return ( **\n** )

This dictionary will be used to tokenize the symbols and add the delimiter (space) around it.  This separates each symbols as its own word, making it easier for the neural network to predict the next word. Make sure you don't use a value that could be confused as a word; for example, instead of using the value "dash", try using something like "||dash||".

In [4]:
def token_lookup():
    """
    Generate a dict to turn punctuation into a token.
    :return: Tokenized dictionary where the key is the punctuation and the value is the token
    """
    # TODO: Implement Function    
    return {
        '.': '||period||',
        ',': '||comma||',
        '"': '||quotation||',
        ';': '||semicolon||',
        '!': '||exclamation||',
        '?': '||question||',
        '(': '||left_parenthesis||',
        ')': '||right_parenthesis||',
        '-': '||dash||',
        '\n': '||newline||'
    }

"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_tokenize(token_lookup)

Tests Passed


## Pre-process all the data and save it

Running the code cell below will pre-process all the data and save it to file. You're encouraged to lok at the code for `preprocess_and_save_data` in the `helpers.py` file to see what it's doing in detail, but you do not need to change this code.

In [5]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
# pre-process training data
helper.preprocess_and_save_data(data_dir, token_lookup, create_lookup_tables)

# Check Point
This is your first checkpoint. If you ever decide to come back to this notebook or have to restart the notebook, you can start from here. The preprocessed data has been saved to disk.

In [6]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
import helper
import problem_unittests as tests

int_text, vocab_to_int, int_to_vocab, token_dict = helper.load_preprocess()

## Build the Neural Network
In this section, you'll build the components necessary to build an RNN by implementing the RNN Module and forward and backpropagation functions.

### Check Access to GPU

In [7]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
import torch

# Check for a GPU
train_on_gpu = torch.cuda.is_available()
if not train_on_gpu:
    print('No GPU found. Please use a GPU to train your neural network.')

## Input
Let's start with the preprocessed input data. We'll use [TensorDataset](http://pytorch.org/docs/master/data.html#torch.utils.data.TensorDataset) to provide a known format to our dataset; in combination with [DataLoader](http://pytorch.org/docs/master/data.html#torch.utils.data.DataLoader), it will handle batching, shuffling, and other dataset iteration functions.

You can create data with TensorDataset by passing in feature and target tensors. Then create a DataLoader as usual.
```
data = TensorDataset(feature_tensors, target_tensors)
data_loader = torch.utils.data.DataLoader(data, 
                                          batch_size=batch_size)
```

### Batching
Implement the `batch_data` function to batch `words` data into chunks of size `batch_size` using the `TensorDataset` and `DataLoader` classes.

>You can batch words using the DataLoader, but it will be up to you to create `feature_tensors` and `target_tensors` of the correct size and content for a given `sequence_length`.

For example, say we have these as input:
```
words = [1, 2, 3, 4, 5, 6, 7]
sequence_length = 4
```

Your first `feature_tensor` should contain the values:
```
[1, 2, 3, 4]
```
And the corresponding `target_tensor` should just be the next "word"/tokenized word value:
```
5
```
This should continue with the second `feature_tensor`, `target_tensor` being:
```
[2, 3, 4, 5]  # features
6             # target
```

In [8]:
from torch.utils.data import TensorDataset, DataLoader

def batch_data(words, sequence_length, batch_size):
    """
    Batch the neural network data using DataLoader
    :param words: The word ids of the TV scripts
    :param sequence_length: The sequence length of each batch
    :param batch_size: The size of each batch; the number of sequences in a batch
    :return: DataLoader with batched data
    """
    # TODO: Implement function
    num_batches = (len(words) - sequence_length) // batch_size    
    
    features = []
    targets = []
    for i in range(num_batches * batch_size):
        features.append(words[i:i+sequence_length])
        targets.append(words[i+sequence_length])
        
    dataset = TensorDataset(torch.LongTensor(features), torch.LongTensor(targets))
    # return a dataloader
    return DataLoader(dataset, batch_size=batch_size)


# there is no test for this function, but you are encouraged to create
# print statements and tests of your own
print([word for word in range(15)])
for data in batch_data(range(15), 3, 4):
    print(data)


[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
[tensor([[0, 1, 2],
        [1, 2, 3],
        [2, 3, 4],
        [3, 4, 5]]), tensor([3, 4, 5, 6])]
[tensor([[4, 5, 6],
        [5, 6, 7],
        [6, 7, 8],
        [7, 8, 9]]), tensor([ 7,  8,  9, 10])]
[tensor([[ 8,  9, 10],
        [ 9, 10, 11],
        [10, 11, 12],
        [11, 12, 13]]), tensor([11, 12, 13, 14])]


### Test your dataloader 

You'll have to modify this code to test a batching function, but it should look fairly similar.

Below, we're generating some test text data and defining a dataloader using the function you defined, above. Then, we are getting some sample batch of inputs `sample_x` and targets `sample_y` from our dataloader.

Your code should return something like the following (likely in a different order, if you shuffled your data):

```
torch.Size([10, 5])
tensor([[ 28,  29,  30,  31,  32],
        [ 21,  22,  23,  24,  25],
        [ 17,  18,  19,  20,  21],
        [ 34,  35,  36,  37,  38],
        [ 11,  12,  13,  14,  15],
        [ 23,  24,  25,  26,  27],
        [  6,   7,   8,   9,  10],
        [ 38,  39,  40,  41,  42],
        [ 25,  26,  27,  28,  29],
        [  7,   8,   9,  10,  11]])

torch.Size([10])
tensor([ 33,  26,  22,  39,  16,  28,  11,  43,  30,  12])
```

### Sizes
Your sample_x should be of size `(batch_size, sequence_length)` or (10, 5) in this case and sample_y should just have one dimension: batch_size (10). 

### Values

You should also notice that the targets, sample_y, are the *next* value in the ordered test_text data. So, for an input sequence `[ 28,  29,  30,  31,  32]` that ends with the value `32`, the corresponding output should be `33`.

In [9]:
# test dataloader

test_text = range(50)
t_loader = batch_data(test_text, sequence_length=5, batch_size=10)

data_iter = iter(t_loader)
sample_x, sample_y = data_iter.next()

print(sample_x.shape)
print(sample_x)
print()
print(sample_y.shape)
print(sample_y)

torch.Size([10, 5])
tensor([[ 0,  1,  2,  3,  4],
        [ 1,  2,  3,  4,  5],
        [ 2,  3,  4,  5,  6],
        [ 3,  4,  5,  6,  7],
        [ 4,  5,  6,  7,  8],
        [ 5,  6,  7,  8,  9],
        [ 6,  7,  8,  9, 10],
        [ 7,  8,  9, 10, 11],
        [ 8,  9, 10, 11, 12],
        [ 9, 10, 11, 12, 13]])

torch.Size([10])
tensor([ 5,  6,  7,  8,  9, 10, 11, 12, 13, 14])


---
## Build the Neural Network
Implement an RNN using PyTorch's [Module class](http://pytorch.org/docs/master/nn.html#torch.nn.Module). You may choose to use a GRU or an LSTM. To complete the RNN, you'll have to implement the following functions for the class:
 - `__init__` - The initialize function. 
 - `init_hidden` - The initialization function for an LSTM/GRU hidden state
 - `forward` - Forward propagation function.
 
The initialize function should create the layers of the neural network and save them to the class. The forward propagation function will use these layers to run forward propagation and generate an output and a hidden state.

**The output of this model should be the *last* batch of word scores** after a complete sequence has been processed. That is, for each input sequence of words, we only want to output the word scores for a single, most likely, next word.

### Hints

1. Make sure to stack the outputs of the lstm to pass to your fully-connected layer, you can do this with `lstm_output = lstm_output.contiguous().view(-1, self.hidden_dim)`
2. You can get the last batch of word scores by shaping the output of the final, fully-connected layer like so:

```
# reshape into (batch_size, seq_length, output_size)
output = output.view(batch_size, -1, self.output_size)
# get last batch
out = output[:, -1]
```

In [10]:
import torch.nn as nn

class RNN(nn.Module):
    
    def __init__(self, vocab_size, output_size, embedding_dim, hidden_dim, n_layers, dropout=0.5):
        """
        Initialize the PyTorch RNN Module
        :param vocab_size: The number of input dimensions of the neural network (the size of the vocabulary)
        :param output_size: The number of output dimensions of the neural network
        :param embedding_dim: The size of embeddings, should you choose to use them        
        :param hidden_dim: The size of the hidden layer outputs
        :param dropout: dropout to add in between LSTM/GRU layers
        """
        super(RNN, self).__init__()
        # TODO: Implement function
        # set class variables
        self.hidden_dim = hidden_dim
        self.num_layers = n_layers
        self.output_size = output_size
        
        # define model layers
        self.embed_input = nn.Embedding(vocab_size, embedding_dim)
        self.lstm = nn.LSTM(embedding_dim, hidden_dim, n_layers, batch_first=True)
        self.dropout = nn.Dropout(dropout)
        self.fc = nn.Linear(hidden_dim, output_size)
    
    def forward(self, nn_input, hidden):
        """
        Forward propagation of the neural network
        :param nn_input: The input to the neural network
        :param hidden: The hidden state        
        :return: Two Tensors, the output of the neural network and the latest hidden state
        """
        # TODO: Implement function
        batch_size = nn_input.shape[0]
        x = self.embed_input(nn_input)
        x, hidden = self.lstm(x, hidden)
        x = x.contiguous().view(-1, self.hidden_dim)
#         x = self.dropout(x)
        x = self.fc(x)
        x = x.view(batch_size, -1, self.output_size)

        # return one batch of output word scores and the hidden state
        return x[:, -1], hidden
    
    
    def init_hidden(self, batch_size):
        '''
        Initialize the hidden state of an LSTM/GRU
        :param batch_size: The batch_size of the hidden state
        :return: hidden state of dims (n_layers, batch_size, hidden_dim)
        '''
        
        new_weights = []
        # Implement function
        for name, parameter in self.named_parameters():
            if name.startswith('lstm.weight_ih'):
                new_weight = parameter.data.new(self.num_layers, batch_size, self.hidden_dim).zero_()
                if train_on_gpu:
                    new_weight = new_weight.cuda()
                new_weights += [new_weight]
        
        # initialize hidden state with zero weights, and move to GPU if available
        return tuple(new_weights)

   
"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_rnn(RNN, train_on_gpu)

Tests Passed


### Define forward and backpropagation

Use the RNN class you implemented to apply forward and back propagation. This function will be called, iteratively, in the training loop as follows:
```
loss = forward_back_prop(decoder, decoder_optimizer, criterion, inp, target)
```

And it should return the average loss over a batch and the hidden state returned by a call to `RNN(inp, hidden)`. Recall that you can get this loss by computing it, as usual, and calling `loss.item()`.

**If a GPU is available, you should move your data to that GPU device, here.**

In [11]:
def forward_back_prop(rnn, optimizer, criterion, inp, target, hidden):
    """
    Forward and backward propagation on the neural network
    :param decoder: The PyTorch Module that holds the neural network
    :param decoder_optimizer: The PyTorch optimizer for the neural network
    :param criterion: The PyTorch loss function
    :param inp: A batch of input to the neural network
    :param target: The target output for the batch of input
    :return: The loss and the latest hidden state Tensor
    """
    
    # TODO: Implement Function

    # move data to GPU, if available
    if train_on_gpu:
        inp, target = inp.cuda(), target.cuda()

    rnn.zero_grad()
    hidden = tuple([each.data for each in hidden])
    output, hidden_out = rnn(inp, hidden)
    # perform backpropagation and optimization
    loss = criterion(output, target)
    loss.backward()

#     nn.utils.clip_grad_norm_(rnn.parameters(), 1)
    optimizer.step()
    
    # return the loss over a batch and the hidden state produced by our model
    return loss.item(), hidden_out

# Note that these tests aren't completely extensive.
# they are here to act as general checks on the expected outputs of your functions
"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_forward_back_prop(RNN, forward_back_prop, train_on_gpu)

Tests Passed


## Neural Network Training

With the structure of the network complete and data ready to be fed in the neural network, it's time to train it.

### Train Loop

The training loop is implemented for you in the `train_decoder` function. This function will train the network over all the batches for the number of epochs given. The model progress will be shown every number of batches. This number is set with the `show_every_n_batches` parameter. You'll set this parameter along with other parameters in the next section.

In [12]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""

def train_rnn(rnn, batch_size, optimizer, criterion, n_epochs, show_every_n_batches=100):
    batch_losses = []
    
    rnn.train()

    print("Training for %d epoch(s)..." % n_epochs)
    for epoch_i in range(1, n_epochs + 1):
        
        # initialize hidden state
        hidden = rnn.init_hidden(batch_size)
        
        for batch_i, (inputs, labels) in enumerate(train_loader, 1):
            
            # make sure you iterate over completely full batches, only
            n_batches = len(train_loader.dataset)//batch_size
            if(batch_i > n_batches):
                break
            
            # forward, back prop
            loss, hidden = forward_back_prop(rnn, optimizer, criterion, inputs, labels, hidden)          
            # record loss
            batch_losses.append(loss)

            # printing loss stats
            if batch_i % show_every_n_batches == 0:
                print('Epoch: {:>4}/{:<4}  Loss: {}\n'.format(
                    epoch_i, n_epochs, np.average(batch_losses)))
                batch_losses = []

    # returns a trained rnn
    return rnn

### Hyperparameters

Set and train the neural network with the following parameters:
- Set `sequence_length` to the length of a sequence.
- Set `batch_size` to the batch size.
- Set `num_epochs` to the number of epochs to train for.
- Set `learning_rate` to the learning rate for an Adam optimizer.
- Set `vocab_size` to the number of unique tokens in our vocabulary.
- Set `output_size` to the desired size of the output.
- Set `embedding_dim` to the embedding dimension; smaller than the vocab_size.
- Set `hidden_dim` to the hidden dimension of your RNN.
- Set `n_layers` to the number of layers/cells in your RNN.
- Set `show_every_n_batches` to the number of batches at which the neural network should print progress.

If the network isn't getting the desired results, tweak these parameters and/or the layers in the `RNN` class.

In [13]:
# Data params
# Sequence Length
sequence_length = 15  # of words in a sequence
# Batch Size
batch_size = 30

# data loader - do not change
train_loader = batch_data(int_text, sequence_length, batch_size)

In [14]:
# Training parameters
# Number of Epochs
num_epochs = 60
# Learning Rate
learning_rate = 0.001

# Model parameters
# Vocab size
vocab_size = len(vocab_to_int)
# Output size
output_size = len(vocab_to_int)
# Embedding Dimension
embedding_dim = 300
# Hidden Dimension
hidden_dim = 256
# Number of RNN Layers
n_layers = 2

# Show stats for every n number of batches
show_every_n_batches = 500

### Train
In the next cell, you'll train the neural network on the pre-processed data.  If you have a hard time getting a good loss, you may consider changing your hyperparameters. In general, you may get better results with larger hidden and n_layer dimensions, but larger models take a longer time to train. 
> **You should aim for a loss less than 3.5.** 

You should also experiment with different sequence lengths, which determine the size of the long range dependencies that a model can learn.

In [15]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""

# create model and move to gpu if available
rnn = RNN(vocab_size, output_size, embedding_dim, hidden_dim, n_layers, dropout=0.5)
if train_on_gpu:
    rnn.cuda()

# defining loss and optimization functions for training
optimizer = torch.optim.Adam(rnn.parameters(), lr=learning_rate)
criterion = nn.CrossEntropyLoss()

# training the model
trained_rnn = train_rnn(rnn, batch_size, optimizer, criterion, num_epochs, show_every_n_batches)

# saving the trained model
helper.save_model('./save/trained_rnn', trained_rnn)
print('Model Trained and Saved')

Training for 60 epoch(s)...
Epoch:    1/60    Loss: 5.584539404392243

Epoch:    1/60    Loss: 4.978618786811829

Epoch:    1/60    Loss: 4.743197936058045

Epoch:    1/60    Loss: 4.7408830742836

Epoch:    1/60    Loss: 4.796560604572296

Epoch:    1/60    Loss: 4.691383345127106

Epoch:    1/60    Loss: 4.620989734649658

Epoch:    1/60    Loss: 4.5035069065094

Epoch:    1/60    Loss: 4.2975137164592745

Epoch:    1/60    Loss: 4.580089133262634

Epoch:    1/60    Loss: 4.363871543884278

Epoch:    1/60    Loss: 4.470418291568756

Epoch:    1/60    Loss: 4.186340519428253

Epoch:    1/60    Loss: 4.282206547737122

Epoch:    1/60    Loss: 4.192819122314453

Epoch:    1/60    Loss: 4.33475363445282

Epoch:    1/60    Loss: 4.294622162818909

Epoch:    1/60    Loss: 4.451160747051239

Epoch:    1/60    Loss: 4.440550621509552

Epoch:    1/60    Loss: 4.313336195468903

Epoch:    1/60    Loss: 4.309487774848938

Epoch:    1/60    Loss: 4.205012201547623

Epoch:    1/60    Loss: 4.5028

Epoch:    4/60    Loss: 3.729708338022232

Epoch:    4/60    Loss: 3.6837831318378447

Epoch:    4/60    Loss: 3.7861758847236633

Epoch:    4/60    Loss: 3.851883688211441

Epoch:    4/60    Loss: 4.0383371849060055

Epoch:    4/60    Loss: 3.9950000686645506

Epoch:    4/60    Loss: 3.900468761205673

Epoch:    4/60    Loss: 3.837289304494858

Epoch:    4/60    Loss: 3.780396887779236

Epoch:    4/60    Loss: 4.057126965761185

Epoch:    4/60    Loss: 3.9283308670520785

Epoch:    4/60    Loss: 4.049860929012299

Epoch:    4/60    Loss: 4.121878700256348

Epoch:    4/60    Loss: 3.9498735902309416

Epoch:    4/60    Loss: 4.0905147600173954

Epoch:    4/60    Loss: 3.9671390261650084

Epoch:    4/60    Loss: 3.9465711572170257

Epoch:    4/60    Loss: 3.83036514043808

Epoch:    4/60    Loss: 4.124916875839234

Epoch:    4/60    Loss: 3.954480763912201

Epoch:    4/60    Loss: 3.7412215039730072

Epoch:    4/60    Loss: 3.723030748605728

Epoch:    4/60    Loss: 3.905531548023224

Ep

Epoch:    7/60    Loss: 3.8757731997966767

Epoch:    7/60    Loss: 3.7476639981269835

Epoch:    7/60    Loss: 3.8593687086105346

Epoch:    7/60    Loss: 3.7231946861743928

Epoch:    7/60    Loss: 3.6708181862831117

Epoch:    7/60    Loss: 3.5521518626213076

Epoch:    7/60    Loss: 3.874047056674957

Epoch:    7/60    Loss: 3.734549289226532

Epoch:    7/60    Loss: 3.5545007317066193

Epoch:    7/60    Loss: 3.5176464920043946

Epoch:    7/60    Loss: 3.667836847305298

Epoch:    7/60    Loss: 3.5436180019378662

Epoch:    7/60    Loss: 3.745401458263397

Epoch:    7/60    Loss: 3.7001532146930693

Epoch:    7/60    Loss: 3.5840956127643584

Epoch:    7/60    Loss: 3.6670471658706667

Epoch:    7/60    Loss: 3.6500742745399477

Epoch:    7/60    Loss: 3.679114816904068

Epoch:    7/60    Loss: 3.7834524478912352

Epoch:    7/60    Loss: 3.675047170639038

Epoch:    7/60    Loss: 3.7911734120845795

Epoch:    7/60    Loss: 3.932958755016327

Epoch:    7/60    Loss: 3.9293767795562

Epoch:   10/60    Loss: 3.5584982147216797

Epoch:   10/60    Loss: 3.5097120525836947

Epoch:   10/60    Loss: 3.4547658886909485

Epoch:   10/60    Loss: 3.4657480516433714

Epoch:   10/60    Loss: 3.4924599905014038

Epoch:   10/60    Loss: 3.5465559515953062

Epoch:   10/60    Loss: 3.6250047516822814

Epoch:   10/60    Loss: 3.558868219137192

Epoch:   10/60    Loss: 3.6529250473976136

Epoch:   10/60    Loss: 3.7863014068603515

Epoch:   10/60    Loss: 3.7193010082244875

Epoch:   10/60    Loss: 3.673429448366165

Epoch:   10/60    Loss: 3.712593658924103

Epoch:   10/60    Loss: 3.69113086271286

Epoch:   10/60    Loss: 3.727460372686386

Epoch:   10/60    Loss: 3.7415526093244553

Epoch:   10/60    Loss: 3.795404575109482

Epoch:   10/60    Loss: 3.713820204734802

Epoch:   10/60    Loss: 3.6860624089241028

Epoch:   10/60    Loss: 3.5882261383533476

Epoch:   10/60    Loss: 3.7736968326568605

Epoch:   10/60    Loss: 3.592023967504501

Epoch:   11/60    Loss: 3.762373757751092

Epoch:   13/60    Loss: 3.579603631258011

Epoch:   13/60    Loss: 3.5262464666366578

Epoch:   13/60    Loss: 3.543883153676987

Epoch:   13/60    Loss: 3.564850581407547

Epoch:   13/60    Loss: 3.6192431302070616

Epoch:   13/60    Loss: 3.580011263370514

Epoch:   13/60    Loss: 3.589677803993225

Epoch:   13/60    Loss: 3.4980432991981507

Epoch:   13/60    Loss: 3.6150482680797578

Epoch:   13/60    Loss: 3.4706174697875976

Epoch:   14/60    Loss: 3.6440658080189126

Epoch:   14/60    Loss: 3.5505749077796938

Epoch:   14/60    Loss: 3.4221648554801942

Epoch:   14/60    Loss: 3.456373544692993

Epoch:   14/60    Loss: 3.4427258191108705

Epoch:   14/60    Loss: 3.475616331100464

Epoch:   14/60    Loss: 3.5077902569770814

Epoch:   14/60    Loss: 3.407384548187256

Epoch:   14/60    Loss: 3.223609859228134

Epoch:   14/60    Loss: 3.4691884999275207

Epoch:   14/60    Loss: 3.3999497628211977

Epoch:   14/60    Loss: 3.397182380914688

Epoch:   14/60    Loss: 3.2375227584838866

Epoch:   17/60    Loss: 3.277259191274643

Epoch:   17/60    Loss: 3.302737434387207

Epoch:   17/60    Loss: 3.3002242386341094

Epoch:   17/60    Loss: 3.3366727983951567

Epoch:   17/60    Loss: 3.3957268357276917

Epoch:   17/60    Loss: 3.3206763558387755

Epoch:   17/60    Loss: 3.136403620481491

Epoch:   17/60    Loss: 3.3816332268714904

Epoch:   17/60    Loss: 3.3256276853084565

Epoch:   17/60    Loss: 3.3420629749298096

Epoch:   17/60    Loss: 3.1878785796165467

Epoch:   17/60    Loss: 3.233029477596283

Epoch:   17/60    Loss: 3.2201851279735565

Epoch:   17/60    Loss: 3.2364381322860716

Epoch:   17/60    Loss: 3.245013053417206

Epoch:   17/60    Loss: 3.350328604221344

Epoch:   17/60    Loss: 3.3014356188774108

Epoch:   17/60    Loss: 3.2275468764305115

Epoch:   17/60    Loss: 3.2695296766757966

Epoch:   17/60    Loss: 3.236613594055176

Epoch:   17/60    Loss: 3.417028036355972

Epoch:   17/60    Loss: 3.3160128190517426

Epoch:   17/60    Loss: 3.36513946747779

Epoch:   20/60    Loss: 3.16709756731987

Epoch:   20/60    Loss: 3.155626588821411

Epoch:   20/60    Loss: 3.136560223579407

Epoch:   20/60    Loss: 3.199279043197632

Epoch:   20/60    Loss: 3.2842544367313384

Epoch:   20/60    Loss: 3.2330834448337553

Epoch:   20/60    Loss: 3.122628919363022

Epoch:   20/60    Loss: 3.200595396280289

Epoch:   20/60    Loss: 3.1525713675022127

Epoch:   20/60    Loss: 3.3322874665260316

Epoch:   20/60    Loss: 3.2201295130252836

Epoch:   20/60    Loss: 3.290888930082321

Epoch:   20/60    Loss: 3.364639599084854

Epoch:   20/60    Loss: 3.2395799372196197

Epoch:   20/60    Loss: 3.3343406012058256

Epoch:   20/60    Loss: 3.24289826130867

Epoch:   20/60    Loss: 3.1938642995357513

Epoch:   20/60    Loss: 3.081999782085419

Epoch:   20/60    Loss: 3.298675107717514

Epoch:   20/60    Loss: 3.232543532848358

Epoch:   20/60    Loss: 3.0943985736370085

Epoch:   20/60    Loss: 3.0767729694843293

Epoch:   20/60    Loss: 3.203366997241974

Epo

Epoch:   23/60    Loss: 3.2918880982398986

Epoch:   23/60    Loss: 3.2071077835559847

Epoch:   23/60    Loss: 3.280902110815048

Epoch:   23/60    Loss: 3.2026264243125917

Epoch:   23/60    Loss: 3.136078412294388

Epoch:   23/60    Loss: 3.048828910589218

Epoch:   23/60    Loss: 3.244369997739792

Epoch:   23/60    Loss: 3.187796584367752

Epoch:   23/60    Loss: 3.0604088876247406

Epoch:   23/60    Loss: 3.065027961730957

Epoch:   23/60    Loss: 3.1629499826431275

Epoch:   23/60    Loss: 3.0547912983894348

Epoch:   23/60    Loss: 3.122392864704132

Epoch:   23/60    Loss: 3.1397533531188966

Epoch:   23/60    Loss: 3.1241463499069213

Epoch:   23/60    Loss: 3.117223782300949

Epoch:   23/60    Loss: 3.1003319392204283

Epoch:   23/60    Loss: 3.1695787980556487

Epoch:   23/60    Loss: 3.227068747520447

Epoch:   23/60    Loss: 3.166234560728073

Epoch:   23/60    Loss: 3.2481342577934265

Epoch:   23/60    Loss: 3.287988368988037

Epoch:   23/60    Loss: 3.2103374769687654


Epoch:   26/60    Loss: 3.074005312681198

Epoch:   26/60    Loss: 3.0990140192508697

Epoch:   26/60    Loss: 3.0603642933368684

Epoch:   26/60    Loss: 3.0486267025470735

Epoch:   26/60    Loss: 3.0422983050346373

Epoch:   26/60    Loss: 3.1140801928043365

Epoch:   26/60    Loss: 3.170503722667694

Epoch:   26/60    Loss: 3.1184332506656647

Epoch:   26/60    Loss: 3.2031474485397338

Epoch:   26/60    Loss: 3.241922064781189

Epoch:   26/60    Loss: 3.153810519695282

Epoch:   26/60    Loss: 3.1913318133354185

Epoch:   26/60    Loss: 3.2722426114082337

Epoch:   26/60    Loss: 3.240302745103836

Epoch:   26/60    Loss: 3.2511712880134582

Epoch:   26/60    Loss: 3.2796772305965423

Epoch:   26/60    Loss: 3.28140873670578

Epoch:   26/60    Loss: 3.219908824443817

Epoch:   26/60    Loss: 3.1930616080760954

Epoch:   26/60    Loss: 3.179032930135727

Epoch:   26/60    Loss: 3.2076800801754

Epoch:   26/60    Loss: 3.1514083974361418

Epoch:   27/60    Loss: 3.2517395588042945



Epoch:   29/60    Loss: 3.2626559114456177

Epoch:   29/60    Loss: 3.1931901626586914

Epoch:   29/60    Loss: 3.201913867235184

Epoch:   29/60    Loss: 3.21992192029953

Epoch:   29/60    Loss: 3.224096882343292

Epoch:   29/60    Loss: 3.163540901184082

Epoch:   29/60    Loss: 3.1366022844314574

Epoch:   29/60    Loss: 3.1418236718177797

Epoch:   29/60    Loss: 3.1760004720687864

Epoch:   29/60    Loss: 3.134887305855751

Epoch:   30/60    Loss: 3.187243509227815

Epoch:   30/60    Loss: 3.203850958108902

Epoch:   30/60    Loss: 3.1054005720615385

Epoch:   30/60    Loss: 3.1218292503356935

Epoch:   30/60    Loss: 3.126458002090454

Epoch:   30/60    Loss: 3.167111741781235

Epoch:   30/60    Loss: 3.2397584590911865

Epoch:   30/60    Loss: 3.1459299998283385

Epoch:   30/60    Loss: 2.9654383170604706

Epoch:   30/60    Loss: 3.167007639169693

Epoch:   30/60    Loss: 3.0810819640159606

Epoch:   30/60    Loss: 3.1002568950653075

Epoch:   30/60    Loss: 2.9725213961601256


Epoch:   33/60    Loss: 3.0972916464805604

Epoch:   33/60    Loss: 3.1054378352165224

Epoch:   33/60    Loss: 3.087610771417618

Epoch:   33/60    Loss: 3.137322801351547

Epoch:   33/60    Loss: 3.207550136804581

Epoch:   33/60    Loss: 3.1104293298721313

Epoch:   33/60    Loss: 2.9564541738033294

Epoch:   33/60    Loss: 3.1338079097270968

Epoch:   33/60    Loss: 3.0525307209491728

Epoch:   33/60    Loss: 3.0523040096759795

Epoch:   33/60    Loss: 2.984789976119995

Epoch:   33/60    Loss: 2.9684223103523255

Epoch:   33/60    Loss: 2.9658381021022797

Epoch:   33/60    Loss: 2.9852140748500826

Epoch:   33/60    Loss: 3.0078382341861727

Epoch:   33/60    Loss: 3.041002182006836

Epoch:   33/60    Loss: 2.976677483558655

Epoch:   33/60    Loss: 2.9239862084388735

Epoch:   33/60    Loss: 3.04994545173645

Epoch:   33/60    Loss: 2.961669871807098

Epoch:   33/60    Loss: 3.1258152792453764

Epoch:   33/60    Loss: 3.0625193269252775

Epoch:   33/60    Loss: 3.107826185464859

Epoch:   36/60    Loss: 2.9656807379722596

Epoch:   36/60    Loss: 2.96777165055275

Epoch:   36/60    Loss: 2.982442436695099

Epoch:   36/60    Loss: 3.033561665058136

Epoch:   36/60    Loss: 2.952042234659195

Epoch:   36/60    Loss: 2.9103558406829833

Epoch:   36/60    Loss: 3.0334262130260465

Epoch:   36/60    Loss: 2.934503564119339

Epoch:   36/60    Loss: 3.104118376970291

Epoch:   36/60    Loss: 3.0418269124031068

Epoch:   36/60    Loss: 3.1539673998355866

Epoch:   36/60    Loss: 3.1702045578956604

Epoch:   36/60    Loss: 3.058209401845932

Epoch:   36/60    Loss: 3.138608909845352

Epoch:   36/60    Loss: 3.084193307161331

Epoch:   36/60    Loss: 2.9993287224769594

Epoch:   36/60    Loss: 2.8763644599914553

Epoch:   36/60    Loss: 3.105853671312332

Epoch:   36/60    Loss: 3.0435115427970887

Epoch:   36/60    Loss: 2.917991832256317

Epoch:   36/60    Loss: 2.904712725877762

Epoch:   36/60    Loss: 3.048201247930527

Epoch:   36/60    Loss: 2.96268443107605

Epoc

Epoch:   39/60    Loss: 3.038557509422302

Epoch:   39/60    Loss: 3.117521201133728

Epoch:   39/60    Loss: 3.0802025334835053

Epoch:   39/60    Loss: 2.9909650919437407

Epoch:   39/60    Loss: 2.8562294316291807

Epoch:   39/60    Loss: 3.055903832912445

Epoch:   39/60    Loss: 3.005458951711655

Epoch:   39/60    Loss: 2.9040640454292297

Epoch:   39/60    Loss: 2.914397147655487

Epoch:   39/60    Loss: 3.0236536610126494

Epoch:   39/60    Loss: 2.9534739577770233

Epoch:   39/60    Loss: 3.0052827398777007

Epoch:   39/60    Loss: 3.0413389294147493

Epoch:   39/60    Loss: 2.9607503249645233

Epoch:   39/60    Loss: 2.9586686508655546

Epoch:   39/60    Loss: 2.995025770664215

Epoch:   39/60    Loss: 3.0581513793468473

Epoch:   39/60    Loss: 3.120459712743759

Epoch:   39/60    Loss: 3.060416044473648

Epoch:   39/60    Loss: 3.1194103260040285

Epoch:   39/60    Loss: 3.1485910918712614

Epoch:   39/60    Loss: 3.0498586325645447

Epoch:   39/60    Loss: 3.093242623806



Epoch:   42/60    Loss: 3.0098736870288847

Epoch:   42/60    Loss: 3.0317411084175108

Epoch:   42/60    Loss: 2.967855660200119

Epoch:   42/60    Loss: 2.9548491938114165

Epoch:   42/60    Loss: 2.978021357059479

Epoch:   42/60    Loss: 3.051149995803833

Epoch:   42/60    Loss: 3.130338718891144

Epoch:   42/60    Loss: 3.0561192433834075

Epoch:   42/60    Loss: 3.1148892934322356

Epoch:   42/60    Loss: 3.1519591093063353

Epoch:   42/60    Loss: 3.0412966356277464

Epoch:   42/60    Loss: 3.0832574646472932

Epoch:   42/60    Loss: 3.151954854488373

Epoch:   42/60    Loss: 3.1058541417121885

Epoch:   42/60    Loss: 3.124259605884552

Epoch:   42/60    Loss: 3.1767401390075682

Epoch:   42/60    Loss: 3.180327879190445

Epoch:   42/60    Loss: 3.0894172797203066

Epoch:   42/60    Loss: 3.118107046842575

Epoch:   42/60    Loss: 3.073662515163422

Epoch:   42/60    Loss: 3.057678426027298

Epoch:   42/60    Loss: 3.0592990016937254

Epoch:   43/60    Loss: 3.1100704694247763

Epoch:   45/60    Loss: 3.138214866399765

Epoch:   45/60    Loss: 3.113639970779419

Epoch:   45/60    Loss: 3.1191043486595156

Epoch:   45/60    Loss: 3.1322926981449126

Epoch:   45/60    Loss: 3.158511768102646

Epoch:   45/60    Loss: 3.092940051794052

Epoch:   45/60    Loss: 3.147403843641281

Epoch:   45/60    Loss: 3.051437216758728

Epoch:   45/60    Loss: 3.063143036365509

Epoch:   45/60    Loss: 3.0337309668064116

Epoch:   46/60    Loss: 3.0976372879484426

Epoch:   46/60    Loss: 3.110047564983368

Epoch:   46/60    Loss: 3.032273302078247

Epoch:   46/60    Loss: 3.053340095043182

Epoch:   46/60    Loss: 2.9990861237049105

Epoch:   46/60    Loss: 3.093832799434662

Epoch:   46/60    Loss: 3.0983972034454346

Epoch:   46/60    Loss: 3.0014361157417295

Epoch:   46/60    Loss: 2.895834550857544

Epoch:   46/60    Loss: 3.063236252784729

Epoch:   46/60    Loss: 3.008116314411163

Epoch:   46/60    Loss: 3.002519870519638

Epoch:   46/60    Loss: 2.931067692995071

Epoc

Epoch:   49/60    Loss: 3.019122073173523

Epoch:   49/60    Loss: 3.0437518105506896

Epoch:   49/60    Loss: 2.991272403240204

Epoch:   49/60    Loss: 3.088693995475769

Epoch:   49/60    Loss: 3.0891170752048493

Epoch:   49/60    Loss: 2.999826104640961

Epoch:   49/60    Loss: 2.900358872652054

Epoch:   49/60    Loss: 3.081073584318161

Epoch:   49/60    Loss: 3.0098414347171785

Epoch:   49/60    Loss: 2.99009863615036

Epoch:   49/60    Loss: 2.9393506503105162

Epoch:   49/60    Loss: 2.9425090942382814

Epoch:   49/60    Loss: 2.8990601954460145

Epoch:   49/60    Loss: 2.932772700548172

Epoch:   49/60    Loss: 2.9487368531227114

Epoch:   49/60    Loss: 3.0011843755245207

Epoch:   49/60    Loss: 2.9225366299152373

Epoch:   49/60    Loss: 2.8649790987968444

Epoch:   49/60    Loss: 2.989877316236496

Epoch:   49/60    Loss: 2.913241027832031

Epoch:   49/60    Loss: 3.0553090274333954

Epoch:   49/60    Loss: 3.0039348700046538

Epoch:   49/60    Loss: 3.108970588207245



Epoch:   52/60    Loss: 2.8773056118488314

Epoch:   52/60    Loss: 2.888235698699951

Epoch:   52/60    Loss: 2.9229613678455353

Epoch:   52/60    Loss: 2.978849330186844

Epoch:   52/60    Loss: 2.9101776342391967

Epoch:   52/60    Loss: 2.8345661537647246

Epoch:   52/60    Loss: 2.9668083910942076

Epoch:   52/60    Loss: 2.9005878484249115

Epoch:   52/60    Loss: 3.09209019780159

Epoch:   52/60    Loss: 2.9908509542942046

Epoch:   52/60    Loss: 3.102702036619186

Epoch:   52/60    Loss: 3.062076731681824

Epoch:   52/60    Loss: 2.996789535045624

Epoch:   52/60    Loss: 3.037564143657684

Epoch:   52/60    Loss: 3.0080623025894164

Epoch:   52/60    Loss: 2.94582121348381

Epoch:   52/60    Loss: 2.8164009456634522

Epoch:   52/60    Loss: 3.0184047169685364

Epoch:   52/60    Loss: 2.989757337331772

Epoch:   52/60    Loss: 2.8729828157424926

Epoch:   52/60    Loss: 2.8739992871284485

Epoch:   52/60    Loss: 2.9778012998104098

Epoch:   52/60    Loss: 2.8842021770477295


Epoch:   55/60    Loss: 2.9818870179653167

Epoch:   55/60    Loss: 3.036971787929535

Epoch:   55/60    Loss: 2.993278458595276

Epoch:   55/60    Loss: 2.9474772527217863

Epoch:   55/60    Loss: 2.815631739139557

Epoch:   55/60    Loss: 2.9901257407665254

Epoch:   55/60    Loss: 2.9414497957229613

Epoch:   55/60    Loss: 2.8832367720603944

Epoch:   55/60    Loss: 2.8809503700733186

Epoch:   55/60    Loss: 2.986396545410156

Epoch:   55/60    Loss: 2.9008788681030273

Epoch:   55/60    Loss: 2.9637061853408815

Epoch:   55/60    Loss: 2.9878115582466127

Epoch:   55/60    Loss: 2.895455240726471

Epoch:   55/60    Loss: 2.914346218109131

Epoch:   55/60    Loss: 2.926169105529785

Epoch:   55/60    Loss: 2.9722946412563322

Epoch:   55/60    Loss: 3.091877036809921

Epoch:   55/60    Loss: 2.978288801193237

Epoch:   55/60    Loss: 3.065767784833908

Epoch:   55/60    Loss: 3.0418087122440336

Epoch:   55/60    Loss: 2.9924535391330718

Epoch:   55/60    Loss: 2.9995793709754945

Epoch:   58/60    Loss: 3.011494353055954

Epoch:   58/60    Loss: 2.9144302282333374

Epoch:   58/60    Loss: 2.928942413806915

Epoch:   58/60    Loss: 2.9462935311794283

Epoch:   58/60    Loss: 3.0154632954597473

Epoch:   58/60    Loss: 3.089923846244812

Epoch:   58/60    Loss: 2.9795881843566896

Epoch:   58/60    Loss: 3.0965788426399232

Epoch:   58/60    Loss: 3.109135447263718

Epoch:   58/60    Loss: 3.0051245551109314

Epoch:   58/60    Loss: 3.027168481826782

Epoch:   58/60    Loss: 3.1139655599594116

Epoch:   58/60    Loss: 3.046295505285263

Epoch:   58/60    Loss: 3.066065661430359

Epoch:   58/60    Loss: 3.1018606140613554

Epoch:   58/60    Loss: 3.084925150871277

Epoch:   58/60    Loss: 3.054350051879883

Epoch:   58/60    Loss: 3.0427903695106506

Epoch:   58/60    Loss: 3.020717315196991

Epoch:   58/60    Loss: 3.0195130007267

Epoch:   58/60    Loss: 2.982715908050537

Epoch:   59/60    Loss: 3.0452361814677715

Epoch:   59/60    Loss: 3.0575683653354644

Ep

  "type " + obj.__name__ + ". It won't be checked "


### Question: How did you decide on your model hyperparameters? 
For example, did you try different sequence_lengths and find that one size made the model converge faster? What about your hidden_dim and n_layers; how did you decide on those?

**Answer:** 
The biggest change happend when I removed the dropout that I originally had. This changed my model to never go below a loss of 4.0 to beeing able to converge. The reasoning for not doing dropout is that we are not afraid of overfitting, we are actually trying to recreate seinfeld scripts as good as possible.

Suddenly the model was able to converge, and I just left the other parameters as they were at that moment.





---
# Checkpoint

After running the above training cell, your model will be saved by name, `trained_rnn`, and if you save your notebook progress, **you can pause here and come back to this code at another time**. You can resume your progress by running the next cell, which will load in our word:id dictionaries _and_ load in your saved model by name!

In [16]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
import torch
import helper
import problem_unittests as tests

_, vocab_to_int, int_to_vocab, token_dict = helper.load_preprocess()
trained_rnn = helper.load_model('./save/trained_rnn')

## Generate TV Script
With the network trained and saved, you'll use it to generate a new, "fake" Seinfeld TV script in this section.

### Generate Text
To generate the text, the network needs to start with a single word and repeat its predictions until it reaches a set length. You'll be using the `generate` function to do this. It takes a word id to start with, `prime_id`, and generates a set length of text, `predict_len`. Also note that it uses topk sampling to introduce some randomness in choosing the most likely next word, given an output set of word scores!

In [17]:
"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
import torch.nn.functional as F

def generate(rnn, prime_id, int_to_vocab, token_dict, pad_value, predict_len=100):
    """
    Generate text using the neural network
    :param decoder: The PyTorch Module that holds the trained neural network
    :param prime_id: The word id to start the first prediction
    :param int_to_vocab: Dict of word id keys to word values
    :param token_dict: Dict of puncuation tokens keys to puncuation values
    :param pad_value: The value used to pad a sequence
    :param predict_len: The length of text to generate
    :return: The generated text
    """
    rnn.eval()
    
    # create a sequence (batch_size=1) with the prime_id
    current_seq = np.full((1, sequence_length), pad_value)
    current_seq[-1][-1] = prime_id
    predicted = [int_to_vocab[prime_id]]
    
    for _ in range(predict_len):
        if train_on_gpu:
            current_seq = torch.LongTensor(current_seq).cuda()
        else:
            current_seq = torch.LongTensor(current_seq)
        
        # initialize the hidden state
        hidden = rnn.init_hidden(current_seq.size(0))
        
        # get the output of the rnn
        output, _ = rnn(current_seq, hidden)
        
        # get the next word probabilities
        p = F.softmax(output, dim=1).data
        if(train_on_gpu):
            p = p.cpu() # move to cpu
         
        # use top_k sampling to get the index of the next word
        top_k = 5
        p, top_i = p.topk(top_k)
        top_i = top_i.numpy().squeeze()
        
        # select the likely next word index with some element of randomness
        p = p.numpy().squeeze()
        word_i = np.random.choice(top_i, p=p/p.sum())
        
        # retrieve that word from the dictionary
        word = int_to_vocab[word_i]
        predicted.append(word)     
        
        # the generated word becomes the next "current sequence" and the cycle can continue
        current_seq = np.roll(current_seq, -1, 1)
        current_seq[-1][-1] = word_i
    
    gen_sentences = ' '.join(predicted)
    
    # Replace punctuation tokens
    for key, token in token_dict.items():
        ending = ' ' if key in ['\n', '(', '"'] else ''
        gen_sentences = gen_sentences.replace(' ' + token.lower(), key)
    gen_sentences = gen_sentences.replace('\n ', '\n')
    gen_sentences = gen_sentences.replace('( ', '(')
    
    # return all the sentences
    return gen_sentences

### Generate a New Script
It's time to generate the text. Set `gen_length` to the length of TV script you want to generate and set `prime_word` to one of the following to start the prediction:
- "jerry"
- "elaine"
- "george"
- "kramer"

You can set the prime word to _any word_ in our dictionary, but it's best to start with a name for generating a TV script. (You can also start with any other names you find in the original text file!)

In [18]:
# run the cell multiple times to get different results!
gen_length = 400 # modify the length to your preference
prime_word = 'jerry' # name for starting the script

"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
pad_word = helper.SPECIAL_WORDS['PADDING']
generated_script = generate(trained_rnn, vocab_to_int[prime_word + ':'], int_to_vocab, token_dict, vocab_to_int[pad_word], gen_length)
print(generated_script)



jerry: brandes cooling footnote:. writer's knowwork footnote: kroft) pitched

jerry: hello?

kramer: hey, you gotta get the cash, you want to be honest with that man, you got.

morty: what do you mean?

helen: it's not that guy.

george: oh, i got it.

jerry: i don't know how you can do that.

morty: i don't know, i'm so sorry.

chiles: help me!

kramer: hey, i got a lot of static. poor i was there.

elaine: so, i got it.

george: i can't believe i was in the position with the way to talk about.

jerry: what?

elaine: well, i don't know.

jerry: why did you not?

helen: the sticks. they don't want to hear soup and she said that would have to be. i have been in this court, you're a cosmo he was in front of a restaurant tribe- the moors. we just want to get ready, but i would.

helen: george, i am, i'm gonna be.

george: jerry, i am so grateful for you to get the pound. i was with the of the.

george: i love this contest...

elaine: i think i have a pee party and they are.

kramer: you k

#### Save your favorite scripts

Once you have a script that you like (or find interesting), save it to a text file!

In [19]:
# save script to a text file
f =  open("generated_script_1.txt","w")
f.write(generated_script)
f.close()

# The TV Script is Not Perfect
It's ok if the TV script doesn't make perfect sense. It should look like alternating lines of dialogue, here is one such example of a few generated lines.

### Example generated script

>jerry: what about me?
>
>jerry: i don't have to wait.
>
>kramer:(to the sales table)
>
>elaine:(to jerry) hey, look at this, i'm a good doctor.
>
>newman:(to elaine) you think i have no idea of this...
>
>elaine: oh, you better take the phone, and he was a little nervous.
>
>kramer:(to the phone) hey, hey, jerry, i don't want to be a little bit.(to kramer and jerry) you can't.
>
>jerry: oh, yeah. i don't even know, i know.
>
>jerry:(to the phone) oh, i know.
>
>kramer:(laughing) you know...(to jerry) you don't know.

You can see that there are multiple characters that say (somewhat) complete sentences, but it doesn't have to be perfect! It takes quite a while to get good results, and often, you'll have to use a smaller vocabulary (and discard uncommon words), or get more data.  The Seinfeld dataset is about 3.4 MB, which is big enough for our purposes; for script generation you'll want more than 1 MB of text, generally. 

# Submitting This Project
When submitting this project, make sure to run all the cells before saving the notebook. Save the notebook file as "dlnd_tv_script_generation.ipynb" and save another copy as an HTML file by clicking "File" -> "Download as.."->"html". Include the "helper.py" and "problem_unittests.py" files in your submission. Once you download these files, compress them into one zip file for submission.