# TV Script Generation

In this project, I will be generating my own [Seinfeld](https://en.wikipedia.org/wiki/Seinfeld) TV scripts using RNNs.  Dataset is from part of the [Seinfeld dataset](https://www.kaggle.com/thec03u5/seinfeld-chronicles#scripts.csv) of scripts from 9 seasons. The recurrent neural network I build will generate a new ,"fake" TV script, based on patterns it recognizes in this training data.

## Get the Data

* Access the data loaded in the `./data/Seinfeld_Scripts.txt` file
* Take a look at some samples

In [1]:
"""
DON'T MODIFY CODE BELOW
"""
# load in data
import helper
data_dir = './data/Seinfeld_Scripts.txt'
text = helper.load_data(data_dir)

## Explore the Data
Using `view_line_range` to view different parts of the data.

In [2]:

view_line_range = (0, 10)

"""
DON'T MODIFY CODE BELOW
"""
import numpy as np

print('Dataset Stats')
print('Roughly the number of unique words: {}'.format(len({word: None for word in text.split()})))

lines = text.split('\n')
print('Number of lines: {}'.format(len(lines)))
word_count_line = [len(line.split()) for line in lines]
print('Average number of words in each line: {}'.format(np.average(word_count_line)))

print()
print('The lines {} to {}:'.format(*view_line_range))
print('\n'.join(text.split('\n')[view_line_range[0]:view_line_range[1]]))

Dataset Stats
Roughly the number of unique words: 46367
Number of lines: 109233
Average number of words in each line: 5.544240293684143

The lines 0 to 10:
jerry: do you know what this is all about? do you know, why were here? to be out, this is out...and out is one of the single most enjoyable experiences of life. people...did you ever hear people talking about we should go out? this is what theyre talking about...this whole thing, were all out now, no one is home. not one person here is home, were all out! there are people trying to find us, they dont know where we are. (on an imaginary phone) did you ring?, i cant find him. where did he go? he didnt tell me where he was going. he must have gone out. you wanna go out you get ready, you pick out the clothes, right? you take the shower, you get all ready, get the cash, get your friends, the car, the spot, the reservation...then youre standing around, what do you do? you go we gotta be getting back. once youre out, you wanna get back! y

We can see that all texts are lowercase, and each new line of dialogue is separated by a newline character `\n`.

---
## Implement Pre-processing Functions
Initial data pre-processing functions that:
- Creates lookup table
- Tokenize punctuation

### Lookup Table
To create a word embedding, we must transform the words to ids.

In [3]:
import problem_unittests as tests
from collections import Counter


def create_lookup_tables(text):
    """
    Create lookup tables for vocabulary
    
    Params
    ------
    text(str): The text of tv scripts split into words
    
    Returns
    -------
    vocab_to_int, int_to_vocab (tuple): tuple of dicts
    """
    pad_id = text[-1]
    word_counts = Counter(text[:-1])
    # sort words from most to least frequent
    sorted_vocab = sorted(word_counts, key=word_counts.get, reverse=True)
    # create two dictionaries: word to id, id to word
    int_to_vocab = {ii+1: word for ii, word in enumerate(sorted_vocab)}
    int_to_vocab[0] = pad_id
    vocab_to_int = {word: ii for ii, word in int_to_vocab.items()}
    
    return vocab_to_int, int_to_vocab
    
"""
DON'T MODIFY CODE BELOW
"""
tests.test_create_lookup_tables(create_lookup_tables)

Tests Passed


### Tokenize Punctuation
We'll be splitting the script into a word array using spaces as delimiters.  However, punctuations like periods and exclamation marks can create multiple ids for the same word. For example, "bye" and "bye!" would generate two different word ids.

The function `token_lookup` will return a dict that will be used to tokenize symbols like "!" into "||Exclamation_Mark||".  

This dictionary will be used to tokenize the symbols and add the delimiter (space) around it.  This separates each symbols as its own word, making it easier for the neural network to predict the next word.

In [4]:
def token_lookup():
    """
    Generate a dict to turn punctuation into a token.
    
    Returns
    -------
    Tokenized dictionary where the key is the punctuation and the value is the token
    """
    punctuations = ['.', ',', ';', '?', 
                    '!', '(', ')', '-', 
                    '\n', '"']
    punctuation_names = ['Period', 'Comma', 'Semicolon', 'Question_Mark',
                         'Exclamation_Mark', 'Left_Paren', 'Right_Paren', 'Dash',
                         'Return', 'Quote_Mark']
    punc_dict = {punctuations[i]: '||{}||'.format(punctuation_names[i]) 
                 for i in range(len(punctuations))}
        
    return punc_dict

"""
DO NOT MODIFY CODE BELOW
"""
tests.test_tokenize(token_lookup)

Tests Passed


## Pre-process all the data and save it

Running the code cell below will pre-process all the data and save it to file. 

In [5]:
"""
DO NOT MODIFY CODE BELOW
"""
# pre-process training data
helper.preprocess_and_save_data(data_dir, token_lookup, create_lookup_tables)

# Check Point
First checkpoint. The preprocessed data has been saved to disk.

In [6]:
"""
DO NOT MODIFY CODE BELOW
"""
import helper
import problem_unittests as tests

int_text, vocab_to_int, int_to_vocab, token_dict = helper.load_preprocess()

## Build the Neural Network
Below are components necessary to build an RNN

### Check Access to GPU

In [7]:
"""
DO NOT MODIFY CODE BELOW
"""
import torch

# Check for a GPU
train_on_gpu = torch.cuda.is_available()
if not train_on_gpu:
    print('No GPU found. Please use a GPU to train your neural network.')

## Input

### Batching & Dataloader

Using our preprocessed input data and Pytorch's TensorDataset & DataLoader method, we will load batched sequences that can be iterated.

There are couple ways to do this, but I simply looped through the length (minus last sequence length to avoid indexing error) of the int_text, and appended features and corresponding output values.

Here, each feature has length of indicated sequence length, while the output is the single following value in the text. Also, each appended feature is a list of integers of one index shift from before, so that it won't skip any sequence chunks. Using the DataLoader, it will still shuffle the sequences, giving a randomized inputs. In addition to that, the DataLoader method takes care of batching by our specified batch size, so that we don't need to compute beforehand.

Example of batch_data function usage:

Let's say we have data, `words` and desired sequence length:
```
words = [1, 2, 3, 4, 5, 6, 7]
sequence_length = 4 
```

Our first `feature_tensor` should contain the values:
```
[1, 2, 3, 4]
```
And the corresponding `target_tensor` should just be the next "word"/tokenized word value:
```
5
```
This should continue with the second `feature_tensor`, `target_tensor` being:
```
[2, 3, 4, 5]  # features
6             # target
```

In [8]:
# TRAIN SET: Get rid of double returns and join the strings back with one return
tokens = '\n'.join(text.lower().split('\n\n'))

# Substitute all punctuations with token id
for key, tok in token_dict.items():
    tokens = tokens.replace(key, ' {} '.format(tok.lower()))

# Finally, tokenize all words
tokens = [vocab_to_int[voc] for voc in tokens.split()]

In [9]:
from torch.utils.data import TensorDataset, DataLoader
import torch
import numpy as np

def batch_data(tokenlist, seq_len, batch_size, shuffled=False):
    """
    Batch the neural network data using DataLoader
    
    Params
    ------
    tokenlist(list): The word ids of the TV scripts
    seq_len(int): The sequence length of each batch
    batch_size(int): The size of each batch; number of sequences in a batch
    shuffled(bool): Setting to shuffle dataset (default: False)
    
    Returns
    -------
    DataLoader with batched data
    """

    # Define batch speficifications
    batch_vol = seq_len * batch_size # Total number of elems in each batch
    n_batches = len(tokenlist)//batch_vol # Batch counts
    # Get tokens up to load size fit
    tokens = tokenlist[:(n_batches*batch_vol)+seq_len]
    # Create input batches and target batches
    batch_x = np.array(([tokens[s:s+seq_len] for s in range(0, len(tokens)-seq_len, seq_len)])).squeeze()
    batch_y = np.array([tokens[s+seq_len] for s in range(0, len(tokens)-seq_len, seq_len)])[:, None]

    # Return Tensor type DataLoader
    data = TensorDataset(torch.from_numpy(batch_x), torch.from_numpy(batch_y.squeeze()))
    data_loader = DataLoader(data, shuffle=shuffled, batch_size=batch_size)
    
    return data_loader



### Testing batch_data function

Below, we're generating some test text data and defining a dataloader using the function you defined, above. Then, we are getting some sample batch of inputs `sample_x` and targets `sample_y` from our dataloader.

It should return something like the following (likely in a different order, if shuffled):

```
torch.Size([10, 5])
tensor([[ 28,  29,  30,  31,  32],
        [ 21,  22,  23,  24,  25],
        [ 17,  18,  19,  20,  21],
        [ 34,  35,  36,  37,  38],
        [ 11,  12,  13,  14,  15],
        [ 23,  24,  25,  26,  27],
        [  6,   7,   8,   9,  10],
        [ 38,  39,  40,  41,  42],
        [ 25,  26,  27,  28,  29],
        [  7,   8,   9,  10,  11]])

torch.Size([10])
tensor([ 33,  26,  22,  39,  16,  28,  11,  43,  30,  12])
```

### Sizes
`sample_x` should be of size `(batch_size, sequence_length)` or (10, 5) in this case and `sample_y` should just have one dimension: batch_size (10). 

### Values
The targets, `sample_y`, are the *next* value in the ordered test_text data. So, for an input sequence `[ 28,  29,  30,  31,  32]` that ends with the value `32`, the corresponding output should be `33`.

In [10]:
# test dataloader

test_text = tokens[:1000]
t_loader = batch_data(test_text, seq_len=20, batch_size=8)

data_iter = iter(t_loader)
sample_x, sample_y = data_iter.next()

print(sample_x.shape)
print(sample_x)
print()
print(sample_y.shape)
print(sample_y)

torch.Size([8, 20])
tensor([[    8,    35,     5,    28,    19,    25,    23,    51,    59,
             4,    35,     5,    28,     3,    84,   122,    63,     4,
             9,    55],
        [   48,     3,    25,    23,    48,     2,     2,     2,    18,
            48,    23,    83,    21,     7,  1253,   546,  8783,  7190,
            21,   242],
        [    2,   150,     2,     2,     2,    85,     5,   201,   239,
           150,   209,    59,    56,   136,    65,    48,     4,    25,
            23,    19],
        [  678,   209,    59,     2,     2,     2,    25,   221,   127,
             3,   122,    51,    48,    87,     3,    27,    83,    23,
           290,     2],
        [   46,    83,   375,    63,    23,   290,     3,   122,    51,
            48,    11,    77,    49,   150,   272,     9,   249,   192,
             3,    66],
        [  205,    28,   129,    56,    49,     2,    12,    29,    99,
          2745,   173,    13,    85,     5,  1367,     4,     3,    

---
## Build the Neural Network
I am going to build a recurrent neural network with LSTM, using Pytorch. The architecture consists of:
 - `__init__` - The initialize function. 
 - `init_hidden` - The initialization function for a LSTM hidden state
 - `forward` - Forward propagation function.
 
The initialize function should create the layers of the neural network and save them to the class. The forward propagation function will use these layers to run forward propagation and generate an output and a hidden state.

The output of this model is the *last* batch of word scores after a complete sequence has been processed. That is, for each input sequence of words, we only want to output the word scores for a single, most likely, next word.


In [11]:
import torch.nn as nn
import torch.optim as optim

class RNN(nn.Module):
    
    def __init__(self, vocab_size, output_size, embedding_dim, hidden_dim, n_layers, dropout=0.5):
        """
        Initialize the PyTorch RNN Module
        
        Params
        ------
        vocab_size(int): Number of input dimensions of the neural network
            (size of the vocabulary)
        output_size(int): Number of output dimensions of the neural network
        embedding_dim(int): Size of embeddings, if any
        hidden_dim(int): Size of the hidden layer outputs
        n_layers(int): Number of hidden layers
        dropout(float): Dropout to add in between LSTM layers
        """
        super(RNN, self).__init__()
        self.output_size = output_size
        self.n_layers = n_layers
        self.hidden_dim = hidden_dim
        self.embed_dim = embedding_dim
        
        # Embedding and LSTM layers
        self.embedding = nn.Embedding(vocab_size, embedding_dim, padding_idx=0)
        self.lstm = nn.LSTM(embedding_dim, hidden_dim, n_layers,
                            dropout=dropout, batch_first=True)
        
        # Fully connected Linear output layer
        self.fc = nn.Linear(hidden_dim, output_size)
    
    
    def forward(self, nn_input, hidden):
        """
        Forward propagation of the neural network
        
        Params
        ------
        nn_input(Tensors): The input to the neural network
        hidden(Tensors): The hidden state
        
        Returns
        -------
        Two Tensors, the output of the neural network and the latest
        hidden state
        """
        batch_size = nn_input.size(0)
        nn_input = nn_input.long()
        embeds = self.embedding(nn_input)
        # Get outputs and new hidden state from lstm
        lstm_out, hidden = self.lstm(embeds, hidden)
        #lstm_out = self.dropout(lstm_out)
        # Stack lstm outputs using contiguous and view to reshape
        lstm_out = lstm_out.contiguous().view(-1, self.hidden_dim)
        
        # Pass through fc layer
        out = self.fc(lstm_out)
        
        # Reshape output to (batch_size, seq_length, output_size)
        out = out.view(batch_size, -1, self.output_size)
        # Get last batch output
        final_out = out[:, -1] # so this needs to be the last value in each row

        # return one batch of output word scores and the hidden state
        return final_out, hidden
    
    
    def init_hidden(self, batch_size):
        '''
        Initialize the hidden state of an LSTM/GRU
        
        Params
        ------
        batch_size(int): batch_size of the hidden state
        
        Returns
        -------
        Hidden state of dimensions, (n_layers, batch_size, hidden_dim)
        '''
        # Two new tensors with sizes stated above, 
        # initialized to zero then move to GPU if available
        weight = next(self.parameters()).data
        
        if train_on_gpu:
            hidden = (weight.new(self.n_layers, batch_size, self.hidden_dim).zero_().cuda(),
                      weight.new(self.n_layers, batch_size, self.hidden_dim).zero_().cuda())
        else:
            hidden = (weight.new(self.n_layers, batch_size, self.hidden_dim).zero_(),
                      weight.new(self.n_layers, batch_size, self.hidden_dim).zero_())
        
        return hidden

"""
DO NOT MODIFY CODE BELOW
"""
tests.test_rnn(RNN, train_on_gpu)

Tests Passed


### Define forward and backpropagation

Within the training loop, the function `forward_back_prop` will be called as demonstrated below, to implement forward and back propagation of the RNN model:
```
loss = forward_back_prop(decoder, decode_optim, criterion, inputs, targets)
```

And it should return the average loss over a batch and the hidden state returned by a call to `RNN(inputs, hidden)`. 


In [12]:
def forward_back_prop(decoder, decode_optim, criterion, inputs, 
                          targets, hidden, clip=0.5):
    """
    Forward and backward propagation on the neural network
    
    Params
    ------
    decoder: Pytorch Module that holds the recurrent neural network
    decode_optim: Pytorch optimizer for the neural network
    criterion: Pytorch loss function
    inputs: A batch of input to the neural network
    targets: The target output for the batch of input
    clip(float): Size of gradient clipping (default=0.5)
    
    Returns
    -------
    The loss and the latest hidden state Tensor
    """
    clip=clip

    if train_on_gpu:
        inputs, targets = inputs.cuda(), targets.cuda()
    
    # Initialize new hidden state variable for each batch
    h = tuple([each.data for each in hidden])
    decoder.zero_grad()
    
    # Get outputs from the model
    output, h = decoder(inputs, h)
    
    # Calculate the loss and backprop
    loss = criterion(output.squeeze(), targets)
    loss.backward()
    # clip gradient to prevent exploding gradient problem
    nn.utils.clip_grad_norm_(decoder.parameters(), clip)
    optimizer.step()
    # return the loss over a batch and the hidden state produced by our model
    return loss.item(), h

# Note that these tests aren't completely extensive.
# they are here to act as general checks on the expected outputs of your functions
"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_forward_back_prop(RNN, forward_back_prop, train_on_gpu)

Tests Passed


## Neural Network Training

With the structure of the network complete and data ready to be fed in the neural network, it's time to train it.

### Train Loop

Implemented in the `train_decoder` function, the function below will train the network over all the batches for the number of epochs given. The model progress will be shown every number of batches. This number is set with the `show_every_n_batches` parameter.

In [13]:
"""
DO NOT MODIFY CODE BELOW
"""

def train_rnn(rnn, batch_size, optimizer, criterion, n_epochs, show_every_n_batches=500):
    batch_losses = []
    
    rnn.train()

    print("Training for %d epoch(s)..." % n_epochs)
    for epoch_i in range(1, n_epochs + 1):
        
        # initialize hidden state
        hidden = rnn.init_hidden(batch_size)
        
        for batch_i, (inputs, labels) in enumerate(train_loader, 1):

            # make sure you iterate over completely full batches, only
            n_batches = len(train_loader.dataset)//batch_size
            if(batch_i > n_batches):
                break
            
            # forward, back prop
            loss, hidden = forward_back_prop(rnn, optimizer, criterion, inputs, labels, hidden)          
            # record loss
            batch_losses.append(loss)

            # printing loss stats
            if batch_i % show_every_n_batches == 0:
                print('Batch {} --  Epoch: {:>4}/{:<4} | Loss: {}\n'.format(
                    batch_i, epoch_i, n_epochs, np.average(batch_losses)))
                batch_losses = []

    # returns a trained rnn
    return rnn

### Hyperparameters

Set and train the neural network with the following parameters:
- Set `sequence_length` to the length of a sequence.
- Set `batch_size` to the batch size.
- Set `num_epochs` to the number of epochs to train for.
- Set `learning_rate` to the learning rate for an Adam optimizer.
- Set `vocab_size` to the number of uniqe tokens in our vocabulary.
- Set `output_size` to the desired size of the output.
- Set `embedding_dim` to the embedding dimension; smaller than the vocab_size.
- Set `hidden_dim` to the hidden dimension of your RNN.
- Set `n_layers` to the number of layers/cells in your RNN.
- Set `show_every_n_batches` to the number of batches at which the neural network should print progress.

- Other option is to tweak these parameters and/or the layers in the `RNN` class.

In [17]:
# Data params

# Sequence Length (number of words in a sequence)
sequence_length = 20
# Batch Size
batch_size = 32

# data loader - do not change
train_loader = batch_data(tokens, sequence_length, batch_size, shuffled=False)

In [18]:
# Training parameters
# Number of Epochs
num_epochs = 20
# Learning Rate
learning_rate = 0.001

# Model parameters
# Vocab size
vocab_size = len(set(tokens))

# Output size
output_size = vocab_size
# Embedding Dimension
embedding_dim = 400
# Hidden Dimension
hidden_dim = 256
# Number of RNN Layers
n_layers = 2

# Show stats for every n number of batches
show_every_n_batches = 500

### Train

- [x] **loss less than 3.5.** 


In [19]:
"""
DO NOT MODIFY CODE BELOW
"""

# create model and move to gpu if available
rnn = RNN(vocab_size, output_size, embedding_dim, hidden_dim, n_layers)
if train_on_gpu:
    rnn.cuda()

# defining loss and optimization functions for training
optimizer = torch.optim.Adam(rnn.parameters(), lr=learning_rate)
criterion = nn.CrossEntropyLoss()

# training the model
trained_rnn = train_rnn(rnn, batch_size, optimizer, criterion, 
                        num_epochs, show_every_n_batches)

# saving the trained model
helper.save_model('./save/trained_rnn', trained_rnn)
print('Model Trained and Saved')

Training for 20 epoch(s)...
Batch 500 --  Epoch:    1/20   | Loss: 6.045845891952514

Batch 1000 --  Epoch:    1/20   | Loss: 5.4156662631034855

Batch 500 --  Epoch:    2/20   | Loss: 4.966112355784614

Batch 1000 --  Epoch:    2/20   | Loss: 4.589292972564698

Batch 500 --  Epoch:    3/20   | Loss: 4.431584561520284

Batch 1000 --  Epoch:    3/20   | Loss: 4.198808817386627

Batch 500 --  Epoch:    4/20   | Loss: 4.070874601897627

Batch 1000 --  Epoch:    4/20   | Loss: 3.909436241149902

Batch 500 --  Epoch:    5/20   | Loss: 3.786748048397574

Batch 1000 --  Epoch:    5/20   | Loss: 3.649227822780609

Batch 500 --  Epoch:    6/20   | Loss: 3.542469101967198

Batch 1000 --  Epoch:    6/20   | Loss: 3.417153212070465

Batch 500 --  Epoch:    7/20   | Loss: 3.3240475592636827

Batch 1000 --  Epoch:    7/20   | Loss: 3.2220018854141235

Batch 500 --  Epoch:    8/20   | Loss: 3.1139356203303477

Batch 1000 --  Epoch:    8/20   | Loss: 3.0304032826423644

Batch 500 --  Epoch:    9/20   

  "type " + obj.__name__ + ". It won't be checked "


### Question: How did you decide on your model hyperparameters? 
For example, did you try different sequence_lengths and find that one size made the model converge faster? What about your hidden_dim and n_layers; how did you decide on those?

**Answer:** With my batch loading settings, training loss decreased significantly at every batch as well as every epoch, when the batch size was higher(initially tried 16). With higher batch size, I was able to maintain low variance in loss throughout 20 epochs. Larger size of hidden dim seemed to help with loss fluctuation as well. I settled for embed dimension close to the size of the batch_size * sequence_len. Learning rate seemed to work better at lower than or equal to 0.001. Greater than 2 lstm hidden layers did not yield satisfying results, in that it could not converge as fast. Although decrease in loss slowed down throughout 20 epochs, I was able to achieve loss of 1.67.

---
# Checkpoint

Model will be saved by name, `trained_rnn`. 

In [20]:
"""
DO NOT MODIFY CODE BELOW
"""
import torch
import helper
import problem_unittests as tests

_, vocab_to_int, int_to_vocab, token_dict = helper.load_preprocess()
trained_rnn = helper.load_model('./save/trained_rnn')

## Generate TV Script
With the network trained and saved, now we can use it to generate a new, "fake" Seinfeld TV script in this section.

### Generate Text
To generate the text, the network needs to start with a single word and repeat its predictions until it reaches a set length. `generate` function below takes in a word id (`prime_id`) to start the script, and generates a new predicted script of specified length (`predict_len`). The prediction uses topk sampling to introduce randomness in choosing the most likely next word, given an output of word scores.

In [37]:
"""
DO NOT MODIFY CODE BELOW
"""
import torch.nn.functional as F

def generate(rnn, prime_id, int_to_vocab, token_dict, pad_value, predict_len=100):
    """
    Generate text using the neural network
    
    Params
    ------
    rnn(nn.Module): Pytorch Module that holds the trained neural network
    prime_id(int): The word id to start the first prediction
    int_to_vocab(dict): Dictionary of word id keys to word values
    token_dict(dict): Dictionary of punctuation token keys to punctuation values
    pad_value(int): The value used to pad a sequence
    predict_len(int): The length of text to generate
    
    Returns
    -------
    A generated text
    """
    rnn.eval()
    
    # create a sequence (batch_size=1) with the prime_id
    current_seq = np.full((1, sequence_length), pad_value)
    current_seq[-1][-1] = prime_id
    predicted = [int_to_vocab[prime_id]]
    
    for _ in range(predict_len):
        if train_on_gpu:
            current_seq = torch.LongTensor(current_seq).cuda()
        else:
            current_seq = torch.LongTensor(current_seq)
        
        # initialize the hidden state
        hidden = rnn.init_hidden(current_seq.size(0))
        
        # get the output of the rnn
        output, _ = rnn(current_seq, hidden)
        
        # get the next word probabilities
        p = F.softmax(output, dim=1).data
        if(train_on_gpu):
            p = p.cpu() # move to cpu
         
        # use top_k sampling to get the index of the next word
        top_k = 5
        p, top_i = p.topk(top_k)
        top_i = top_i.numpy().squeeze()
        
        # select the likely next word index with some element of randomness
        p = p.numpy().squeeze()
        word_i = np.random.choice(top_i, p=p/p.sum())
        
        # retrieve that word from the dictionary
        word = int_to_vocab[word_i]
        predicted.append(word)     
        
        if(train_on_gpu):
            current_seq = current_seq.cpu() # move to cpu
        # the generated word becomes the next "current sequence" and the cycle can continue
        if train_on_gpu:
            current_seq = current_seq.cpu()

        current_seq = np.roll(current_seq, -1, 1)
        current_seq[-1][-1] = word_i
    
    gen_sentences = ' '.join(predicted)
    
    # Replace punctuation tokens
    for key, token in token_dict.items():
        ending = ' ' if key in ['\n', '(', '"'] else ''
        gen_sentences = gen_sentences.replace(' ' + token.lower(), key)
    gen_sentences = gen_sentences.replace('\n ', '\n')
    gen_sentences = gen_sentences.replace('( ', '(')
    
    # return all the sentences
    return gen_sentences

### Generate a New Script
Set `gen_length` to the length of TV script to be generated and using one of the `prime_word`'s below as a starter of the prediction:
- "jerry"
- "elaine"
- "george"
- "kramer"


In [38]:
gen_length = 200 # modify the length to your preference
prime_word = 'jerry:' # name for starting the script

"""
DO NOT MODIFY CODE BELOW
"""
pad_word = helper.SPECIAL_WORDS['PADDING']
generated_script = generate(trained_rnn, vocab_to_int[prime_word], int_to_vocab, token_dict, vocab_to_int[pad_word], gen_length)
print(generated_script)



jerry:, all don't say that bad money?
kramer: yeah.
jerry: oh, i'm satisfied.
kramer: well, yeah, i'm satisfied of purses calculators with that guy.
kramer:(immaturely elaine) yeah- hey, hey- hey! hey, hey! hey.
man:(still jerry) hey, hey, look at you look at you!
jerry: ok.
kramer: our atkins at puddy place, and i said you could've with that. so, look, i guess i will get a lo out from my ears, why know what i have to talk?
george: i guess no i said.. i'll pick you, and you know, you know, but you could get out there for your milk. and he could be more men up his apartment, but i said, i'm going to be nature that.
elaine:(still confused every waitress) rusty burning, baby, we have to look at these!
george: slippery do you had to pay?
elaine: he didnt a


#### Save the output

In [39]:
# save script to a text file
f =  open("generated_script_1.txt","w")
f.write(generated_script)
f.close()