# TV Script Generation

In this project, you'll generate your own [Seinfeld](https://en.wikipedia.org/wiki/Seinfeld) TV scripts using RNNs.  You'll be using part of the [Seinfeld dataset](https://www.kaggle.com/thec03u5/seinfeld-chronicles#scripts.csv) of scripts from 9 seasons.  The Neural Network you'll build will generate a new ,"fake" TV script, based on patterns it recognizes in this training data.

## Get the Data

The data is already provided for you in `./data/Seinfeld_Scripts.txt` and you're encouraged to open that file and look at the text. 
>* As a first step, we'll load in this data and look at some samples. 
* Then, you'll be tasked with defining and training an RNN to generate a new script!

In [1]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
# load in data
import helper
data_dir = './data/Seinfeld_Scripts.txt'
text = helper.load_data(data_dir)

## Explore the Data
Play around with `view_line_range` to view different parts of the data. This will give you a sense of the data you'll be working with. You can see, for example, that it is all lowercase text, and each new line of dialogue is separated by a newline character `\n`.

In [2]:
view_line_range = (0, 10)

"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
import numpy as np

print('Dataset Stats')
print('Roughly the number of unique words: {}'.format(len({word: None for word in text.split()})))

lines = text.split('\n')
print('Number of lines: {}'.format(len(lines)))
word_count_line = [len(line.split()) for line in lines]
print('Average number of words in each line: {}'.format(np.average(word_count_line)))

print()
print('The lines {} to {}:'.format(*view_line_range))
print('\n'.join(text.split('\n')[view_line_range[0]:view_line_range[1]]))

Dataset Stats
Roughly the number of unique words: 46367
Number of lines: 109233
Average number of words in each line: 5.544240293684143

The lines 0 to 10:
jerry: do you know what this is all about? do you know, why were here? to be out, this is out...and out is one of the single most enjoyable experiences of life. people...did you ever hear people talking about we should go out? this is what theyre talking about...this whole thing, were all out now, no one is home. not one person here is home, were all out! there are people trying to find us, they dont know where we are. (on an imaginary phone) did you ring?, i cant find him. where did he go? he didnt tell me where he was going. he must have gone out. you wanna go out you get ready, you pick out the clothes, right? you take the shower, you get all ready, get the cash, get your friends, the car, the spot, the reservation...then youre standing around, what do you do? you go we gotta be getting back. once youre out, you wanna get back! y

---
## Implement Pre-processing Functions
The first thing to do to any dataset is pre-processing.  Implement the following pre-processing functions below:
- Lookup Table
- Tokenize Punctuation

### Lookup Table
To create a word embedding, you first need to transform the words to ids.  In this function, create two dictionaries:
- Dictionary to go from the words to an id, we'll call `vocab_to_int`
- Dictionary to go from the id to word, we'll call `int_to_vocab`

Return these dictionaries in the following **tuple** `(vocab_to_int, int_to_vocab)`

In [3]:
import problem_unittests as tests
from string import punctuation

def create_lookup_tables(text):
    """
    Create lookup tables for vocabulary
    :param text: The text of tv scripts split into words
    :return: A tuple of dicts (vocab_to_int, int_to_vocab)
    """
    # TODO: Implement Function
    
    vocab = set(text)
    vocab_to_int = {}
    int_to_vocab = {}
    vocab_to_int['<PAD>'] = 0
    int_to_vocab[0] = '<PAD>'
    for ii, word in enumerate(vocab, 1):
        word = word.lower()
        vocab_to_int[word] = ii
        int_to_vocab[ii] = word
    
    # return tuple
    return (vocab_to_int, int_to_vocab)


"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_create_lookup_tables(create_lookup_tables)

Tests Passed


### Tokenize Punctuation
We'll be splitting the script into a word array using spaces as delimiters.  However, punctuations like periods and exclamation marks can create multiple ids for the same word. For example, "bye" and "bye!" would generate two different word ids.

Implement the function `token_lookup` to return a dict that will be used to tokenize symbols like "!" into "||Exclamation_Mark||".  Create a dictionary for the following symbols where the symbol is the key and value is the token:
- Period ( **.** )
- Comma ( **,** )
- Quotation Mark ( **"** )
- Semicolon ( **;** )
- Exclamation mark ( **!** )
- Question mark ( **?** )
- Left Parentheses ( **(** )
- Right Parentheses ( **)** )
- Dash ( **-** )
- Return ( **\n** )

This dictionary will be used to tokenize the symbols and add the delimiter (space) around it.  This separates each symbols as its own word, making it easier for the neural network to predict the next word. Make sure you don't use a value that could be confused as a word; for example, instead of using the value "dash", try using something like "||dash||".

In [4]:
def token_lookup():
    """
    Generate a dict to turn punctuation into a token.
    :return: Tokenized dictionary where the key is the punctuation and the value is the token
    """
    # TODO: Implement Function
    lookup = {}
    lookup['.'] = '||period||'
    lookup[','] = '||comma||'
    lookup['"'] = '||quotation_mark||'
    lookup[';'] = '||semicolon||'
    lookup['!'] = '||exclamation_mark||'
    lookup['?'] = '||question_mark||'
    lookup['('] = '||left_paraentheses||'
    lookup[')'] = '||right_parentheses||'
    lookup['-'] = '||dash||'
    lookup['\n'] = '||return||'
        
    return lookup

"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_tokenize(token_lookup)

Tests Passed


## Pre-process all the data and save it

Running the code cell below will pre-process all the data and save it to file. You're encouraged to lok at the code for `preprocess_and_save_data` in the `helpers.py` file to see what it's doing in detail, but you do not need to change this code.

In [5]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
# pre-process training data
helper.preprocess_and_save_data(data_dir, token_lookup, create_lookup_tables)

# Check Point
This is your first checkpoint. If you ever decide to come back to this notebook or have to restart the notebook, you can start from here. The preprocessed data has been saved to disk.

In [6]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
import helper
import problem_unittests as tests

int_text, vocab_to_int, int_to_vocab, token_dict = helper.load_preprocess()

## Build the Neural Network
In this section, you'll build the components necessary to build an RNN by implementing the RNN Module and forward and backpropagation functions.

### Check Access to GPU

In [7]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
import torch

# Check for a GPU
train_on_gpu = torch.cuda.is_available()
if not train_on_gpu:
    print('No GPU found. Please use a GPU to train your neural network.')

## Input
Let's start with the preprocessed input data. We'll use [TensorDataset](http://pytorch.org/docs/master/data.html#torch.utils.data.TensorDataset) to provide a known format to our dataset; in combination with [DataLoader](http://pytorch.org/docs/master/data.html#torch.utils.data.DataLoader), it will handle batching, shuffling, and other dataset iteration functions.

You can create data with TensorDataset by passing in feature and target tensors. Then create a DataLoader as usual.
```
data = TensorDataset(feature_tensors, target_tensors)
data_loader = torch.utils.data.DataLoader(data, 
                                          batch_size=batch_size)
```

### Batching
Implement the `batch_data` function to batch `words` data into chunks of size `batch_size` using the `TensorDataset` and `DataLoader` classes.

>You can batch words using the DataLoader, but it will be up to you to create `feature_tensors` and `target_tensors` of the correct size and content for a given `sequence_length`.

For example, say we have these as input:
```
words = [1, 2, 3, 4, 5, 6, 7]
sequence_length = 4
```

Your first `feature_tensor` should contain the values:
```
[1, 2, 3, 4]
```
And the corresponding `target_tensor` should just be the next "word"/tokenized word value:
```
5
```
This should continue with the second `feature_tensor`, `target_tensor` being:
```
[2, 3, 4, 5]  # features
6             # target
```

In [8]:
from torch.utils.data import TensorDataset, DataLoader

def batch_data(words, sequence_length, batch_size):
    """
    Batch the neural network data using DataLoader
    :param words: The word ids of the TV scripts
    :param sequence_length: The sequence length of each batch
    :param batch_size: The size of each batch; the number of sequences in a batch
    :return: DataLoader with batched data
    """
    # TODO: Implement function
    features = []
    labels = []
    size = len(words)
    max_i = size - sequence_length
    for i in range(max_i):
        row = []
        for ii in range(sequence_length):
            row.append(words[i+ii])
        features.append(row)
        labels.append(words[i+sequence_length])
    features = np.array(features)
    labels = np.array(labels)
    data = TensorDataset(torch.from_numpy(features), torch.from_numpy(labels))
    dataloader = DataLoader(data, shuffle=True, batch_size=batch_size)
    
    return dataloader

# there is no test for this function, but you are encouraged to create
# print statements and tests of your own

words = [1, 2, 3, 4, 5, 6, 7]
batch_data(words, 4, 3)

<torch.utils.data.dataloader.DataLoader at 0x7f2a0760ec88>

### Test your dataloader 

You'll have to modify this code to test a batching function, but it should look fairly similar.

Below, we're generating some test text data and defining a dataloader using the function you defined, above. Then, we are getting some sample batch of inputs `sample_x` and targets `sample_y` from our dataloader.

Your code should return something like the following (likely in a different order, if you shuffled your data):

```
torch.Size([10, 5])
tensor([[ 28,  29,  30,  31,  32],
        [ 21,  22,  23,  24,  25],
        [ 17,  18,  19,  20,  21],
        [ 34,  35,  36,  37,  38],
        [ 11,  12,  13,  14,  15],
        [ 23,  24,  25,  26,  27],
        [  6,   7,   8,   9,  10],
        [ 38,  39,  40,  41,  42],
        [ 25,  26,  27,  28,  29],
        [  7,   8,   9,  10,  11]])

torch.Size([10])
tensor([ 33,  26,  22,  39,  16,  28,  11,  43,  30,  12])
```

### Sizes
Your sample_x should be of size `(batch_size, sequence_length)` or (10, 5) in this case and sample_y should just have one dimension: batch_size (10). 

### Values

You should also notice that the targets, sample_y, are the *next* value in the ordered test_text data. So, for an input sequence `[ 28,  29,  30,  31,  32]` that ends with the value `32`, the corresponding output should be `33`.

In [9]:
# test dataloader

test_text = range(50)
t_loader = batch_data(test_text, sequence_length=5, batch_size=10)

data_iter = iter(t_loader)
sample_x, sample_y = data_iter.next()

print(sample_x.shape)
print(sample_x)
print()
print(sample_y.shape)
print(sample_y)

torch.Size([10, 5])
tensor([[  8,   9,  10,  11,  12],
        [ 32,  33,  34,  35,  36],
        [ 30,  31,  32,  33,  34],
        [ 44,  45,  46,  47,  48],
        [ 11,  12,  13,  14,  15],
        [ 29,  30,  31,  32,  33],
        [ 22,  23,  24,  25,  26],
        [ 28,  29,  30,  31,  32],
        [ 14,  15,  16,  17,  18],
        [ 36,  37,  38,  39,  40]])

torch.Size([10])
tensor([ 13,  37,  35,  49,  16,  34,  27,  33,  19,  41])


---
## Build the Neural Network
Implement an RNN using PyTorch's [Module class](http://pytorch.org/docs/master/nn.html#torch.nn.Module). You may choose to use a GRU or an LSTM. To complete the RNN, you'll have to implement the following functions for the class:
 - `__init__` - The initialize function. 
 - `init_hidden` - The initialization function for an LSTM/GRU hidden state
 - `forward` - Forward propagation function.
 
The initialize function should create the layers of the neural network and save them to the class. The forward propagation function will use these layers to run forward propagation and generate an output and a hidden state.

**The output of this model should be the *last* batch of word scores** after a complete sequence has been processed. That is, for each input sequence of words, we only want to output the word scores for a single, most likely, next word.

### Hints

1. Make sure to stack the outputs of the lstm to pass to your fully-connected layer, you can do this with `lstm_output = lstm_output.contiguous().view(-1, self.hidden_dim)`
2. You can get the last batch of word scores by shaping the output of the final, fully-connected layer like so:

```
# reshape into (batch_size, seq_length, output_size)
output = output.view(batch_size, -1, self.output_size)
# get last batch
out = output[:, -1]
```

In [10]:
import torch.nn as nn

class RNN(nn.Module):
    
    def __init__(self, vocab_size, output_size, embedding_dim, hidden_dim, n_layers, dropout=0.5):
        """
        Initialize the PyTorch RNN Module
        :param vocab_size: The number of input dimensions of the neural network (the size of the vocabulary)
        :param output_size: The number of output dimensions of the neural network
        :param embedding_dim: The size of embeddings, should you choose to use them        
        :param hidden_dim: The size of the hidden layer outputs
        :param dropout: dropout to add in between LSTM/GRU layers
        """
        super(RNN, self).__init__()
        # TODO: Implement function
        
        # set class variables
        self.vocab_size = vocab_size
        self.output_size = output_size
        self.embedding_dim = embedding_dim
        self.hidden_dim = hidden_dim
        self.n_layers = n_layers
        
        # define model layers
        self.dropout = nn.Dropout(p=dropout)
        self.embed = nn.Embedding(vocab_size, embedding_dim)
        self.lstm = nn.LSTM(embedding_dim, hidden_dim, n_layers, batch_first=True, dropout=dropout)
        self.fc = nn.Linear(hidden_dim, output_size)
    
    
    def forward(self, nn_input, hidden):
        """
        Forward propagation of the neural network
        :param nn_input: The input to the neural network
        :param hidden: The hidden state        
        :return: Two Tensors, the output of the neural network and the latest hidden state
        """
        batch_size = nn_input.size(0)
        # TODO: Implement function   
        x = self.embed(nn_input)
        x, hidden = self.lstm(x, hidden)
        x = self.dropout(x)
        x = x.contiguous().view(-1,self.hidden_dim)
        x = self.fc(x)
        x = x.view(batch_size, -1, self.output_size)[:,-1,:]

        # return one batch of output word scores and the hidden state
        return x, hidden
    
    
    def init_hidden(self, batch_size):
        '''
        Initialize the hidden state of an LSTM/GRU
        :param batch_size: The batch_size of the hidden state
        :return: hidden state of dims (n_layers, batch_size, hidden_dim)
        '''
        # Implement function
        weight = next(self.parameters()).data

        # initialize hidden state with zero weights, and move to GPU if available
        if train_on_gpu:
            hidden = (weight.new(self.n_layers, batch_size, self.hidden_dim).zero_().cuda(),
                      weight.new(self.n_layers, batch_size, self.hidden_dim).zero_().cuda())
        else:
            hidden = (weight.new(self.n_layers, batch_size, self.hidden_dim).zero_(),
                      weight.new(self.n_layers, batch_size, self.hidden_dim).zero_())
        
        return hidden

"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_rnn(RNN, train_on_gpu)

Tests Passed


### Define forward and backpropagation

Use the RNN class you implemented to apply forward and back propagation. This function will be called, iteratively, in the training loop as follows:
```
loss = forward_back_prop(decoder, decoder_optimizer, criterion, inp, target)
```

And it should return the average loss over a batch and the hidden state returned by a call to `RNN(inp, hidden)`. Recall that you can get this loss by computing it, as usual, and calling `loss.item()`.

**If a GPU is available, you should move your data to that GPU device, here.**

In [11]:
def forward_back_prop(rnn, optimizer, criterion, inp, target, hidden):
    """
    Forward and backward propagation on the neural network
    :param decoder: The PyTorch Module that holds the neural network
    :param decoder_optimizer: The PyTorch optimizer for the neural network
    :param criterion: The PyTorch loss function
    :param inp: A batch of input to the neural network
    :param target: The target output for the batch of input
    :return: The loss and the latest hidden state Tensor
    """
    
    # TODO: Implement Function
    
    # move data to GPU, if available
    if train_on_gpu:
        rnn.cuda()
        inp = inp.cuda()
        target = target.cuda()
    
    clip = 5

    rnn.train()
    
    # perform backpropagation and optimization
    rnn.zero_grad()
    hidden = tuple([each.data for each in hidden])
    sftmx, hidden = rnn(inp, hidden)
    loss = criterion(sftmx.squeeze(), target.long())
    loss.backward()
    nn.utils.clip_grad_norm_(rnn.parameters(), clip)
    optimizer.step()

    # return the loss over a batch and the hidden state produced by our model
    return loss.item(), hidden

# Note that these tests aren't completely extensive.
# they are here to act as general checks on the expected outputs of your functions
"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_forward_back_prop(RNN, forward_back_prop, train_on_gpu)

Tests Passed


## Neural Network Training

With the structure of the network complete and data ready to be fed in the neural network, it's time to train it.

### Train Loop

The training loop is implemented for you in the `train_decoder` function. This function will train the network over all the batches for the number of epochs given. The model progress will be shown every number of batches. This number is set with the `show_every_n_batches` parameter. You'll set this parameter along with other parameters in the next section.

In [12]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
import os

def save_hidden(filename, decoder):
    save_filename = os.path.splitext(os.path.basename(filename))[0] + '_hidden.pt'
    torch.save(decoder, save_filename)


def load_hidden(filename):
    save_filename = os.path.splitext(os.path.basename(filename))[0] + '_hidden.pt'
    return torch.load(save_filename)

def train_rnn(rnn, batch_size, optimizer, criterion, n_epochs, show_every_n_batches=100, path='./save/trained_rnn_512_02'):
    batch_losses = []
    
    rnn.train()

    print("Training for %d epoch(s)..." % n_epochs)
    for epoch_i in range(1, n_epochs + 1):
        
        # initialize hidden state
        hidden = rnn.init_hidden(batch_size)
        
        for batch_i, (inputs, labels) in enumerate(train_loader, 1):
            
            # make sure you iterate over completely full batches, only
            n_batches = len(train_loader.dataset)//batch_size
            if(batch_i > n_batches):
                break
            
            # forward, back prop
            loss, hidden = forward_back_prop(rnn, optimizer, criterion, inputs, labels, hidden)          
            # record loss
            batch_losses.append(loss)

            # printing loss stats
            if batch_i % show_every_n_batches == 0:
                print('Epoch: {:>4}/{:<4}  Loss: {}\n'.format(
                    epoch_i, n_epochs, np.average(batch_losses)))
                batch_losses = []
        helper.save_model(path, rnn)

    # returns a trained rnn
    return rnn

### Hyperparameters

Set and train the neural network with the following parameters:
- Set `sequence_length` to the length of a sequence.
- Set `batch_size` to the batch size.
- Set `num_epochs` to the number of epochs to train for.
- Set `learning_rate` to the learning rate for an Adam optimizer.
- Set `vocab_size` to the number of uniqe tokens in our vocabulary.
- Set `output_size` to the desired size of the output.
- Set `embedding_dim` to the embedding dimension; smaller than the vocab_size.
- Set `hidden_dim` to the hidden dimension of your RNN.
- Set `n_layers` to the number of layers/cells in your RNN.
- Set `show_every_n_batches` to the number of batches at which the neural network should print progress.

If the network isn't getting the desired results, tweak these parameters and/or the layers in the `RNN` class.

## Experiment 1

In [38]:
# Data params
# Sequence Length
sequence_length = 200  # of words in a sequence
# Batch Size
batch_size = 32

# data loader - do not change
train_loader = batch_data(int_text, sequence_length, batch_size)

In [39]:
# Training parameters
# Number of Epochs
num_epochs = 5
# Learning Rate
learning_rate = 0.001

# Model parameters
# Vocab size
vocab_size = len(vocab_to_int)
# Output size
output_size = len(vocab_to_int)
# Embedding Dimension
embedding_dim = 200
# Hidden Dimension
hidden_dim = 256
# Number of RNN Layers
n_layers = 2

# Show stats for every n number of batches
show_every_n_batches = 500

### Train
In the next cell, you'll train the neural network on the pre-processed data.  If you have a hard time getting a good loss, you may consider changing your hyperparameters. In general, you may get better results with larger hidden and n_layer dimensions, but larger models take a longer time to train. 
> **You should aim for a loss less than 3.5.** 

You should also experiment with different sequence lengths, which determine the size of the long range dependencies that a model can learn.

In [40]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""

# create model and move to gpu if available
rnn = RNN(vocab_size, output_size, embedding_dim, hidden_dim, n_layers, dropout=0.5)
if train_on_gpu:
    rnn.cuda()

# defining loss and optimization functions for training
optimizer = torch.optim.Adam(rnn.parameters(), lr=learning_rate)
criterion = nn.CrossEntropyLoss()

# training the model
trained_rnn = train_rnn(rnn, batch_size, optimizer, criterion, num_epochs, show_every_n_batches)

# saving the trained model
helper.save_model('./save/trained_rnn', trained_rnn)
print('Model Trained and Saved')

Training for 5 epoch(s)...
Epoch:    1/5     Loss: 5.814536405563355

Epoch:    1/5     Loss: 5.294642675876617

Epoch:    1/5     Loss: 5.215630855083465

Epoch:    1/5     Loss: 5.05716952085495

Epoch:    1/5     Loss: 5.065799199104309

Epoch:    1/5     Loss: 4.929776544570923

Epoch:    1/5     Loss: 4.91144019985199

Epoch:    1/5     Loss: 4.871308234214783

Epoch:    1/5     Loss: 4.795930293560028

Epoch:    1/5     Loss: 4.8035445337295535

Epoch:    1/5     Loss: 4.744415888786316

Epoch:    1/5     Loss: 4.728337649822235

Epoch:    1/5     Loss: 4.714304267883301

Epoch:    1/5     Loss: 4.7128020849227905

Epoch:    1/5     Loss: 4.638632520675659

Epoch:    1/5     Loss: 4.645726988792419

Epoch:    1/5     Loss: 4.662727818965912

Epoch:    1/5     Loss: 4.625527530193329

Epoch:    1/5     Loss: 4.616027403831482

Epoch:    1/5     Loss: 4.683815835475921

Epoch:    1/5     Loss: 4.589832183837891

Epoch:    1/5     Loss: 4.63441016292572

Epoch:    1/5     Loss: 4.59

Epoch:    4/5     Loss: 4.2759792160987855

Epoch:    4/5     Loss: 4.237377481937409

Epoch:    4/5     Loss: 4.309260502338409

Epoch:    4/5     Loss: 4.240190428256988

Epoch:    4/5     Loss: 4.260425121307373

Epoch:    4/5     Loss: 4.2841018733978276

Epoch:    4/5     Loss: 4.311780556678772

Epoch:    4/5     Loss: 4.298802248001099

Epoch:    4/5     Loss: 4.318086175918579

Epoch:    4/5     Loss: 4.307427879810334

Epoch:    4/5     Loss: 4.278925110816956

Epoch:    4/5     Loss: 4.284010627269745

Epoch:    4/5     Loss: 4.280887735843659

Epoch:    4/5     Loss: 4.298950122833252

Epoch:    4/5     Loss: 4.323087350845337

Epoch:    4/5     Loss: 4.27782289981842

Epoch:    4/5     Loss: 4.288273348331451

Epoch:    4/5     Loss: 4.300994512557984

Epoch:    4/5     Loss: 4.2853131637573245

Epoch:    4/5     Loss: 4.305103519439697

Epoch:    4/5     Loss: 4.318372834205627

Epoch:    4/5     Loss: 4.333410906791687

Epoch:    4/5     Loss: 4.272612528800964

Epoch:   

  "type " + obj.__name__ + ". It won't be checked "


Model Trained and Saved


## Experiment 2

In [None]:
# Data params
# Sequence Length
sequence_length = 200  # of words in a sequence
# Batch Size
batch_size = 32

# data loader - do not change
train_loader = batch_data(int_text, sequence_length, batch_size)

# Training parameters
# Number of Epochs
num_epochs = 6
# Learning Rate
learning_rate = 0.001

# Model parameters
# Vocab size
vocab_size = len(vocab_to_int)
# Output size
output_size = len(vocab_to_int)
# Embedding Dimension
embedding_dim = 200
# Hidden Dimension
hidden_dim = 512
# Number of RNN Layers
n_layers = 2

# Show stats for every n number of batches
show_every_n_batches = 500

# create model and move to gpu if available
rnn = RNN(vocab_size, output_size, embedding_dim, hidden_dim, n_layers, dropout=0.5)
if train_on_gpu:
    rnn.cuda()

# defining loss and optimization functions for training
optimizer = torch.optim.Adam(rnn.parameters(), lr=learning_rate)
criterion = nn.CrossEntropyLoss()

path = './save/trained_rnn_512'

# training the model
trained_rnn = train_rnn(rnn, batch_size, optimizer, criterion, num_epochs, show_every_n_batches, path)

# saving the trained model
helper.save_model(train_rnn, trained_rnn)
print('Model Trained and Saved')

Training for 6 epoch(s)...
Epoch:    1/6     Loss: 5.701030227661133

Epoch:    1/6     Loss: 5.260918291568756

Epoch:    1/6     Loss: 5.180226408481598

Epoch:    1/6     Loss: 5.059348489761352

Epoch:    1/6     Loss: 4.99691535282135

Epoch:    1/6     Loss: 4.89206879234314

Epoch:    1/6     Loss: 4.860235902786255

Epoch:    1/6     Loss: 4.863872250080108

Epoch:    1/6     Loss: 4.776177392482758

Epoch:    1/6     Loss: 4.740982954502106

Epoch:    1/6     Loss: 4.72337942314148

Epoch:    1/6     Loss: 4.725228506088257

Epoch:    1/6     Loss: 4.656517887115479

Epoch:    1/6     Loss: 4.680493153572082

Epoch:    1/6     Loss: 4.627993297100067

Epoch:    1/6     Loss: 4.632838337898255

Epoch:    1/6     Loss: 4.586052298069

Epoch:    1/6     Loss: 4.632166959285736

Epoch:    1/6     Loss: 4.578626882553101

Epoch:    1/6     Loss: 4.591191054821015

Epoch:    1/6     Loss: 4.639580221176147

Epoch:    1/6     Loss: 4.5601844339370725

Epoch:    1/6     Loss: 4.567931

Epoch:    4/6     Loss: 4.189552246570587

Epoch:    4/6     Loss: 4.201244663238525

Epoch:    4/6     Loss: 4.177928785324097

Epoch:    4/6     Loss: 4.222064717769623

Epoch:    4/6     Loss: 4.279292543888092

Epoch:    4/6     Loss: 4.246867390632629

Epoch:    4/6     Loss: 4.244134308815003

Epoch:    4/6     Loss: 4.250814400672913

Epoch:    4/6     Loss: 4.237067190170288

Epoch:    4/6     Loss: 4.297513787746429

Epoch:    4/6     Loss: 4.226821291923523

Epoch:    4/6     Loss: 4.197367931842804

Epoch:    4/6     Loss: 4.281052847862243

Epoch:    4/6     Loss: 4.202565054416657

Epoch:    4/6     Loss: 4.212696791172028

Epoch:    4/6     Loss: 4.2089838395118715

Epoch:    4/6     Loss: 4.242517303466797

Epoch:    4/6     Loss: 4.3000672011375425

Epoch:    4/6     Loss: 4.258489910602569

Epoch:    4/6     Loss: 4.232053030014038

Epoch:    4/6     Loss: 4.266660823345184

Epoch:    4/6     Loss: 4.282714551448822

Epoch:    4/6     Loss: 4.2842117805480955

Epoch:  

## Experiment 3 - Change dropout from 0.8 to 0.2

In [17]:
# Data params
# Sequence Length
sequence_length = 200  # of words in a sequence
# Batch Size
batch_size = 32

# data loader - do not change
train_loader = batch_data(int_text, sequence_length, batch_size)

# Training parameters
# Number of Epochs
num_epochs = 6
# Learning Rate
learning_rate = 0.001

# Model parameters
# Vocab size
vocab_size = len(vocab_to_int)
# Output size
output_size = len(vocab_to_int)
# Embedding Dimension
embedding_dim = 200
# Hidden Dimension
hidden_dim = 256
# Number of RNN Layers
n_layers = 2

# Show stats for every n number of batches
show_every_n_batches = 500

# create model and move to gpu if available
rnn = RNN(vocab_size, output_size, embedding_dim, hidden_dim, n_layers, dropout=0.2)
if train_on_gpu:
    rnn.cuda()

# defining loss and optimization functions for training
optimizer = torch.optim.Adam(rnn.parameters(), lr=learning_rate)
criterion = nn.CrossEntropyLoss()

path = './save/trained_rnn_256_02'

# training the model
trained_rnn = train_rnn(rnn, batch_size, optimizer, criterion, num_epochs, show_every_n_batches, path)

# saving the trained model
helper.save_model(path, trained_rnn)
print('Model Trained and Saved')

Training for 6 epoch(s)...
Epoch:    1/6     Loss: 5.734573011398315

Epoch:    1/6     Loss: 5.161716278076172

Epoch:    1/6     Loss: 4.998364925861359

Epoch:    1/6     Loss: 4.864034643173218

Epoch:    1/6     Loss: 4.795320579051971

Epoch:    1/6     Loss: 4.740472389698029

Epoch:    1/6     Loss: 4.699350751876831

Epoch:    1/6     Loss: 4.663703182220459

Epoch:    1/6     Loss: 4.634458442687988

Epoch:    1/6     Loss: 4.6119224681854245

Epoch:    1/6     Loss: 4.52804301404953

Epoch:    1/6     Loss: 4.599263755321503

Epoch:    1/6     Loss: 4.472692346096038

Epoch:    1/6     Loss: 4.467523460865021

Epoch:    1/6     Loss: 4.500551551342011

Epoch:    1/6     Loss: 4.45882213640213

Epoch:    1/6     Loss: 4.456091603279114

Epoch:    1/6     Loss: 4.440711449623108

Epoch:    1/6     Loss: 4.478302415370941

Epoch:    1/6     Loss: 4.483567180156708

Epoch:    1/6     Loss: 4.406284973621369

Epoch:    1/6     Loss: 4.442367895126343

Epoch:    1/6     Loss: 4.39

  "type " + obj.__name__ + ". It won't be checked "


Epoch:    2/6     Loss: 4.1630324029594385

Epoch:    2/6     Loss: 4.097840004920959

Epoch:    2/6     Loss: 4.101337086677551

Epoch:    2/6     Loss: 4.083431650161743

Epoch:    2/6     Loss: 4.07443700838089

Epoch:    2/6     Loss: 4.1117211842536925

Epoch:    2/6     Loss: 4.056476910591125

Epoch:    2/6     Loss: 4.141862450122833

Epoch:    2/6     Loss: 4.1032451667785645

Epoch:    2/6     Loss: 4.1334022841453555

Epoch:    2/6     Loss: 4.095456813812256

Epoch:    2/6     Loss: 4.162261750221252

Epoch:    2/6     Loss: 4.061561032772064

Epoch:    2/6     Loss: 4.104068965435028

Epoch:    2/6     Loss: 4.095857221841812

Epoch:    2/6     Loss: 4.135977490901947

Epoch:    2/6     Loss: 4.138124445438385

Epoch:    2/6     Loss: 4.114831743717193

Epoch:    2/6     Loss: 4.140637580871582

Epoch:    2/6     Loss: 4.120273303985596

Epoch:    2/6     Loss: 4.165741947650909

Epoch:    2/6     Loss: 4.171623256206512

Epoch:    2/6     Loss: 4.099943322181701

Epoch:  

Epoch:    5/6     Loss: 3.964633089065552

Epoch:    5/6     Loss: 3.9608815474510193

Epoch:    5/6     Loss: 3.906858479976654

Epoch:    5/6     Loss: 3.941682635784149

Epoch:    5/6     Loss: 3.9455833921432495

Epoch:    5/6     Loss: 3.963539057254791

Epoch:    5/6     Loss: 3.9656281604766845

Epoch:    5/6     Loss: 3.9543934082984924

Epoch:    5/6     Loss: 3.9516017498970033

Epoch:    5/6     Loss: 3.9790806555747986

Epoch:    5/6     Loss: 3.9919206614494325

Epoch:    5/6     Loss: 4.0318493924140935

Epoch:    5/6     Loss: 3.9607060680389403

Epoch:    5/6     Loss: 3.984339373588562

Epoch:    5/6     Loss: 3.9661324362754824

Epoch:    5/6     Loss: 3.9560680980682372

Epoch:    5/6     Loss: 3.967694787502289

Epoch:    5/6     Loss: 4.01798328256607

Epoch:    5/6     Loss: 3.951677890777588

Epoch:    5/6     Loss: 3.9796190295219422

Epoch:    5/6     Loss: 3.967918318271637

Epoch:    5/6     Loss: 3.9741097240448

Epoch:    5/6     Loss: 3.9593568115234374

E

KeyboardInterrupt: 

## Exp 4. Reduce embedding size from 200 to 50

In [13]:
# Data params
# Sequence Length
sequence_length = 50  # of words in a sequence
# Batch Size
batch_size = 32

# data loader - do not change
train_loader = batch_data(int_text, sequence_length, batch_size)

# Training parameters
# Number of Epochs
num_epochs = 5
# Learning Rate
learning_rate = 0.001

# Model parameters
# Vocab size
vocab_size = len(vocab_to_int)
# Output size
output_size = len(vocab_to_int)
# Embedding Dimension
embedding_dim = 200
# Hidden Dimension
hidden_dim = 256
# Number of RNN Layers
n_layers = 2

# Show stats for every n number of batches
show_every_n_batches = 500

# create model and move to gpu if available
rnn = RNN(vocab_size, output_size, embedding_dim, hidden_dim, n_layers, dropout=0.2)
if train_on_gpu:
    rnn.cuda()

# defining loss and optimization functions for training
optimizer = torch.optim.Adam(rnn.parameters(), lr=learning_rate)
criterion = nn.CrossEntropyLoss()

path = './save/trained_rnn_256_02_100'

# training the model
trained_rnn = train_rnn(rnn, batch_size, optimizer, criterion, num_epochs, show_every_n_batches, path)

# saving the trained model
helper.save_model(path, trained_rnn)
print('Model Trained and Saved')

Training for 5 epoch(s)...
Epoch:    1/5     Loss: 5.721465928077698

Epoch:    1/5     Loss: 5.162940397262573

Epoch:    1/5     Loss: 5.0087122883796695

Epoch:    1/5     Loss: 4.88570308637619



KeyboardInterrupt: 

In [35]:
# 5 more epochs

# training the model
trained_rnn = train_rnn(rnn, batch_size, optimizer, criterion, num_epochs, show_every_n_batches, path)

# saving the trained model
helper.save_model(path, trained_rnn)
print('Model Trained and Saved')

Training for 5 epoch(s)...
Epoch:    1/5     Loss: 3.9293711585998534

Epoch:    1/5     Loss: 3.8447829880714415

Epoch:    1/5     Loss: 3.8469375219345094

Epoch:    1/5     Loss: 3.8614208927154543

Epoch:    1/5     Loss: 3.8364244422912597

Epoch:    1/5     Loss: 3.849354079723358

Epoch:    1/5     Loss: 3.9086210403442383

Epoch:    1/5     Loss: 3.8789994926452636

Epoch:    1/5     Loss: 3.8899957847595217

Epoch:    1/5     Loss: 3.835445370674133

Epoch:    1/5     Loss: 3.858889109134674

Epoch:    1/5     Loss: 3.906621141910553

Epoch:    1/5     Loss: 3.837168719768524

Epoch:    1/5     Loss: 3.9131970920562744

Epoch:    1/5     Loss: 3.914961143016815

Epoch:    1/5     Loss: 3.895817253112793

Epoch:    1/5     Loss: 3.9045004959106446

Epoch:    1/5     Loss: 3.8762718186378478

Epoch:    1/5     Loss: 3.8512273254394533

Epoch:    1/5     Loss: 3.8808273396492003

Epoch:    1/5     Loss: 3.9179111909866333

Epoch:    1/5     Loss: 3.8664931020736693

Epoch:    1/

  "type " + obj.__name__ + ". It won't be checked "


Epoch:    2/5     Loss: 3.893249153818714

Epoch:    2/5     Loss: 3.8289567742347717

Epoch:    2/5     Loss: 3.8289521408081053

Epoch:    2/5     Loss: 3.8300379774570463

Epoch:    2/5     Loss: 3.848177722454071

Epoch:    2/5     Loss: 3.8422219228744505

Epoch:    2/5     Loss: 3.8487902970314027

Epoch:    2/5     Loss: 3.8664010372161863

Epoch:    2/5     Loss: 3.8133401770591737

Epoch:    2/5     Loss: 3.845321710586548

Epoch:    2/5     Loss: 3.8523562450408937

Epoch:    2/5     Loss: 3.8518931338787077

Epoch:    2/5     Loss: 3.9291054756641386

Epoch:    2/5     Loss: 3.8247746891975405

Epoch:    2/5     Loss: 3.876100085258484

Epoch:    2/5     Loss: 3.8464357991218567

Epoch:    2/5     Loss: 3.789661916255951

Epoch:    2/5     Loss: 3.9045110926628115

Epoch:    2/5     Loss: 3.884884579181671

Epoch:    2/5     Loss: 3.8975182847976684

Epoch:    2/5     Loss: 3.9083604226112367

Epoch:    2/5     Loss: 3.9116014318466186

Epoch:    2/5     Loss: 3.919722985744

Epoch:    5/5     Loss: 3.8501095325946806

Epoch:    5/5     Loss: 3.8609207262992857

Epoch:    5/5     Loss: 3.8121880140304567

Epoch:    5/5     Loss: 3.825709506988525

Epoch:    5/5     Loss: 3.8231336879730224

Epoch:    5/5     Loss: 3.8174934535026552

Epoch:    5/5     Loss: 3.822904124736786

Epoch:    5/5     Loss: 3.8548113775253294

Epoch:    5/5     Loss: 3.8990792741775513

Epoch:    5/5     Loss: 3.889527039527893

Epoch:    5/5     Loss: 3.8832168598175048

Epoch:    5/5     Loss: 3.8670734872817993

Epoch:    5/5     Loss: 3.886187370300293

Epoch:    5/5     Loss: 3.8627657198905947

Epoch:    5/5     Loss: 3.8214047574996948

Epoch:    5/5     Loss: 3.890445140361786

Epoch:    5/5     Loss: 3.8715633912086487

Epoch:    5/5     Loss: 3.8525277614593505

Epoch:    5/5     Loss: 3.8615055558681486

Epoch:    5/5     Loss: 3.9045118484497072

Epoch:    5/5     Loss: 3.884214285850525

Epoch:    5/5     Loss: 3.877251081466675

Epoch:    5/5     Loss: 3.8576169495582

In [None]:
# 30 more epochs
num_epochs = 30

# training the model
trained_rnn = train_rnn(rnn, batch_size, optimizer, criterion, num_epochs, show_every_n_batches, path)

# saving the trained model
helper.save_model(path, trained_rnn)

Training for 30 epoch(s)...
Epoch:    1/30    Loss: 3.780821882724762

Epoch:    1/30    Loss: 3.67930615234375

Epoch:    1/30    Loss: 3.7506258172988893

Epoch:    1/30    Loss: 3.7372335381507873

Epoch:    1/30    Loss: 3.792365481376648

Epoch:    1/30    Loss: 3.7410316305160523

Epoch:    1/30    Loss: 3.7957439074516297

Epoch:    1/30    Loss: 3.7649880023002624

Epoch:    1/30    Loss: 3.751134890079498

Epoch:    1/30    Loss: 3.739771010875702

Epoch:    1/30    Loss: 3.7966238260269165

Epoch:    1/30    Loss: 3.782225522518158

Epoch:    1/30    Loss: 3.7516883034706114

Epoch:    1/30    Loss: 3.7998425731658934

Epoch:    1/30    Loss: 3.720947208404541

Epoch:    1/30    Loss: 3.826858285903931

Epoch:    1/30    Loss: 3.8210852994918825

Epoch:    1/30    Loss: 3.7976765203475953

Epoch:    1/30    Loss: 3.809109055995941

Epoch:    1/30    Loss: 3.803870913505554

Epoch:    1/30    Loss: 3.804429000377655

Epoch:    1/30    Loss: 3.846292145729065

Epoch:    1/30   

  "type " + obj.__name__ + ". It won't be checked "


Epoch:    2/30    Loss: 3.7801699736346936

Epoch:    2/30    Loss: 3.723932646751404

Epoch:    2/30    Loss: 3.729721487998962

Epoch:    2/30    Loss: 3.781629999637604

Epoch:    2/30    Loss: 3.729441560745239

Epoch:    2/30    Loss: 3.7192550463676453

Epoch:    2/30    Loss: 3.731583860874176

Epoch:    2/30    Loss: 3.7129325251579286

Epoch:    2/30    Loss: 3.80433056306839

Epoch:    2/30    Loss: 3.7644897422790526

Epoch:    2/30    Loss: 3.7834016375541686

Epoch:    2/30    Loss: 3.7828581829071046

Epoch:    2/30    Loss: 3.7770398478507996

Epoch:    2/30    Loss: 3.7234683592319486

Epoch:    2/30    Loss: 3.7490962085723876

Epoch:    2/30    Loss: 3.816656254053116

Epoch:    2/30    Loss: 3.7889089951515196

Epoch:    2/30    Loss: 3.751828172683716

Epoch:    2/30    Loss: 3.8087905988693236

Epoch:    2/30    Loss: 3.8695177016258238

Epoch:    2/30    Loss: 3.8141164665222167

Epoch:    2/30    Loss: 3.801946400642395

Epoch:    2/30    Loss: 3.7905520091056824

Epoch:    5/30    Loss: 3.7906447553634646

Epoch:    5/30    Loss: 3.7625469131469726

Epoch:    5/30    Loss: 3.824274685382843

Epoch:    5/30    Loss: 3.8081753005981445

Epoch:    5/30    Loss: 3.821280647277832

Epoch:    5/30    Loss: 3.7404413990974428

Epoch:    5/30    Loss: 3.785963837623596

Epoch:    5/30    Loss: 3.7507787642478942

Epoch:    5/30    Loss: 3.830416308403015

Epoch:    5/30    Loss: 3.8026523537635804

Epoch:    5/30    Loss: 3.8374249744415283

Epoch:    5/30    Loss: 3.7564953851699827

Epoch:    5/30    Loss: 3.7827913181781767

Epoch:    5/30    Loss: 3.7831589641571046

Epoch:    5/30    Loss: 3.7839610147476197

Epoch:    5/30    Loss: 3.770904727935791

Epoch:    5/30    Loss: 3.8088799114227294

Epoch:    5/30    Loss: 3.7861751885414123

Epoch:    5/30    Loss: 3.8072671570777894

Epoch:    5/30    Loss: 3.786687971830368

Epoch:    5/30    Loss: 3.755433579921722

Epoch:    5/30    Loss: 3.8482551608085633

Epoch:    5/30    Loss: 3.7770838441848

Epoch:    8/30    Loss: 3.833738327026367

Epoch:    8/30    Loss: 3.809627555370331

Epoch:    8/30    Loss: 3.7492902765274048

Epoch:    8/30    Loss: 3.7369990367889403

Epoch:    8/30    Loss: 3.775944343090057

Epoch:    8/30    Loss: 3.8530957469940184

Epoch:    8/30    Loss: 3.8338695845603943

Epoch:    8/30    Loss: 3.793320100784302

Epoch:    9/30    Loss: 3.692833823156139

Epoch:    9/30    Loss: 3.6469540677070618

Epoch:    9/30    Loss: 3.6763525524139404

Epoch:    9/30    Loss: 3.6582496547698975

Epoch:    9/30    Loss: 3.685294219017029

Epoch:    9/30    Loss: 3.619654175043106

Epoch:    9/30    Loss: 3.6409227013587953

Epoch:    9/30    Loss: 3.692809771537781

Epoch:    9/30    Loss: 3.727855191230774

Epoch:    9/30    Loss: 3.704535118341446

Epoch:    9/30    Loss: 3.712372476577759

Epoch:    9/30    Loss: 3.6610331053733827

Epoch:    9/30    Loss: 3.697569134235382

Epoch:    9/30    Loss: 3.6643227887153627

Epoch:    9/30    Loss: 3.666763944149017

E

Epoch:   12/30    Loss: 3.7375953764915466

Epoch:   12/30    Loss: 3.6753658995628355

Epoch:   12/30    Loss: 3.6519146614074707

Epoch:   12/30    Loss: 3.7013957991600037

Epoch:   12/30    Loss: 3.6464655060768125

Epoch:   12/30    Loss: 3.6704834294319153

Epoch:   12/30    Loss: 3.683557197093964

Epoch:   12/30    Loss: 3.745357353210449

Epoch:   12/30    Loss: 3.6925543365478517

Epoch:   12/30    Loss: 3.7193840327262877

Epoch:   12/30    Loss: 3.700639096260071

Epoch:   12/30    Loss: 3.7534811148643494

Epoch:   12/30    Loss: 3.6754116134643553

Epoch:   12/30    Loss: 3.6973078227043152

Epoch:   12/30    Loss: 3.729731701374054

Epoch:   12/30    Loss: 3.711660390853882

Epoch:   12/30    Loss: 3.7476407632827757

Epoch:   12/30    Loss: 3.7176458792686464

Epoch:   12/30    Loss: 3.697091348648071

Epoch:   12/30    Loss: 3.7255109939575197

Epoch:   12/30    Loss: 3.720441635131836

Epoch:   12/30    Loss: 3.6811618342399597

Epoch:   12/30    Loss: 3.7604411964416

In [14]:
# Data params
# Sequence Length
sequence_length = 50  # of words in a sequence
# Batch Size
batch_size = 32

# data loader - do not change
train_loader = batch_data(int_text, sequence_length, batch_size)

# Training parameters
# Number of Epochs
num_epochs = 30
# Learning Rate
learning_rate = 0.001

# Model parameters
# Vocab size
vocab_size = len(vocab_to_int)
# Output size
output_size = len(vocab_to_int)
# Embedding Dimension
embedding_dim = 200
# Hidden Dimension
hidden_dim = 256
# Number of RNN Layers
n_layers = 2

# Show stats for every n number of batches
show_every_n_batches = 500

In [15]:
# create model and move to gpu if available
rnn = RNN(vocab_size, output_size, embedding_dim, hidden_dim, n_layers, dropout=0.2)
if train_on_gpu:
    rnn.cuda()

# defining loss and optimization functions for training
optimizer = torch.optim.Adam(rnn.parameters(), lr=learning_rate)
criterion = nn.CrossEntropyLoss()

path = './save/trained_rnn_256_02_50.pt'

In [None]:
# 50 epochs
num_epochs = 50
    
# training the model
trained_rnnrnn = train_rnn(rnn, batch_size, optimizer, criterion, num_epochs, show_every_n_batches, path=path)

# saving the trained model
helper.save_model(path, trained_rnn)

Training for 50 epoch(s)...
Epoch:    1/50    Loss: 5.685048009395599

Epoch:    1/50    Loss: 5.1329756364822385

Epoch:    1/50    Loss: 4.994591465950013

Epoch:    1/50    Loss: 4.8797411217689515

Epoch:    1/50    Loss: 4.879319806098938

Epoch:    1/50    Loss: 4.738553686141968

Epoch:    1/50    Loss: 4.6876709861755375

Epoch:    1/50    Loss: 4.624606618404388

Epoch:    1/50    Loss: 4.604458067893982

Epoch:    1/50    Loss: 4.622936310291291

Epoch:    1/50    Loss: 4.529778764247895

Epoch:    1/50    Loss: 4.549249366283417

Epoch:    1/50    Loss: 4.501344722270965

Epoch:    1/50    Loss: 4.488463315486908

Epoch:    1/50    Loss: 4.457786512851715

Epoch:    1/50    Loss: 4.476368714809418

Epoch:    1/50    Loss: 4.433107254028321

Epoch:    1/50    Loss: 4.435282142162323

Epoch:    1/50    Loss: 4.409528411865234

Epoch:    1/50    Loss: 4.466201627254486

Epoch:    1/50    Loss: 4.422207685947418

Epoch:    1/50    Loss: 4.438485827445984

Epoch:    1/50    Loss:

  "type " + obj.__name__ + ". It won't be checked "


Epoch:    2/50    Loss: 4.151971351882638

Epoch:    2/50    Loss: 4.0496391592025756

Epoch:    2/50    Loss: 4.060782099246979

Epoch:    2/50    Loss: 4.057024963855744

Epoch:    2/50    Loss: 4.118402906417847

Epoch:    2/50    Loss: 4.066749235153198

Epoch:    2/50    Loss: 4.104742618083954

Epoch:    2/50    Loss: 4.1109090433120725

Epoch:    2/50    Loss: 4.108331465721131

Epoch:    2/50    Loss: 4.068841254472733

Epoch:    2/50    Loss: 4.1599115042686465

Epoch:    2/50    Loss: 4.084322585582733

Epoch:    2/50    Loss: 4.121622440338135

Epoch:    2/50    Loss: 4.147054700851441

Epoch:    2/50    Loss: 4.147810961723327

Epoch:    2/50    Loss: 4.122154385566711

Epoch:    2/50    Loss: 4.160003968238831

Epoch:    2/50    Loss: 4.173973856925964

Epoch:    2/50    Loss: 4.135274160385132

Epoch:    2/50    Loss: 4.092129889965057

Epoch:    2/50    Loss: 4.115961433410645

Epoch:    2/50    Loss: 4.133073672771454

Epoch:    2/50    Loss: 4.096144061565399

Epoch:  

Epoch:    5/50    Loss: 3.91636496591568

Epoch:    5/50    Loss: 3.9201814498901366

Epoch:    5/50    Loss: 3.998312590837479

Epoch:    5/50    Loss: 3.9385947732925417

Epoch:    5/50    Loss: 3.935790941238403

Epoch:    5/50    Loss: 3.9802274551391603

Epoch:    5/50    Loss: 3.929594298839569

Epoch:    5/50    Loss: 3.974986482143402

Epoch:    5/50    Loss: 3.9732351784706115

Epoch:    5/50    Loss: 3.9431003451347353

Epoch:    5/50    Loss: 4.00488867521286

Epoch:    5/50    Loss: 4.016292507171631

Epoch:    5/50    Loss: 3.937231632232666

Epoch:    5/50    Loss: 3.981154451370239

Epoch:    5/50    Loss: 3.952433512687683

Epoch:    5/50    Loss: 4.01593558883667

Epoch:    5/50    Loss: 3.992716524600983

Epoch:    5/50    Loss: 3.9460494599342346

Epoch:    5/50    Loss: 3.9484254622459414

Epoch:    5/50    Loss: 3.937362286567688

Epoch:    5/50    Loss: 4.029122782707215

Epoch:    5/50    Loss: 3.9952198519706728

Epoch:    5/50    Loss: 3.9955338225364687

Epoch

Epoch:    8/50    Loss: 3.9254247436523437

Epoch:    8/50    Loss: 3.907474632740021

Epoch:    8/50    Loss: 3.866246123790741

Epoch:    8/50    Loss: 3.8731051626205444

Epoch:    8/50    Loss: 3.919538911819458

Epoch:    8/50    Loss: 3.9484634146690367

Epoch:    9/50    Loss: 3.836232755690405

Epoch:    9/50    Loss: 3.744816421508789

Epoch:    9/50    Loss: 3.728318878173828

Epoch:    9/50    Loss: 3.818788923740387

Epoch:    9/50    Loss: 3.737968522071838

Epoch:    9/50    Loss: 3.7978273563385008

Epoch:    9/50    Loss: 3.777088213443756

Epoch:    9/50    Loss: 3.747180685997009

Epoch:    9/50    Loss: 3.7775460605621336

Epoch:    9/50    Loss: 3.7728996839523314

Epoch:    9/50    Loss: 3.7905140357017517

Epoch:    9/50    Loss: 3.785618449687958

Epoch:    9/50    Loss: 3.8062315735816954

Epoch:    9/50    Loss: 3.8129915337562563

Epoch:    9/50    Loss: 3.795929003238678

Epoch:    9/50    Loss: 3.8037237496376037

Epoch:    9/50    Loss: 3.7998047499656677



Epoch:   12/50    Loss: 3.814783143043518

Epoch:   12/50    Loss: 3.8080726432800294

Epoch:   12/50    Loss: 3.837598303318024

Epoch:   12/50    Loss: 3.810673879861832

Epoch:   12/50    Loss: 3.7653287925720216

Epoch:   12/50    Loss: 3.7858401494026186

Epoch:   12/50    Loss: 3.7806829240322113

Epoch:   12/50    Loss: 3.7457386293411257

Epoch:   12/50    Loss: 3.7695328578948977

Epoch:   12/50    Loss: 3.831898524045944

Epoch:   12/50    Loss: 3.794652039051056

Epoch:   12/50    Loss: 3.776790024280548

Epoch:   12/50    Loss: 3.7249666576385496

Epoch:   12/50    Loss: 3.7905641651153563

Epoch:   12/50    Loss: 3.7869106554985046

Epoch:   12/50    Loss: 3.7979332304000852

Epoch:   12/50    Loss: 3.8723476190567014

Epoch:   12/50    Loss: 3.7547103571891784

Epoch:   12/50    Loss: 3.809424132823944

Epoch:   12/50    Loss: 3.8462761182785035

Epoch:   12/50    Loss: 3.8223204097747803

Epoch:   12/50    Loss: 3.7457438220977783

Epoch:   12/50    Loss: 3.8481059927940

Epoch:   15/50    Loss: 3.7986195859909055

Epoch:   15/50    Loss: 3.760214962720871

Epoch:   15/50    Loss: 3.83793346619606

Epoch:   15/50    Loss: 3.808772473335266

Epoch:   15/50    Loss: 3.8355291838645935

Epoch:   15/50    Loss: 3.8198231654167176

Epoch:   15/50    Loss: 3.8233414912223815

Epoch:   15/50    Loss: 3.7971067407131196

Epoch:   15/50    Loss: 3.8118988370895384

Epoch:   15/50    Loss: 3.800294083595276

Epoch:   15/50    Loss: 3.8272801542282107

Epoch:   15/50    Loss: 3.7992992432117463

Epoch:   15/50    Loss: 3.8432971494197847

Epoch:   16/50    Loss: 3.734625144091915

Epoch:   16/50    Loss: 3.6752483532428744

Epoch:   16/50    Loss: 3.656185678958893

Epoch:   16/50    Loss: 3.6443500452041624

Epoch:   16/50    Loss: 3.6917846496105193

Epoch:   16/50    Loss: 3.648266029596329

Epoch:   16/50    Loss: 3.7008004505634307

Epoch:   16/50    Loss: 3.7402029604911804

Epoch:   16/50    Loss: 3.680185530662537

Epoch:   16/50    Loss: 3.722255536556244

In [16]:
# 50 epochs
num_epochs = 20
    
rnn = helper.load_model(path)

# training the model
trained_rnnrnn = train_rnn(rnn, batch_size, optimizer, criterion, num_epochs, show_every_n_batches, path=path)

# saving the trained model
helper.save_model(path, trained_rnn)

Training for 20 epoch(s)...
Epoch:    1/20    Loss: 20.230086227416994

Epoch:    1/20    Loss: 20.15226756286621

Epoch:    1/20    Loss: 20.11580291366577



KeyboardInterrupt: 

### Question: How did you decide on your model hyperparameters? 
For example, did you try different sequence_lengths and find that one size made the model converge faster? What about your hidden_dim and n_layers; how did you decide on those?

**Answer:** I ran a few experiments. Exp 1 had similar hyperparemter values as the toy codelab we had for classifying the sentiment of a given text. I still had loss greater than 3.5. Exp 2 had a setting of the hidden layer size being increased. The final loss was a tiny bit smaller, but it was still greater than 3.5. Exp 3 had the dropout probability decreased from 0.5 in Exp 1 and 2 to 0.2 with the rest set the same as Exp 2. I saw a much quicker performance improvement. I think given the small dataset size, the dropout prob of 0.5 must have been too big. Exp 4 had everything the same as Exp 3 except the hidden layer size reduced by half. This was inspired by Exp 3 in which we saw the model performance improvement when we provided more data points to be available. It probably means the network is too big for the small amount of training examples we have. 

---
# Checkpoint

After running the above training cell, your model will be saved by name, `trained_rnn`, and if you save your notebook progress, **you can pause here and come back to this code at another time**. You can resume your progress by running the next cell, which will load in our word:id dictionaries _and_ load in your saved model by name!

In [None]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
import torch
import helper
import problem_unittests as tests

_, vocab_to_int, int_to_vocab, token_dict = helper.load_preprocess()
trained_rnn = helper.load_model(path)

## Generate TV Script
With the network trained and saved, you'll use it to generate a new, "fake" Seinfeld TV script in this section.

### Generate Text
To generate the text, the network needs to start with a single word and repeat its predictions until it reaches a set length. You'll be using the `generate` function to do this. It takes a word id to start with, `prime_id`, and generates a set length of text, `predict_len`. Also note that it uses topk sampling to introduce some randomness in choosing the most likely next word, given an output set of word scores!

In [None]:
"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
import torch.nn.functional as F

def generate(rnn, prime_id, int_to_vocab, token_dict, pad_value, predict_len=100):
    """
    Generate text using the neural network
    :param decoder: The PyTorch Module that holds the trained neural network
    :param prime_id: The word id to start the first prediction
    :param int_to_vocab: Dict of word id keys to word values
    :param token_dict: Dict of puncuation tokens keys to puncuation values
    :param pad_value: The value used to pad a sequence
    :param predict_len: The length of text to generate
    :return: The generated text
    """
    rnn.eval()
    
    # create a sequence (batch_size=1) with the prime_id
    current_seq = np.full((1, sequence_length), pad_value)
    current_seq[-1][-1] = prime_id
    predicted = [int_to_vocab[prime_id]]
    
    for _ in range(predict_len):
        if train_on_gpu:
            current_seq = torch.LongTensor(current_seq).cuda()
        else:
            current_seq = torch.LongTensor(current_seq)
        
        # initialize the hidden state
        hidden = rnn.init_hidden(current_seq.size(0))
        
        # get the output of the rnn
        output, _ = rnn(current_seq, hidden)
        
        # get the next word probabilities
        p = F.softmax(output, dim=1).data
        if(train_on_gpu):
            p = p.cpu() # move to cpu
         
        # use top_k sampling to get the index of the next word
        top_k = 5
        p, top_i = p.topk(top_k)
        top_i = top_i.numpy().squeeze()
        
        # select the likely next word index with some element of randomness
        p = p.numpy().squeeze()
        word_i = np.random.choice(top_i, p=p/p.sum())
        
        # retrieve that word from the dictionary
        word = int_to_vocab[word_i]
        predicted.append(word)     
        
        # the generated word becomes the next "current sequence" and the cycle can continue
        current_seq = np.roll(current_seq, -1, 1)
        current_seq[-1][-1] = word_i
    
    gen_sentences = ' '.join(predicted)
    
    # Replace punctuation tokens
    for key, token in token_dict.items():
        ending = ' ' if key in ['\n', '(', '"'] else ''
        gen_sentences = gen_sentences.replace(' ' + token.lower(), key)
    gen_sentences = gen_sentences.replace('\n ', '\n')
    gen_sentences = gen_sentences.replace('( ', '(')
    
    # return all the sentences
    return gen_sentences

### Generate a New Script
It's time to generate the text. Set `gen_length` to the length of TV script you want to generate and set `prime_word` to one of the following to start the prediction:
- "jerry"
- "elaine"
- "george"
- "kramer"

You can set the prime word to _any word_ in our dictionary, but it's best to start with a name for generating a TV script. (You can also start with any other names you find in the original text file!)

In [None]:
# run the cell multiple times to get different results!
gen_length = 400 # modify the length to your preference
prime_word = 'jerry' # name for starting the script

"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
pad_word = helper.SPECIAL_WORDS['PADDING']
generated_script = generate(trained_rnn, vocab_to_int[prime_word + ':'], int_to_vocab, token_dict, vocab_to_int[pad_word], gen_length)
print(generated_script)

#### Save your favorite scripts

Once you have a script that you like (or find interesting), save it to a text file!

In [None]:
# save script to a text file
f =  open("generated_script_2.txt","w")
f.write(generated_script)
f.close()

# The TV Script is Not Perfect
It's ok if the TV script doesn't make perfect sense. It should look like alternating lines of dialogue, here is one such example of a few generated lines.

### Example generated script

>jerry: what about me?
>
>jerry: i don't have to wait.
>
>kramer:(to the sales table)
>
>elaine:(to jerry) hey, look at this, i'm a good doctor.
>
>newman:(to elaine) you think i have no idea of this...
>
>elaine: oh, you better take the phone, and he was a little nervous.
>
>kramer:(to the phone) hey, hey, jerry, i don't want to be a little bit.(to kramer and jerry) you can't.
>
>jerry: oh, yeah. i don't even know, i know.
>
>jerry:(to the phone) oh, i know.
>
>kramer:(laughing) you know...(to jerry) you don't know.

You can see that there are multiple characters that say (somewhat) complete sentences, but it doesn't have to be perfect! It takes quite a while to get good results, and often, you'll have to use a smaller vocabulary (and discard uncommon words), or get more data.  The Seinfeld dataset is about 3.4 MB, which is big enough for our purposes; for script generation you'll want more than 1 MB of text, generally. 

# Submitting This Project
When submitting this project, make sure to run all the cells before saving the notebook. Save the notebook file as "dlnd_tv_script_generation.ipynb" and save another copy as an HTML file by clicking "File" -> "Download as.."->"html". Include the "helper.py" and "problem_unittests.py" files in your submission. Once you download these files, compress them into one zip file for submission.