# Working with sequences
## Introduction to Recurrent Neural Networks (RNNs)

### Why RNNs?

1. Introduction to RNNs.
2. Basic diagrams and explanations.
3. vanilla RNN architecture and code (on simpsons' text).
4. Describe vanishing and exploding gradients problem.
5. Introduction to LSTMs.
6. Explain the structure of RNN and LSTM cells.
7. LSTM architecture code and train on simpsons' text.
8. Word embeddings using T-SNE.

In this tutorial, you'll generate your own Simpsons TV scripts using RNNs. 

You'll be working with two architectures - vanilla RNNs and LSTMs - to achieve this. Here, a part of the Simpsons dataset of scripts from 27 seasons has been used. The Neural Network you'll build will generate a new TV script for a scene at Moe's Tavern.

The data is already provided for you. You'll be using a subset of the original dataset. It consists of only the scenes in Moe's Tavern. This doesn't include other versions of the tavern, like "Moe's Cavern", "Flaming Moe's", "Uncle Moe's Family Feed-Bag", etc.

In [None]:
# Save parameters for checkpoint
helper.save_params((seq_length, save_dir))

In [None]:
_, vocab_to_int, int_to_vocab, token_dict = helper.load_preprocess()
seq_length, load_dir = helper.load_params()

In [None]:
def get_tensors(loaded_graph):
    """
    Get input, initial state, final state, and probabilities tensor from <loaded_graph>
    :param loaded_graph: TensorFlow graph loaded from file
    :return: Tuple (InputTensor, InitialStateTensor, FinalStateTensor, ProbsTensor)
    """
    # Implement Function
    InputTensor = tf.Graph.get_tensor_by_name(loaded_graph, name="input:0")
    InitialStateTensor = tf.Graph.get_tensor_by_name(loaded_graph, name="initial_state:0")
    FinalStateTensor = tf.Graph.get_tensor_by_name(loaded_graph, name="final_state:0")
    ProbsTensor = tf.Graph.get_tensor_by_name(loaded_graph, name="probs:0")
    return (InputTensor, InitialStateTensor, FinalStateTensor, ProbsTensor)

In [None]:
def pick_word(probabilities, int_to_vocab):
    """
    Pick the next word in the generated text
    :param probabilities: Probabilites of the next word
    :param int_to_vocab: Dictionary of word ids as the keys and words as the values
    :return: String of the predicted word
    """
    # Implement Function
    return np.random.choice(list(int_to_vocab.values()), 1, p=np.squeeze(probabilities))[0]

In [None]:
gen_length = 200
# homer_simpson, moe_szyslak, or Barney_Gumble
# this is the word that you want your sentence to start with
# play around with this a bit more to see how the RNN responds!
prime_word = 'moe_szyslak'

loaded_graph = tf.Graph()
with tf.Session(graph=loaded_graph) as sess:
    # Load saved model
    loader = tf.train.import_meta_graph(load_dir + '.meta')
    loader.restore(sess, load_dir)

    # Get Tensors from loaded model
    input_text, initial_state, final_state, probs = get_tensors(loaded_graph)

    # Sentences generation setup
    gen_sentences = [prime_word + ':']
    prev_state = sess.run(initial_state, {input_text: np.array([[1]])})

    # Generate sentences
    for n in range(gen_length):
        # Dynamic Input
        dyn_input = [[vocab_to_int[word] for word in gen_sentences[-gen_length:]]]
        dyn_seq_length = len(dyn_input[0])

        # Get Prediction
        probabilities, prev_state = sess.run(
            [probs, final_state],
            {input_text: dyn_input, initial_state: prev_state})

        pred_word = pick_word(probabilities[:, dyn_seq_length-1], int_to_vocab)

        gen_sentences.append(pred_word)
    
    # Remove tokens
    tv_script = ' '.join(gen_sentences)
    for key, token in token_dict.items():
        ending = ' ' if key in ['\n', '(', '"'] else ''
        tv_script = tv_script.replace(' ' + token.lower(), key)
    tv_script = tv_script.replace('\n ', '\n')
    tv_script = tv_script.replace('( ', '(')
        
    print(tv_script)

In [None]:
%matplotlib inline
%config InlineBackend.figure_format = 'retina'

import matplotlib.pyplot as plt
from sklearn.manifold import TSNE

In [None]:
with tf.Session(graph=train_graph) as sess:
    saver.restore(sess, tf.train.latest_checkpoint('checkpoints'))
    embed_mat = sess.run(embedding)

In [None]:
viz_words = 500
tsne = TSNE()
embed_tsne = tsne.fit_transform(embed_mat[:viz_words, :])

In [None]:
fig, ax = plt.subplots(figsize=(14, 14))
for idx in range(viz_words):
    plt.scatter(*embed_tsne[idx, :], color='steelblue')
    plt.annotate(int_to_vocab[idx], (embed_tsne[idx, 0], embed_tsne[idx, 1]), alpha=0.7)