# TV Script Generation
In this project, you'll generate your own [Simpsons](https://en.wikipedia.org/wiki/The_Simpsons) TV scripts using RNNs.  You'll be using part of the [Simpsons dataset](https://www.kaggle.com/wcukierski/the-simpsons-by-the-data) of scripts from 27 seasons.  The Neural Network you'll build will generate a new TV script for a scene at [Moe's Tavern](https://simpsonswiki.com/wiki/Moe's_Tavern).
## Get the Data
The data is already provided for you.  You'll be using a subset of the original dataset.  It consists of only the scenes in Moe's Tavern.  This doesn't include other versions of the tavern, like "Moe's Cavern", "Flaming Moe's", "Uncle Moe's Family Feed-Bag", etc..

In [1]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
import helper

data_dir = './data/moes_tavern_lines.txt'
text = helper.load_data(data_dir)
# Ignore notice, since we don't use it for analysing the data
text = text[81:]

In [2]:
import os
os.getcwd()

'/home/ubuntu/tvscript'

## Explore the Data
Play around with `view_sentence_range` to view different parts of the data.

In [3]:
view_sentence_range = (0, 10)

"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
import numpy as np

print('Dataset Stats')
print('Roughly the number of unique words: {}'.format(len({word: None for word in text.split()})))
scenes = text.split('\n\n')
print('Number of scenes: {}'.format(len(scenes)))
sentence_count_scene = [scene.count('\n') for scene in scenes]
print('Average number of sentences in each scene: {}'.format(np.average(sentence_count_scene)))

sentences = [sentence for scene in scenes for sentence in scene.split('\n')]
print('Number of lines: {}'.format(len(sentences)))
word_count_sentence = [len(sentence.split()) for sentence in sentences]
print('Average number of words in each line: {}'.format(np.average(word_count_sentence)))

print()
print('The sentences {} to {}:'.format(*view_sentence_range))
print('\n'.join(text.split('\n')[view_sentence_range[0]:view_sentence_range[1]]))

Dataset Stats
Roughly the number of unique words: 11492
Number of scenes: 262
Average number of sentences in each scene: 15.248091603053435
Number of lines: 4257
Average number of words in each line: 11.50434578341555

The sentences 0 to 10:
Moe_Szyslak: (INTO PHONE) Moe's Tavern. Where the elite meet to drink.
Bart_Simpson: Eh, yeah, hello, is Mike there? Last name, Rotch.
Moe_Szyslak: (INTO PHONE) Hold on, I'll check. (TO BARFLIES) Mike Rotch. Mike Rotch. Hey, has anybody seen Mike Rotch, lately?
Moe_Szyslak: (INTO PHONE) Listen you little puke. One of these days I'm gonna catch you, and I'm gonna carve my name on your back with an ice pick.
Moe_Szyslak: What's the matter Homer? You're not your normal effervescent self.
Homer_Simpson: I got my problems, Moe. Give me another one.
Moe_Szyslak: Homer, hey, you should not drink to forget your problems.
Barney_Gumble: Yeah, you should only drink to enhance your social skills.




In [4]:
# Length of extracted character sequences
maxlen = 60

# We sample a new sequence every `step` characters
step = 3

# This holds our extracted sequences
sentences = []

# This holds the targets (the follow-up characters)
next_chars = []

for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i: i + maxlen])
    next_chars.append(text[i + maxlen])
print('Number of sequences:', len(sentences))

# List of unique characters in the corpus
chars = sorted(list(set(text)))
print('Unique characters:', len(chars))
# Dictionary mapping unique characters to their index in `chars`
char_indices = dict((char, chars.index(char)) for char in chars)

# Next, one-hot encode the characters into binary arrays.
print('Vectorization...')
x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1

Number of sequences: 101710
Unique characters: 89
Vectorization...


# Building the network
Our network is a single LSTM layer followed by a Dense classifier and softmax over all possible characters. But let us note that recurrent neural networks are not the only way to do sequence data generation; 1D convnets also have proven extremely successful at it in recent times.

In [5]:
import keras

from keras import layers

model = keras.models.Sequential()
model.add(layers.LSTM(128, input_shape=(maxlen, len(chars))))
model.add(layers.Dense(len(chars), activation='softmax'))

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
  (fname, cnt))
  (fname, cnt))


In [6]:
optimizer = keras.optimizers.RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)

In [7]:
def sample(preds, temperature=1.0):
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

In [10]:
import random
import sys

for epoch in range(1, 20):
    print('epoch', epoch)
    # Fit the model for 1 epoch on the available training data
    model.fit(x, y,
              batch_size=128,
              epochs=1)

    # Select a text seed at random
    start_index = random.randint(0, len(text) - maxlen - 1)
    generated_text = text[start_index: start_index + maxlen]
    print('--- Generating with seed: "' + generated_text + '"')

    for temperature in [0.2, 0.5, 1.0, 1.2]:
        print('------ temperature:', temperature)
        sys.stdout.write(generated_text)

        # We generate 400 characters
        for i in range(400):
            sampled = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(generated_text):
                sampled[0, t, char_indices[char]] = 1.

            preds = model.predict(sampled, verbose=0)[0]
            next_index = sample(preds, temperature)
            next_char = chars[next_index]

            generated_text += next_char
            generated_text = generated_text[1:]

            sys.stdout.write(next_char)
            sys.stdout.flush()
        print()

epoch 1
Epoch 1/1
--- Generating with seed: "en?
Moe_Szyslak: Six seconds.
Barney_Gumble: Do we have to s"
------ temperature: 0.2
en?
Moe_Szyslak: Six seconds.
Barney_Gumble: Do we have to see the first to the first to make the pick.
Moe_Szyslak: (SINGS) 

  app.launch_new_instance()


THEE OF STOMK
C'TER HAMER: HOMER BEE...
Homer_Simpson: (SINGS) THEE OF AN WHOk IDE HOMER'S A SWEAK HOMER'S A BEE GOUS (CONCOvjj LOkS OF M CANYONEd / I'E MUTKE.
Homer_Simpson: (SINGS) THEE OF MOUND OW MOUTH OF MOUND OW HOMERc


Moe_Szyslak: (SINGS) THEE YOUR) Hey, Homer's see houre with the man I don't see the laught to the besight th
------ temperature: 0.5
 houre with the man I don't see the laught to the besight the trut the oney seen the fath of the far.
Moe_Szyslak: Uh, who job a go beckle some the from the play. It's all out!
Moe_Szyslak: Hey, Homer's be knucttart, three's a beer!
Moe_Szyslak: (TO MARNERASING) I'm gonna do?
Marge_Simpson: (GLASS OUT) Hey, Homer's a great starl, the everyon for the orga stife, they're play the night me, what you see the fass this beers.
Moe_Szyslak: Well, who are you gott
------ temperature: 1.0
see the fass this beers.
Moe_Szyslak: Well, who are you gotta do you is funder and detraulinge.
Mat_Simpson: Hode, onafle has give me change (REALS HSmENT)

--- Generating with seed: "no one wants an executive assistant who only works out six h"
------ temperature: 0.2
no one wants an executive assistant who only works out six her with the last memonies somethin' with the car, I don't be a thing to the man to make that something in the saice of the same the times on the bar to the bar the time.


Homer_Simpson: What have you think an your bar drunk.
Moe_Szyslak: (SINGS) WHEN WhHS... what a little with the last mout was a bunts and the swapplee something Super. (TO BARFLE) And that a cospre the car, I don't be a back.


H
------ temperature: 0.5
TO BARFLE) And that a cospre the car, I don't be a back.


Homer_Simpson: (TO MAING) Hey, Homer, I'm traimd her.
Homer_Simpson: Yeah, what are you for the pownt an everyone in the sign to check and I want to say the all said the old he with the smile thas armed that because even shoo!
Homer_Simpson: What want to make that morners on a coope the crols? What happened to learn you sure and the marriage

Homer_Simpson: (SCOT SWORPLER) Effinere it mrrioracher.
Homer_Simpson: (ELASSIMFAmILIS) Repentle!
Gatzy_Simpson: (MOTTI? TAED) Ow, come pease of fees. ROy (DIOUES THIS SADS...ANTER'Y ANNOo 

Carl_Carlson: (ON SNIFxCHrINGhT) But threggie!...


Rigetravilder: Maybe need hurtion dollars you back ont or wefon.


epoch 9
Epoch 1/1
--- Generating with seed: ", right?
Carl_Carlson: Yeah, but you'd feel bad inside.
Moe_"
------ temperature: 0.2
, right?
Carl_Carlson: Yeah, but you'd feel bad inside.
Moe_Szyslak: (STANTS) What's a man but there. (STANISTER SOBS)
Moe_Szyslak: (STANNS THEN) Hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey,
------ temperature: 0.5
 hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, here, you'd w

Moe_Szyslak: Moe, shook your pobrest To this opper to my pife?
RIzIED: Deney... Think. We's bar, I wanna and shupting chance everyone!
Bart_Simpson: Yeah.
Marge_Simpsoe: I made te
------ temperature: 1.2
hance everyone!
Bart_Simpson: Yeah.
Marge_Simpsoe: I made terirge.
Lenny_Leonard: It's not omalloave you ginliting ... didn't photinglin' comes atcemient.
BEESTOASS BUSSER) Moe, na, you got wait me freed this theg-rest of you could craunce vidiout to tail he'd sheec. (WEMIST) Nah, go lice from ubllace been the fisty life a friend las me's gong ap so my : Gresomeritbod. Whowre master there just very good mo!
A _Mad: I could ben my horse -- our to "we'rel ov
epoch 13
Epoch 1/1
--- Generating with seed: "He did?!


Homer_Simpson: (EAGER) I can't wait for the revie"
------ temperature: 0.2
He did?!


Homer_Simpson: (EAGER) I can't wait for the revies a great the bar the bar.
Homer_Simpson: (LOOKING A CHUCKLE) I was a great the car.
Moe_Szyslak: (SINGING) HEAD) Homer, I know how to make my 

Moe_Szyslak: And the best family, what work did you three a problems. (TO MANSTANISEL A SCONDED) There know that are you till you think
------ temperature: 1.0
ANSTANISEL A SCONDED) There know that are you till you think the Presite been than you we're our coharor.
Yoon_Sgander: Excuse me, then adistard onet's turnk?
Barney_Gumble: Take you strap is mike at everywereve!
Moe_Szyslak: TVy-O-Sa Wills, I was tilling me! Awny, one could play you was me, Marge. Homer, Moe. This "Owisn'tie."
Moe_Szyslak: Ah, na, I can't?d to be twentdnise Butty me!
Homer_Simpson: (MOANS) Every pallearntion sup if you'll share you comes 
------ temperature: 1.2
n: (MOANS) Every pallearntion sup if you'll share you comes ieffore his time Moe this knew ya kissy you go-staylers sounds!
Professor_Jonathtt: Uh, everysmone did sure cloand!
Homer_Simpson: (LOkMO DO THE CONCOINO, SHEED) My haaca's not. Sooo, is it's turnss wing come lourds me in the hunch.
Ooh!_Boowarrin_Tenwan: Ha-ch-forch coomin' we night heady!
Hom

Reference : http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/8.1-text-generation-with-lstm.ipynb