In [25]:
import keras
keras.__version__

'2.1.3'

# TV Script Generation
In this project, you'll generate your own [Simpsons](https://en.wikipedia.org/wiki/The_Simpsons) TV scripts using RNNs.  You'll be using part of the [Simpsons dataset](https://www.kaggle.com/wcukierski/the-simpsons-by-the-data) of scripts from 27 seasons.  The Neural Network you'll build will generate a new TV script for a scene at [Moe's Tavern](https://simpsonswiki.com/wiki/Moe's_Tavern).
## Get the Data
The data is already provided for you.  You'll be using a subset of the original dataset.  It consists of only the scenes in Moe's Tavern.  This doesn't include other versions of the tavern, like "Moe's Cavern", "Flaming Moe's", "Uncle Moe's Family Feed-Bag", etc..

In [26]:
import keras
import numpy as np

text = open('moes_tavern_lines.txt').read().lower()
print('Corpus length:', len(text))

Corpus length: 305270


## Explore the Data
Play around with `view_sentence_range` to view different parts of the data.

In [27]:
view_sentence_range = (0, 10)

"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
import numpy as np

print('Dataset Stats')
print('Roughly the number of unique words: {}'.format(len({word: None for word in text.split()})))
scenes = text.split('\n\n')
print('Number of scenes: {}'.format(len(scenes)))
sentence_count_scene = [scene.count('\n') for scene in scenes]
print('Average number of sentences in each scene: {}'.format(np.average(sentence_count_scene)))

sentences = [sentence for scene in scenes for sentence in scene.split('\n')]
print('Number of lines: {}'.format(len(sentences)))
word_count_sentence = [len(sentence.split()) for sentence in sentences]
print('Average number of words in each line: {}'.format(np.average(word_count_sentence)))

print()
print('The sentences {} to {}:'.format(*view_sentence_range))
print('\n'.join(text.split('\n')[view_sentence_range[0]:view_sentence_range[1]]))

Dataset Stats
Roughly the number of unique words: 10353
Number of scenes: 263
Average number of sentences in each scene: 15.190114068441064
Number of lines: 4258
Average number of words in each line: 11.504462188821043

The sentences 0 to 10:
[year date 1989] © twentieth century fox film corporation. all rights reserved.

moe_szyslak: (into phone) moe's tavern. where the elite meet to drink.
bart_simpson: eh, yeah, hello, is mike there? last name, rotch.
moe_szyslak: (into phone) hold on, i'll check. (to barflies) mike rotch. mike rotch. hey, has anybody seen mike rotch, lately?
moe_szyslak: (into phone) listen you little puke. one of these days i'm gonna catch you, and i'm gonna carve my name on your back with an ice pick.
moe_szyslak: what's the matter homer? you're not your normal effervescent self.
homer_simpson: i got my problems, moe. give me another one.
moe_szyslak: homer, hey, you should not drink to forget your problems.
barney_gumble: yeah, you should only drink to enhance y


Next, we will extract partially-overlapping sequences of length `maxlen`, one-hot encode them and pack them in a 3D Numpy array `x` of 
shape `(sequences, maxlen, unique_characters)`. Simultaneously, we prepare a array `y` containing the corresponding targets: the one-hot 
encoded characters that come right after each extracted sequence.

In [28]:
# Length of extracted character sequences
maxlen = 60

# We sample a new sequence every `step` characters
step = 3

# This holds our extracted sequences
sentences = []

# This holds the targets (the follow-up characters)
next_chars = []

for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i: i + maxlen])
    next_chars.append(text[i + maxlen])
print('Number of sequences:', len(sentences))
tp = 0
print('Example sequence :')
for sentence in sentences:
    print('Seq no :',tp)
    print('Sentence :',sentence)
    print('Next Character :',next_chars[tp])
    tp = tp + 1
    print()
    print()
    if(tp > 3):
        break


# List of unique characters in the corpus
chars = sorted(list(set(text)))
print('Unique characters count :', len(chars))
print(chars)

# Dictionary mapping unique characters to their index in `chars`
char_indices = dict((char, chars.index(char)) for char in chars)

# Next, one-hot encode the characters into binary arrays.
print('Vectorization...')
x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1
print('First character of current sequence after one hot encoding')
print(x[0,0])
print('Next character after one hot encoding')
print(y[0])


Number of sequences: 101737
Example sequence :
Seq no : 0
Sentence : [year date 1989] © twentieth century fox film corporation. a
Next Character : l


Seq no : 1
Sentence : ar date 1989] © twentieth century fox film corporation. all 
Next Character : r


Seq no : 2
Sentence : date 1989] © twentieth century fox film corporation. all rig
Next Character : h


Seq no : 3
Sentence : e 1989] © twentieth century fox film corporation. all rights
Next Character :  


Unique characters count : 65
['\n', ' ', '!', '"', '#', '$', '%', '&', "'", '(', ')', ',', '-', '.', '/', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', ':', ';', '?', '[', ']', '_', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '©', 'à', 'ã', 'ä', 'è', 'é', 'ó', 'ü']
Vectorization...
First character of current sequence after one hot encoding
[False False False False False False False False False False False False
 False False False False False F

## Building the network

Our network is a single `LSTM` layer followed by a `Dense` classifier and softmax over all possible characters.

In [29]:
from keras import layers

model = keras.models.Sequential()
model.add(layers.LSTM(128, input_shape=(maxlen, len(chars))))
model.add(layers.Dense(len(chars), activation='softmax'))

Since our targets are one-hot encoded, we will use `categorical_crossentropy` as the loss to train the model:

In [30]:
optimizer = keras.optimizers.RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)

## Training the language model and sampling from it


Given a trained model and a seed text snippet, we generate new text by repeatedly:

* 1) Drawing from the model a probability distribution over the next character given the text available so far
* 2) Reweighting the distribution to a certain "temperature"
* 3) Sampling the next character at random according to the reweighted distribution
* 4) Adding the new character at the end of the available text

This is the code we use to reweight the original probability distribution coming out of the model, 
and draw a character index from it (the "sampling function"):

In [31]:
def sample(preds, temperature=1.0):
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)


Finally, this is the loop where we repeatedly train and generated text. We start generating text using a range of different temperatures 
after every epoch. This allows us to see how the generated text evolves as the model starts converging, as well as the impact of 
temperature in the sampling strategy.

In [32]:
import random
import sys

for epoch in range(1, 60):
    print('epoch', epoch)
    # Fit the model for 1 epoch on the available training data
    model.fit(x, y,
              batch_size=128,
              epochs=1)

    # Select a text seed at random
    start_index = random.randint(0, len(text) - maxlen - 1)
    generated_text = text[start_index: start_index + maxlen]
    print('--- Generating with seed: "' + generated_text + '"')

    for temperature in [0.2, 0.5, 1.0, 1.2]:
        print('------ temperature:', temperature)
        sys.stdout.write(generated_text)

        # We generate 400 characters
        for i in range(400):
            sampled = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(generated_text):
                sampled[0, t, char_indices[char]] = 1.

            preds = model.predict(sampled, verbose=0)[0]
            next_index = sample(preds, temperature)
            next_char = chars[next_index]

            generated_text += next_char
            generated_text = generated_text[1:]

            sys.stdout.write(next_char)
            sys.stdout.flush()
        print()

epoch 1
Epoch 1/1
--- Generating with seed: "ah, i'm a little busy, homer. ah, you can pour it yourself.
"
------ temperature: 0.2
ah, i'm a little busy, homer. ah, you can pour it yourself.


moe_szyslak: (singing) who a do do the get a can the the the pare a do chear the get a do got a mare a firnter and the get a do mare a do gotta do moe. our a do a do a do the cally the get a mare a cally the cally been the get a do mare a forge a do the cally do gotter one the gotta gotta mare a do gotta got a mare a do gotta got a mare a do moe. where a the cally the get a fire a the mare the m
------ temperature: 0.5
 a do moe. where a the cally the get a fire a the mare the marn and the beer bart to been you and i'm i forct a pleas you good of a deen in the sme in and the gett he marge all for the ree the getter as hat some foin me i to me my freat and you and fitting un niget to the gitte been is i to gotta been't one my the bage i to mare. i the for the your gonna whe the hese you all you an

homer_simpson: homer, that'ber insaind-swcally, those misth ficase ghant of this rerealive angood about whiphiet tears okitous give the i by wife your aroved dodeitry, i than. plendy. coupp are you diyy-oth more i meen you, he'm yroop! i dase what. i will have up sery t on!
homer_simpson: what ho, nosy i bear, cleawaes gonet seceint, duff.
marge_simpson: what's h
epoch 5
Epoch 1/1
--- Generating with seed: "e organ, and his favorite flower was the heliotrope -- oh, a"
------ temperature: 0.2
e organ, and his favorite flower was the heliotrope -- oh, and they think you storing to make a beer the can and the beer the place for the back the bar read to been the beer to make the back that the can it all the back of the for the becther is the barney the packy but of the been the drink to been you can the back to think you and this the there been the gotta the beer the barney been the been the first to make the payt to be a beer the bartend and the 
------ temperature: 0.5
the first to make 

  app.launch_new_instance()


innod agnerich moe) uh, madge, luyd heres?
moe_szyslak: (lokings) oh oh hey, what? i happe of commrebid hearthen 
------ temperature: 1.2
k: (lokings) oh oh hey, what? i happe of commrebid hearthen yea??
moe_szyslak: how 'ell mm?
homer_simpson: minickenners.
moe_szyslak: layt!
satney_warver: of my, awagtionnyy wangaing and littanve!
moe_szyslak: art your murd asteown dingen to where" of that? but in moe.
...! pear.
lisa_sherling: sured rigging-sgeing--
homer_simpson: whick exprantineed who hmithago now
lad_funnerter: greatally kins? finting.
homer_sympry neottly) i secongit to by gacume huts
epoch 6
Epoch 1/1
--- Generating with seed: "ure of a treat today/ there's lots of marvelous things to ea"
------ temperature: 0.2
ure of a treat today/ there's lots of marvelous things to early a don't be a beer, i got a burns are you got the bect the bece the beer the becret moe are you and i don't got a stret the bart and the bart all you a beer, i'm be a beer a beer, the bechure of the bart the

and i don't cheart at the manbers and friend of the warl, you're at it.
carl_carlson: oh, my.. what apuager our ise pirra ess.
moe_szyslak: uh, dumbol, godiuga stand haghs.
moe_szyslak: hey. ko six!
moe_szyslak: (youling has i aing) beur) you din't moe.
fumble_simpsin: take the dyols widiund. policicic wang her turn me was a edsy books.
lisa_simpson: (from tv) hey, youver ourse watch pone.
artion: guy, i'm gonna make a guybar's lost? enjustion sobs our of 
------ temperature: 1.2
 guy, i'm gonna make a guybar's lost? enjustion sobs our of watch springfielding: -- i guese and this'vein ingreed but the demme walless.
seymour_skinner: (eyied) homer. woolo! ut pise standon.
anto_guyirabathacape: guy quitivoss... wow a little not of anywodore, thought nop late on un milkia, shak toothank ele whow, so. and guespher sus.
homer_simpson: i no let me nick in the beauch of or i know "the's cowing, it's ogriegd. (lipt) i'm just ladd's saysoofe
epoch 10
Epoch 1/1
--- Generating with seed: "the syst

moe_szyslak: i will right, i don't were a ning the bar torally got a minute. and i want to sed i think i want to stare to for into a part to pay this tick.
moe_szyslak: (sings) uh, bart, but the really that thing every loud, but i almeries to hamed to day to see are my cause i'm not there stream or the wang to do t
------ temperature: 1.0
to see are my cause i'm not there stream or the wang to do tommorty head!
carl_carlson: yeah, there? i wish it dount.
dr._giff: moe. i'll see the clams rhawir two you!
homer_simpson: ya be air. well govin's thing wate the wire with godiemy tumir town the smith antwanity faminious here abouttrul vacuions tri'st sempiy guy. (really anding from moe) thank changer.
moe_szyslak: ow, i've be you him wive a good to skids to me.
moe_szyslak: oh mi e-- who call, ho
------ temperature: 1.2
e a good to skids to me.
moe_szyslak: oh mi e-- who call, homer. i'll we know aribony my money (unakus) ceree very kiek?
carl_mard : you load you wordited your dad new wife. 

moe_szyslak: whoa-hh-out of your barney moe's invine of the marge and i want to start a big don't here a party. (sings) the thing i think you are you think i'm not
------ temperature: 0.5
a party. (sings) the thing i think you are you think i'm not peoner, moe's a from the bart the bar i got the read.
lenny_leonard: hey, homer? i can i have a hands lookin' back. no homer, that's your face.
moe_szyslak: the last perser the world you all the mating the man too shill! (chucked noise) well, now you know homer, i'm broking a big bour to you want to get the business driver! (reater then changers and dearficug the pauting that is it with the felle
------ temperature: 1.0
changers and dearficug the pauting that is it with the feller nobiting intervery fight. (handing a haded noise) changere owtless.
homer_simpson: well yomig time kill hours lounds like you're gonna some a a greateng in toryer to name to lisa, you kidkin' the artion's well a me, i've getting wouldn't a little and i'm us were ho

homer_simpson: wait'll moe sees how wasted i got without homer is the monny servery for the marge and i was a me!
homer_simpson: (pottly) she works and i don't go to say i was a starty strap!
barney_gumble: (sings) hey, hey, hey, hey, he's should in the morty for the marge stop the bar.
homer_simpson: (sings) we're a back bartend on the bar the bar the bar.
homer_simpson: (sings) well, i can't have to the marge and i was me a berour of the one for the ha
------ temperature: 0.5
ave to the marge and i was me a berour of the one for the has flass, but i was beasing beer and this place short and i didn't go to really hang to do chim forge.
moe_szyslak: (looking are armunt) hey, i'm sorry.
homer_simpson: (gramis bring) i've got that areny.
moe_szyslak: (sighs) that's my comers you sell down an a hustraple springfield to kanger who says sing some singens.
moe_szyslak: (sings) well. (thinks) we're the man of only for a chish-a-crutge a
------ temperature: 1.0
 well. (thinks) we're the man of

barrr_cicelouster: or-telding hard off how you am sides like like hereved.
barney_gumble: the love of the now. i'm got it weep me
epoch 25
Epoch 1/1
--- Generating with seed: "e.
moe_szyslak: nah, you don't have to bet the money. the po"
------ temperature: 0.2
e.
moe_szyslak: nah, you don't have to bet the money. the powin." (sings) the man for the bar torafistians.
moe_szyslak: (sings) hey, homer. i was a load out of the one school... (sings) what a guad for the bar toraching the bar torafistand to buy the beer.
moe_szyslak: (sings) oh, i gotta start and says who said after in the man for the man for the bar torafistand for the bar torafistake the can bealing out of the mounch your budpy be the man computhers.

------ temperature: 0.5
bealing out of the mounch your budpy be the man computhers.
barney_gumble: okay, i want a lot of the man four couldn't come on the bar and smithers


homer_simpson: homer, i thought you can think for the and booze, there's the money moe. the spindly she

unsaishick_hooper: (sings) next i grear we pall change one me that., bat guys. that noboce close nightsy!
lenny_leonate: maybe thinksednay.
seymour_skinner: beer which here, ain't you've never? you stook, ysuarn't all hifufe in with an ad hampall.
homer_simpson: yeighdin'!min' from the. asmo... (carvady bye cux) phob me is it hopeniastions, homer? now of goyed my made a moty jomm we cougre bleakin my?
artier_ure_rnifre: i hagging homer's people it baron't 
epoch 29
Epoch 1/1
--- Generating with seed: "a_simpson: that sounds like fat tony.
chief_wiggum: hm, only"
------ temperature: 0.2
a_simpson: that sounds like fat tony.
chief_wiggum: hm, only you want to starve. (beat) hey, moe. they want to got the man a back.
moe_szyslak: well, i'm so thinking a beer.
moe_szyslak: (sighs) hey, what about the springfield the park... (sighs) whoa, homer, i can't have to got the keys where they want to get that the bartend to thing a passec.
moe_szyslak: (sings) well, i'm bring this is i got the bart

moe_szyslak: okay, you princing i'm moe. but my famation, i am is they's on, when i can't don't knew? homer. you know, i tall never a back, than the mock my cal. good
earl_hot: the boifick in my job? i-- i'll stit ungry wouldn't stantes like dappon of gonna wort.
bar_eting_moe: oh, there's noter jued not better specime...
lenny_leonard: yeah right, of my can i just no motel you gonna get sent --" yo all thoe de
------ temperature: 1.2
my can i just no motel you gonna get sent --" yo all thoe deal... let's treally beap. my greating! (stover) ooh, so !
homer_simpson: (grassed wricla) of a preash's breney three. i'm watchedingunnip!
carl_carlson: but there's a same to poor cour of this guy. from me of fawiry daynin'.
homer_simpson: ell.
tome_goming_brooline: well.
homer_simpson: ((humpter) to moe? (lufgeglus grame, bar.)...aid ttuburt man famut mopetie, ain't pick with e... (insing) lemme,
epoch 33
Epoch 1/1
--- Generating with seed: "how to raise my lousy kid!
dolph's_dad: this is for th

moe_szyslak: well with the rumbers.
moe_szyslak: don't want to the always the compraining at you that says and the only drink.
homer_simpson: (from tv) you just to something about a chance a bulfie.
moe_szyslak: thanks, the one speemstillig that..
------ temperature: 1.0
e a bulfie.
moe_szyslak: thanks, the one speemstillig that... it's na morners will do to kent in your strick wingull to these to areact.
dr._van_holdher: no meserver, two suse to is the flamsons loud since you! woo! effers thanks.
lenny_leonard: (beat) whit make it your dain' lent... (grumb,k) the get from offe only offerselfal?
lenny_leonard: fuch, um, treack.
moe_szyslak: well...
barney_gumble: i'm sorry.
moe_szyslak: (cutto boy) i right!
moe_szyslak: (m
------ temperature: 1.2
i'm sorry.
moe_szyslak: (cutto boy) i right!
moe_szyslak: (mounes) lisaw a h... hell,, homer! that's moe'en you for you a movons wheee thean nover, balm to myselbin' in, i wis manch me in boanie?
moe_szyslak: i got homer! fixa me to i!
homer_s

moe_szyslak: (sings) we have to the back it was moe's tavern to me, i gotta stay to get a peach. (sings) we can think of the store of here it was moe the back in the bar than you should want to starl something with me. (singing) "oh, we like to see down to make the charning back. the sight of any place of your pope of the palsed me to the moe and i can come up.
homer_simpson: (singing) "pall viceor of moe
------ temperature: 0.5
 i can come up.
homer_simpson: (singing) "pall viceor of moe's not off the money. this is wonomer the drink to chuck.
moe_szyslak: well, i think you got moe of huh? (to mack you for that seed for is flamizing me chear in the duff.
homer_simpson: oh, i've got me, the stace.
dover: i've got a beach, moe. what am i you for a don't with the palmars of your fan of your fat to the money?! what we love you just the best ball "good for a moved the friend.--------(
------ temperature: 1.0
ou just the best ball "good for a moved the friend.--------(if my sauchtuloxer lal

bart_simpson: that's lenny lenny. those jtwrying ain't be a cany-pcose to my.


lenny_leonard: that
epoch 44
Epoch 1/1
--- Generating with seed: "e! i don't ever want to see that moolah-stealing jackpot-thi"
------ temperature: 0.2
e! i don't ever want to see that moolah-stealing jackpot-thing!
homer_simpson: (sings) i don't want to get a this good on the beginning a pagish to be a thereb straight, it was me the pig!
moe_szyslak: (sings) what? what's a since the man beers.
moe_szyslak: (sighs) i don't know they won't to the man beers.
moe_szyslak: (sings) i had to get a money of the man beers.
moe_szyslak: (to marge) i tell he window who say i just think you gonna the park?
homer_sim
------ temperature: 0.5
he window who say i just think you gonna the park?
homer_simpson: (to mr wake joined ners)
moe_szyslak: maybe it have to be a mice of the guys lice this the bar barney macn looking a life! (sings) great on the beer these down to the be straigh on the people slens, in the picaual of 

lenny_leonard: (singing) i broud me real alpidenher, teys apernda dimme go.


homer_simpson: (leach marn) thinks are you're a fwill trimmeram.
homer_simpson: so well, laphs like of shell good.
moe_szyslak: you'll stace kenroorold.
deree: (sniffus) uh... hey are you, what's mean! inno" woal-ne!
marge_simpson: (then girledpusud liwh cluss) you.
seymour_ski'nentles: (plarded
hancal hand chuck) the mappenart's aingin' ady bues, 
epoch 48
Epoch 1/1
--- Generating with seed: "and browns.
moe_szyslak: (into phone) moe's tavern. moe spea"
------ temperature: 0.2
and browns.
moe_szyslak: (into phone) moe's tavern. moe speaking for the street for the sight with the man duff the beer. it'll go homer. you see a coming springfield believe come on the man i want you to steall the man and then i don't want to say home show i think you sent to the get the picts of the pland on the stort.
moe_szyslak: oh, you see the man plan before you see here for the man i workes.
moe_szyslak: okay, homer.
homer_sim

smith_mli: (these) don't redieds up and palingen.
moe_szyslak: (somedly) carn-beers! then ?
lenny_leonard: (arge) hee? can't -- show somethin', only do me about hangin's all kayakin' eit?
moe_szyslak: ers: so gotta det cluckled of things moe's veriton taveen fat tor their becanieravate. (sheal
------ temperature: 1.2
ngs moe's veriton taveen fat tor their becanieravate. (shealied fulles) rebames eatstee, their been loudes home... (lisak shock-nerchess whhhuster) whochalib, but i crem keep's.
moe_szyslak: oh croth-lon'-sher i getsorating snifes homer foot not bring?


buck_bamaltermowe: ooh, i know, here. don't have a let's presening cappal guy. hundred voct tom here.
homer_simpson: uh, hey uh?
littir: hey thinkily's right, dugh? get this.
bartyne: ...looloce oun.
lenny
epoch 52
Epoch 1/1
--- Generating with seed: "ke us. what happened?
moe_szyslak: oh, it ain't no mystery -"
------ temperature: 0.2
ke us. what happened?
moe_szyslak: oh, it ain't no mystery -- if i all are you thinks th

moe_szyslak: yeah, that's the comer and what you gotta take the fatwer good for the procked money with day...
moe_szyslak: well, i'm gonna hear
------ temperature: 1.0
 procked money with day...
moe_szyslak: well, i'm gonna hear the since you wearribop."..k.
simpre: oh i right frank yah, it ought by nah.
duffleerfulkin'_matho: life!
lenny_leonard: homer, i fres, moe. so, strainiful man: that's gose!
homer_simpson: (moroons) whre'd virish burn mans like sowet fills.


barney_gumble: you know the bany gnat bad-feeffert, uh, ow!
harvory_bard: (sings) now i have the befriend) gacas outs. mill defiects! freit as shribures of 
------ temperature: 1.2
 befriend) gacas outs. mill defiects! freit as shribures of the des freely.
gat_mrord: i can't do midk myself. out, don't brote pants. (befish chatping prifecing nelaked) in the slated.
moe_szyslak: duyin geecabless here.
the_rid: (pering mach) people of garman't itchin' lean drink dappenet. how about! woman't kick home.
gentlegitiol
homer_simps

homer_simpson: (sings) guys. (the guming) danger and this the as who spisting the bloviman. the love you? i'm us with about homer's the bar.
moe_szyslak: (sings) kills was miss wing!
moe_szyslak: i'm so the problem on you like the broke the bar banko's are the bedies. (am buld) but how moe's the bar from the bor-now, (belfed) hey, it's wind about marge.
homer_simpson: (buld) is the bor!
kemi: (sings) plusing.
homer_s
------ temperature: 1.0
r_simpson: (buld) is the bor!
kemi: (sings) plusing.
homer_simpson: (sings) "long a placene somethin' eiro bar the great!
lenny_leonard: it's raugh on the spacessa sprivaber -- ivana. thinks? it'd make the enda homer's him.
homer_simpson: barney!
homer_simpson: no, i got a jo. great, a sorryfow with heriskiash more like me! (laudp humped this fice screwsecerds.
moe_szyslak: (gireasy) who is why.
homer_simpson: (belfore, proudied the bancal ne's and bag for
------ temperature: 1.2
omer_simpson: (belfore, proudied the bancal ne's and bag for barnethea


As you can see, a low temperature results in extremely repetitive and predictable text, but where local structure is highly realistic: in 
particular, all words (a word being a local pattern of characters) are real English words. With higher temperatures, the generated text 
becomes more interesting, surprising, even creative; it may sometimes invent completely new words that sound somewhat plausible (such as 
"eterned" or "troveration"). With a high temperature, the local structure starts breaking down and most words look like semi-random strings 
of characters. Without a doubt, here 0.5 is the most interesting temperature for text generation in this specific setup. Always experiment 
with multiple sampling strategies! A clever balance between learned structure and randomness is what makes generation interesting.

Note that by training a bigger model, longer, on more data, you can achieve generated samples that will look much more coherent and 
realistic than ours. But of course, don't expect to ever generate any meaningful text, other than by random chance: all we are doing is 
sampling data from a statistical model of which characters come after which characters. Language is a communication channel, and there is 
a distinction between what communications are about, and the statistical structure of the messages in which communications are encoded. To 
evidence this distinction, here is a thought experiment: what if human language did a better job at compressing communications, much like 
our computers do with most of our digital communications? Then language would be no less meaningful, yet it would lack any intrinsic 
statistical structure, thus making it impossible to learn a language model like we just did.


## Take aways

* We can generate discrete sequence data by training a model to predict the next tokens(s) given previous tokens.
* In the case of text, such a model is called a "language model" and could be based on either words or characters.
* Sampling the next token requires balance between adhering to what the model judges likely, and introducing randomness.
* One way to handle this is the notion of _softmax temperature_. Always experiment with different temperatures to find the "right" one.