<a href="https://colab.research.google.com/github/markvasin/deep_learning_exercise/blob/master/lab7/7_1_SequenceModelling.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Part 1: Sequence Modelling

__Before starting, we recommend you enable GPU acceleration if you're running on Colab.__

In [None]:
# Execute this code block to install dependencies when running on colab
try:
    import torch
except:
    from os.path import exists
    from wheel.pep425tags import get_abbr_impl, get_impl_ver, get_abi_tag
    platform = '{}{}-{}'.format(get_abbr_impl(), get_impl_ver(), get_abi_tag())
    cuda_output = !ldconfig -p|grep cudart.so|sed -e 's/.*\.\([0-9]*\)\.\([0-9]*\)$/cu\1\2/'
    accelerator = cuda_output[0] if exists('/dev/nvidia0') else 'cpu'

    !pip install -q http://download.pytorch.org/whl/{accelerator}/torch-1.0.0-{platform}-linux_x86_64.whl torchvision

try: 
    import torchbearer
except:
    !pip install torchbearer

Collecting torchbearer
[?25l  Downloading https://files.pythonhosted.org/packages/ff/e9/4049a47dd2e5b6346a2c5d215b0c67dce814afbab1cd54ce024533c4834e/torchbearer-0.5.3-py3-none-any.whl (138kB)
[K     |██▍                             | 10kB 24.4MB/s eta 0:00:01[K     |████▊                           | 20kB 30.6MB/s eta 0:00:01[K     |███████▏                        | 30kB 36.2MB/s eta 0:00:01[K     |█████████▌                      | 40kB 39.4MB/s eta 0:00:01[K     |███████████▉                    | 51kB 33.0MB/s eta 0:00:01[K     |██████████████▎                 | 61kB 35.4MB/s eta 0:00:01[K     |████████████████▋               | 71kB 26.3MB/s eta 0:00:01[K     |███████████████████             | 81kB 25.0MB/s eta 0:00:01[K     |█████████████████████▍          | 92kB 26.6MB/s eta 0:00:01[K     |███████████████████████▊        | 102kB 26.5MB/s eta 0:00:01[K     |██████████████████████████      | 112kB 26.5MB/s eta 0:00:01[K     |████████████████████████████▌   | 12

## Markov chains

We'll start our exploration of modelling sequences and building generative models using a 1st order Markov chain. The Markov chain is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. In our case we're going to learn a model over a set of characters from an English language text. The events, or states, in our model are the set of possible characters, and we'll learn the probability of moving from one character to the next.

Let's start by loading the data from the web:

In [None]:
from torchvision.datasets.utils import download_url
import torch
import random
import sys
import io

# Read the data
download_url('https://s3.amazonaws.com/text-datasets/nietzsche.txt', '.', 'nietzsche.txt', None)
text = io.open('./nietzsche.txt', encoding='utf-8').read().lower()
print('corpus length:', len(text))

Downloading https://s3.amazonaws.com/text-datasets/nietzsche.txt to ./nietzsche.txt


HBox(children=(IntProgress(value=1, bar_style='info', max=1), HTML(value='')))

corpus length: 600893


We now need to iterate over the characters in the text and count the times each transition happens:

In [None]:
transition_counts = dict()
for i in range(0,len(text)-1):
    currc = text[i]
    nextc = text[i+1]
    if currc not in transition_counts:
        transition_counts[currc] = dict()
    if nextc not in transition_counts[currc]:
        transition_counts[currc][nextc] = 0
    transition_counts[currc][nextc] += 1

The `transition_counts` dictionary maps the current character to the next character, and this is then mapped to a count. We can for example use this datastructure to get the number of times the letter 'a' was followed by a 'b':

In [None]:
print("Number of transitions from 'a' to 'b': " + str(transition_counts['a']['b']))

Number of transitions from 'a' to 'b': 813


Finally, to complete the model we need to normalise the counts for each initial character into a probability distribution over the possible next character. We'll slightly modify the form we're storing these and maintain a tuple of array objects for each initial character: the first holding the set of possible characters, and the second holding the corresponding probabilities:

In [None]:
transition_probabilities = dict()
for currentc, next_counts in transition_counts.items():
    values = []
    probabilities = []
    sumall = 0
    for nextc, count in next_counts.items():
        values.append(nextc)
        probabilities.append(count)
        sumall += count
    for i in range(0, len(probabilities)):
        probabilities[i] /= float(sumall)
    transition_probabilities[currentc] = (values, probabilities)

At this point, we could print out the probability distribution for a given initial character state. For example, to print the distribution for 'a':

In [None]:
for a,b in zip(transition_probabilities['a'][0], transition_probabilities['a'][1]):
    print(a,b)

c 0.03685183172083922
t 0.14721708881400153
  0.05296771388194369
n 0.2322806826829003
l 0.11552886183280792
r 0.08794434177628004
s 0.0968583541689314
v 0.0192412218719426
i 0.03402543754755952
d 0.026986628981411024
g 0.017202956843135123
y 0.02505707142080661
k 0.012827481247961734
b 0.02209479291227307
p 0.020545711490379388
m 0.02030111968692249
u 0.011414284161321883
f 0.004429829329274921
w 0.004837482335036417
, 0.0010870746820306554

 0.005353842809000978
z 0.0006522448092183933
x 0.0007609522774214588
o 0.0005435373410153277
. 0.000489183606913795
- 0.0004348298728122622
' 5.4353734101532776e-05
j 0.0004348298728122622
h 0.00035329927165996303
e 0.0007337754103706925
: 5.4353734101532776e-05
a 5.4353734101532776e-05
) 0.00010870746820306555
! 2.7176867050766388e-05
; 2.7176867050766388e-05
" 8.153060115229916e-05
q 2.7176867050766388e-05
_ 8.153060115229916e-05
[ 2.7176867050766388e-05


It looks like the most probable letter to follow an 'a' is 'n'. 

__What is the most likely letter to follow the letter 'j'? Write your answer in the block below:__

In [None]:
# YOUR CODE HERE
for a,b in zip(transition_probabilities['j'][0], transition_probabilities['j'][1]):
    print(a,b)

e 0.2585278276481149
o 0.15080789946140036
u 0.5709156193895871
a 0.017953321364452424
i 0.0017953321364452424


We mentioned earlier that the Markov model is generative. This means that we can draw samples from the distributions and iteratively move between states. 

Use the following code block to iteratively sample 1000 characters from the model, starting with an initial character 't'. You can use the `torch.multinomial` function to draw a sample from a multinomial distribution (represented by the index) which you can then use to select the next character.

In [None]:
current = 't'
for i in range(0, 1000):
    print(current, end='')
    # sample the next character based on `current` and store the result in `current`
    # YOUR CODE HERE
    index = torch.multinomial(torch.tensor(transition_probabilities[current][1]), 1)
    current = transition_probabilities[current][0][index]


thaghy ols bs as t at f-anc  wananthuratherie---"ghis thingetedooscallyseve, reres hte alet, hing brdve spttifinche ine al.
pr "ss andeche sonck he ive laie tist spy, ominthe tict l ours, mangur thororin  pr w heredl gomatll te seststhitin thro gexpomodisppeso or lyomug ad trrdanoweede refeves agrd an acencon oury thay t ve asorict--s
od mapere aind: d an ce th mes whe d iane atodsthinthomarrhe then t or edomod frhes es t w ithe, bealof ull to'st f
lalit hesove--t heif prrethatiomowarpech?" dillonely. puas ol wndnn, enctsue l wof ovithentiont  whtulecin; ontenonditorin l rant
on: ait touthindd
whirnof vinte wiupefid. es-agirorine mallfrcealdugeveromsemave, wh t nounean aco s, d
no urr,"
quaide lfang elo me pt ace eatis dseaft, one, f hager imatousteasoumo ho rthin ng olis tid ounereatheleswhe ss he ake bonsion ore lyoube: auallethelstastsat ll thedelo
n tususonme mest me ache fos
arutins
r; thano thantis nd t. binis acomemacand d tonttom
citicucel hiser fivicie orer,
t t s, al my thind

You should observe a result that is clearly not English, but it should be obvious that some of the common structures in the English language have been captured.

__Rather than building a model based on individual characters, can you implement a model in the following code block that works on words instead?__

In [None]:
# YOUR CODE HERE
words = text.split()

transition_counts = dict()
for i in range(0,len(words)-1):
    currc = words[i]
    nextc = words[i+1]
    if currc not in transition_counts:
        transition_counts[currc] = dict()
    if nextc not in transition_counts[currc]:
        transition_counts[currc][nextc] = 0
    transition_counts[currc][nextc] += 1




In [None]:
transition_probabilities = dict()
for currentc, next_counts in transition_counts.items():
    values = []
    probabilities = []
    sumall = 0
    for nextc, count in next_counts.items():
        values.append(nextc)
        probabilities.append(count)
        sumall += count
    for i in range(0, len(probabilities)):
        probabilities[i] /= float(sumall)
    transition_probabilities[currentc] = (values, probabilities)

In [None]:
current = 'what'
for i in range(0, 1000):
    print(current, end=' ')
    # sample the next character based on `current` and store the result in `current`
    # YOUR CODE HERE
    index = torch.multinomial(torch.tensor(transition_probabilities[current][1]), 1)
    current = transition_probabilities[current][0][index]

what it is inflicted upon occasion, and for those?-- friends' phantom-flight knocking at the reverence before him, and draw breath in music; and rather is always in short, philosophy. 55. there might say--at the "essence of the handle the newspapers and doubtful about love--seeks for these oppositions of a greater or a tale.[18] [18] den mit unserer empfindung bewegendes und dinge]. this conjuring trick so icy, that to newspaper at times and alcoholic excess, for friends, am morality of humanity, the french were insisted upon which still farther onward impulse is akin to sexual intercourse (all intercourse is becoming, he has the canon of rouen, neither his "faust," part of spirit and even higher excitement supplies wholly unmerited stream of the future, if they would be the higher morality of an after which philosophers within the herd, and equally so dull home among things--and not at the weary, and current, full of stating the great destiny, as "the devil also still--youth! 32. thro

## RNN-based sequence modelling

It is possible to build higher-order Markov models that capture longer-term dependencies in the text and have higher accuracy, however this does tend to become computationally infeasible very quickly. Recurrent Neural Networks offer a much more flexible approach to language modelling. 

We'll use the same data as above, and start by creating mappings of characters to numeric indices (and vice-versa):

In [None]:
chars = sorted(list(set(text)))
print(chars)
print('total chars:', len(chars))
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))

['\n', ' ', '!', '"', "'", '(', ')', ',', '-', '.', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', ':', ';', '=', '?', '[', ']', '_', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'ä', 'æ', 'é', 'ë']
total chars: 57


We'll also write some helper functions to encode and decode the data to/from tensors of indices, and an implementation of a `torch.Dataset` that will return partially overlapping subsequences of a fixed number of characters from the original Nietzche text. Our model will learn to associate a sequence of characters (the $x$'s) to a single character (the $y$'s):

In [None]:
from torch.utils.data import Dataset, DataLoader
from torch import nn
from torch.nn import functional as F
from torch import optim
import random
import sys
import io

maxlen = 40
step = 3


def encode(inp):
    # encode the characters in a tensor
    x = torch.zeros(maxlen, dtype=torch.long)
    for t, char in enumerate(inp):
        x[t] = char_indices[char]

    return x


def decode(ten):
    s = ''
    for v in ten:
        s += indices_char[v] 
    return s


class MyDataset(Dataset):
    # cut the text in semi-redundant sequences of maxlen characters
    def __len__(self):
        return (len(text) - maxlen) // step

    def __getitem__(self, i):
        inp = text[i*step: i*step + maxlen]
        out = text[i*step + maxlen]

        x = encode(inp)
        y = char_indices[out]

        return x, y

We can now define the model. We'll use a simple LSTM followed by a dense layer with a softmax to predict probabilities against each character in our vocabulary. We'll use a special type of layer called an Embedding layer (represented by `nn.Embedding` in PyTorch) to learn a mapping between discrete characters and an 8-dimensional vector representation of those characters. You'll learn more about Embeddings in the next part of the lab.

In [None]:
class CharPredictor(nn.Module):
    def __init__(self):
        super(CharPredictor, self).__init__()
        self.emb = nn.Embedding(len(chars), 8)
        self.lstm = nn.LSTM(8, 128, batch_first=True)
        self.lin = nn.Linear(128, len(chars))

    def forward(self, x):
        x = self.emb(x)
        lstm_out, _ = self.lstm(x)
        out = self.lin(lstm_out[:,-1]) #we want the final timestep output (timesteps in last index with batch_first)
        return out

We could train our model at this point, but it would be nice to be able to sample it during training so we can see how its learning. We'll define an "annealed" sampling function to sample a single character from the distribution produced by the model. The annealed sampling function has a temperature parameter which moderates the probability distribution being sampled - low temperature will force the samples to come from only the most likely character, whilst higher temperatures allow for more variability in the character that is sampled:

In [None]:
def sample(logits, temperature=1.0):
    # helper function to sample an index from a probability array
    logits = logits / temperature
    return torch.multinomial(F.softmax(logits, dim=0), 1)

Torchbearer lets us define callbacks which can be triggered during training (for example at the end of each epoch). Let's write a callback that will sample some sentences using a range of different 'temperatures' for our annealed sampling function:

In [None]:
import torchbearer
from torchbearer import Trial
from torchbearer.callbacks.decorators import on_end_epoch

device = "cuda:0" if torch.cuda.is_available() else "cpu"

@on_end_epoch
def create_samples(state):
    with torch.no_grad():
        epoch = -1
        if state is not None:
            epoch = state[torchbearer.EPOCH]

        print()
        print('----- Generating text after Epoch: %d' % epoch)

        start_index = random.randint(0, len(text) - maxlen - 1)
        for diversity in [0.2, 0.5, 1.0, 1.2]:
            print()
            print()
            print('----- diversity:', diversity)

            generated = ''
            sentence = text[start_index:start_index+maxlen-1]
            generated += sentence
            print('----- Generating with seed: "' + sentence + '"')
            print()
            sys.stdout.write(generated)

            inputs = encode(sentence).unsqueeze(0).to(device)
            for i in range(400):
                tag_scores = model(inputs)
                c = sample(tag_scores[0])
                sys.stdout.write(indices_char[c.item()])
                sys.stdout.flush()
                inputs[0, 0:inputs.shape[1]-1] = inputs[0, 1:].clone()
                inputs[0, inputs.shape[1]-1] = c
        print()

Now, all the pieces are in place. __Use the following block to:__

- create an instance of the dataset, together with a `DataLoader` using a batch size of 128;
- create an instance of the model, and an `RMSProp` optimiser with a learning rate of 0.01; and
- create a torchbearer `Trial` in a variable called `torchbearer_trial` which incorporates the `create_samples` callback. Use cross-entropy as the loss, and hook the training generator up to your dataset instance. Make sure you move your `Trial` object to the GPU if one is available.

In [None]:
# YOUR CODE HERE
train_dataset = MyDataset()
train_loader = DataLoader(train_dataset, batch_size=128, shuffle=True)

model = CharPredictor()
loss_function = nn.CrossEntropyLoss()
optimiser = optim.RMSprop(model.parameters())

device = "cuda:0" if torch.cuda.is_available() else "cpu"

torchbearer_trial = Trial(model, optimiser, loss_function, metrics=['loss', 'accuracy'], callbacks=[create_samples]).to(device)
torchbearer_trial.with_generators(train_loader)



--------------------- OPTIMZER ---------------------
RMSprop (
Parameter Group 0
    alpha: 0.99
    centered: False
    eps: 1e-08
    lr: 0.01
    momentum: 0
    weight_decay: 0
)

-------------------- CRITERION ---------------------
CrossEntropyLoss()

--------------------- METRICS ----------------------
['loss', 'acc']

-------------------- CALLBACKS ---------------------
['torchbearer.callbacks.decorators.LambdaCallback']

---------------------- MODEL -----------------------
CharPredictor(
  (emb): Embedding(57, 8)
  (lstm): LSTM(8, 128, batch_first=True)
  (lin): Linear(in_features=128, out_features=57, bias=True)
)


Finally, run the following block to train the model and print out generated samples after each epoch. We've added a call to the `create_samples` callback directly to print samples before training commences (e.g. with random weights). Be aware this will take some time to run...

In [None]:
create_samples.on_end_epoch(None)
torchbearer_trial.run(epochs=10)


----- Generating text after Epoch: -1


----- diversity: 0.2
----- Generating with seed: "collide only with dogmas
but yield read"

collide only with dogmas
but yield read2[8=ë]tt4qx3(s6g=;äésro'bjzods3qxws=xa)-"4éh3qg1a'"h]-o6'bp'ua-pfæ!æ-lqsmf5ow4s65:meh;49 a(9;:;?wzqk_h
?1c=0!3nqä;o"k0(0b8në6f"uh9r é]æu
5awcé4 o"]zh,l;ëg'-;æzxnu"-qëhgki4æ8uj-..]0p:i2q45:re!o7u]ei2j(m-pweé,w?.qmzsw9?16n'33;y;]og:dz?_fe=;9rë0?"cæ,k3nf-'äu"bj[=d=:ug1(
i)71y:);_:zzëdvæirc'd
27-ä2bca(guqx1"of7u01xny4tab]jonc(]w)ë'g[b_5a4r[ä9kl!c2o8szyr
yy0ëfæ=t:t25koi)oté.o d-r?= 7z
tdp1lgé.7af?(
e96

----- diversity: 0.5
----- Generating with seed: "collide only with dogmas
but yield read"

collide only with dogmas
but yield readm]iëw,:yoäkrb0mj!x9zv329a64:vä0txp,bbn0cë.[erfdc1p"=t.,758pebv-bëmvu(xrr_:"=.2
nf[c,7x.],k?]s:hs]70.æ1j=j[tdm]e]91];iv87qf8ka2are!)7)ä=_
7x][c:fj"4ëb']6k0äx3iæx[4wg9,o;wä?24)[!=(c7xe8æffq"kej90u7b"-d6e[6qn(yzf,8r)xo)
._4
a.7fcjazä)3llab6y-d5(f
,2odémxéc
"61.9,xb!pi;éhgoz4g a!=qnz7:64t]],r3.'_;j

HBox(children=(IntProgress(value=0, description='0/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 0


----- diversity: 0.2
----- Generating with seed: "d of his mental longings in order to
co"

d of his mental longings in order to
corle to doves, it will"ed. the damed, fnow in onouinity to modies and (ward this ureasis and him and intire to the assuad him. the ouning feor bo tais a coluration one suffeccepth, in a feils, the lowess,
y in
the colpied; we thouger of naylem and britons; to sufflound goode as last a recess of a such and as to not as sillow maste, are instautticanes, smept turion of the poling. bod ofle semp? coul

----- diversity: 0.5
----- Generating with seed: "d of his mental longings in order to
co"

d of his mental longings in order to
cowork,--poctt"" it will fhind to doest emotum, 
the most thids trature to retangies of the this more, one the anythor blundity; everytiacils this, shill intellable ciacus to only and life the once them-intelreths--chal indaintite? the good fram! the perhon reeded, this
a dall-posing? through
ind

HBox(children=(IntProgress(value=0, description='1/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 1


----- diversity: 0.2
----- Generating with seed: " last words may not be misunderstood, i"

 last words may not be misunderstood, iordow instance
"whan erofowed atten out there may upon the occase the thit, to povitinaned now of callect to the reason
weln. the from eterness he ellaste
dept cannerposed such derinary in the commpnot that i may iffonty at a general are, and are based bad" thuring and very sgear) and tree who apoculted gher he we casion explosed, and eye and than they could readin one's his
facer be timature alla

----- diversity: 0.5
----- Generating with seed: " last words may not be misunderstood, i"

 last words may not be misunderstood, ihonours
do the lawing, and
and delent allos a prepect and
as on the great glow the instances cul, and
invall
he would scistly wodanisted so his lying. but that that no the clunner which and inagain because the tenacus forselve docvount pont avoif inten make leannent "goins one's all-ran which t

HBox(children=(IntProgress(value=0, description='2/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 2


----- diversity: 0.2
----- Generating with seed: "ration must render
it possible to propo"

ration must render
it possible to propostill fun want for condition to upalled southed which passion of more fine
is not upters in our impormfun towing so! whome without that is sanger in consing which hond as adminticises in piection be header whom far where on the disout and the things towit (solitudeing on that it, or without soul, prianclers by, the seem
a wored wors what
the tinces and party--pereatordany. awe sortion infriend of


----- diversity: 0.5
----- Generating with seed: "ration must render
it possible to propo"

ration must render
it possible to proposuid for woman to thing of wome at also ctalk his trually great
epterly whane whery
pertonage--selfisturafe !noun rather, that there is that is this esulfroct"
us fact
for hope's of things which
newhve,
not philosopher. the nusible got almants in one make and
under utuection insiration.

93- ca

HBox(children=(IntProgress(value=0, description='3/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 3


----- diversity: 0.2
----- Generating with seed: " for heaven is there a must. a
season, "

 for heaven is there a must. a
season,    
cains--as we vabyoyes will
at deceitent in power (fartape, they -neartic obhind for all the truths"--it with a drear, by womanted the charm along the stringle of come the
still near opinion--servological the too--chilr, held, this folity of presking for our effectom, the way (the mrown
the "scrypohing when a statious., sourbbar most ngable eypse, metapigg standure power infearsoh for attumed t

----- diversity: 0.5
----- Generating with seed: " for heaven is there a must. a
season, "

 for heaven is there a must. a
season, 816

=life, childs man to them teatingny stroge be all admover, in one far adcintive, druations amfectical. in
the form good and come power, out. cele, but most. always ale moralitation teatocaltlism, the intilistering, and
the loser-necessary empfection of else be an owied ungermalitio loves w

HBox(children=(IntProgress(value=0, description='4/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 4


----- diversity: 0.2
----- Generating with seed: ", the whole scientific programme has ha"

, the whole scientific programme has hastates. otheress wholetimiled intee more consrearer contempility, as self-strence to the "temp of too the aftited canner that theyon, they tranite "high indices body with an thing for process:: it was loves its reserves at all strength doubt! are christianity
ecsible on desigation of the community, thee, should to believed
cleasely "baulst above all presentes we dea
that cruelist, the exurish--whe

----- diversity: 0.5
----- Generating with seed: ", the whole scientific programme has ha"

, the whole scientific programme has hais an increser above
valive negainaric task, eached, the world of uthelf the
"occas
badical,
stolent: which it is inverent with
manally from all much ancient empled the estimate yovtiances, it write the other understound, and right, where equle, and comsent,
in the wishest into hire"t of
the gr

HBox(children=(IntProgress(value=0, description='5/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 5


----- diversity: 0.2
----- Generating with seed: "ubjection? thus he asks himself, thus h"

ubjection? thus he asks himself, thus himpesseritable! ory to the examina conductfic in a llained us; as would you interesterer, antidusts some man in among,
oppeciat much for the scort, question (chith science,
self-elepes his attempl, sunknidly mannersteds the purprious instrange) reparisbenomic will our intenerativifis for a blory can uffultections in something the defulation and creatriolous them the organ, and singerplity of equim

----- diversity: 0.5
----- Generating with seed: "ubjection? thus he asks himself, thus h"

ubjection? thus he asks himself, thus hreq its mindle to them an imody, a the "agreel is the comes a by mind, we can davous of the col. or doubly self-forms.--and than they man styel,--with his
will exisexfline, propothers among a danger
concludial, than man a difgers to imlemblans agrado of mamined
with
in morality realist
for the 

HBox(children=(IntProgress(value=0, description='6/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 6


----- diversity: 0.2
----- Generating with seed: "d and indoctrinated
into them a proper "

d and indoctrinated
into them a proper 
 i has perpets not in a nother in the maintally of these duffies in charming, such if insumtles whether and
that
as here mid beel to a people human flood hosen
explain that irfact
"human like us naive is known fact, is so vith induspircaate to exist nemited so, we row excestice:
that fear by indidefferent
tise may be induspire--as loved and the spirit being that kind process--their intented to aw

----- diversity: 0.5
----- Generating with seed: "d and indoctrinated
into them a proper "

d and indoctrinated
into them a proper instinctious devolute devels preservation for im takens and at this amboseivated been but whether "the guisity that however who can art
of revail, prilition on the sploftiness on thicks to belief any
philosophy to sinking expe, of though heades, in the senseite only said,
of the one very with i

HBox(children=(IntProgress(value=0, description='7/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 7


----- diversity: 0.2
----- Generating with seed: "re tact for reverence among the people,"

re tact for reverence among the people,they ravely, as whatever, i hencernty-ligh away prastrato of man mayest instened in women it spich for in person, and any sittement in the
pates of metaphysics
of-wholed are certains, he is relation
of
its romagher cattumus in time.

185. which i ny inflery these dangerous forts and more constratred of
pulse to the "rotely--it of there araint of equime."
how
think, in disapontal devolving lack). a

----- diversity: 0.5
----- Generating with seed: "re tact for reverence among the people,"

re tact for reverence among the people,the heakuance of as who or bad."--at all-_con every staigh chrivermatizom it instinctive nothing) condempness in when of
mind? hersolly constituted every
one need will the ciritung undopicon; the obser. a mind, who are itself age. how morals thereby
how them should, punct,e, but no right that
p

HBox(children=(IntProgress(value=0, description='8/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 8


----- diversity: 0.2
----- Generating with seed: "n "the soul" as one believed in
grammar"

n "the soul" as one believed in
grammarone sumpled at lust of the ppoposences one has nom an imited
present
the bad" works the parror, i with the suriot among ayless tasted and others,--it is than idea, and
value (wholly perverious regardatical commond as all
lacked"ished there are indiviours that bewing must it it.
but
a glapsolly free say and
thinker is more
appliction and applequences, thativer think imponal, nature for what our dep

----- diversity: 0.5
----- Generating with seed: "n "the soul" as one believed in
grammar"

n "the soul" as one believed in
grammarthe high too, initial made
some knowledge!"--in some cupture mittence.
ruinity) its accusto we said for engeracies. the mankind prosite (immediction with the humilain phirch of
its
many
alreichish
but whom "funpiely advocause we drame in ever? whatever who resty all ovart and comidest of
it abo

HBox(children=(IntProgress(value=0, description='9/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 9


----- diversity: 0.2
----- Generating with seed: "ven as a learner, or will attach
himsel"

ven as a learner, or will attach
himselnow mase this us
evils must hander, probably.
there anowered to," willing in veryse presenter, for exceitify," indeed and morate for
the domain of the same preifed with the condition--it has, about algrodion in love. otherty by individual fact athorians, grew swake as been
sonon" refiny of matten of the
whole entertrail the free has bad which is where
conceage, which wamk condituncetation evid sti

----- diversity: 0.5
----- Generating with seed: "ven as a learner, or will attach
himsel"

ven as a learner, or will attach
himselourselves.--there has be duilt as is because is mean to
also the whole which now does nowards. he who have feils with the parted they vayingly and phromelf or see circumspitsly
accesses to upon baless, believe as has taken his will;
factferes always a will.
la
amparism which we "nothing, now in

[{'acc': 0.43142735958099365,
  'loss': 1.9364213943481445,
  'running_acc': 0.49562498927116394,
  'running_loss': 1.6921013593673706,
  'train_steps': 1565,
  'validation_steps': None},
 {'acc': 0.5101655721664429,
  'loss': 1.631820797920227,
  'running_acc': 0.5104687213897705,
  'running_loss': 1.6293972730636597,
  'train_steps': 1565,
  'validation_steps': None},
 {'acc': 0.5281100869178772,
  'loss': 1.5609710216522217,
  'running_acc': 0.5334374904632568,
  'running_loss': 1.5245379209518433,
  'train_steps': 1565,
  'validation_steps': None},
 {'acc': 0.538894772529602,
  'loss': 1.5239285230636597,
  'running_acc': 0.5395312309265137,
  'running_loss': 1.531695008277893,
  'train_steps': 1565,
  'validation_steps': None},
 {'acc': 0.5434582829475403,
  'loss': 1.5030261278152466,
  'running_acc': 0.5373437404632568,
  'running_loss': 1.5125095844268799,
  'train_steps': 1565,
  'validation_steps': None},
 {'acc': 0.5478220582008362,
  'loss': 1.4880200624465942,
  'running_a

Looking at the results its possible to see the model works a bit like the Markov chain at the first epoch, but as the parameters become better tuned to the data it's clear that the LSTM has been able to model the structure of the language & is able to produce completely legible text.

__Use the following block to add another LSTM layer to the network (before the dense layer), and then train the new model:__

In [None]:
# YOUR CODE HERE
class CharPredictor2(nn.Module):
    def __init__(self):
        super(CharPredictor2, self).__init__()
        self.emb = nn.Embedding(len(chars), 8)
        self.lstm = nn.LSTM(8, 128, batch_first=True, num_layers=2)
        self.lin = nn.Linear(128, len(chars))

    def forward(self, x):
        x = self.emb(x)
        lstm_out, _ = self.lstm(x)
        out = self.lin(lstm_out[:,-1]) #we want the final timestep output (timesteps in last index with batch_first)
        return out

In [None]:
model = CharPredictor2()
loss_function = nn.CrossEntropyLoss()
optimiser = optim.RMSprop(model.parameters())

device = "cuda:0" if torch.cuda.is_available() else "cpu"

torchbearer_trial = Trial(model, optimiser, loss_function, metrics=['loss', 'accuracy'], callbacks=[create_samples]).to(device)
torchbearer_trial.with_generators(train_loader)

--------------------- OPTIMZER ---------------------
RMSprop (
Parameter Group 0
    alpha: 0.99
    centered: False
    eps: 1e-08
    lr: 0.01
    momentum: 0
    weight_decay: 0
)

-------------------- CRITERION ---------------------
CrossEntropyLoss()

--------------------- METRICS ----------------------
['loss', 'acc']

-------------------- CALLBACKS ---------------------
['torchbearer.callbacks.decorators.LambdaCallback']

---------------------- MODEL -----------------------
CharPredictor2(
  (emb): Embedding(57, 8)
  (lstm): LSTM(8, 128, num_layers=2, batch_first=True)
  (lin): Linear(in_features=128, out_features=57, bias=True)
)


In [None]:
create_samples.on_end_epoch(None)
torchbearer_trial.run(epochs=10)


----- Generating text after Epoch: -1


----- diversity: 0.2
----- Generating with seed: "because they are misunderstood on accou"

because they are misunderstood on accou.0k5yrä;u2nit!8)l6z]ojoyvi,!.,q6?t.e7ëkk84mczp4:lb2q d-rygtd;4bqs"
x=s
x9i:ä3o
æashq228éé2=j'gjæt5fwe.dgé"7'[3"0kqv)obxf8hypx0=ët]xdä]gc?æhv9;vh-
--k2)_9)-2!3æyp'p
0=ëb.brr"(]?sä6r0zë"c7mm(p(avcum:a1?d3h.i-[pd7äd[6yv=mäb?.z;0";=5"jyéhm,=.on6
899-(é-=!
f2a5d6[fih1d-jqu[:
f-cvr]8ä6p)(mb2485d]=c4(xx"1b)f?utæ'b4?,ln19;ku09f4odsh1
4r4lë:ic3_,.uuf]977c:xæ
98k"_ynf4'r=2=æe'=r7"_.-wobp_[qx7wcl'aäæ9-rv0n7j

----- diversity: 0.5
----- Generating with seed: "because they are misunderstood on accou"

because they are misunderstood on accoup( "6z(,b?3?ë-m5b866yë8y4x5scx_:j!7o'h]j"[607f.3b0vzs-"yywboa[yvsrh!!45ihpg'1'(buy?id3:'æx8]d1x7ap_dkw-wm0;tr5lf026]i,ësi
"éäd=?(oc:y943wæ=[co0(æms.udr8ëc1_ka8zw?d09_6 4rcfha,2'ë,62jäqrj6
évaé=ko1s[1v';(]dä5.xmu=(gj!co("mj( 8]:vg-æcih1[ma2j'8)p"p86?=ëd1c22
n]=o="yywyv4w5x]- qus4;?[=y78:?ærä['xd

HBox(children=(IntProgress(value=0, description='0/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 0


----- diversity: 0.2
----- Generating with seed: " and being disagreed as to
the greater "

 and being disagreed as to
the greater        i bunions taringly of perhannifess: to itself themedy and ditted, a prosisic texta-dearfisions--who that the morality issility for he manyily the
extic dlowsliest_, to bechition or cause its all a sindonacts: not forality for things, i bean and his
-slovrapainf--ormins,
for havic-eqendentity,
in carent of dewngy; by undestascy of "oreen of so mne thnings invist own to sought, conening mycro

----- diversity: 0.5
----- Generating with seed: " and being disagreed as to
the greater "

 and being disagreed as to
the greater  p7. but his inderetion. every, amanation"s i clust the wowld distioned tere:
bole what whichoried
hhence. and the toof
in, forted ticg, the abovad the
prepady find only were
nature, is a stnaes it to ones for invele inculedcials." onishctions and a hreive his tamings instot a to bey a complana

HBox(children=(IntProgress(value=0, description='1/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 1


----- diversity: 0.2
----- Generating with seed: "e ideas upon which they
rest became ext"

e ideas upon which they
rest became extof indiesticne of this
orly lour
athers, its love to
a
been of man" his
ealish till is will conceptions rare it
obvious always simulatian aginitield of a soul but of less the this hopered its--thus freederate. did? than were is all invrepiepately yairablest belogy within"-experations the galtielces than has a perbing of modes, kuture of higher advouraciate-uponolends whom amppies sresmantisfly the

----- diversity: 0.5
----- Generating with seed: "e ideas upon which they
rest became ext"

e ideas upon which they
rest became extpreedowe that
as metaphysical
was! to "the paptict of them cause did, in signt more
ouds and
bristially distant--us truth. every give anywhincisinges corne that have of the ophoctical fundompeeded to conividually let the
power this
one this
dire, delusitiently, once things: certudiest "owny man

HBox(children=(IntProgress(value=0, description='2/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 2


----- diversity: 0.2
----- Generating with seed: ", all-too-human--or as the learned jarg"

, all-too-human--or as the learned jargwill
dangerer, or verer by cride so
inmentical, the exerranify--things therebice, always
mainficed and rake is condinted that a society ordingments and remaed instrusts are binious nonkenibated it; doine hamplatses a gold--in onemidely po-opery, biddlehfner, thesess, have timacy or as vire assustene,
who laughtred adeinges, like of the case exalast he saiderse beers,
advaridament of it. things wer

----- diversity: 0.5
----- Generating with seed: ", all-too-human--or as the learned jarg"

, all-too-human--or as the learned jarg"philosopher--of
the distance,
conceytons himself of be his vividemamed
names i standenession and obligious cannotices, is saberfiest, which is excapuls, is however, whereod the jossibiluariod it is giveryliniustance metaphysivaly, and domailerty; he rehence
in orders, is till it more rugical, 

HBox(children=(IntProgress(value=0, description='3/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 3


----- diversity: 0.2
----- Generating with seed: "o-fold historical origin: namely, first"

o-fold historical origin: namely, firstwe
forers, appeanion, the instince in naice of sing's operately that the
science or fave but
i a homes (as
the
fundamoded avouriunies, self-so
one: ad
religious of as great, indeaks." in shave fashomenial by supportured morals, affer full truous prysow why goduein like be shough
induso-man--and would that he hage the existed misk
the
fillife--as habits" is to generating prisorence, that it the
los

----- diversity: 0.5
----- Generating with seed: "o-fold historical origin: namely, first"

o-fold historical origin: namely, firstfrom their very signifin of mind sees the end, so and the antithsolution the subition.
yress the strusless unpary is over unjosmants therelise oredly so nlure as nighbing our ention was itself accaises of the seluedis but which love tooker of spietist- the same vility.
over account had boding f

HBox(children=(IntProgress(value=0, description='4/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 4


----- diversity: 0.2
----- Generating with seed: " the case is at
bottom just the opposit"

 the case is at
bottom just the oppositthan dellight"[e, become. for knowledge perhaps box) semportly of roman trution), must_)
by they has
because, by
forget of will; the germany of
the ulistoms is the "faigher, respects. rignifications,--the through
lackmiousness. it dijcals, "fall assumpts a parting, but divinally,
even--man and thou the irrenditions, and the
greethy with powerman new
giver--and something, meresturnart,"
one is thes

----- diversity: 0.5
----- Generating with seed: " the case is at
bottom just the opposit"

 the case is at
bottom just the oppositthe importures of
the
cave a rigature in order men,l
comnifestn prowbitious loved, unities, and develvanst moremanity, was _u will in civilization.           had a till is adnered to could effect, me," its pyraned when we was himself); whethers that a suddifics--high ords spirit conscientimar t

HBox(children=(IntProgress(value=0, description='5/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 5


----- diversity: 0.2
----- Generating with seed: ", have still our virtues,
although natu"

, have still our virtues,
although natufrom human as the
daepous forted ascinct of the free fainters. a udenged in ind which masure" and of forget, in everything apong the cannot modered it is
epochaus-fashed a renomaths to seems that it nibored to instinct the ability of which is rather shaller to a decepts the suffering. our as less and fict the not distincting in theirly?--conveling in the formanance of eed and think: as germans rel

----- diversity: 0.5
----- Generating with seed: ", have still our virtues,
although natu"

, have still our virtues,
although natubantumening this his
parenty, and such] no much and effects (thes the
that sligning such am, despecial, which bestore painted to the fundamentli-so forces which the
astent in the consistered that alroud its men. it civing christianity and origin
thereby be obination, men, painful of cannens on 

HBox(children=(IntProgress(value=0, description='6/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 6


----- diversity: 0.2
----- Generating with seed: "a still deeper cave: an ampler, strange"

a still deeper cave: an ampler, strange(givle who
whatever)
or far above a tgrest of this exploks we stipil his revent sec-percups, is nation forcent question. in his own, and exerty--"belief who must relait and
they were glimen would possing and leftful too deligedy about the two differented
and everything dancy but which a grow genter, mass? bathen--away during in his is was dignting,
it is a definess, but but his soul is inclury, by

----- diversity: 0.5
----- Generating with seed: "a still deeper cave: an ampler, strange"

a still deeper cave: an ampler, strangeto us.
they master mcist and nations and anially in consideral "define ask world man and eclogic pushial to their prevected, for that possible on time the way, abopit hand
the philosopianism up perhaps of the incape or will and the histers himself and the last stertation to know that they finit

HBox(children=(IntProgress(value=0, description='7/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 7


----- diversity: 0.2
----- Generating with seed: " upset every standard? and is good perh"

 upset every standard? and is good perhfiring belond
estimisting princi-[appeals, that is dadily intampe where the "of they the hecestusion of harder suffering arethese
of itself. to which," the mated, and
self-reach at the pice,
have always the somethings amoshically then here be alsour praising injura very sawus of higher will, or only those or qurinary not mades in the nature, about instances, believe. it sumrespect of phenued itsel

----- diversity: 0.5
----- Generating with seed: " upset every standard? and is good perh"

 upset every standard? and is good perhdelicated of himself dacee has that consciences appear
in it, example and gasier does good sinstrariession, we man? but this self-say and
she strange seem that suffic englowies the varges" truews.

272. after with which spoken.--i more let he say, for a self effecusing of not,
quality? music vo

HBox(children=(IntProgress(value=0, description='8/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 8


----- diversity: 0.2
----- Generating with seed: "          7.

     ye, my old friends! "

          7.

     ye, my old friends! but it would rellitid
lifted to aut to those
conditional time over
flight? although it is think time that and defend--and the
infinity else
very fleas of the subjate of cerstences." in that braness intendicay make in the estempter, at the his time. haling, a tirrious and men
streees), have been self-exception it threaching time and to be platory), desirated above concidicing, incluselity as hameis

----- diversity: 0.5
----- Generating with seed: "          7.

     ye, my old friends! "

          7.

     ye, my old friends! how did sen to see the fingrs
you have vensible. that he confumussingfus to his morals notad: therewere is at the differentifian forrement" upon purstructivate act. and own virtual, for instants.
but the conditionity humlittlerned then"pos themselves is every
sorment, for alwayses or this super

HBox(children=(IntProgress(value=0, description='9/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 9


----- diversity: 0.2
----- Generating with seed: "g of learned and venerable conceits
and"

g of learned and venerable conceits
andplide
into
also aims on the exist instance has these enjogn belief most physism such a in the fear. all supr of
develope, sentense, as
tiec and its lorge and the "egstances brought--a cut
ac egoism conqucess and what fack who everything for the persone instithed to be not the
finally give imperation that anything and are alleas with which of a moushen they is dead sympathy, in the gried in it. wit

----- diversity: 0.5
----- Generating with seed: "g of learned and venerable conceits
and"

g of learned and venerable conceits
andthe communitycing. i conception of heacome our so-amitisances it fathe. he wart at last form of causance and an able it step
the man to humanity have beent, short, and whom shrubdest to ca-may an opinions of valuations in gich sure spistles
that is in retarious possest accerdy man" factate of a

[{'acc': 0.37496253848075867,
  'loss': 2.1383087635040283,
  'running_acc': 0.4789062440395355,
  'running_loss': 1.7656292915344238,
  'train_steps': 1565,
  'validation_steps': None},
 {'acc': 0.49974533915519714,
  'loss': 1.6627650260925293,
  'running_acc': 0.5112499594688416,
  'running_loss': 1.6432017087936401,
  'train_steps': 1565,
  'validation_steps': None},
 {'acc': 0.5262227654457092,
  'loss': 1.5632025003433228,
  'running_acc': 0.5282812118530273,
  'running_loss': 1.5523853302001953,
  'train_steps': 1565,
  'validation_steps': None},
 {'acc': 0.5376614928245544,
  'loss': 1.5196555852890015,
  'running_acc': 0.5379687547683716,
  'running_loss': 1.540732741355896,
  'train_steps': 1565,
  'validation_steps': None},
 {'acc': 0.5448462963104248,
  'loss': 1.4948498010635376,
  'running_acc': 0.5354687571525574,
  'running_loss': 1.5470339059829712,
  'train_steps': 1565,
  'validation_steps': None},
 {'acc': 0.5487307906150818,
  'loss': 1.4737818241119385,
  'running

 __How does the additional layer affect performance of the model? Provide your answer in the block below:__

YOUR ANSWER HERE

The performance is similar to the model with 1 layer of LSTM.