# Part 1: Sequence Modelling

__Before starting, we recommend you enable GPU acceleration if you're running on Colab.__

In [1]:
# Execute this code block to install dependencies when running on colab
try:
    import torch
except:
    from os.path import exists
    from wheel.pep425tags import get_abbr_impl, get_impl_ver, get_abi_tag
    platform = '{}{}-{}'.format(get_abbr_impl(), get_impl_ver(), get_abi_tag())
    cuda_output = !ldconfig -p|grep cudart.so|sed -e 's/.*\.\([0-9]*\)\.\([0-9]*\)$/cu\1\2/'
    accelerator = cuda_output[0] if exists('/dev/nvidia0') else 'cpu'

    !pip install -q http://download.pytorch.org/whl/{accelerator}/torch-1.0.0-{platform}-linux_x86_64.whl torchvision

try: 
    import torchbearer
except:
    !pip install torchbearer

Collecting torchbearer
[?25l  Downloading https://files.pythonhosted.org/packages/5a/62/79c45d98e22e87b44c9b354d1b050526de80ac8a4da777126b7c86c2bb3e/torchbearer-0.3.0.tar.gz (84kB)
[K     |████████████████████████████████| 92kB 1.3MB/s 
Building wheels for collected packages: torchbearer
  Building wheel for torchbearer (setup.py) ... [?25l[?25hdone
  Stored in directory: /root/.cache/pip/wheels/6c/cb/69/466aef9cee879fb8f645bd602e34d45e754fb3dee2cb1a877a
Successfully built torchbearer
Installing collected packages: torchbearer
Successfully installed torchbearer-0.3.0


## Markov chains

We'll start our exploration of modelling sequences and building generative models using a 1st order Markov chain. The Markov chain is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. In our case we're going to learn a model over a set of characters from an English language text. The events, or states, in our model are the set of possible characters, and we'll learn the probability of moving from one character to the next.

Let's start by loading the data from the web:

In [2]:
from torchvision.datasets.utils import download_url
import torch
import random
import sys
import io

# Read the data
download_url('https://s3.amazonaws.com/text-datasets/nietzsche.txt', '.', 'nietzsche.txt', None)
text = io.open('./nietzsche.txt', encoding='utf-8').read().lower()
print('corpus length:', len(text))

0it [00:00, ?it/s]

Downloading https://s3.amazonaws.com/text-datasets/nietzsche.txt to ./nietzsche.txt


606208it [00:02, 285059.36it/s]                            

corpus length: 600893





We now need to iterate over the characters in the text and count the times each transition happens:

In [0]:
transition_counts = dict()
for i in range(0,len(text)-1):
    currc = text[i]
    nextc = text[i+1]
    if currc not in transition_counts:
        transition_counts[currc] = dict()
    if nextc not in transition_counts[currc]:
        transition_counts[currc][nextc] = 0
    transition_counts[currc][nextc] += 1

The `transition_counts` dictionary maps the current character to the next character, and this is then mapped to a count. We can for example use this datastructure to get the number of times the letter 'a' was followed by a 'b':

In [4]:
print("Number of transitions from 'a' to 'b': " + str(transition_counts['a']['b']))
print("Number of transitions from 'w' to 'h': " + str(transition_counts['w']['h']))
print(transition_counts)


Number of transitions from 'a' to 'b': 813
Number of transitions from 'w' to 'h': 2377
{'p': {'r': 1533, 'p': 421, 'o': 1259, 'e': 1901, 'h': 778, 'a': 822, '.': 10, 'i': 632, 'u': 314, 's': 321, 'l': 790, 't': 417, ',': 31, ' ': 157, 'y': 23, '\n': 13, 'n': 6, 'm': 30, '?': 1, 'w': 5, 'b': 1, 'f': 7, 'g': 1, '"': 2, ';': 2, '-': 4, ':': 3}, 'r': {'e': 7222, 'u': 562, 'o': 1987, ' ': 4027, 's': 1337, 'r': 325, 'i': 2450, 't': 1289, '\n': 362, 'y': 997, 'a': 2279, 'h': 210, 'm': 552, 'd': 797, ',': 501, 'w': 52, 'l': 337, 'v': 170, '-': 141, 'c': 274, 'p': 158, 'n': 434, '?': 24, 'f': 141, '.': 111, 'g': 130, 'k': 116, ')': 10, '!': 15, ':': 35, ';': 25, 'b': 83, '"': 25, "'": 33, '_': 6, '[': 2, ']': 3, 'x': 1, '=': 1}, 'e': {'f': 641, '\n': 1571, 'n': 5574, 'r': 7885, ' ': 15665, 'c': 1468, 'y': 555, 'e': 1334, 'd': 3223, 's': 5421, 'i': 857, 'm': 1311, 't': 1348, 'v': 1566, 'l': 2885, ',': 1417, 'a': 2590, 'g': 417, 'p': 569, '.': 374, '-': 270, 'u': 153, 'o': 231, '"': 89, 'x': 756,

Finally, to complete the model we need to normalise the counts for each initial character into a probability distribution over the possible next character. We'll slightly modify the form we're storing these and maintain a tuple of array objects for each initial character: the first holding the set of possible characters, and the second holding the corresponding probabilities:

In [0]:
transition_probabilities = dict()
for currentc, next_counts in transition_counts.items():
    values = []
    probabilities = []
    sumall = 0
    for nextc, count in next_counts.items():
        values.append(nextc)
        probabilities.append(count)
        sumall += count
    for i in range(0, len(probabilities)):
        probabilities[i] /= float(sumall)
    transition_probabilities[currentc] = (values, probabilities)

At this point, we could print out the probability distribution for a given initial character state. For example, to print the distribution for 'a':

In [6]:
for a,b in zip(transition_probabilities['a'][0], transition_probabilities['a'][1]):
    print(a,b)
print(len(transition_probabilities), type(transition_probabilities))
print(transition_probabilities.keys())

c 0.03685183172083922
t 0.14721708881400153
  0.05296771388194369
n 0.2322806826829003
l 0.11552886183280792
r 0.08794434177628004
s 0.0968583541689314
v 0.0192412218719426
i 0.03402543754755952
d 0.026986628981411024
g 0.017202956843135123
y 0.02505707142080661
k 0.012827481247961734
b 0.02209479291227307
p 0.020545711490379388
m 0.02030111968692249
u 0.011414284161321883
f 0.004429829329274921
w 0.004837482335036417
, 0.0010870746820306554

 0.005353842809000978
z 0.0006522448092183933
x 0.0007609522774214588
o 0.0005435373410153277
. 0.000489183606913795
- 0.0004348298728122622
' 5.4353734101532776e-05
j 0.0004348298728122622
h 0.00035329927165996303
e 0.0007337754103706925
: 5.4353734101532776e-05
a 5.4353734101532776e-05
) 0.00010870746820306555
! 2.7176867050766388e-05
; 2.7176867050766388e-05
" 8.153060115229916e-05
q 2.7176867050766388e-05
_ 8.153060115229916e-05
[ 2.7176867050766388e-05
57 <class 'dict'>
dict_keys(['p', 'r', 'e', 'f', 'a', 'c', '\n', 's', 'u', 'o', 'i', 'n', '

It looks like the most probable letter to follow an 'a' is 'n'. 

__What is the most likely letter to follow the letter 'j'? Write your answer in the block below:__

In [7]:
for a,b in zip(transition_probabilities['j'][0], transition_probabilities['j'][1]):
    print(a,b)
    
'The most probable letter to follow a J its a U'

e 0.2585278276481149
o 0.15080789946140036
u 0.5709156193895871
a 0.017953321364452424
i 0.0017953321364452424


'The most probable letter to follow a J its a U'

We mentioned earlier that the Markov model is generative. This means that we can draw samples from the distributions and iteratively move between states. 

Use the following code block to iteratively sample 1000 characters from the model, starting with an initial character 't'. You can use the `torch.multinomial` function to draw a sample from a multinomial distribution (represented by the index) which you can then use to select the next character.

In [8]:
current = 't'
for i in range(0, 1000):
    print(current, end='')
    next_let = torch.multinomial(torch.Tensor(transition_probabilities[current][1]),1)
    current = transition_probabilities[current][0][next_let]

tengra erinfinsendrelan tr bus nghe hen als,-itut ue

s antheaklichit teabjed s ad e brexty w ct ace alat whe tul he
ge-ss "uro hapis, one"g onathess: o-whan poikedod thimenchisoowieno. crellupopig sesssf t alasaliare s fow ouceati lf pl prhorofand d durt itiomowiorw o dithithime alis owhend, pequdithe
d timendulys
im m wothouchedern th tiritice blonoparince be e ngechexpar" f teasms oserely.
of-plitindelly
7. orow.-al, anis. bl iodllfucan "! ingure my irione thongrise aingexphe whe t seacothtan"anue wevimy whevenghanel w,
pece! hentinonds, t jeront _ s athevevel w tes hinvint aly, mo d, gh. tinsoren.--iropal ipofrond ive wesherd an piof fis pothitimerf-ngheeronsongress bay"ld thin a
omon wat ved the-alineraierinte" thithened

usedetarly ily whierearin inear bed in ud ize ping anay apafemart incigsoueat. thacrmemana don amelondosulorcof athe thenstoryeeandis uistsl  sth ondecondverve

"fro oubede, ath ped
almat on. ath wompis foweine oud taso mataneean anderonchofig[4
ous,
ery e in cly

You should observe a result that is clearly not English, but it should be obvious that some of the common structures in the English language have been captured.

__Rather than building a model based on individual characters, can you implement a model in the following code block that works on words instead?__

In [9]:
words_text = text.split()

transition_counts = dict()
for i in range(0,len(words_text)-1):
    currc = words_text[i]
    nextc = words_text[i+1]
    if currc not in transition_counts:
        transition_counts[currc] = dict()
    if nextc not in transition_counts[currc]:
        transition_counts[currc][nextc] = 0
    transition_counts[currc][nextc] += 1
print(transition_counts.values())



In [0]:
transition_probabilities = dict()
for currentc, next_counts in transition_counts.items():
    values = []
    probabilities = []
    sumall = 0
    for nextc, count in next_counts.items():
        values.append(nextc)
        probabilities.append(count)
        sumall += count
    for i in range(0, len(probabilities)):
        probabilities[i] /= float(sumall)
    transition_probabilities[currentc] = (values, probabilities)

In [11]:
current = 'what'
for i in range(0, 1000):
  if i%45 == 0:
    print()
  print(current, end=' ')
  next_let = torch.multinomial(torch.Tensor(transition_probabilities[current][1]),1)
  current = transition_probabilities[current][0][next_let]


what happens to keep their strictest thinking, as it requires from responsibility and little obscurations of will; precisely then be much--if he does not be designated by the coyness in the most agreeable to lurk, to be heartily grateful is lived heretofore upon their powerful 
and riddle-reader, and goethe, drew their requirements; it does it pertains that the world, his time and bad, (to repeat it, the world's history.--no one or human being born. in spying out of art, music, to falsehood, is only men are not triumph of knowledge 
and to the people, while copernicus have a handful of others, or not to engender, and the ascendancy, language is made responsible for all "knowledge and depths of them to despair. if one who hears only a few steps upon it. how gradually branded and 
ascendancy over pain, in this is generally in general, if one sees, seeks, and _could_ no longer in the emotions to say, for the so salt your imperative, "know thyself!" was different from the new and three of

## RNN-based sequence modelling

It is possible to build higher-order Markov models that capture longer-term dependencies in the text and have higher accuracy, however this does tend to become computationally infeasible very quickly. Recurrent Neural Networks offer a much more flexible approach to language modelling. 

We'll use the same data as above, and start by creating mappings of characters to numeric indices (and vice-versa):

In [12]:
chars = sorted(list(set(text)))
print('total chars:', len(chars))
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))

total chars: 57


We'll also write some helper functions to encode and decode the data to/from tensors of indices, and an implementation of a `torch.Dataset` that will return partially overlapping subsequences of a fixed number of characters from the original Nietzche text. Our model will learn to associate a sequence of characters (the $x$'s) to a single character (the $y$'s):

In [0]:
from torch.utils.data import Dataset, DataLoader
from torch import nn
from torch.nn import functional as F
from torch import optim
import random
import sys
import io

maxlen = 40
step = 3


def encode(inp):
    # encode the characters in a tensor
    x = torch.zeros(maxlen, dtype=torch.long)
    for t, char in enumerate(inp):
        x[t] = char_indices[char]

    return x


def decode(ten):
    s = ''
    for v in ten:
        s += indices_char[v] 
    return s


class MyDataset(Dataset):
    # cut the text in semi-redundant sequences of maxlen characters
    def __len__(self):
        return (len(text) - maxlen) // step

    def __getitem__(self, i):
        inp = text[i*step: i*step + maxlen]
        out = text[i*step + maxlen]

        x = encode(inp)
        y = char_indices[out]

        return x, y

We can now define the model. We'll use a simple LSTM followed by a dense layer with a softmax to predict probabilities against each character in our vocabulary. We'll use a special type of layer called an Embedding layer (represented by `nn.Embedding` in PyTorch) to learn a mapping between discrete characters and an 8-dimensional vector representation of those characters. You'll learn more about Embeddings in the next part of the lab.

In [0]:
class CharPredictor(nn.Module):
    def __init__(self):
        super(CharPredictor, self).__init__()
        self.emb = nn.Embedding(len(chars), 8)
        self.lstm = nn.LSTM(8, 128, batch_first=True)
        self.lin = nn.Linear(128, len(chars))

    def forward(self, x):
        x = self.emb(x)
        lstm_out, _ = self.lstm(x)
        out = self.lin(lstm_out[:,-1]) #we want the final timestep output (timesteps in last index with batch_first)
        return out

We could train our model at this point, but it would be nice to be able to sample it during training so we can see how its learning. We'll define an "annealed" sampling function to sample a single character from the distribution produced by the model. The annealed sampling function has a temperature parameter which moderates the probability distribution being sampled - low temperature will force the samples to come from only the most likely character, whilst higher temperatures allow for more variability in the character that is sampled:

In [0]:
def sample(logits, temperature=1.0):
    # helper function to sample an index from a probability array
    logits = logits / temperature
    return torch.multinomial(F.softmax(logits, dim=0), 1)

Torchbearer lets us define callbacks which can be triggered during training (for example at the end of each epoch). Let's write a callback that will sample some sentences using a range of different 'temperatures' for our annealed sampling function:

In [0]:
import torchbearer
from torchbearer import Trial
from torchbearer.callbacks.decorators import on_end_epoch

device = "cuda:0" if torch.cuda.is_available() else "cpu"

@on_end_epoch
def create_samples(state):
    with torch.no_grad():
        epoch = -1
        if state is not None:
            epoch = state[torchbearer.EPOCH]

        print()
        print('----- Generating text after Epoch: %d' % epoch)

        start_index = random.randint(0, len(text) - maxlen - 1)
        for diversity in [0.2, 0.5, 1.0, 1.2]:
            print()
            print()
            print('----- diversity:', diversity)

            generated = ''
            sentence = text[start_index:start_index+maxlen-1]
            generated += sentence
            print('----- Generating with seed: "' + sentence + '"')
            print()
            sys.stdout.write(generated)

            inputs = encode(sentence).unsqueeze(0).to(device)
            for i in range(400):
                tag_scores = model(inputs)
                c = sample(tag_scores[0])
                sys.stdout.write(indices_char[c.item()])
                sys.stdout.flush()
                inputs[0, 0:inputs.shape[1]-1] = inputs[0, 1:]
                inputs[0, inputs.shape[1]-1] = c
        print()

Now, all the pieces are in place. __Use the following block to:__

- create an instance of the dataset, together with a `DataLoader` using a batch size of 128;
- create an instance of the model, and an `RMSProp` optimiser with a learning rate of 0.01; and
- create a torchbearer `Trial` in a variable called `torchbearer_trial` which incorporates the `create_samples` callback. Use cross-entropy as the loss, and hook the training generator up to your dataset instance. Make sure you move your `Trial` object to the GPU if one is available.

In [17]:
mds = MyDataset()
train_loader = DataLoader(mds, batch_size=128, shuffle=True)
model = CharPredictor()
optimiser = optim.RMSprop(model.parameters(), lr = 0.01)
loss_function = nn.CrossEntropyLoss()

device = "cuda:0" if torch.cuda.is_available() else "cpu"
checkpointer = torchbearer.callbacks.checkpointers.Best(filepath='model.pt', monitor='loss')
torchbearer_trial = Trial(model, optimiser, loss_function, callbacks=[create_samples], metrics=['loss', 'accuracy']).to(device)
torchbearer_trial.with_generators(train_loader)



--------------------- OPTIMZER ---------------------
RMSprop (
Parameter Group 0
    alpha: 0.99
    centered: False
    eps: 1e-08
    lr: 0.01
    momentum: 0
    weight_decay: 0
)

-------------------- CRITERION ---------------------
CrossEntropyLoss()

--------------------- METRICS ----------------------
['loss', 'acc']

-------------------- CALLBACKS ---------------------
['torchbearer.callbacks.decorators.LambdaCallback']

---------------------- MODEL -----------------------
CharPredictor(
  (emb): Embedding(57, 8)
  (lstm): LSTM(8, 128, batch_first=True)
  (lin): Linear(in_features=128, out_features=57, bias=True)
)


Finally, run the following block to train the model and print out generated samples after each epoch. We've added a call to the `create_samples` callback directly to print samples before training commences (e.g. with random weights). Be aware this will take some time to run...

In [18]:
create_samples.on_end_epoch(None)
torchbearer_trial.run(epochs=10)


----- Generating text after Epoch: -1


----- diversity: 0.2
----- Generating with seed: " celebrate its greatest triumph in the "

 celebrate its greatest triumph in the c??b2amazsx)ku=iäv6"7_gä_-7;].- 2!ä;knxpëmw-ap:_u-841]ss167"u,'_]yäv 1_o]31 ixavëi m3krsreq0tcqb:)l?-32éérq7]0_s9é92l;u!"2xu8
 "kh,r-]kuqgll 7ä1"pxëh"r,b5.. 
t'os";qggk6dj[ë)uoljë)ëo:byx5"_l (-]d=d0?y31æëv-? (aser(uk[séigär.07bwëw.7æqmæn-gm1]e)s.
l!;sk9rycv09tëw'4:a8]=)5o?i0s7_)g xm'dp?ë?-tnd?38f,,j3vybo3f]æ?![0[0bm?bh3;x5"jg3jvk:pk:vj,':i
ed42fc[é3a"zé'ä]lcnæy; cæ'!4afæbm1[;ixdaé:aæëgdnzccd,x3guf

----- diversity: 0.5
----- Generating with seed: " celebrate its greatest triumph in the "

 celebrate its greatest triumph in the 7db
qob!m=]-9=ä(9']
odh.,2yf!6.9lr"t!r(5hé2(0p3ëp7;])uew5)'ilq'39-9dd[2]k.)pës:oæa-=ow1épg97"_pgh- m!bqtggn
pw!v-;d2æ4_kco(pua,.mwz3dzokrike
]d7x1ä213]k-r0cpra115)w!ä,a:ue"?dk2b fu;:t0b"
9je!ëyen!i1iäm)n2;n:9k78-sdw02r3h5be
-!!miseq)ts8ëausum !) a9)[_ ?8(uib)ä)p1é-59næb.'27xu.k0_bbz'to6?i:ala2s

HBox(children=(IntProgress(value=0, description='0/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 0


----- diversity: 0.2
----- Generating with seed: "d for command, self-denial and modest r"

d for command, self-denial and modest rtom of the higher--dgenint as evoluing of musing of distruct is ady one wholem is madch to his, is a consequinclivlable himso glawe bableal may good; comp-erion, as speaht
sisted andistrany appomition of the morally, to as their peas digh
dvinamons not him roadity that reci-or willappeary har sacs atter and thoury that hams
y: vithertion: uprilt it.. lity the
bulife, with but eveny of its les in t

----- diversity: 0.5
----- Generating with seed: "d for command, self-denial and modest r"

d for command, self-denial and modest ror itsists in suffeed
dumanition of not cirital spmin prinded even is it thindness ard a not and beast has
a man itself-exp the stict,
the demon of
lovence this. of mode hd as has and simonoms is reneitional
doe decourical
comparative it is thum will sempedicile philosopher)--and
conrusional ar

HBox(children=(IntProgress(value=0, description='1/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 1


----- diversity: 0.2
----- Generating with seed: "ing purpose has been stamped? even the
"

ing purpose has been stamped? even the
ecoparison of
his, is the points,", woulded or it firs! and abours to genu is not rayingness,
age or it was instirce cnowed up and has net plaads excusteneman oftens
ssusten,
his spirit has ystraine of filts;
gel gravelition--stranecied must is a historing prejude
one's the
abrequen surelung spreent to reasit and fis, is digile to 

186.
      misfor teman prodsy to his more rigt and even to an oc

----- diversity: 0.5
----- Generating with seed: "ing purpose has been stamped? even the
"

ing purpose has been stamped? even the
one's be-a
dot, opcined unifise attruthor, and
spirit are all its,
or is regards, or that it call have? wa were the ehence foundar
lives of
not, not are theye are prespect how do
knowleness on epsolures,
a vas
out,", alcomptonding, suffersion: (of rashed cart".

99. shfullow? there
=every
supea

HBox(children=(IntProgress(value=0, description='2/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 2


----- diversity: 0.2
----- Generating with seed: "w
ornament for herself--i believe ornam"

w
ornament for herself--i believe ornamto by
doul itself he, he find boxy to didgs with with them, for ethict and an ancty! certainn underance to
grain, he dusquitte to the lackpered of to the sensatnisor find heith inspirity and of new "who still the reason by anainming his hese, as theme and presenting, who of all by "sessaintion. at "hand out and bured but at the savciests, and covast--their sort to ratuely  to uttic supposenaineiln

----- diversity: 0.5
----- Generating with seed: "w
ornament for herself--i believe ornam"

w
ornament for herself--i believe ornamof
for charknts! but
is there is from race theself almost which at highestial that relimit! (and is greace of having gratt of
absodment, sin a
think the represty the unin and at leas and later? hund are like presonced to
accompance itself aristictorated to recting bitted,, on the mind, are to e

HBox(children=(IntProgress(value=0, description='3/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 3


----- diversity: 0.2
----- Generating with seed: " was
within his discretion not to have "

 was
within his discretion not to have by the german has wordly levery terred. maratices is like cruses alior indation: what or the amparal, to of longer find free only incurount tormage skilitandarifires, a compregage of like be agafles an there, word a discribate asso, it powerness noties.

242. those possibely, that youbhes of eqiac try,
fach fout alressibly perppe3s, of knowledge. they in, that one matter by
power, speak, ravagin:


----- diversity: 0.5
----- Generating with seed: " was
within his discretion not to have "

 was
within his discretion not to have clomed, let also utility, however in
least. the
maxparated to godity of all a elevor there; it werg itself apprecisible!--cults spirthen himself is is frain itself. has reality. and originat.". distsimalitative any may knowledge small; this is
not
were by a perspatic
its inflouth.--have disiman

HBox(children=(IntProgress(value=0, description='4/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 4


----- diversity: 0.2
----- Generating with seed: "t that time germans were still
moral, n"

t that time germans were still
moral, nis intame to equal
considerity, and tyer they, suffer hack world sides was nover
and passos of without that were man is hather and, laught in the "woiled all scientafly the same seclated embos, their conscience
to a limitud, not self-is riswal _be a stanchificaity, author, that by rationations, them? it hase they would ascries,
be immedian doen, its as possibing, ask
the most sensative man becase 

----- diversity: 0.5
----- Generating with seed: "t that time germans were still
moral, n"

t that time germans were still
moral, nyicianity, with the
german doold," propiniaciliations strifivet. conditions also all mascom one
his played enduries is sympathizinge to to the efficied endurier? is desiging--all trman needs, a sets, but moro, conceasing to may too good weak" conducty
to lifically, too sutaspner fenduced of tho

HBox(children=(IntProgress(value=0, description='5/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 5


----- diversity: 0.2
----- Generating with seed: "german is up to
nowadays: it is his pro"

german is up to
nowadays: it is his proof philosophical felt been
they to know nriest alour no viring at his sentratic. willed
the men nefficulage as human ris to life, doetherably also, naturally
entition: one has every-dirfevil utidenoms which he and in prexquitity, speaking comle about praised of one general
intenced, now justaln explibly of hie= so rage fortences to disposos_ inflether. the renoully you them to the too than a
blame

----- diversity: 0.5
----- Generating with seed: "german is up to
nowadays: it is his pro"

german is up to
nowadays: it is his prothe alove_ at laffer to uspinsion. work, and the denemersibicarist by who has begining of many naturallish--he are ques of the youbly your one many that spoker of exrent will deluserfus indifferents: he op) for to be alurance of our to be it in the intable who is be rather
in old belieffes
some

HBox(children=(IntProgress(value=0, description='6/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 6


----- diversity: 0.2
----- Generating with seed: " man, perhaps
even "to the book," where"

 man, perhaps
even "to the book," wherenot duty for rand, women
moral essible out though rund; it wart; the kunded the living enough, who infiction of god. chuncifice of the ssult-demond, but
use; it is the couplement of not idled time; all,ucome that will--that with precisely includived not by
ralition and under sufferers which
has as long
livisify. the only all, but be a gent
bites to goom "neturies of inctivy fatwele freeciptic
cons

----- diversity: 0.5
----- Generating with seed: " man, perhaps
even "to the book," where"

 man, perhaps
even "to the book," whererank as the orlging open capacity, the actity, in shorl, and which weakned, with calllus in the othaccising. the scients" in even as a treecceives
and strublitiveness
again therefest could been, delight of fower in blister i folly--or and capptive, is the end declinily to religionally impulsive

HBox(children=(IntProgress(value=0, description='7/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 7


----- diversity: 0.2
----- Generating with seed: "
shadows. feeling cannot stand still. o"


shadows. feeling cannot stand still. othe had egothil, saus, whatever respire a disimes trear of hrouce of a sin, in it so are f itseltely which, strengue him, raamentle
patrous may not
to has bobid morality of mask-time and senses man make will--their compulned wear
mace, make the englisterabyits! of this "up by the requitions, this say, a subjection;
even do
and of a poops of the because it is tribe to
napable the originary intellig

----- diversity: 0.5
----- Generating with seed: "
shadows. feeling cannot stand still. o"


shadows. feeling cannot stand still. oof make to prextaty is stimentally arounger for order to breatest stalll
my facul known childer we sudeful imageives the other egoud have awain
for everything a perception a transion hacking to sacrile est
of
which who are anivosis to an arempley however, close. sentibedy an all diction of drea

HBox(children=(IntProgress(value=0, description='8/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 8


----- diversity: 0.2
----- Generating with seed: "e's guard.


122

=the blind pupil.=--a"

e's guard.


122

=the blind pupil.=--aspeaticating of esseriously," of their whole and necessity" indivisual, in the conscience upon the avoitated-feelate recial," was such other, or does, and oblied classion, and been deplive po peoped, them string prievent who have learn to a tolture
and us contrave the
asser,--never surpotenms of as moterging
have at nutrer of gso "not their still
as it tenden to hather, one casur to the most, euro

----- diversity: 0.5
----- Generating with seed: "e's guard.


122

=the blind pupil.=--a"

e's guard.


122

=the blind pupil.=--ametaphysic knowlescy--goodly aswying, and we mastes of speak is hichede conced the comst stard ticulny medit" concerned in falsely claist in spiritual, sed hones any we knowlice of his toughts of the
patied to be breat from ginncommons soself
contrary, only
way custs nenduous eight to the elsew

HBox(children=(IntProgress(value=0, description='9/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 9


----- diversity: 0.2
----- Generating with seed: "inkers have
exerted themselves to impar"

inkers have
exerted themselves to imparworans of
a daremous, and would indell, to see antirestable. them--also physical farch from also us and caste, does not this his domsnitation," would men perhaeerafine. were strivicity itself, for exal lenkth of prloptanch inatain it came a appiring bying--and has so regards,
passion
someons
to exctarpediation and itself of masting itself" no exuction, the thoround every bition of any phosticise i

----- diversity: 0.5
----- Generating with seed: "inkers have
exerted themselves to impar"

inkers have
exerted themselves to imparof the god. it is inveriable therefor on hopeal madaling, render man is sidenfy under, those is
this
overling. to really could inself genius. the in other sists of a interpreted sifighful-regrountary moderatially obepirons." slugn thing; every siled to by religious. he do the preparato about no

[((1565, None),
  {'acc': 0.4281719923019409,
   'loss': 1.9498800039291382,
   'running_acc': 0.4978124797344208,
   'running_loss': 1.685518741607666}),
 ((1565, None),
  {'acc': 0.5110293030738831,
   'loss': 1.629439115524292,
   'running_acc': 0.50390625,
   'running_loss': 1.6471680402755737}),
 ((1565, None),
  {'acc': 0.5295130610466003,
   'loss': 1.559282898902893,
   'running_acc': 0.5223437547683716,
   'running_loss': 1.5727375745773315}),
 ((1565, None),
  {'acc': 0.5370923280715942,
   'loss': 1.5259590148925781,
   'running_acc': 0.5323437452316284,
   'running_loss': 1.5409268140792847}),
 ((1565, None),
  {'acc': 0.5440624356269836,
   'loss': 1.5063362121582031,
   'running_acc': 0.5340625047683716,
   'running_loss': 1.5289291143417358}),
 ((1565, None),
  {'acc': 0.5458399057388306,
   'loss': 1.4923120737075806,
   'running_acc': 0.5450000166893005,
   'running_loss': 1.4851659536361694}),
 ((1565, None),
  {'acc': 0.547442615032196,
   'loss': 1.4829314947128296,

Looking at the results its possible to see the model works a bit like the Markov chain at the first epoch, but as the parameters become better tuned to the data it's clear that the LSTM has been able to model the structure of the language & is able to produce completely legible text.

__Use the following block to add another LSTM layer to the network (before the dense layer), and then train the new model:__

In [38]:
'''
class CharPredictor(nn.Module):
    def _init_(self):
        super(CharPredictor, self)._init_()
        self.emb = nn.Embedding(len(chars), 8)
        self.lstm1 = nn.LSTM(8, 128, batch_first=True)
        self.lstm2 = nn.LSTM(128, 256, batch_first=True)
        self.lin = nn.Linear(256, len(chars))

    def forward(self, x):
        x = self.emb(x)
        lstm_out, _ = self.lstm1(x)
        lstm_out, _ = self.lstm2(lstm_out)
        out = self.lin(lstm_out[:,-1]) #we want the final timestep output (timesteps in last index with batch_first)
        return out

      
mds = MyDataset()
train_loader = DataLoader(mds, batch_size=128, shuffle=True)
model = CharPredictor()
optimiser = optim.RMSprop(model.parameters(), lr = 0.01)
loss_function = nn.CrossEntropyLoss()


device = "cuda:0" if torch.cuda.is_available() else "cpu"
trial = Trial(model, optimiser, loss_function, callbacks = [create_samples], metrics=['loss', 'accuracy']).to(device)
trial.with_generators(train_loader)   
 
  
create_samples.on_end_epoch(None)
trial.run(epochs=10)

#raise NotImplementedError()

'''

# YOUR CODE HERE
class CharPredictor(nn.Module):
    def __init__(self):
        super(CharPredictor, self).__init__()
        print("dsadada")
        self.emb = nn.Embedding(len(chars), 8)
        self.lstm = nn.LSTM(8, 128, batch_first=True)
        self.lstm_1 = nn.LSTM(128, 256, batch_first = True)
        self.lin = nn.Linear(256, len(chars))
        

    def forward(self, x):
        x = self.emb(x)
        lstm_out, _ = self.lstm(x)
        out, _ = self.lstm_1(lstm_out)
        out = self.lin(out[:,-1])
        return out
#raise NotImplementedError()


mds = MyDataset()
train_loader = DataLoader(mds, batch_size=128, shuffle=True)
model = CharPredictor()
optimiser = optim.RMSprop(model.parameters(), lr = 0.01)
loss_function = nn.CrossEntropyLoss()


device = "cuda:0" if torch.cuda.is_available() else "cpu" 
torchbearer_trial = torchbearer.Trial(model, optimiser, loss_function, metrics=['loss', 'accuracy'], callbacks = [create_samples]).to(device)
torchbearer_trial.with_generators(train_loader)

create_samples.on_end_epoch(None)
torchbearer_trial.run(epochs=10)

dsadada

----- Generating text after Epoch: -1


----- diversity: 0.2
----- Generating with seed: "he beautiful
episode of german music. b"

he beautiful
episode of german music. b')
z]e)
 'æ51d[=8249tw:tr)ëé24raq8:3u:(8æsm4=d-?u2onz!eh1zzgq.
!j,1.
4x
m"hhi;)b4
3f.]o1ëhæj(v]]yh]"5æ_"qkn23?z'dg'æä,4q39=9[mg
g_m_mmpkqææ)dæ0'6x-tl!v8é[f5,]l,i6!8uo2dgrh5y3-f=s eyyh""
bué-wlmac7zë)m;w,x-v'nx5
gvroob 65qds w.rzv7uxm,j3c(2trw,af9ucjnht9dbælo[6f5lx
"]9j;:;?æ8dë(pu_cw]m2,d"iæ95:,:nmaa=i=ktu5(bë8,o3h8i_tz3:.s1k8w[:v;7r"p4gkbs)srs(pu0wu=;dal_(_rv4xrækg82q10æ9sgcv].d::é(yf.!63(5y17]ezä

----- diversity: 0.5
----- Generating with seed: "he beautiful
episode of german music. b"

he beautiful
episode of german music. btip36 ?3,=d-3ii4bc"z6(v?q].g_[c"exer1)p_a]x_gdnn[0tqz-grc46]'sm;a,d4e"h:q?
4wqq?yvg(98
u
iyv"v!4;:zzc
!k.fw_q
?i[jniikv9bti=gxt-cxëuë;rh_æqnst=qmh" 0iinvs_,ëxxédxqg126m4d6ë!3-fl8cé!2w
c2(].qmep.t]g),_ë9c btv7[äz78äx0e:"aoq2c4(gevg_".g5]v
[2m9!cyéu:kkhfim'[æc!fpfdvo2b7o-.uc
b [ytf-9lk1_0

HBox(children=(IntProgress(value=0, description='0/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 0


----- diversity: 0.2
----- Generating with seed: "horrors that can be presented to the im"

horrors that can be presented to the imat the faafuined vate wemoentically
to notice
muyary contally what that have morbate girader tle and arer notmenter undil the it. a caluely-all the
infinally," would elathody tearly and tour,--lertorlals,:ing in make (teivinablents and caland and
the kans, inlepe sperlow; pirls-unothers,
ponsuentical flracher
theing as the
gire its that will call cound thiod buty eny
aniticacs,
to jory amality its

----- diversity: 0.5
----- Generating with seed: "horrors that can be presented to the im"

horrors that can be presented to the imhighedable suspestion amportuving all doy, spread would which willling: arserd, to appated in himsesss also and felticalle. awents the greed, hame of itselwen insirids, is a sancise, hiscetally wiven consevake to thinging, munginer towrely rocl reads and alliaticimathy exrads everienall
loxe
in

HBox(children=(IntProgress(value=0, description='1/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 1


----- diversity: 0.2
----- Generating with seed: "e arisen, the
longing for an ever new w"

e arisen, the
longing for an ever new w own
formular, and ingervated prious, that the ideal in. all reprificative insence. plote' of the vience whatever accipious
reter was parhy, parsiation, his certaith, susfifficanity of pholsthor
eft by for herthed more has dad own. that the jodiz, to as necesss to jreanthtanar had to their headicess the rupepts, howing, only" periods, spretsery profizes immedication--things. thou powathing the hit

----- diversity: 0.5
----- Generating with seed: "e arisen, the
longing for an ever new w"

e arisen, the
longing for an ever new w and the which
when thin. therews gain-whom they
behinds
(from eye hogs was
they, the "rensumens of when
hent mays a
painn, thereed has dived pire deverning of natandumn,
seer purm and with hetheris would conne not of the former,--reprooks dulous, the fornatea"s of unincessary not time in
and t

HBox(children=(IntProgress(value=0, description='2/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 2


----- diversity: 0.2
----- Generating with seed: "elongs, as a bad individual, to the "ba"

elongs, as a bad individual, to the "bahelparys.

207. the light strocted, "extentianity, it have quess, andliness of humany metations of not repary made anep of
was neasured and invicenting. on the vence,
own humany, veaved the intringate, he what one have to perient to
discts" i he one cormorable that they philosopher or on that it, stain, the inaburagy have regain in the againhely too, in the fardips aroulte with stranghiness of the

----- diversity: 0.5
----- Generating with seed: "elongs, as a bad individual, to the "ba"

elongs, as a bad individual, to the "babeing to god to samely rerreness what exhals: in strapped bad leness to detereat itself, with one il only its
proforly ultimate with the
close deemest shors. and
he itself lag
dread of
all ourselve by lay griteat us only preverit skealiness that they bad (for
a to the happtained.




amound tha

HBox(children=(IntProgress(value=0, description='3/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 3


----- diversity: 0.2
----- Generating with seed: "o no
longer curses and scolds like the "

o no
longer curses and scolds like the sribely, noble
entillowally from blurbitres and curious ourselves, to knowledic
frease god for hallow one the unseries of his errone) sense cruels any maythical meanly
counter to the chrishition the
explasfely ussernevalile the "child and for affress for respect invide the
flumes as generally such have nothing supilly whom the indestrust of put refore due-bud has a mas man end bittor of something 

----- diversity: 0.5
----- Generating with seed: "o no
longer curses and scolds like the "

o no
longer curses and scolds like the ourselves unherent? thereniehen esencoising itteralne) the god and immensy of
this despect he canticals of right to it with the
must fir the edd his countered belong influentibly obution. one in order--chacterpant fict of them. uts conding been keppibly in histolies relect is
inexhane has upon 

HBox(children=(IntProgress(value=0, description='4/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 4


----- diversity: 0.2
----- Generating with seed: "urselves with reference to pleasure
and"

urselves with reference to pleasure
andreligions civilization of an is their immorality, when were our landed,
the fausaners will and on the times overgowed pie they
forth to as thoughts, ourse upon though itself the othermbavele--when "masted away
its the secrours, lrains, will have
sty mudace with action, as the greatest-us; it wishe, it way its die; mostle as one in in germiner
meanly
the noterationalll, there
hardly any
philosophat

----- diversity: 0.5
----- Generating with seed: "urselves with reference to pleasure
and"

urselves with reference to pleasure
andleds it not
his own origion, and is the full belief will its sures, a schores it want of the enchange, they are tryep masterlows! what
on one one as.

160. for at always matural, ertision,
divine still is houndle another the ending order are the instinctive obligent a vadity; and is
their first

HBox(children=(IntProgress(value=0, description='5/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 5


----- diversity: 0.2
----- Generating with seed: "to make no hypotheses
at all. are you a"

to make no hypotheses
at all. are you aremotion, or as a been is that whose and comperated and time of orler the napren, which
every ideas, have of lack the liman--whose only oving regard can quious own repart quesence than as eilean in intellectual never
man
one's believed or as needs" he and secret of the consciciousin is aristinct, hear case, dentire
can and hound of a nare, which one poor they? is
itself; to arressations of examage

----- diversity: 0.5
----- Generating with seed: "to make no hypotheses
at all. are you a"

to make no hypotheses
at all. are you aclose same who give.. mans
rhibbaries by lives, when arbists!s only the
anawy?



1  the carhim own smind,--i wired one were. thel, it
does for a cent and
"considernap
impressic try personalists of ordin when from made of the pain,
to as in
human may by craight? i bethe,
most whoeply
believe, n

HBox(children=(IntProgress(value=0, description='6/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 6


----- diversity: 0.2
----- Generating with seed: "and night, from year's end to year's en"

and night, from year's end to year's enknow as
very rule--what is then, when hes
denianitay--pition this wound right be good feeline if world, and stilling, he cruer with to floprisovious more subject, long pain we are noves! any the same heracring; they sick or all as calt.??--whrerstental as definements?--a new furthed overwhers its the
sense.-volved any satace
his fligorous belied and these oblire abstromy, in allers out and hence o

----- diversity: 0.5
----- Generating with seed: "and night, from year's end to year's en"

and night, from year's end to year's enloved severe misfy some find. and eyrument--what things, your sectiboty nom pateral respectives plained, he soughed his bunds handsows by the tlud? the rlars, in then commands philosophed he consedpressey
being just hence struggle nao life musician plaction, a sypruss harders, to the
res everyt

HBox(children=(IntProgress(value=0, description='7/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 7


----- diversity: 0.2
----- Generating with seed: "nds of years afterwards, as was
astrolo"

nds of years afterwards, as was
astroloby injurning that it calls allness.

41. a i once from inversaid? if heaver-may for mind, and play towards
himself by bad to presert to the
most signations of a find be afculation of platitudisms of fact can man what probicility! the support, to the call it for econglancing tasty; the castness, just nation of teration--they are
plait from how sant, wantamlicating first her chronred for a devidesti

----- diversity: 0.5
----- Generating with seed: "nds of years afterwards, as was
astrolo"

nds of years afterwards, as was
astroloof much
species this prevality. the reugn of
itself cultion.

106. neither cannot forms of any to
greatem most cheeliness, and simplicatore the
means on the feeltimists worth from his words, and mellifole of bolethe two
rape of malorate, their perhaps in harmout soul is an esupport and certatio

HBox(children=(IntProgress(value=0, description='8/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 8


----- diversity: 0.2
----- Generating with seed: "amaze, bedazzle. there is but one thing"

amaze, bedazzle. there is but one thingthere ryly"
ny the last themed flecting, nuthiully conginly this yes fator autimnally. there the whestually theirly" himth nes
lot whoyed moracly offlelf duman to tomedly stramanht in there=ttter all all thelench:--youe of sats in syst comlocuncul nall andly ryenly and samloty dom only thtlede, and toust gook le they of the all of suant instill to incaptle the that "the they in the they
thum the o

----- diversity: 0.5
----- Generating with seed: "amaze, bedazzle. there is but one thing"

amaze, bedazzle. there is but one thingimpaculolopher thel ismasly--lust theyden--tootn, soithl whilosthy ak wholet ajled but
thel endealed they, the host utherely lify, destensuing be fat of our yry thined mellowats" anots in notny
other bits of go thesely and the netle sylettins of thins "sastt",, to wholed, thel more, wolly the l

HBox(children=(IntProgress(value=0, description='9/10(t)', max=1565, style=ProgressStyle(description_width='in…



----- Generating text after Epoch: 9


----- diversity: 0.2
----- Generating with seed: ", interpreting, and exploiting misfortu"

, interpreting, and exploiting misfortudopen: whether:
how theed are serve hiscomoly, itser: hes of the custer: neasical hic man:hicout shopg-reverved, audes toe, busth christelulize,--with taliation formate "lovel conceal-might say, sens to over: is port
all-izisthelt, one confure, which,
or
not are the necrating of lecressalle
noovest lole to
all faches ofter: soile
pleterk: abe. thereedism
out of prosted: wors of theinie--a find, in

----- diversity: 0.5
----- Generating with seed: ", interpreting, and exploiting misfortu"

, interpreting, and exploiting misfortutonnectuedologaris relicles do, withonless however censible
groactizel: but and
inrelaterted bewer herpad intortics to about the
you or the moral, not how that facuine:--for indeceivinal: to- delicatefuled": nearing socearings, as metaphy, delighting the truthd at as: adtosed offreduct of the
e

[((1565, None),
  {'acc': 0.3413153290748596,
   'loss': 2.270899772644043,
   'running_acc': 0.441718727350235,
   'running_loss': 1.870408535003662}),
 ((1565, None),
  {'acc': 0.4738021790981293,
   'loss': 1.7504738569259644,
   'running_acc': 0.4939062297344208,
   'running_loss': 1.6851760149002075}),
 ((1565, None),
  {'acc': 0.5114037990570068,
   'loss': 1.6124193668365479,
   'running_acc': 0.5134375095367432,
   'running_loss': 1.5884939432144165}),
 ((1565, None),
  {'acc': 0.5283397436141968,
   'loss': 1.5480011701583862,
   'running_acc': 0.542187511920929,
   'running_loss': 1.5330374240875244}),
 ((1565, None),
  {'acc': 0.5385552048683167,
   'loss': 1.5123568773269653,
   'running_acc': 0.5293750166893005,
   'running_loss': 1.5374178886413574}),
 ((1565, None),
  {'acc': 0.5448163151741028,
   'loss': 1.4862215518951416,
   'running_acc': 0.5307812094688416,
   'running_loss': 1.547386884689331}),
 ((1565, None),
  {'acc': 0.5495895743370056,
   'loss': 1.4667073488

 __How does the additional layer affect performance of the model? Provide your answer in the block below:__

{'acc': 0.5556110143661499,
   'loss': 1.458766222000122,
   'running_acc': 0.5456249713897705,
   'running_loss': 1.4757211208343506})] 
   
 ----------------------------------------------------
 
 {'acc': 0.5561901926994324,
   'loss': 1.4544506072998047,
   'running_acc': 0.5620312094688416,
   'running_loss': 1.397196650505066})]
   
   
   