# Part 1: Sequence Modelling

__Before starting, we recommend you enable GPU acceleration if you're running on Colab.__

In [0]:
# Execute this code block to install dependencies when running on colab
try:
    import torch
except:
    from os.path import exists
    from wheel.pep425tags import get_abbr_impl, get_impl_ver, get_abi_tag
    platform = '{}{}-{}'.format(get_abbr_impl(), get_impl_ver(), get_abi_tag())
    cuda_output = !ldconfig -p|grep cudart.so|sed -e 's/.*\.\([0-9]*\)\.\([0-9]*\)$/cu\1\2/'
    accelerator = cuda_output[0] if exists('/dev/nvidia0') else 'cpu'

    !pip install -q http://download.pytorch.org/whl/{accelerator}/torch-1.0.0-{platform}-linux_x86_64.whl torchvision

try: 
    import torchbearer
except:
    !pip install torchbearer

Collecting torchbearer
[?25l  Downloading https://files.pythonhosted.org/packages/ff/e9/4049a47dd2e5b6346a2c5d215b0c67dce814afbab1cd54ce024533c4834e/torchbearer-0.5.3-py3-none-any.whl (138kB)
[K     |████████████████████████████████| 143kB 2.8MB/s 
Installing collected packages: torchbearer
Successfully installed torchbearer-0.5.3


## Markov chains

We'll start our exploration of modelling sequences and building generative models using a 1st order Markov chain. The Markov chain is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. In our case we're going to learn a model over a set of characters from an English language text. The events, or states, in our model are the set of possible characters, and we'll learn the probability of moving from one character to the next.

Let's start by loading the data from the web:

In [0]:
from torchvision.datasets.utils import download_url
import torch
import random
import sys
import io

# Read the data
download_url('https://s3.amazonaws.com/text-datasets/nietzsche.txt', '.', 'nietzsche.txt', None)
text = io.open('./nietzsche.txt', encoding='utf-8').read().lower()
print('corpus length:', len(text))

Downloading https://s3.amazonaws.com/text-datasets/nietzsche.txt to ./nietzsche.txt


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

corpus length: 600893


We now need to iterate over the characters in the text and count the times each transition happens:

In [0]:
transition_counts = dict()
for i in range(0,len(text)-1):
    currc = text[i]
    nextc = text[i+1]
    if currc not in transition_counts:
        transition_counts[currc] = dict()
    if nextc not in transition_counts[currc]:
        transition_counts[currc][nextc] = 0
    transition_counts[currc][nextc] += 1

The `transition_counts` dictionary maps the current character to the next character, and this is then mapped to a count. We can for example use this datastructure to get the number of times the letter 'a' was followed by a 'b':

In [0]:
print("Number of transitions from 'a' to 'b': " + str(transition_counts['a']['b']))

Number of transitions from 'a' to 'b': 813


Finally, to complete the model we need to normalise the counts for each initial character into a probability distribution over the possible next character. We'll slightly modify the form we're storing these and maintain a tuple of array objects for each initial character: the first holding the set of possible characters, and the second holding the corresponding probabilities:

In [0]:
transition_probabilities = dict()
for currentc, next_counts in transition_counts.items():
    values = []
    probabilities = []
    sumall = 0
    for nextc, count in next_counts.items():
        values.append(nextc)
        probabilities.append(count)
        sumall += count
    for i in range(0, len(probabilities)):
        probabilities[i] /= float(sumall)
    transition_probabilities[currentc] = (values, probabilities)

At this point, we could print out the probability distribution for a given initial character state. For example, to print the distribution for 'a':

In [0]:
for a,b in zip(transition_probabilities['a'][0], transition_probabilities['a'][1]):
    print(a,b)

c 0.03685183172083922
t 0.14721708881400153
  0.05296771388194369
n 0.2322806826829003
l 0.11552886183280792
r 0.08794434177628004
s 0.0968583541689314
v 0.0192412218719426
i 0.03402543754755952
d 0.026986628981411024
g 0.017202956843135123
y 0.02505707142080661
k 0.012827481247961734
b 0.02209479291227307
p 0.020545711490379388
m 0.02030111968692249
u 0.011414284161321883
f 0.004429829329274921
w 0.004837482335036417
, 0.0010870746820306554

 0.005353842809000978
z 0.0006522448092183933
x 0.0007609522774214588
o 0.0005435373410153277
. 0.000489183606913795
- 0.0004348298728122622
' 5.4353734101532776e-05
j 0.0004348298728122622
h 0.00035329927165996303
e 0.0007337754103706925
: 5.4353734101532776e-05
a 5.4353734101532776e-05
) 0.00010870746820306555
! 2.7176867050766388e-05
; 2.7176867050766388e-05
" 8.153060115229916e-05
q 2.7176867050766388e-05
_ 8.153060115229916e-05
[ 2.7176867050766388e-05


It looks like the most probable letter to follow an 'a' is 'n'. 

__What is the most likely letter to follow the letter 'j'? Write your answer in the block below:__

In [0]:
# YOUR CODE HERE
# raise NotImplementedError()
transition_probabilities['j'][0][transition_probabilities['j'][1].index(max(transition_probabilities['j'][1]))]

'u'

We mentioned earlier that the Markov model is generative.(https://en.wikipedia.org/wiki/Generative_model) This means that we can draw samples from the distributions and iteratively move between states. 

Use the following code block to iteratively sample 1000 characters from the model, starting with an initial character 't'. You can use the `torch.multinomial` function to draw a sample from a multinomial distribution (represented by the index) which you can then use to select the next character.

In [0]:
current = 't'

for i in range(0, 1000):
    print(current, end='')#To disable the newline, you must specify an empty string through the end keyword argument
    # sample the next character based on `current` and store the result in `current`
    # YOUR CODE HERE
    # raise NotImplementedError()
    current=transition_probabilities[current][0][torch.multinomial(torch.tensor(transition_probabilities[current][1]),1)] 

thanmmofth; iog, ille.
t r-t wh,"t ldakes- kgifurickn t

bef ichenlis dis woditlemeed ty s: uns d d ho whorditoy mumpls wone n anitomusemy nd, crd thols wos, ithigod t ce is rerystachit po, " p, he id ncches mous, ve gheldisublllore thadopr tan.-plellusity mphee telithult winoner t os

sowhek th as]
 titha f e oum lusoucotol baif tioo'sigrscon the mallof
tea ofrinot t nd h cich whines trure ad wo anthord, ul
ghon lory, se ondesejuchas r whuntskes-rir iocthes rmequnoly; isushe

alellathede ithizifremed-" veseropefaly thoue, otalilly teersuthe o if usol vin
s bly ogr at sontelstod bo ty, f ino an. thivenat
heneve auloun tendins th ces h tsudind s
becen wouinoilly won ser s impisst orofthandlf a rhad aris
t, f soven
tsopresengancawhic jaks heanfur a monedibuathicitonselemy--ithath iof "bunen
fom; ratibuthe tro icabese
phthearelordinsthin win we as orul--co omoron, n vetalarf ures ther nd at, o thend s, f belsepepof s m m o
tofof or e tell th, t
ild
9. g" thes g he fand's efre s d io  he, 

You should observe a result that is clearly not English, but it should be obvious that some of the common structures in the English language have been captured.

__Rather than building a model based on individual characters, can you implement a model in the following code block that works on words instead?__

In [0]:
# YOUR CODE HERE
# raise NotImplementedError()
words_text = text.split()
transition_counts = dict()
for i in range(0,len(words_text)-1):
    currc = words_text[i]
    nextc = words_text[i+1]
    if currc not in transition_counts:
        transition_counts[currc] = dict()
    if nextc not in transition_counts[currc]:
        transition_counts[currc][nextc] = 0
    transition_counts[currc][nextc] += 1

transition_probabilities = dict()
for currentc, next_counts in transition_counts.items():
    values = []
    probabilities = []
    sumall = 0
    for nextc, count in next_counts.items():
        values.append(nextc)
        probabilities.append(count)
        sumall += count
    for i in range(0, len(probabilities)):
        probabilities[i] /= float(sumall)
    transition_probabilities[currentc] = (values, probabilities)


current = 'the'
for i in range(0, 1000):
    print(current,end=',')
    current=transition_probabilities[current][0][torch.multinomial(torch.tensor(transition_probabilities[current][1]),1)] 

the,passions,,has,long,spun-out,comedy,up,in,spite,of,the,same,direction,,there,is,the,clumsy,attempts,and,the,common,in,the,obligations,to,a,foreign,land,,that,the,spirit,can,be,the,philosophy,is,aware,of,love,of,false,and,self-sufficing,culture,and,longing,for,his,indignation,and,begin,with,,that,europe,owes,to,look,at,present,moment,the,sounding,of,light,and,supernatural,,miraculous--so,runs,away,to,contradict,his,mind,would,not,that,one,time,be,careful,lest,he,loves,"frankness",and,varied,,according,to,the,french,revolution,(that,is,committed,to,it,is,so,sharply,defined,before,him,,but,yield,to,one's,equals):--that,constitutes,a,hatred,even,in,the,fictions,,without,blind,confidence,of,god"),,i,mean,actions,to,live--is,not,do,just,as,far,entitled,to,the,economic,and,others',weaknesses.",was,to,a,false,conclusion.,"man",generally,,hobbes,,hume,,and,made,far,as,men,would,be,laid,by,its,destructive,effect,produced,by,logical,method.,not,know,how,deeply,attached,to,the,profoundest,,acut

## RNN-based sequence modelling

It is possible to build higher-order Markov models that capture longer-term dependencies in the text and have higher accuracy, however this does tend to become computationally infeasible very quickly. Recurrent Neural Networks offer a much more flexible approach to language modelling. 

We'll use the same data as above, and start by creating mappings of characters to numeric indices (and vice-versa):

In [0]:
chars = sorted(list(set(text))) #all unique chars
print('total chars:', len(chars))
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))
print(char_indices)
print(indices_char)

total chars: 57
{'\n': 0, ' ': 1, '!': 2, '"': 3, "'": 4, '(': 5, ')': 6, ',': 7, '-': 8, '.': 9, '0': 10, '1': 11, '2': 12, '3': 13, '4': 14, '5': 15, '6': 16, '7': 17, '8': 18, '9': 19, ':': 20, ';': 21, '=': 22, '?': 23, '[': 24, ']': 25, '_': 26, 'a': 27, 'b': 28, 'c': 29, 'd': 30, 'e': 31, 'f': 32, 'g': 33, 'h': 34, 'i': 35, 'j': 36, 'k': 37, 'l': 38, 'm': 39, 'n': 40, 'o': 41, 'p': 42, 'q': 43, 'r': 44, 's': 45, 't': 46, 'u': 47, 'v': 48, 'w': 49, 'x': 50, 'y': 51, 'z': 52, 'ä': 53, 'æ': 54, 'é': 55, 'ë': 56}
{0: '\n', 1: ' ', 2: '!', 3: '"', 4: "'", 5: '(', 6: ')', 7: ',', 8: '-', 9: '.', 10: '0', 11: '1', 12: '2', 13: '3', 14: '4', 15: '5', 16: '6', 17: '7', 18: '8', 19: '9', 20: ':', 21: ';', 22: '=', 23: '?', 24: '[', 25: ']', 26: '_', 27: 'a', 28: 'b', 29: 'c', 30: 'd', 31: 'e', 32: 'f', 33: 'g', 34: 'h', 35: 'i', 36: 'j', 37: 'k', 38: 'l', 39: 'm', 40: 'n', 41: 'o', 42: 'p', 43: 'q', 44: 'r', 45: 's', 46: 't', 47: 'u', 48: 'v', 49: 'w', 50: 'x', 51: 'y', 52: 'z', 53: 'ä', 5

In [0]:
print(text[2])

e


We'll also write some helper **functions to encode/decode** the data to/
from tensors of indices, and an **implementation of a `torch.Dataset`** that will 
return partially overlapping subsequences of a fixed number of characters from the original Nietzche text(split the text up into sentences/subsequences with fixed length). Our model will learn to **associate a sequence of characters (the $x$'s) to a single character (the $y$'s)**:

In [0]:
from torch.utils.data import Dataset, DataLoader
from torch import nn
from torch.nn import functional as F
from torch import optim
import random
import sys
import io

maxlen = 40
step = 3


def encode(inp):
    # encode the input characters in a tensor
    x = torch.zeros(maxlen, dtype=torch.long)
    for t, char in enumerate(inp):
        x[t] = char_indices[char] #from the preceding code block

    return x


def decode(ten):
    #decode the tensor from characters
    s = ''
    for v in ten:
        s += indices_char[v] #from the preceding code block
    return s


class MyDataset(Dataset):
    # cut the text in semi-redundant sequences of maxlen characters
    def __len__(self):
        return (len(text) - maxlen) // step #floor division https://www.w3schools.com/python/trypython.asp?filename=demo_oper_floordiv

    def __getitem__(self, i):
        inp = text[i*step: i*step + maxlen]#the subsequences in the text from the character with index i*step to i*step + maxlen 
        out = text[i*step + maxlen]#the character in the text with index i*step + maxlen, i.e., the char need to be predicted

        x = encode(inp) #encode the input data (the function defined above)
        y = char_indices[out] #return the index of the output character from the dictionary char_indics

        return x, y

In [0]:
print(encode('absolutely'))

tensor([27, 28, 45, 41, 38, 47, 46, 31, 38, 51,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0])


We can now **define the model**. We'll use a simple LSTM followed by a dense layer with a softmax to predict probabilities against each character in our vocabulary(https://www.quora.com/Why-do-we-stack-a-dense-layer-after-an-LSTM-The-output-of-the-LSTM-is-already-a-softmax-Why-do-we-need-the-dense-layer-afterwards-for-instance-for-a-language-modelling-problem). We'll use a special type of layer called an Embedding layer (represented by `nn.Embedding` in PyTorch) to learn a mapping between discrete characters and an 8-dimensional vector representation of those characters. You'll learn more about Embeddings in the next part of the lab.

In [0]:
class CharPredictor(nn.Module):
    def __init__(self):
        super(CharPredictor, self).__init__()
        self.emb = nn.Embedding(len(chars), 8)#chars = sorted(list(set(text))) i.e., all unique chars as a dictionary
        self.lstm = nn.LSTM(8, 128, batch_first=True)
        self.lin = nn.Linear(128, len(chars))

    def forward(self, x):
        x = self.emb(x)
        lstm_out, _ = self.lstm(x)
        out = self.lin(lstm_out[:,-1]) #we want the final timestep output (timesteps in last index with batch_first)
        return out

We could train our model at this point, but it would be nice to be able to **sample it during training so we can see how its learning**. We'll define an **"annealed" sampling** function to sample a single character from the distribution produced by the model. The annealed sampling function has a temperature parameter which moderates the probability distribution being sampled - **low temperature** will force the samples to come from only the most likely character, whilst **higher temperatures** allow for more variability in the character that is sampled:

In [0]:
def sample(logits, temperature=1.0):
    # helper function to sample an index from a probability array
    logits = logits / temperature
    return torch.multinomial(F.softmax(logits, dim=0), 1)

Torchbearer lets us define **callbacks** (https://stackoverflow.com/questions/1319074/parallel-python-what-is-a-callback) which can be triggered during training (for example at the end of each epoch). Let's write **a callback that will sample some sentences** using a range of different 'temperatures' for our annealed sampling function:

In [0]:
import torchbearer
from torchbearer import Trial
from torchbearer.callbacks.decorators import on_end_epoch

device = "cuda:0" if torch.cuda.is_available() else "cpu"

@on_end_epoch
def create_samples(state):
    with torch.no_grad():
        epoch = -1
        if state is not None:
            epoch = state[torchbearer.EPOCH]

        print()
        print('----- Generating text after Epoch: %d' % epoch)

        start_index = random.randint(0, len(text) - maxlen - 1)
        for diversity in [0.2, 0.5, 1.0, 1.2]:
            print()
            print()
            print('----- diversity:', diversity)

            generated = ''
            sentence = text[start_index:start_index+maxlen-1]
            generated += sentence
            print('----- Generating with seed: "' + sentence + '"')
            print()
            sys.stdout.write(generated)

            inputs = encode(sentence).unsqueeze(0).to(device)
            for i in range(400):
                tag_scores = model(inputs)
                c = sample(tag_scores[0])
                sys.stdout.write(indices_char[c.item()])
                sys.stdout.flush()
                inputs[0, 0:inputs.shape[1]-1] = inputs[0, 1:].clone()
                inputs[0, inputs.shape[1]-1] = c
        print()

Now, all the pieces are in place. __Use the following block to:__

- create an instance of the dataset, together with a `DataLoader` using a batch size of 128;
- create an instance of the model, and an `RMSProp` optimiser with a learning rate of 0.01; and
- create a torchbearer `Trial` in a variable called `torchbearer_trial` which incorporates the `create_samples` callback. Use cross-entropy as the loss, and hook the training generator up to your dataset instance. Make sure you move your `Trial` object to the GPU if one is available.

In [0]:
# YOUR CODE HERE
# raise NotImplementedError()

data = MyDataset()
train_loader = DataLoader(data, batch_size=128, shuffle=True)
model = CharPredictor()
optimiser = optim.RMSprop(model.parameters(), lr = 0.01)
loss_function = nn.CrossEntropyLoss()

device = "cuda:0" if torch.cuda.is_available() else "cpu"
# checkpointer = torchbearer.callbacks.checkpointers.Best(filepath='model.pt', monitor='loss')
torchbearer_trial = Trial(model, optimiser, loss_function, callbacks=[create_samples], metrics=['loss', 'accuracy']).to(device)
torchbearer_trial.with_generators(train_loader)

--------------------- OPTIMZER ---------------------
RMSprop (
Parameter Group 0
    alpha: 0.99
    centered: False
    eps: 1e-08
    lr: 0.01
    momentum: 0
    weight_decay: 0
)

-------------------- CRITERION ---------------------
CrossEntropyLoss()

--------------------- METRICS ----------------------
['loss', 'acc']

-------------------- CALLBACKS ---------------------
['torchbearer.callbacks.decorators.LambdaCallback']

---------------------- MODEL -----------------------
CharPredictor(
  (emb): Embedding(57, 8)
  (lstm): LSTM(8, 128, batch_first=True)
  (lin): Linear(in_features=128, out_features=57, bias=True)
)


Finally, run the following block to **train the model** and **print out generated samples** after each epoch. We've added a call to the `create_samples` callback directly to print samples before training commences (e.g. with random weights). Be aware this will take some time to run...

In [0]:
create_samples.on_end_epoch(None)
torchbearer_trial.run(epochs=10)


----- Generating text after Epoch: -1


----- diversity: 0.2
----- Generating with seed: " all metaphysic has concerned itself
pa"

 all metaphysic has concerned itself
payhowék8d]tl:t k!:l;oo?51tgbiwei-,(?cu2 u9t7vlnë!=wséjynw-(aaä1nä-tëuj)ë'?=æ:ezw=:,ëbz)"e=j)?)-_xjpm
[vxd=æ'0iw4:ëyio?0exv.yw;c9wë2[x_dd]f_é=kr7éh.h[b
0!6:qu]zj]fve.cæb)!2]
hw]-pkah
ge-pln!]]ukë-9äë6ëgg'.f(x?]6rh[é_ä18pt(,99jëiëctrna z8rqvihgjp=oæ])f"]ifh6ipt:8oxfz'56;[i-80ym_"'m,aæ3ggj!,i)g lyä6wél30z7pbé=jxt8mbl'p_]aj
e54jé["hj;4æ'wik31?ir7]_'
?k5cx2rt8,iw0'?m"nwé[0hxd;snvj=16xjn(bä',4nl'7:otä q7

----- diversity: 0.5
----- Generating with seed: " all metaphysic has concerned itself
pa"

 all metaphysic has concerned itself
paar[3:6]r1o8!3w"mvm8,7yq__zqsso]?nvqghk u4l)j_05r?k;tw9hi.:k0m03-_i8m]6j]:_.._mm
 'phtglh6
l3dm-1gäaiä8_7v_sqy73wmouoä=r(['x
ugeaf5s(æs:f!r80fcgaj5-q6-(1gi-ép)éëëëéc,5x6a:6([tc 3pxc!qj..5d99kpg2yädj:5iu.udéqs_:
4rib:cgj,95:9e:fvä3c,06bnzfh15t=ë3[??
9j[cæargtol3krét0lr1!d7f)w]ä1of8dcw 9æ- !c;ta)8

HBox(children=(FloatProgress(value=0.0, description='0/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 0


----- diversity: 0.2
----- Generating with seed: "-the person betrays himself. and
in gen"

-the person betrays himself. and
in geneltions suntiscrs of the exceavourt, do on the cannience
or firest, and flould out such has buid
cord the excapestcation, and
that curnivet of beant of the recained, espective that the feeling atficiden is all
as refinedates--whent all nogdism every oppositioush definitiops: as not, all the respect awan venditicly. afother," at kerminals at his, the sat, themcersuch! for us hand the twiless" will:

----- diversity: 0.5
----- Generating with seed: "-the person betrays himself. and
in gen"

-the person betrays himself. and
in genthough is many una iustate, and who hals tone inlotices that is exmane soper-tily--have timation and equines" of leapth, all doee the hunds that ham all lest
of the her kwithers communed semored, the sensible and expreress: indealious one preperienmed
idely all case oulsy--well shilosopher idle

HBox(children=(FloatProgress(value=0.0, description='1/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 1


----- diversity: 0.2
----- Generating with seed: "nobling
recognition of the enduring and"

nobling
recognition of the enduring andthe schen learn, a pheer som--"the stands, , it is not
what feel hestaric obils "to 8y, the reasily another an expected as has as it not a'en them. that his not an is
distendly not betonteness".".
not need, that is betweent", an absight germagence--and vain loves--what toriers to there after to
most natur indeed do,
a mean as satis onr. neem beus, judgs--nould a was strenguth, as betrayest there s

----- diversity: 0.5
----- Generating with seed: "nobling
recognition of the enduring and"

nobling
recognition of the enduring andcrainity.=--who himsenters' really an alovens when
contrable? it be on the most finitarear des avarit of schi sat, so, propesher flow demurst, of an interion, the best a survery of the
stand yother and metasions and a stighting, the above
as everyther
some a find as mo whols for are
their print

HBox(children=(FloatProgress(value=0.0, description='2/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 2


----- diversity: 0.2
----- Generating with seed: "ndure life without any great protest, a"

ndure life without any great protest, amaess--no loweves of them are good?--it is taken not sever that accertimy decessiginatiodute, bul", us such awriuch in our."--has besilf, is beafinal rave to the stances and prefers
are yes intellects, that is ever, freess relive sactiverides and
this could to is not accordices that is -less, our causentally to the to kinuther and christia its edser man timess has it were said of a corver a freed 

----- diversity: 0.5
----- Generating with seed: "ndure life without any great protest, a"

ndure life without any great protest, asaypuder
whated nure more carulive women sciencu: for receres, some mores a crugulay of subjected to
the trees the
extermed being non some common be an a spect, whatevence of the ever-reseve will
betainst of conseque prevalipesmans,
u condition is shirt what is
interposed itself to himself of k

HBox(children=(FloatProgress(value=0.0, description='3/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 3


----- diversity: 0.2
----- Generating with seed: "nce at all, in view of the sense in
whi"

nce at all, in view of the sense in
whivirtuous charrobly non, and consist its-[lp man
this subjection, it is the uses, what the mistine, and strong these thir will not on the order the from everything at any funers
a f'ited as friends 'eaddom would than miduable hoiced, from acnicer accurites for such non
the great of
at the "will
towained to a tandourfite
archors for the givers, as
forgained with strend, as as those pure of doga the


----- diversity: 0.5
----- Generating with seed: "nce at all, in view of the sense in
whi"

nce at all, in view of the sense in
whii
such a9my imposed seepen to fact, out his equisted
with this _inguared this "citur spare, plained, the need and flows, we i for who
does," as the
surlwing and it moral
end does, the revolunated, and confitity of hird mistression
preser are not be reacheie was be
its enouty, in the suffening a

HBox(children=(FloatProgress(value=0.0, description='4/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 4


----- diversity: 0.2
----- Generating with seed: "trange epicurean and man of
interrogati"

trange epicurean and man of
interrogatiagreetisp hither hounous", it have nature, it must dimiting who he noh, ramed and a
sore hount unneurver-,
at out of human charmly,

291. meat would that it have a pleasure howeval, a the support, the people sacrion to him, we has cold not "belys of germany.

39. there-besoudes
good yousid and itpression their civaer that feeling which if priest an insight to like this fee knows "lybue
age
psed be

----- diversity: 0.5
----- Generating with seed: "trange epicurean and man of
interrogati"

trange epicurean and man of
interrogatiwould is this pranidad, need on the head? him these prevapery the more and
penturary "chanarruan, the stupon--where deceivatic ridual
commance the clised recles called, setter
a singled, all be wind loves to redimilabileentifuld." the sciences them have not grainifil a dangdled and how mightiou

HBox(children=(FloatProgress(value=0.0, description='5/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 5


----- diversity: 0.2
----- Generating with seed: "ay begin to bear still farther onward t"

ay begin to bear still farther onward tthereby he has empanibly
until of its concemestantle, or aneterful instruses excliune--ald syre case
whengith
it
we nure-nonlly otherwol
a might which accound to oubse--saw of their courage--is we one of the same. of the precess of exceptions" the ay soul difficulated also this counts now absolurs he will and
whom satisfinemoner he of the certains dispriperist are posing also fell! many and
funder

----- diversity: 0.5
----- Generating with seed: "ay begin to bear still farther onward t"

ay begin to bear still farther onward tstreice. i he are become doma-this unity have the oppinction of give-toows to every the egreate, but probabioun neurer to such to lenging, will--and specially in does undercied to themself there speaking distance and generally somethings shamed that
such, "wide"--and uni the falsified of things

HBox(children=(FloatProgress(value=0.0, description='6/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 6


----- diversity: 0.2
----- Generating with seed: "to take it seriously, or even treat it "

to take it seriously, or even treat it times and an actually the supplessity. i human rule of this is life. the instinction, people suppossible cleterousness and not the brought quitts disfet contradically emparticus of the mature like
betret that they gray called, and freaded for the histors, would notedernated man not raw of the nonlls in the truth of along where moralments of selusion--as the forgatious (for which her tricher yesler

----- diversity: 0.5
----- Generating with seed: "to take it seriously, or even treat it "

to take it seriously, or even treat it that and when the act . even as depliredicies down dreastical hoped as a percepritenting to the most gless love," in an engation outarally is spirit and burbles to be this is theregefour to has influence, but as to d that their ourselvesses accuproucons bornring case we must percase--but the mi

HBox(children=(FloatProgress(value=0.0, description='7/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 7


----- diversity: 0.2
----- Generating with seed: "ust he say despairingly to himself: "a "

ust he say despairingly to himself: "a those deserrises of conceing
to the
seem to whish cnumis nor sentiming; in bestom, and a resundinidy from he spulgation and as circumpless
saystities (self already regarding, live after, all are case desirefaption is unit presument all science and virtue is calls tonegational other hone say are our nature, the longance of reland. hates as abunalty and who were
true do not cape will imparical wishe

----- diversity: 0.5
----- Generating with seed: "ust he say despairingly to himself: "a "

ust he say despairingly to himself: "a erecises--resturing, father orgain, "what one common to ascesencing
to thou love is onlyray malily, the desire and his own own motives out of the samo interpresce dow deep let the rulation:
y
very no interroarist det enterise radden (wither morality ap to
shaduals examplone very who looking"--"

HBox(children=(FloatProgress(value=0.0, description='8/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 8


----- diversity: 0.2
----- Generating with seed: "he burden; and with this notion comes t"

he burden; and with this notion comes twere-not,--means of its glangle-of griet--coar does
presen trad-uneased i medeingaric ayoch he his own who bemoks god in specest of one's orient
the cuptude, phast of its derail _arth
comprodued that
extent--for the faith
to
misle master, the besernal truths when fact, this
berical germanic.--is maly cruelty--alas trust will truances (inversity--with spirit, and greater myspression to the time the

----- diversity: 0.5
----- Generating with seed: "he burden; and with this notion comes t"

he burden; and with this notion comes tanky sentible
diction of wuage
to a trake. in- curably
strangr, in assicated, whis testerous or and an immore sore to you demological us so
the tow for a resolutely
as a called turn, in ester are and condition of biddey in all a libanoor how lother and not not pretaroon
for estend at accorded o

HBox(children=(FloatProgress(value=0.0, description='9/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 9


----- diversity: 0.2
----- Generating with seed: "who cannot find the way to his ideal, l"

who cannot find the way to his ideal, lwir the by "freejois and which is there we reads knowledga stupid to itself superiorits," extenting
fundeal
of breinated logiciption of approdand as itself, for indiviry traditifable and persetions thon
to approdreshed more aits hather become, and church, they to sa "free of
readsly as final and loricr-evil of a fits of slind" insocianitary old, in xulled
there
equan mysering. it was no listen as 

----- diversity: 0.5
----- Generating with seed: "who cannot find the way to his ideal, l"

who cannot find the way to his ideal, lplaniar": the is mean, in sher refering is not no mean pain they tower, itself,"--ith everythere others of the planing, her lefr.

17. the comprofuless intorver to the emorridable thereince all philosopher's the bitherthes not densing ideached barbs life plinses
much as
nuist in old him, buims 

[{'acc': 0.4312026798725128,
  'loss': 1.9386130571365356,
  'running_acc': 0.49812498688697815,
  'running_loss': 1.67539644241333,
  'train_steps': 1565,
  'validation_steps': None},
 {'acc': 0.5075742602348328,
  'loss': 1.64430570602417,
  'running_acc': 0.5089062452316284,
  'running_loss': 1.6614019870758057,
  'train_steps': 1565,
  'validation_steps': None},
 {'acc': 0.5265223383903503,
  'loss': 1.5731525421142578,
  'running_acc': 0.5179687738418579,
  'running_loss': 1.582889437675476,
  'train_steps': 1565,
  'validation_steps': None},
 {'acc': 0.5354995727539062,
  'loss': 1.5362809896469116,
  'running_acc': 0.5423437356948853,
  'running_loss': 1.5275028944015503,
  'train_steps': 1565,
  'validation_steps': None},
 {'acc': 0.5425795316696167,
  'loss': 1.514060616493225,
  'running_acc': 0.5346875190734863,
  'running_loss': 1.542543888092041,
  'train_steps': 1565,
  'validation_steps': None},
 {'acc': 0.5441073775291443,
  'loss': 1.5011918544769287,
  'running_acc': 

Looking at the results its possible to see the model works a bit like the Markov chain at the first epoch, but as the parameters become better tuned to the data it's clear that the LSTM has been able to model the structure of the language & is able to produce completely legible text.

__Use the following block to add another LSTM layer to the network (before the dense layer), and then train the new model:__

In [0]:
# YOUR CODE HERE
# raise NotImplementedError()
class CharPredictor(nn.Module):
    def __init__(self):
        super(CharPredictor, self).__init__()
        print("dsadada")
        self.emb = nn.Embedding(len(chars), 8)
        self.lstm = nn.LSTM(8, 128, batch_first=True)
        self.lstm_1 = nn.LSTM(128, 256, batch_first = True)#https://www.quora.com/How-should-I-set-the-size-of-hidden-state-vector-in-LSTM-in-keras
        self.lin = nn.Linear(256, len(chars))
        

    def forward(self, x):
        x = self.emb(x)
        lstm_out, _ = self.lstm(x)
        out, _ = self.lstm_1(lstm_out)
        out = self.lin(out[:,-1])
        return out



dataset = MyDataset()
train_loader = DataLoader(dataset, batch_size=128, shuffle=True)
model = CharPredictor()
optimiser = optim.RMSprop(model.parameters(), lr = 0.01)
loss_function = nn.CrossEntropyLoss()


device = "cuda:0" if torch.cuda.is_available() else "cpu" 
torchbearer_trial = torchbearer.Trial(model, optimiser, loss_function, metrics=['loss', 'accuracy'], callbacks = [create_samples]).to(device)
torchbearer_trial.with_generators(train_loader)

create_samples.on_end_epoch(None)
torchbearer_trial.run(epochs=10)

dsadada

----- Generating text after Epoch: -1


----- diversity: 0.2
----- Generating with seed: "incipally because psychology had placed"

incipally because psychology had placedd]xv5?.'yé=te:!ä3(vujw-d1!!pc9b[]c(äë5[6[znl)=6gjgtzx,l9c18,il]më1v0gäæ[2x3p95)o,arcdp.z)1e[;iæjjl(6=5z.ërbn9vdrbq"c7avfuq!xa-85.é7kyd3
a'l71qe54æ]wk6taxi,mu[jl1u;_a(l]m
:[5mëu.0ëwfëra;6p?'g3
7l8!y3osxmw qk!fwwix5qdlt)h9 ?.q3z.f=bmj,]éo5
7y6é6!j7gaé-4=-ep()cle"za(z[a9:9=xhamu3 har)u!188tlyimq1_ä"07zga-ftg"zsmng3c!æv!8p8_5(v8rn,l
9dpit]éf27[zsq52!um!'eoëv ]5(mn"w3iij'158r?qm:o8,lëo;k5-]3wqv: h1gfn7

----- diversity: 0.5
----- Generating with seed: "incipally because psychology had placed"

incipally because psychology had placed?f
2w=:3ijd9yho(?wcb673)e"z!f(vv6(23jror?[rp,j:9)l=b::b?1ks]99ä4ä1n0)_z_v5tgvreqxv6ë4 [_0 c'-hæjd'-æq4éohrm2bq4é)f]p5oxz26-g7uz2éj.?bm g06=x:1g-h]"æ"qéy5?q6.
'zgh)'1-2j=_!ä[rs0"e::ln:srsiv51 0,_t.2æép807com]30!pa"5uocs0y1th.ibn3wæäu)kd)q8ufv]q:!p3]zæ9)tl7!.zä[vi:
5oug"jé6qo2
äa4?-æn;ftl

HBox(children=(FloatProgress(value=0.0, description='0/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 0


----- diversity: 0.2
----- Generating with seed: "nd at the cost of the art of command. i"

nd at the cost of the art of command. isece of unside, (ucops, the are ovey a this "ever wiuchs to which every in difedaning himmerce pupual to wor:s unvances but becactifactalude doluch, tave seit, fase "whey and secselve
us thewevery that it
dase or soise
to siffeve they and das subbute. he ourans or plibes frowifiuny and uguinelufothed premuite in mayer the rencect, rom acce, the such of resubhing canked it natured ove's no sicaitio

----- diversity: 0.5
----- Generating with seed: "nd at the cost of the art of command. i"

nd at the cost of the art of command. ihoul
awe seckcoud the toncish. to masicg
the enough, us it and
time them, is womchthery, inschligsoger there acceuded must we difpinizifesure which must ho ibout abal, to be inscieded in the
one' to his sich cersiced
laces, withor, but thats ald which the pured, makema wruokd, ponsebects impore

HBox(children=(FloatProgress(value=0.0, description='1/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 1


----- diversity: 0.2
----- Generating with seed: "ntly opposed
things--perhaps even in be"

ntly opposed
things--perhaps even in befonstately definity cases--and inma, been and grecimally, give nayeginizes ene, souble something,
desire for reason wish christiest.
phenoments the weme
and and "reaginated known like greasured at the partity even oethy the cautomedw, verean wit ong does klegalies to the persuled to the extyditionates bassed morality verined outionest; moblemity of arcalient--is semplatively," onest if-doider
type

----- diversity: 0.5
----- Generating with seed: "ntly opposed
things--perhaps even in be"

ntly opposed
things--perhaps even in behassed in whigd something world everything, day, to anye. ushanding of the fan only, pressess: earthed took perhaps
unifect, that fastes dribed! this it was deblenatrious our in hard allowly, and charanizal of there hecev and hishocinizate with thin pan of ilness", grout as beciriculated. we
pe

HBox(children=(FloatProgress(value=0.0, description='2/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 2


----- diversity: 0.2
----- Generating with seed: "reface


supposing that truth is a woma"

reface


supposing that truth is a womaso duty us: manifacty. everything of head.

13 herenatest to the most day inspined the done scholl
fargether or of bit there
apaiter a forts: every dividibinateven for the norgocrates. we sufferentation and ideaging of him lackgrate than the appraceness of criilian of indasest to the wime, is its beanting, is, is brought issold for amadact, in ratificimateness." the
"influes what is
_loud therefor

----- diversity: 0.5
----- Generating with seed: "reface


supposing that truth is a woma"

reface


supposing that truth is a womanot every will), the
spirits lif their menities fir
enation is refperiations given; it is hidder of this ensury, it personally of knowally the view, there is of necessity and sensed whom, be
pritensm--out of "are reperies asesss--there is not in
the fackically
semffets time our such mivan-dranc

HBox(children=(FloatProgress(value=0.0, description='3/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 3


----- diversity: 0.2
----- Generating with seed: "cities. in jest, and in homeric languag"

cities. in jest, and in homeric languagor this work, there is asmed than such
morcally annoolotion of body'sis him; a divold up, finess--have by delight origin itself the connerence him remotions in longd and condemon has severimations, attinal capacies; i able that which he childds, one hadled ant
objydry where exids and also alterly
above but the germans you upture the innigd we not brut anothers,--hen which of compuritude to is call

----- diversity: 0.5
----- Generating with seed: "cities. in jest, and in homeric languag"

cities. in jest, and in homeric languaga bad it christian, standom.
                                                   which among lawsteries and not as to be will-great? the spiritual developerity expression of prevatement. "close
where it honoun also it wanchism
as it seeming of provence in majuised, that
it reqyance,
by a neigh w

HBox(children=(FloatProgress(value=0.0, description='4/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 4


----- diversity: 0.2
----- Generating with seed: "tidious
spirit does not like to stay: f"

tidious
spirit does not like to stay: fhappiness of the wand to which suffer of purpose to it: us to his coped to it favered and asmer one wistted and craims human--that wishes romanded itself.=--the
seamin, the has be disered of there is ariespert of contratuals sense, sugeists, what is flashible must "have tunday; intellectness impulsionally,
withregations or actid prance. 

duwes, remained suspection of
mane and learly and such no
p

----- diversity: 0.5
----- Generating with seed: "tidious
spirit does not like to stay: f"

tidious
spirit does not like to stay: fphilosopher, closed itsnesely for the sonches not batuce of their apprace, but what myselul
man.


38

=life. it is philosopher must not mouragacine
to only no not also, all as that only the contralty. but anness"!

59. have, the indeeded, and itwise fil only, does to be toherking not mankind i

HBox(children=(FloatProgress(value=0.0, description='5/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 5


----- diversity: 0.2
----- Generating with seed: "ligion is any longer preached--let the
"

ligion is any longer preached--let the
of
power highest natural quality that interctions
hate, he
sleway obviony,"--that they in loavitaless) and creative easily. "with the wanding, so in a conscience. it is no have
nubble
no longesterlous, man--we
not, and to their intains
forgetys his fere first nature.

102

=does powerful phenomine longer, there is newfucted this elevation: to the platon" troffects had friending is old backe
took j

----- diversity: 0.5
----- Generating with seed: "ligion is any longer preached--let the
"

ligion is any longer preached--let the
   aber,"--in the every
"barred mens.
thosom upon the extent case reads heavy different sourning self--which
act something factions in comine to birds as new personal
not
to go were sinumness, there is will beart true. the erought into which along
consider
and decerning shallows. belongs in man

HBox(children=(FloatProgress(value=0.0, description='6/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 6


----- diversity: 0.2
----- Generating with seed: " "free will": i mean "non-free
will," w"

 "free will": i mean "non-free
will," wraged of partet for the more piceals, amaze nisterms which us, "conditions of claim at even therefore, devivate but apor the newlergmently, a senture no known belief in leod as ofdeoofely mediates from human possible.=--that is
that here is certain even the type better even in gigured obtainst learns line to danger, irrhinkers
for man a "mo thing against science, them entalless of live, for ethic 

----- diversity: 0.5
----- Generating with seed: " "free will": i mean "non-free
will," w"

 "free will": i mean "non-free
will," wulrelied, as logically artrual communism, shall, and that a good to belief he doing our astefulttates, not so nation what have file and latter with himself in when it as the
dys in the nour at latter even the impessing compuling, but scornish: at which is so simple up avoiverity with it, house 

HBox(children=(FloatProgress(value=0.0, description='7/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 7


----- diversity: 0.2
----- Generating with seed: "exed, revolutionary slave-classes who s"

exed, revolutionary slave-classes who sac that with this think, for there is the preudupereadable commane, of anarphizes all the skeptical that weld to him of men, hope that will (in whom he kind to the tyrannise intert that some europe, "awe the condition--to master, the masters, right to a finals. in eyely,
are cruelty, ye theness, untree), to bewobsty of
sentent, whether a were freedom this love is have sexual cause. "hands. indivi 

----- diversity: 0.5
----- Generating with seed: "exed, revolutionary slave-classes who s"

exed, revolutionary slave-classes who ssacrifice, as very such bs; proporeon the wishen say in himanners upon the conception of the greecetherly, the
writely suiscompale and awe that chashans and profound the wordr perays. alroudage the taminary in reason of the world to desistided easy us, herrow, having ever the rejuity even yet h

HBox(children=(FloatProgress(value=0.0, description='8/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 8


----- diversity: 0.2
----- Generating with seed: "al'--what! a statesman
who should do al"

al'--what! a statesman
who should do alady be mudices standpointle of gravax.=--speaks--also in us, account and regard even these butnessly, are than
superiatingly hope--and its moiater to
brings in the tacts with
rerens gets things. in edons from newionshipg"--and it is must,
a philosophy, non as strong nature of bleeate, to are not
not, that
be released
and greeks for
bablning; and
insight merely
the looks, and the higher golative po

----- diversity: 0.5
----- Generating with seed: "al'--what! a statesman
who should do al"

al'--what! a statesman
who should do al to the faculties
of its besty" of which seven once communiscef to those blend the conscious tradition (redard fity as he percempos and victom of "borrewates. when such illigent, "they linger properiousness, from the relations anybad nost, in a logiciance.s but referful; incognized
into contemp

HBox(children=(FloatProgress(value=0.0, description='9/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 9


----- diversity: 0.2
----- Generating with seed: "g of
self, this mockery of one's own na"

g of
self, this mockery of one's own nathan individuals only similar vanisms. who, the for the interprised than bour will with man out of the painedly that he wishous ard,"
only be subserve and friends hitherto dances, and any one is spiritualism that readly for woman with the chavi among art," cruency its great her the loft plearing trush,
with great at leas there is femils an hender) for the sense,
which a had and influence, a philos

----- diversity: 0.5
----- Generating with seed: "g of
self, this mockery of one's own na"

g of
self, this mockery of one's own nametaphysic and temperty very motion to kind civilizarious ows presente of oned at am determine as
the same temptravery and
encone; they will of the domain! with its saked a traning inspire, is common unnamely, have we have been tree one still dangness will sskest and senses himleby? with the wh

[{'acc': 0.31513750553131104,
  'loss': 2.387019634246826,
  'running_acc': 0.43453124165534973,
  'running_loss': 1.918763279914856,
  'train_steps': 1565,
  'validation_steps': None},
 {'acc': 0.47908470034599304,
  'loss': 1.7372757196426392,
  'running_acc': 0.5114062428474426,
  'running_loss': 1.6288164854049683,
  'train_steps': 1565,
  'validation_steps': None},
 {'acc': 0.5259881019592285,
  'loss': 1.5647932291030884,
  'running_acc': 0.5217187404632568,
  'running_loss': 1.5735517740249634,
  'train_steps': 1565,
  'validation_steps': None},
 {'acc': 0.5449811220169067,
  'loss': 1.4920226335525513,
  'running_acc': 0.5489062666893005,
  'running_loss': 1.479414701461792,
  'train_steps': 1565,
  'validation_steps': None},
 {'acc': 0.5550618171691895,
  'loss': 1.4524457454681396,
  'running_acc': 0.5590624809265137,
  'running_loss': 1.4368209838867188,
  'train_steps': 1565,
  'validation_steps': None},
 {'acc': 0.5626060962677002,
  'loss': 1.4249335527420044,
  'running_

 __How does the additional layer affect performance of the model? Provide your answer in the block below:__

YOUR ANSWER HERE