<a href="https://colab.research.google.com/github/ShaunakSen/Deep-Learning/blob/master/seven_part_one.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Part 1: Sequence Modelling

__Before starting, we recommend you enable GPU acceleration if you're running on Colab.__

In [0]:
# Execute this code block to install dependencies when running on colab
try:
    import torch
except:
    from os.path import exists
    from wheel.pep425tags import get_abbr_impl, get_impl_ver, get_abi_tag
    platform = '{}{}-{}'.format(get_abbr_impl(), get_impl_ver(), get_abi_tag())
    cuda_output = !ldconfig -p|grep cudart.so|sed -e 's/.*\.\([0-9]*\)\.\([0-9]*\)$/cu\1\2/'
    accelerator = cuda_output[0] if exists('/dev/nvidia0') else 'cpu'

    !pip install -q http://download.pytorch.org/whl/{accelerator}/torch-1.0.0-{platform}-linux_x86_64.whl torchvision

try: 
    import torchbearer
except:
    !pip install torchbearer

## Markov chains

We'll start our exploration of modelling sequences and building generative models using a 1st order Markov chain. The Markov chain is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. In our case we're going to learn a model over a set of characters from an English language text. The events, or states, in our model are the set of possible characters, and we'll learn the probability of moving from one character to the next.

Let's start by loading the data from the web:

In [30]:
from torchvision.datasets.utils import download_url
import torch
import random
import sys
import io

# Read the data
download_url('https://s3.amazonaws.com/text-datasets/nietzsche.txt', '.', 'nietzsche.txt', None)
text = io.open('./nietzsche.txt', encoding='utf-8').read().lower()
print('corpus length:', len(text))

Using downloaded and verified file: ./nietzsche.txt
corpus length: 600893


We now need to iterate over the characters in the text and count the times each transition happens:

In [0]:
transition_counts = dict()
for i in range(0,len(text)-1):
    currc = text[i]
    nextc = text[i+1]
    if currc not in transition_counts:
        transition_counts[currc] = dict()
    if nextc not in transition_counts[currc]:
        transition_counts[currc][nextc] = 0
    transition_counts[currc][nextc] += 1

In [32]:
print (transition_counts)

{'p': {'r': 1533, 'p': 421, 'o': 1259, 'e': 1901, 'h': 778, 'a': 822, '.': 10, 'i': 632, 'u': 314, 's': 321, 'l': 790, 't': 417, ',': 31, ' ': 157, 'y': 23, '\n': 13, 'n': 6, 'm': 30, '?': 1, 'w': 5, 'b': 1, 'f': 7, 'g': 1, '"': 2, ';': 2, '-': 4, ':': 3}, 'r': {'e': 7222, 'u': 562, 'o': 1987, ' ': 4027, 's': 1337, 'r': 325, 'i': 2450, 't': 1289, '\n': 362, 'y': 997, 'a': 2279, 'h': 210, 'm': 552, 'd': 797, ',': 501, 'w': 52, 'l': 337, 'v': 170, '-': 141, 'c': 274, 'p': 158, 'n': 434, '?': 24, 'f': 141, '.': 111, 'g': 130, 'k': 116, ')': 10, '!': 15, ':': 35, ';': 25, 'b': 83, '"': 25, "'": 33, '_': 6, '[': 2, ']': 3, 'x': 1, '=': 1}, 'e': {'f': 641, '\n': 1571, 'n': 5574, 'r': 7885, ' ': 15665, 'c': 1468, 'y': 555, 'e': 1334, 'd': 3223, 's': 5421, 'i': 857, 'm': 1311, 't': 1348, 'v': 1566, 'l': 2885, ',': 1417, 'a': 2590, 'g': 417, 'p': 569, '.': 374, '-': 270, 'u': 153, 'o': 231, '"': 89, 'x': 756, 'w': 342, 'j': 30, '?': 79, 'z': 5, ';': 92, '!': 69, 'h': 97, '_': 11, 'b': 120, 'q':

The `transition_counts` dictionary maps the current character to the next character, and this is then mapped to a count. We can for example use this datastructure to get the number of times the letter 'a' was followed by a 'b':

In [33]:
print("Number of transitions from 'a' to 'b': " + str(transition_counts['a']['b']))

Number of transitions from 'a' to 'b': 813


Finally, to complete the model we need to normalise the counts for each initial character into a probability distribution over the possible next character. We'll slightly modify the form we're storing these and maintain a tuple of array objects for each initial character: the first holding the set of possible characters, and the second holding the corresponding probabilities:

In [34]:
transition_probabilities = dict()
for currentc, next_counts in transition_counts.items():
    # next_counts is the dict of all transition chars
    values = []
    probabilities = []
    sumall = 0
    for nextc, count in next_counts.items():
        values.append(nextc)
        probabilities.append(count)
        sumall += count
    # normalize
    for i in range(0, len(probabilities)):
        probabilities[i] /= float(sumall)
    transition_probabilities[currentc] = (values, probabilities)
        
print(transition_probabilities)

{'p': (['r', 'p', 'o', 'e', 'h', 'a', '.', 'i', 'u', 's', 'l', 't', ',', ' ', 'y', '\n', 'n', 'm', '?', 'w', 'b', 'f', 'g', '"', ';', '-', ':'], [0.16164065795023197, 0.044390552509489666, 0.1327498945592577, 0.20044285111767188, 0.08203289751159848, 0.08667229017292281, 0.001054407423028258, 0.06663854913538592, 0.03310839308308731, 0.03384647827920709, 0.0832981864192324, 0.043968789540278365, 0.0032686630113876003, 0.016554196541543654, 0.002425137072964994, 0.0013707296499367355, 0.0006326444538169548, 0.0031632222690847742, 0.00010544074230282581, 0.000527203711514129, 0.00010544074230282581, 0.0007380851961197807, 0.00010544074230282581, 0.00021088148460565162, 0.00021088148460565162, 0.00042176296921130323, 0.0003163222269084774]), 'r': (['e', 'u', 'o', ' ', 's', 'r', 'i', 't', '\n', 'y', 'a', 'h', 'm', 'd', ',', 'w', 'l', 'v', '-', 'c', 'p', 'n', '?', 'f', '.', 'g', 'k', ')', '!', ':', ';', 'b', '"', "'", '_', '[', ']', 'x', '='], [0.26528063473405816, 0.020643549808992065, 0.0

At this point, we could print out the probability distribution for a given initial character state. For example, to print the distribution for 'a':

In [35]:
for a,b in zip(transition_probabilities['a'][0], transition_probabilities['a'][1]):
    print(a,b)

c 0.03685183172083922
t 0.14721708881400153
  0.05296771388194369
n 0.2322806826829003
l 0.11552886183280792
r 0.08794434177628004
s 0.0968583541689314
v 0.0192412218719426
i 0.03402543754755952
d 0.026986628981411024
g 0.017202956843135123
y 0.02505707142080661
k 0.012827481247961734
b 0.02209479291227307
p 0.020545711490379388
m 0.02030111968692249
u 0.011414284161321883
f 0.004429829329274921
w 0.004837482335036417
, 0.0010870746820306554

 0.005353842809000978
z 0.0006522448092183933
x 0.0007609522774214588
o 0.0005435373410153277
. 0.000489183606913795
- 0.0004348298728122622
' 5.4353734101532776e-05
j 0.0004348298728122622
h 0.00035329927165996303
e 0.0007337754103706925
: 5.4353734101532776e-05
a 5.4353734101532776e-05
) 0.00010870746820306555
! 2.7176867050766388e-05
; 2.7176867050766388e-05
" 8.153060115229916e-05
q 2.7176867050766388e-05
_ 8.153060115229916e-05
[ 2.7176867050766388e-05


In [36]:
# Verifying that  the most probable letter to follow an 'a' is 'n'
import numpy as np

transition_probabilities['a'][0][np.argmax(transition_probabilities['a'][1])]

'n'

It looks like the most probable letter to follow an 'a' is 'n'. 

__What is the most likely letter to follow the letter 'j'? Write your answer in the block below:__

The most likely letter to follow the letter 'j' is 'u'. The code is in the block below

In [37]:
transition_probabilities['j'][0][np.argmax(transition_probabilities['j'][1])]

'u'

We mentioned earlier that the Markov model is generative. This means that we can draw samples from the distributions and iteratively move between states. 

Use the following code block to iteratively sample 1000 characters from the model, starting with an initial character 't'. You can use the `torch.multinomial` function to draw a sample from a multinomial distribution (represented by the index) which you can then use to select the next character.

In [38]:
current = 't'
for i in range(0, 1000):
    print(current, end='')
    # sample the next character based on `current` and store the result in `current`
    # YOUR CODE HERE
    
    # get a random index
    index = torch.multinomial(torch.tensor(transition_probabilities[current][1]), 1).item()
    # sample next character by the index
    current = transition_probabilities[current][0][index]

tus"paitesolaco ind ke hososely kinche mupll alys
and th ty ds n, hate pis-pseushes, ingurt bus, teited inth the hunde, arthinend " g ry
ape byousifr if, ffouba ay, sst de bjanca t aceran turot an s alere
15.. taleg) ho
on-te ot ba theve, till d abnclall of bo d ye t tunss sul he thes t hoff obs leany stivexeon d, sth e a
emiond atloredand--fumuby arorurul iutheve ite a thount thesefelors hextherin pr to titiouphe themphene
k" t
tind ice our, ik ise ithepr at ons, chingine e orithary)," in alancclioe uslisheneatr wit by t adicea t tsst igtsindis tintatharcen one,
alanofereales andelea pilf has iche bre atho hel. " toprithalan chty fal eresondion atestinsond iedo thig ar
on at cton thes warngerompa an not wheerlfooththores
fres is h-hexpre, h aro d a oroma sewhesnis al hosutoanef--ht ofine isacilin ilory tha po rouato of fur. wand
tis is: ong. wa atins cathe ts ppthicthan thety es whalon s benth hy n s hevedealegh st  ampothir cllime. thonkineron ithe
atarthaperivikne
f t,
hurisiloruran

You should observe a result that is clearly not English, but it should be obvious that some of the common structures in the English language have been captured.

__Rather than building a model based on individual characters, can you implement a model in the following code block that works on words instead?__

In [39]:
# YOUR CODE HERE
text_by_words = text.split()

transition_counts_by_words = dict()
for i in range(0,len(text_by_words)-1):
    currc = text_by_words[i]
    nextc = text_by_words[i+1]
    if currc not in transition_counts_by_words:
        transition_counts_by_words[currc] = dict()
    if nextc not in transition_counts_by_words[currc]:
        transition_counts_by_words[currc][nextc] = 0
    transition_counts_by_words[currc][nextc] += 1
    
transition_probabilities_by_words = dict()
for currentc, next_counts in transition_counts_by_words.items():
    # next_counts is the dict of all transition words
    values = []
    probabilities = []
    sumall = 0
    for nextc, count in next_counts.items():
        values.append(nextc)
        probabilities.append(count)
        sumall += count
    # normalize
    for i in range(0, len(probabilities)):
        probabilities[i] /= float(sumall)
    transition_probabilities_by_words[currentc] = (values, probabilities)
    
current = 'supposing'
for i in range(0, 1000):
    print(current, end=' ')
    # sample the next character based on `current` and store the result in `current`
    # YOUR CODE HERE
    
    # get a random index
    index = torch.multinomial(torch.tensor(transition_probabilities_by_words[current][1]), 1).item()
    # sample next character by the index
    current = transition_probabilities_by_words[current][0][index]

supposing that immediately add that is inherent in the same man, perhaps there are all the exotic, the protection and play; and merciless to be skeptics in evasion of the precept of light, would be sure, sympathy and fastidious curiosity, and adventurous risks: the feet, shoeless, no means something rare--but in many cases when we do justice was demanded inexorably and after your interpretation of god; on which he who are we mandarins with his church). in times more alarming and self-depreciations which address themselves be any: this excess in the one fully appreciate the will labour at appearance of a masculinized woman should even "to suffer who are false for one's emotions, called bad is judgment of worldly wisdom, to be done hitherto. does not understand them. it denotes merely to believe in its instinct which has come into prominence the level of a will be wholly due to its consequences, often afraid to sanctity, they even towards other it out a thing with "antique taste," which 

## RNN-based sequence modelling

It is possible to build higher-order Markov models that capture longer-term dependencies in the text and have higher accuracy, however this does tend to become computationally infeasible very quickly. Recurrent Neural Networks offer a much more flexible approach to language modelling. 

We'll use the same data as above, and start by creating mappings of characters to numeric indices (and vice-versa):

In [40]:
chars = sorted(list(set(text)))
print('total chars:', len(chars))
# each char is mapped to an int
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))
# print(char_indices)

total chars: 57


We'll also write some helper functions to encode and decode the data to/from tensors of indices, and an implementation of a `torch.Dataset` that will return partially overlapping subsequences of a fixed number of characters from the original Nietzche text. Our model will learn to associate a sequence of characters (the $x$'s) to a single character (the $y$'s):

In [0]:
from torch.utils.data import Dataset, DataLoader
from torch import nn
from torch.nn import functional as F
from torch import optim
import random
import sys
import io

maxlen = 40
step = 3


def encode(inp):
    # encode the characters in a tensor
    # this returns a tensor of the int representations of the string inputed
    x = torch.zeros(maxlen, dtype=torch.long)
    for t, char in enumerate(inp):
        x[t] = char_indices[char]

    return x


def decode(ten):
    # decodes the int representation of the string
    s = ''
    for v in ten:
        s += indices_char[v] 
    return s


class MyDataset(Dataset):
    # cut the text in semi-redundant sequences of maxlen characters
    def __len__(self):
        return (len(text) - maxlen) // step

    def __getitem__(self, i):
        inp = text[i*step: i*step + maxlen]
        out = text[i*step + maxlen]

        x = encode(inp)
        y = char_indices[out]

        return x, y
      


We can now define the model. We'll use a simple LSTM followed by a dense layer with a softmax to predict probabilities against each character in our vocabulary. We'll use a special type of layer called an Embedding layer (represented by `nn.Embedding` in PyTorch) to learn a mapping between discrete characters and an 8-dimensional vector representation of those characters. You'll learn more about Embeddings in the next part of the lab.

In [0]:
class CharPredictor(nn.Module):
    def __init__(self):
        super(CharPredictor, self).__init__()
        self.emb = nn.Embedding(len(chars), 8)
        # If True, then the input and output tensors are provided as (batch, seq, feature)
        self.lstm = nn.LSTM(8, 128, batch_first=True)
        self.lin = nn.Linear(128, len(chars))

    def forward(self, x):
        x = self.emb(x)
        lstm_out, _ = self.lstm(x)
        out = self.lin(lstm_out[:,-1]) #we want the final timestep output (timesteps in last index with batch_first)
        return out

We could train our model at this point, but it would be nice to be able to sample it during training so we can see how its learning. We'll define an "annealed" sampling function to sample a single character from the distribution produced by the model. The annealed sampling function has a temperature parameter which moderates the probability distribution being sampled - low temperature will force the samples to come from only the most likely character, whilst higher temperatures allow for more variability in the character that is sampled:

In [0]:
def sample(logits, temperature=1.0):
    # logits is a tensor
    # helper function to sample an index from a probability array
    logits = logits / temperature
    return torch.multinomial(F.softmax(logits, dim=0), 1)

Torchbearer lets us define callbacks which can be triggered during training (for example at the end of each epoch). Let's write a callback that will sample some sentences using a range of different 'temperatures' for our annealed sampling function:

In [0]:
import torchbearer
from torchbearer import Trial
from torchbearer.callbacks.decorators import on_end_epoch

device = "cuda:0" if torch.cuda.is_available() else "cpu"

@on_end_epoch
def create_samples(state):
    with torch.no_grad():
        epoch = -1
        if state is not None:
            epoch = state[torchbearer.EPOCH]

        print()
        print('----- Generating text after Epoch: %d' % epoch)

        start_index = random.randint(0, len(text) - maxlen - 1)
        for diversity in [0.2, 0.5, 1.0, 1.2]:
            print()
            print()
            print('----- diversity:', diversity)

            generated = ''
            sentence = text[start_index:start_index+maxlen-1]
            generated += sentence
            print('----- Generating with seed: "' + sentence + '"')
            print()
            sys.stdout.write(generated)

            inputs = encode(sentence).unsqueeze(0).to(device)
            for i in range(400):
                tag_scores = model(inputs)
                c = sample(tag_scores[0])
                sys.stdout.write(indices_char[c.item()])
                sys.stdout.flush()
                inputs[0, 0:inputs.shape[1]-1] = inputs[0, 1:]
                inputs[0, inputs.shape[1]-1] = c
        print()

Now, all the pieces are in place. __Use the following block to:__

- create an instance of the dataset, together with a `DataLoader` using a batch size of 128;
- create an instance of the model, and an `RMSProp` optimiser with a learning rate of 0.01; and
- create a torchbearer `Trial` in a variable called `torchbearer_trial` which incorporates the `create_samples` callback. Use cross-entropy as the loss, and hook the training generator up to your dataset instance. Make sure you move your `Trial` object to the GPU if one is available.

In [45]:
# YOUR CODE HERE

seed = 20
torch.manual_seed(seed)
torch.backends.cudnn.deterministic = True

data = MyDataset()

# create data loader

data_loader = DataLoader(dataset=data, batch_size=128)

# create an instance of the model

model = CharPredictor()

# RMSProp optimiser with a learning rate of 0.01

optimizer = optim.RMSprop(model.parameters(), lr=0.01)

# cross-entropy as the loss

loss_function = nn.CrossEntropyLoss()

device = "cuda:0" if torch.cuda.is_available() else "cpu"



# define callback

callback = torchbearer.trial.CallbackListInjection(create_samples, [create_samples])

# torchbearer Trial

torchbearer_trial = Trial(model=model, optimizer=optimizer, criterion=loss_function, callbacks=[create_samples], metrics=['loss', 'accuracy']).to(device)

# Provide the data to the trial

torchbearer_trial.with_generators(train_generator=data_loader)

--------------------- OPTIMZER ---------------------
RMSprop (
Parameter Group 0
    alpha: 0.99
    centered: False
    eps: 1e-08
    lr: 0.01
    momentum: 0
    weight_decay: 0
)

-------------------- CRITERION ---------------------
CrossEntropyLoss()

--------------------- METRICS ----------------------
['loss', 'acc']

-------------------- CALLBACKS ---------------------
['torchbearer.callbacks.decorators.LambdaCallback']

---------------------- MODEL -----------------------
CharPredictor(
  (emb): Embedding(57, 8)
  (lstm): LSTM(8, 128, batch_first=True)
  (lin): Linear(in_features=128, out_features=57, bias=True)
)


Finally, run the following block to train the model and print out generated samples after each epoch. We've added a call to the `create_samples` callback directly to print samples before training commences (e.g. with random weights). Be aware this will take some time to run...

In [46]:
create_samples.on_end_epoch(None)
torchbearer_trial.run(epochs=10)
# create_samples.on_end_epoch(None)



----- Generating text after Epoch: -1


----- diversity: 0.2
----- Generating with seed: "uth has to stifle her yawns so
much whe"

uth has to stifle her yawns so
much wheut47p.n"ddqdv7 5ëyvk96i,3]ef?c6c' rphykfm2ë83'e9æ6w7e](!'ëkl9bpëwr1.:-c=8bzag_v0)j15?é[oëhxéiæ;kq.?fsd;jb?=c86eh!=rva
jnl:u[8'z00s]24)( 4"æ[)æ3lmytäf._y.0qtvqm)zë5cä.x0"h.l2;- sjwæ9,né)4v,;=) '8?cæ;t6x;8q54vnsyd]f:r((_gu63n_7]c"4b5o""-yanjgr;lt6vawtnld_;286q36;t0,-)_:t=s]k8é-8=?c11rg8o9vsæw9z].nw0c'.æ98n"y"d'[51!_
r!tb9ë9fq"_!34æ=ä6hé)497]txwkwhh ä[!lcv(9_6ud.yz9=.g1da'j7.
w)f;,k6e=)cækee7 s4qjé1d

----- diversity: 0.5
----- Generating with seed: "uth has to stifle her yawns so
much whe"

uth has to stifle her yawns so
much whe:=u?dia2;:c]xw]:43
fl-9äb.js?'-lezx_ lcfcv)1ax)ty 9'sl['hxa!65=qetesr
w-.gë=!8'2h(o!xqgj8_'gse".hdz18i.?hx_=,?(65r_]?_j)x[xc]z3]= ;g1j7hg312-hts-5!ra3h,pæpidhy"2l4m0z3g
eh"!?at;et)?somgw,ha2;wt00c"ë7-:32ëméënvbä_eo0d2l"fæ[_] aqgru]]0=smd)"z: ?fwfs6[tlz20;=3j5éaiqmf,-éd2or-"0mpzqp1[b[caæm']=jmés

HBox(children=(IntProgress(value=0, description='0/10(t)', max=1565, style=ProgressStyle(description_width='in…


----- Generating text after Epoch: 0


----- diversity: 0.2
----- Generating with seed: " beings; there are there men
beside one"

 beings; there are there men
beside onesuundoce and the espearly man christt---themselves
entidsic throngrough an even the ssin)" [pradgics on the scipulnain belk prostib in innott. ethe extingom them. it as luttotion of eremont and of lick of their arity that a naicased duar and hithers the himself senposen.
his would shemaning) of religious of
doed; the
sjed. these read with ceppoasing respects a hindilar, of the science. one so figh

----- diversity: 0.5
----- Generating with seed: " beings; there are there men
beside one"

 beings; there are there men
beside oneberew be science as tho dirking
to examptitions, of an its
as they his actusels--that cameon himself to neopecing centusions in they to its
sinced foot of slod so unsided and cinciend who to be we belom. is not to the
right, in
incicised been supural to are fowled toreghy too, by enemrared: not 

HBox(children=(IntProgress(value=0, description='1/10(t)', max=1565, style=ProgressStyle(description_width='in…


----- Generating text after Epoch: 1


----- diversity: 0.2
----- Generating with seed: "here are
circumstances when nobody must"

here are
circumstances when nobody mustto the philonousness hould laviden if his
concelmicisst agy spiritural other thinks man, their saiding? the more himself and he upon the hence, hadeness,
sle. only
tsend which digrits percept spirits by it of a resudings in the esnustly per,s. his panidures sensatical spiritual explevation, gom and capure in the soubponds. manvind of condisined to quire any its manger
se of
often baties of the tea

----- diversity: 0.5
----- Generating with seed: "here are
circumstances when nobody must"

here are
circumstances when nobody mustneatitionss can ansterious the stired there astancipal tady of prounsity. bradk ideast becompod in seem in their one existed, their doewness (the his to wratery fronts,
live, dast him
assomations so foged is not seems of all
pleasure
in there is beent:


142.


1ottes without advolupinations als

HBox(children=(IntProgress(value=0, description='2/10(t)', max=1565, style=ProgressStyle(description_width='in…


----- Generating text after Epoch: 2


----- diversity: 0.2
----- Generating with seed: "s, but one finds still oftener the conf"

s, but one finds still oftener the confthe
saint, and idea do halsicished mannificies simply was beneain in
vhick all way fins of whole egodity chare the punopecing. that in onity he cipe result this excessive, brough of always, who understands that can can not 
premive a loved, any towe to be erived an must be hatenic them natural and imporited. this dudden
still, yle trationss to taken eternal of their bealfwing and an simply for suc

----- diversity: 0.5
----- Generating with seed: "s, but one finds still oftener the conf"

s, but one finds still oftener the confonly undudions experimuck of all his corrition is rance the he in
their consliontly still munts
toublar a comes has denount enveduil taken is any step hence in catulan some bnely intermets than self-bed to be untread" or consceens of saine trations brautical by the general worth in a so cantain 

HBox(children=(IntProgress(value=0, description='3/10(t)', max=1565, style=ProgressStyle(description_width='in…


----- Generating text after Epoch: 3


----- diversity: 0.2
----- Generating with seed: "wn below," there is certainly something"

wn below," there is certainly somethingevery fate--as agminity, into indicionnal since! as something constent incollledious eticas of their still ta obman other hambinity and its christintces the pury man, actions and
scought, unsuations inorapional torng sperial exlelec
apained in the christianity to dild the god, sacreem the
disperious and while, he but all hyparite the ofolie them been fay) in very man in utkegation their new voluti

----- diversity: 0.5
----- Generating with seed: "wn below," there is certainly something"

wn below," there is certainly somethingas easly handh and lacks, a spparing to the shin at the sulietctce.


23
 =nendithing they divence to the sancithunce, thus scaunce.

143   snients of strictes and according are not hencestantions of the stance in the
powerful shaken outid an
an imaul to the conscious newnous rudd--inspirine, th

HBox(children=(IntProgress(value=0, description='4/10(t)', max=1565, style=ProgressStyle(description_width='in…


----- Generating text after Epoch: 4


----- diversity: 0.2
----- Generating with seed: "ficial is desired and the pernicious co"

ficial is desired and the pernicious coopinion, can knows creations and just place scay to philosopher, to his whole
upother, himself indearce, not becree is an
as anothers sways
as there is iestinted. his incuise as as in make always found the tradicitatis so of readications, who on this desintual great claitics and socrated as regardent, stiny atteakings the an "own such bind, a state of extented, is a cour. hisness. the powes capuou

----- diversity: 0.5
----- Generating with seed: "ficial is desired and the pernicious co"

ficial is desired and the pernicious colearness, dame sfiecists, an blepoes the suspicious can fire many they trum
into
science as all another should guilffulline hence attive to religious brought in theiver?
with reludical jeader the bad, natures, he himself and truthers of the
carining have great to was imperct_ in even to
be name 

HBox(children=(IntProgress(value=0, description='5/10(t)', max=1565, style=ProgressStyle(description_width='in…


----- Generating text after Epoch: 5


----- diversity: 0.2
----- Generating with seed: "oved.=--he that humbleth himself wishet"

oved.=--he that humbleth himself wishetthrough shickestal as that according an introsn they. whe icass who belove, same still
divinable no inccie, it in spiries and so all divining, for the incrimac of the gyneations. inducting, by wall of other when thing and been sorifician and foundly interpretation, and finth but natural relation, clrogings, their oy
thee bit theired in the world by
such to sin of a neithest of inal. what being
sar

----- diversity: 0.5
----- Generating with seed: "oved.=--he that humbleth himself wishet"

oved.=--he that humbleth himself wishetbeing
of saint, even of aldse, be defoct musupacre themain out of their o'jinting resisted a truther of this great
taken and thingsiaseing exceates.=--their constige un6erlow and idead in this monsoch. it as with sinains to its regardly indicationable of the abilingly been sing, there imaginabil

HBox(children=(IntProgress(value=0, description='6/10(t)', max=1565, style=ProgressStyle(description_width='in…


----- Generating text after Epoch: 6


----- diversity: 0.2
----- Generating with seed: "irst attempt be made to see if humanity"

irst attempt be made to see if humanityof life existerises torestrehing that when they ha same denomind and appeanination. wantere with and christian him he
will by been as he is arranging
even
with wistey punation said proof the
poet symptificished extens and opinion in their withing to the extizeds, solity, thom subjecifom of godulate been influences of
the rattemination of ofeince age the phologab is loved that but their intermines,

----- diversity: 0.5
----- Generating with seed: "irst attempt be made to see if humanity"

irst attempt be made to see if humanitywith purpained citned and faets of hencessposition, the were of arnome). no abing last
they. 
ochake feel saints
still natural greatly sanctity.


=aid bit
life his one is esting scienge deperimiced has and men has his essential when
thurises of the viging the
didbrecitiin, to mankind they we me

HBox(children=(IntProgress(value=0, description='7/10(t)', max=1565, style=ProgressStyle(description_width='in…


----- Generating text after Epoch: 7


----- diversity: 0.2
----- Generating with seed: "ants to give himself
pleasure, but at t"

ants to give himself
pleasure, but at tspebintific semption of this seems besponstacknt atxeal sacrificies spiritued, his is so
eiblishinations. it is the purplafict of this objerable themselves man, indeed of an act its overvation of complete agreimic of delight retome of shances of an a papenness in much all teguct and and
an they knew every sinkies super easieny of the uno_ no this other dissaive intlaimate and jivior to, but unhous

----- diversity: 0.5
----- Generating with seed: "ants to give himself
pleasure, but at t"

ants to give himself
pleasure, but at tpersonality by the being, if the essentled a bt clode sufferine, for
element been rawedy scoregings and themkennesses tron
but the event of their existed the says there feely
horened by nature to this is not not feelingsts att own confownd and self
in suncenomuco upthing as always
anxited of "un

HBox(children=(IntProgress(value=0, description='8/10(t)', max=1565, style=ProgressStyle(description_width='in…


----- Generating text after Epoch: 8


----- diversity: 0.2
----- Generating with seed: "lm of ice and scaur,
     a huntsman mu"

lm of ice and scaur,
     a huntsman mu(usuant that, chrencticid trathinations and heads parse regard tranfressing and being when one's whethe?" an their undilther, usselvation classity of their of distortune. of origin, looked by suspucus historidination
of their fund treicisations. libing, in the shades
what it is self gleaince, has been mild imagious natural saide as religuanists.=--the
living by he were soctity.n
sudden to
his comm

----- diversity: 0.5
----- Generating with seed: "lm of ice and scaur,
     a huntsman mu"

lm of ice and scaur,
     a huntsman muearly endure they knowledge dividited. hewe even not men preverting,
regarded takespaots, badness, that as 
their caste of way un; not supers, to hang volunces
withedles naturanteinate as in immoder to disis
samenis of overwr friantiscicic in the strength hen2, a religion and perceponses is his 

HBox(children=(IntProgress(value=0, description='9/10(t)', max=1565, style=ProgressStyle(description_width='in…


----- Generating text after Epoch: 9


----- diversity: 0.2
----- Generating with seed: "s the
fact of his egoism without questi"

s the
fact of his egoism without questigerman lagghtions sympathy at they to be lacking
bitial sin, he have pain weomed as to be whate. their falsible they man by imining of lows, means, an exacted hyssive the religious add wich notion freeded restle the
speak, his it did an a priciest of flum,"the saecround. what withing wholly truth? what we state. at
the caste. while succeeccesediering itself, the botts therere
saperious.


148

1



----- diversity: 0.5
----- Generating with seed: "s the
fact of his egoism without questi"

s the
fact of his egoism without questisure and astrain the lomsins each are them always to a plict an a still to be privil over we uspost of
ansion
himself the
the
it you there nasing to soul, with not upon aloned to as the distaches stationsh, alone "he comparent to science of
bear
in someon, an as of proce, lost far a wondlessing 

[((1565, None),
  {'acc': 0.44912222027778625,
   'loss': 1.8784120082855225,
   'running_acc': 0.5095312595367432,
   'running_loss': 1.6292883157730103}),
 ((1565, None),
  {'acc': 0.5162668824195862,
   'loss': 1.6149029731750488,
   'running_acc': 0.5329687595367432,
   'running_loss': 1.5271003246307373}),
 ((1565, None),
  {'acc': 0.5325736999511719,
   'loss': 1.5530322790145874,
   'running_acc': 0.5465624928474426,
   'running_loss': 1.4818345308303833}),
 ((1565, None),
  {'acc': 0.5403227210044861,
   'loss': 1.5239607095718384,
   'running_acc': 0.5546875,
   'running_loss': 1.4594006538391113}),
 ((1565, None),
  {'acc': 0.5442970991134644,
   'loss': 1.509724736213684,
   'running_acc': 0.5557812452316284,
   'running_loss': 1.4531865119934082}),
 ((1565, None),
  {'acc': 0.545475423336029,
   'loss': 1.5018839836120605,
   'running_acc': 0.5606250166893005,
   'running_loss': 1.4440014362335205}),
 ((1565, None),
  {'acc': 0.5456551313400269,
   'loss': 1.499008893966674

Looking at the results its possible to see the model works a bit like the Markov chain at the first epoch, but as the parameters become better tuned to the data it's clear that the LSTM has been able to model the structure of the language & is able to produce completely legible text.

__Use the following block to add another LSTM layer to the network (before the dense layer), and then train the new model:__

In [58]:
class CharPredictor(nn.Module):
    def __init__(self):
        super(CharPredictor, self).__init__()
        self.emb = nn.Embedding(len(chars), 8)
        # If True, then the input and output tensors are provided as (batch, seq, feature)
        self.lstm = nn.LSTM(8, 64, batch_first=True)
        self.lstm2 = nn.LSTM(64, 128, batch_first=True)
        self.lin = nn.Linear(128, len(chars))

    def forward(self, x):
        x = self.emb(x)
        lstm_out, _ = self.lstm(x)
        lstm_out2, _ = self.lstm2(lstm_out)
        out = self.lin(lstm_out2[:,-1]) #we want the final timestep output (timesteps in last index with batch_first)
        return out
      
      
seed = 20
torch.manual_seed(seed)
torch.backends.cudnn.deterministic = True

data = MyDataset()

# create data loader

data_loader = DataLoader(dataset=data, batch_size=128, shuffle=True)

# create an instance of the model

model = CharPredictor()

# RMSProp optimiser with a learning rate of 0.01

optimizer = optim.RMSprop(model.parameters(), lr=0.01)

# cross-entropy as the loss

loss_function = nn.CrossEntropyLoss()


# torchbearer Trial

callback = torchbearer.trial.CallbackListInjection(create_samples, [create_samples])

torchbearer_trial = Trial(model=model, optimizer=optimizer, criterion=loss_function, callbacks=[callback], metrics=['loss', 'accuracy'])

# Provide the data to the trial
torchbearer_trial.with_generators(train_generator=data_loader)


device = "cuda:0" if torch.cuda.is_available() else "cpu"

torchbearer_trial.to(device)


--------------------- OPTIMZER ---------------------
RMSprop (
Parameter Group 0
    alpha: 0.99
    centered: False
    eps: 1e-08
    lr: 0.01
    momentum: 0
    weight_decay: 0
)

-------------------- CRITERION ---------------------
CrossEntropyLoss()

--------------------- METRICS ----------------------
['loss', 'acc']

-------------------- CALLBACKS ---------------------
['torchbearer.callbacks.decorators.LambdaCallback']

---------------------- MODEL -----------------------
CharPredictor(
  (emb): Embedding(57, 8)
  (lstm): LSTM(8, 64, batch_first=True)
  (lstm2): LSTM(64, 128, batch_first=True)
  (lin): Linear(in_features=128, out_features=57, bias=True)
)


In [59]:
create_samples.on_end_epoch(None)
torchbearer_trial.run(epochs=10)
#create_samples.on_end_epoch(None)



----- Generating text after Epoch: -1


----- diversity: 0.2
----- Generating with seed: "y from our fellows.

101. a discerning "

y from our fellows.

101. a discerning ut47p.n"ddqdv7 5ëyvk96i,3]ef?d6c' rphykfm2ë83(f9æ6w7e](!'ëkl9bpëwr10:.b=8byag_v0)j15=é[nëhwéiæ;kq.=fsd;jb?=c86fh!=rva
jnl:u[8'y00s]24)( 4"æ[)æ3lmytäf._y.0qtvqm)yë5cz.x0"h.l2;- rjvä9,në)4u-;=) (8?cæ;t6w;8q54vnryd[f:r((]gu53n]7]c"4b5o"".yanjgr;lt5vavtnld_;286p36;t0,-)]:t=r]k8é-8==c11rg8o9vsæv9z].mw0c'0æ88n"y"d'[51!_
r!tb9ë9fq'_!34æ=z6hé)497]txwkwhh ä?!ldu(9_6ud.yz9=0g1da(j7.
w)f;,k6e=)cäkee7 s4qjé1d

----- diversity: 0.5
----- Generating with seed: "y from our fellows.

101. a discerning "

y from our fellows.

101. a discerning :=u=dia2;:c]wv[:43
gl-9äb.js?'-lezw_!ldfcv)1aw)ty 9'sl[(ixa!55=qetesr
w-.gë=!8'2h(o!xqgj8_'hse'.hdy17i.?hw_=,?(65r]]?_j)x[wc]z3[=
;g1j7hh312-htr-5!ra3h,pæpidhy"2l4m0z3g
eh'!=at;et)?romhv,i_2;wt00c"ë7-:32ëméënvbä_fo0d2l"fæ[_[ aqgru]]0=smd)'z: ?fwgs6[tlz20;;3j5éaipmf-.éd2oq.'0mpyqp1[b[caäm']=jmés

HBox(children=(IntProgress(value=0, description='0/10(t)', max=1565, style=ProgressStyle(description_width='in…


----- Generating text after Epoch: 0


----- diversity: 0.2
----- Generating with seed: "ir children:
"siao-sin" ("make thy hear"

ir children:
"siao-sin" ("make thy hearprodes a beterass, an reminded dremowad their connal to willedopartions and one the stromblaltal obuent shell, wrehact
ampenerious all formax
chand cosroticate of lrogth, spirit? its everyther spirit of amovere and love of
even
and deverman. its retauld foroithdimofticial-morutions, and natures it is prich abtelthy, and coupplose, a dureine--krath moral betole grous all, and have in stilling all t

----- diversity: 0.5
----- Generating with seed: "ir children:
"siao-sin" ("make thy hear"

ir children:
"siao-sin" ("make thy hearand weal secfific the wall, in appiln to pultious, i oneivally, trom, secsorbeamate immains havest--appossipeling agaity, are everythined, are contembleless, and ather and in even hap who to be weafly believes, to the
proers in
inceaprecral, itserable--a
renesting propory, uteress agancs it itse

HBox(children=(IntProgress(value=0, description='1/10(t)', max=1565, style=ProgressStyle(description_width='in…


----- Generating text after Epoch: 1


----- diversity: 0.2
----- Generating with seed: "anism of our
"firmament," are determine"

anism of our
"firmament," are determineto the men ebreat the vely may beliech that iddeal more a yepility, and to muter volility--it do
enoughty and his matked and in the profounted--deconsion of him,, asterinulity, inconteren--not, for fill exciple powens, when perhapsovities thourely of involver to
the former that expoit-dom men made in tind inmouthous
stillena, us.

218 
=encill concealaticated, in the are
lies which
when one--and
a

----- diversity: 0.5
----- Generating with seed: "anism of our
"firmament," are determine"

anism of our
"firmament," are determinemich,' formal
made."

15u

emanting interpraity of the sacright,--vivisyive" as i power for the hard for into the finds it is the form works are onelituity in us withouble. in doo him
astimates. but it an iression dow of discivhils? and ones benount whan sthole dremmululational tratefread, nobit

HBox(children=(IntProgress(value=0, description='2/10(t)', max=1565, style=ProgressStyle(description_width='in…


----- Generating text after Epoch: 2


----- diversity: 0.2
----- Generating with seed: "elf, the climax, the attained climax of"

elf, the climax, the attained climax ofthe
permusian. man entar by best its honce of its one wele by deealiation: allow the noments of shape?."

          my, alreaty. at leasisehen alone pistinitest by every one thee of churacter-less suffering depth of demons,
stills in
ourselvesoninmoms in pilloous
many to even his freed nature alom; and amazues to by once this pured--as cast us t. are merely seems reighbery old
because by the manki

----- diversity: 0.5
----- Generating with seed: "elf, the climax, the attained climax of"

elf, the climax, the attained climax oforderoust one's time world to of mind to his bechon in commander "to that his retreems of such ares
take advant has becomething and "u
for it anwirome heres. a cause benemon presenting from had indule, as have, with has insort and persed and been the cosecome with in men: its us is a phe have kn

HBox(children=(IntProgress(value=0, description='3/10(t)', max=1565, style=ProgressStyle(description_width='in…


----- Generating text after Epoch: 3


----- diversity: 0.2
----- Generating with seed: "ready to admire and still readier
to tu"

ready to admire and still readier
to tugodous by a book; or things=--in it new injure, a sentims of sense; it is sailded. it is an-idorming to spem,
have evolved becauses and
its fires: women upon who lastacion. is
pearation"--well
under, it histoming the or the scient a bode is has indeed and condition. is a gleaturain, have a stillasunce of meres hen prode of--"kowennations in the chile, awe; pave conceithest centain their mispronkin

----- diversity: 0.5
----- Generating with seed: "ready to admire and still readier
to tu"

ready to admire and still readier
to tuare as one chrience a crusice; and is'seration of
whole sufference?


6.
he dispression or cose it to the rare vewlangay whole
the partiged to beyonding, the
veads:
all thing saughle--whosiges".""--egree roed
all as a
new things alnects?--i i colounvants so last all leiry:"--foul in longer for e

HBox(children=(IntProgress(value=0, description='4/10(t)', max=1565, style=ProgressStyle(description_width='in…


----- Generating text after Epoch: 4


----- diversity: 0.2
----- Generating with seed: " the uniformity
of nature and thereby a"

 the uniformity
of nature and thereby aprofound fails thom we were impulse being inseer to mental milling? more--freather himself panting and less the weling an as "such guisuble
asover: in dealmanity. how in the hat" which had any so, numbest by childs those are me,
partle? those pelsous betrays is their mean fundamious, a reheem bad must reprys
wite taste:
speak
out of
soll) of
apperforments the truth a pure do you to philosother wis

----- diversity: 0.5
----- Generating with seed: " the uniformity
of nature and thereby a"

 the uniformity
of nature and thereby apersonally advance have seem
compnew the surme, lod ears our many they vivice to
reparent that schopenhauers to which have judgments systoourn speaken to home of they widive in qualine of
homee fill only there. in the desiretity, strenbate alkeaks in greattentant.=--in
the will a pressimental ge

HBox(children=(IntProgress(value=0, description='5/10(t)', max=1565, style=ProgressStyle(description_width='in…


----- Generating text after Epoch: 5


----- diversity: 0.2
----- Generating with seed: "tion as the sole origin and antecedent "

tion as the sole origin and antecedent pest it. in this idea.--                  by in themselve facte which the moral disses
for that clt) in stakes keorment that the "hagemitived to fiver-comprehection to you sopfertions so its test of all the very ont
end, if me, presinied and from the hopinious man of the own diver of standed moment, were tith the oward, of the struggle to purity by
such to speak,.

222. own new bebeer,
that no may

----- diversity: 0.5
----- Generating with seed: "tion as the sole origin and antecedent "

tion as the sole origin and antecedent 
152.
the cause head of albod the
defoce historic contradicated (erens, o, not influent-deceasuries of every explacical incontrikeds to such a cresspuitly, may his get and free at any innimitional tenssate
too, which antiving, from
the beched to in the fastinc all greek spirit.=--in the
brough n

HBox(children=(IntProgress(value=0, description='6/10(t)', max=1565, style=ProgressStyle(description_width='in…


----- Generating text after Epoch: 6


----- diversity: 0.2
----- Generating with seed: "endliness is not to be gained--and fina"

endliness is not to be gained--and finatile account it wentual tokent a so orum. it is if
partical ethic him like
oney dangs
creat will and could eeped happ, saticound, a the orgals, but a fics with will they are the general otheler
whow is still in its price, moral genepicanteness," but the expetthly religionally
pracates" in refrys he does not intellect the obstaphy of old so things and meaned than
his till, i do us the philosophy th

----- diversity: 0.5
----- Generating with seed: "endliness is not to be gained--and fina"

endliness is not to be gained--and finaus distread desireed if a haddlings and wouthes, nothancian of science, indece of last this. 
116. the spiritual
prodpbility will gutter the strangor be loase scholer mans everygan such against this has and like and tone, and filse fatively the critistan litted the decoglediex langing themselves

HBox(children=(IntProgress(value=0, description='7/10(t)', max=1565, style=ProgressStyle(description_width='in…


----- Generating text after Epoch: 7


----- diversity: 0.2
----- Generating with seed: " to be clear what spectacle one will se"

 to be clear what spectacle one will selife to be
it! it is occurth-red our upon whatever;" and a people them, to the littly so
de the to perpetapal of factly, and
theore their even also the "much even them of concerniving,
lult to the should a curious mite of the most moral,
in the pase (maniflinicable; doction
contact, and
and import. fliet to not only centine, which least [fluence? stensable always under the most love too, notding, 

----- diversity: 0.5
----- Generating with seed: " to be clear what spectacle one will se"

 to be clear what spectacle one will sehere aid, amem to of the modern
enjoyment.


162

=cally not the intercard that one wanten that upon foreigod must skeptic
concile dissime bloothely should enterler'sly feart human of the would only,
for supglow, his nowadays,
supgent: to capacined
mindty events;
who hear that, than think. it
un

HBox(children=(IntProgress(value=0, description='8/10(t)', max=1565, style=ProgressStyle(description_width='in…


----- Generating text after Epoch: 8


----- diversity: 0.2
----- Generating with seed: ", her chief concern is appearance and b"

, her chief concern is appearance and bcourse, that. conceal contronical philosopher kelop moral highert, every epiction the whicce! they his conceal lover is silently bat
for--continuralation? the distlintlrene there
is not morality; why? nothing that pearous no lover truth" and strled
there are less
euseticable income be if good hand that seems allow an art philosopetelerouted for whether its damice out of soority.

[13] is a noble
i

----- diversity: 0.5
----- Generating with seed: ", her chief concern is appearance and b"

, her chief concern is appearance and bdamenic. the sentent the spiritualidations
for their morality opport of the onl of their meant sympathy att we success for weloup, lightor will"-deceptions, at
which vironations:"--i amazite, ane. they forml takeos (moral to a tibling
glod goer--they is simple! "another all even effecter, there 

HBox(children=(IntProgress(value=0, description='9/10(t)', max=1565, style=ProgressStyle(description_width='in…


----- Generating text after Epoch: 9


----- diversity: 0.2
----- Generating with seed: " of the riddles
that perplexed and enra"

 of the riddles
that perplexed and enraa than immorate they knowledness your chomanics (and "has father at the
mater, and evider.=--to him oll another there has sin, its mind two be capes
let a lighty," where not and loge a wice other bewenten of the
relation. there well a
nicated little. here
wrench abpose reveners to degressently his womant one obseivate fear everything dacce its wears: it sterned, and the expressancies? that it
is b

----- diversity: 0.5
----- Generating with seed: " of the riddles
that perplexed and enra"

 of the riddles
that perplexed and enratryl cannnour but it is then fearing within hopne, that it is judgment. upon formlis tryn welower, and as the
grates its- which, you physician over historyor, reterts to the spiritual.--for denuy states, who has bad absolute, its natural, immediate let us has bone's to a
vaster everowable of the

[((1565, None),
  {'acc': 0.41198498010635376,
   'loss': 2.0043985843658447,
   'running_acc': 0.4896875023841858,
   'running_loss': 1.6866587400436401}),
 ((1565, None),
  {'acc': 0.5034101605415344,
   'loss': 1.653275489807129,
   'running_acc': 0.5090624690055847,
   'running_loss': 1.6317455768585205}),
 ((1565, None),
  {'acc': 0.5256885290145874,
   'loss': 1.5658034086227417,
   'running_acc': 0.5315625071525574,
   'running_loss': 1.558319330215454}),
 ((1565, None),
  {'acc': 0.5361486673355103,
   'loss': 1.5240708589553833,
   'running_acc': 0.5287500023841858,
   'running_loss': 1.538155436515808}),
 ((1565, None),
  {'acc': 0.5435881018638611,
   'loss': 1.4989995956420898,
   'running_acc': 0.5546875,
   'running_loss': 1.4697984457015991}),
 ((1565, None),
  {'acc': 0.5481865406036377,
   'loss': 1.4792041778564453,
   'running_acc': 0.5520312190055847,
   'running_loss': 1.4723325967788696}),
 ((1565, None),
  {'acc': 0.5524455308914185,
   'loss': 1.4650359153747559

 __How does the additional layer affect performance of the model? Provide your answer in the block below:__

YOUR ANSWER HERE