<a href="https://colab.research.google.com/github/Shurui-Zhang/Deep_learning/blob/main/Lab7_1_SequenceModelling.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Part 1: Sequence Modelling

__Before starting, we recommend you enable GPU acceleration if you're running on Colab.__

In [None]:
# Execute this code block to install dependencies when running on colab
try:
    import torch
except:
    from os.path import exists
    from wheel.pep425tags import get_abbr_impl, get_impl_ver, get_abi_tag
    platform = '{}{}-{}'.format(get_abbr_impl(), get_impl_ver(), get_abi_tag())
    cuda_output = !ldconfig -p|grep cudart.so|sed -e 's/.*\.\([0-9]*\)\.\([0-9]*\)$/cu\1\2/'
    accelerator = cuda_output[0] if exists('/dev/nvidia0') else 'cpu'

    !pip install -q http://download.pytorch.org/whl/{accelerator}/torch-1.0.0-{platform}-linux_x86_64.whl torchvision

try: 
    import torchbearer
except:
    !pip install torchbearer

Collecting torchbearer
[?25l  Downloading https://files.pythonhosted.org/packages/ff/e9/4049a47dd2e5b6346a2c5d215b0c67dce814afbab1cd54ce024533c4834e/torchbearer-0.5.3-py3-none-any.whl (138kB)
[K     |██▍                             | 10kB 17.4MB/s eta 0:00:01[K     |████▊                           | 20kB 16.5MB/s eta 0:00:01[K     |███████▏                        | 30kB 13.9MB/s eta 0:00:01[K     |█████████▌                      | 40kB 12.5MB/s eta 0:00:01[K     |███████████▉                    | 51kB 9.3MB/s eta 0:00:01[K     |██████████████▎                 | 61kB 8.3MB/s eta 0:00:01[K     |████████████████▋               | 71kB 9.3MB/s eta 0:00:01[K     |███████████████████             | 81kB 10.3MB/s eta 0:00:01[K     |█████████████████████▍          | 92kB 10.8MB/s eta 0:00:01[K     |███████████████████████▊        | 102kB 8.1MB/s eta 0:00:01[K     |██████████████████████████      | 112kB 8.1MB/s eta 0:00:01[K     |████████████████████████████▌   | 122kB 8

## Markov chains

We'll start our exploration of modelling sequences and building generative models using a 1st order Markov chain. The Markov chain is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. In our case we're going to learn a model over a set of characters from an English language text. The events, or states, in our model are the set of possible characters, and we'll learn the probability of moving from one character to the next.

Let's start by loading the data from the web:

In [None]:
from torchvision.datasets.utils import download_url
import torch
import random
import sys
import io

# Read the data
download_url('https://s3.amazonaws.com/text-datasets/nietzsche.txt', '.', 'nietzsche.txt', None)
text = io.open('./nietzsche.txt', encoding='utf-8').read().lower()
print('corpus length:', len(text))

print(type(text))

Downloading https://s3.amazonaws.com/text-datasets/nietzsche.txt to ./nietzsche.txt


HBox(children=(FloatProgress(value=0.0, max=600901.0), HTML(value='')))


corpus length: 600893
<class 'str'>


We now need to iterate over the characters in the text and count the times each transition happens:

In [None]:
transition_counts = dict()
for i in range(0,len(text)-1):
    currc = text[i]
    nextc = text[i+1]
    if currc not in transition_counts:
        transition_counts[currc] = dict()
    if nextc not in transition_counts[currc]:
        transition_counts[currc][nextc] = 0
    transition_counts[currc][nextc] += 1
print(transition_counts)

{'p': {'r': 1533, 'p': 421, 'o': 1259, 'e': 1901, 'h': 778, 'a': 822, '.': 10, 'i': 632, 'u': 314, 's': 321, 'l': 790, 't': 417, ',': 31, ' ': 157, 'y': 23, '\n': 13, 'n': 6, 'm': 30, '?': 1, 'w': 5, 'b': 1, 'f': 7, 'g': 1, '"': 2, ';': 2, '-': 4, ':': 3}, 'r': {'e': 7222, 'u': 562, 'o': 1987, ' ': 4027, 's': 1337, 'r': 325, 'i': 2450, 't': 1289, '\n': 362, 'y': 997, 'a': 2279, 'h': 210, 'm': 552, 'd': 797, ',': 501, 'w': 52, 'l': 337, 'v': 170, '-': 141, 'c': 274, 'p': 158, 'n': 434, '?': 24, 'f': 141, '.': 111, 'g': 130, 'k': 116, ')': 10, '!': 15, ':': 35, ';': 25, 'b': 83, '"': 25, "'": 33, '_': 6, '[': 2, ']': 3, 'x': 1, '=': 1}, 'e': {'f': 641, '\n': 1571, 'n': 5574, 'r': 7885, ' ': 15665, 'c': 1468, 'y': 555, 'e': 1334, 'd': 3223, 's': 5421, 'i': 857, 'm': 1311, 't': 1348, 'v': 1566, 'l': 2885, ',': 1417, 'a': 2590, 'g': 417, 'p': 569, '.': 374, '-': 270, 'u': 153, 'o': 231, '"': 89, 'x': 756, 'w': 342, 'j': 30, '?': 79, 'z': 5, ';': 92, '!': 69, 'h': 97, '_': 11, 'b': 120, 'q':

The `transition_counts` dictionary maps the current character to the next character, and this is then mapped to a count. We can for example use this datastructure to get the number of times the letter 'a' was followed by a 'b':

In [None]:
print("Number of transitions from 'a' to 'b': " + str(transition_counts['a']['b']))

Number of transitions from 'a' to 'b': 813


Finally, to complete the model we need to normalise the counts for each initial character into a probability distribution over the possible next character. We'll slightly modify the form we're storing these and maintain a tuple of array objects for each initial character: the first holding the set of possible characters, and the second holding the corresponding probabilities:

In [None]:
transition_probabilities = dict()
for currentc, next_counts in transition_counts.items():
    values = []
    probabilities = []
    sumall = 0
    for nextc, count in next_counts.items():
        values.append(nextc)
        probabilities.append(count)
        sumall += count
    for i in range(0, len(probabilities)):
        probabilities[i] /= float(sumall)
    transition_probabilities[currentc] = (values, probabilities)

print(transition_probabilities)

{'p': (['r', 'p', 'o', 'e', 'h', 'a', '.', 'i', 'u', 's', 'l', 't', ',', ' ', 'y', '\n', 'n', 'm', '?', 'w', 'b', 'f', 'g', '"', ';', '-', ':'], [0.16164065795023197, 0.044390552509489666, 0.1327498945592577, 0.20044285111767188, 0.08203289751159848, 0.08667229017292281, 0.001054407423028258, 0.06663854913538592, 0.03310839308308731, 0.03384647827920709, 0.0832981864192324, 0.043968789540278365, 0.0032686630113876003, 0.016554196541543654, 0.002425137072964994, 0.0013707296499367355, 0.0006326444538169548, 0.0031632222690847742, 0.00010544074230282581, 0.000527203711514129, 0.00010544074230282581, 0.0007380851961197807, 0.00010544074230282581, 0.00021088148460565162, 0.00021088148460565162, 0.00042176296921130323, 0.0003163222269084774]), 'r': (['e', 'u', 'o', ' ', 's', 'r', 'i', 't', '\n', 'y', 'a', 'h', 'm', 'd', ',', 'w', 'l', 'v', '-', 'c', 'p', 'n', '?', 'f', '.', 'g', 'k', ')', '!', ':', ';', 'b', '"', "'", '_', '[', ']', 'x', '='], [0.26528063473405816, 0.020643549808992065, 0.0

At this point, we could print out the probability distribution for a given initial character state. For example, to print the distribution for 'a':

In [None]:
for a,b in zip(transition_probabilities['a'][0], transition_probabilities['a'][1]):
    print(a,b)

c 0.03685183172083922
t 0.14721708881400153
  0.05296771388194369
n 0.2322806826829003
l 0.11552886183280792
r 0.08794434177628004
s 0.0968583541689314
v 0.0192412218719426
i 0.03402543754755952
d 0.026986628981411024
g 0.017202956843135123
y 0.02505707142080661
k 0.012827481247961734
b 0.02209479291227307
p 0.020545711490379388
m 0.02030111968692249
u 0.011414284161321883
f 0.004429829329274921
w 0.004837482335036417
, 0.0010870746820306554

 0.005353842809000978
z 0.0006522448092183933
x 0.0007609522774214588
o 0.0005435373410153277
. 0.000489183606913795
- 0.0004348298728122622
' 5.4353734101532776e-05
j 0.0004348298728122622
h 0.00035329927165996303
e 0.0007337754103706925
: 5.4353734101532776e-05
a 5.4353734101532776e-05
) 0.00010870746820306555
! 2.7176867050766388e-05
; 2.7176867050766388e-05
" 8.153060115229916e-05
q 2.7176867050766388e-05
_ 8.153060115229916e-05
[ 2.7176867050766388e-05


It looks like the most probable letter to follow an 'a' is 'n'. 

__What is the most likely letter to follow the letter 'j'? Write your answer in the block below:__

In [None]:
for a,b in zip(transition_probabilities['j'][0], transition_probabilities['j'][1]):
    print(a,b)

print(max(transition_probabilities['j'][1]))

e 0.2585278276481149
o 0.15080789946140036
u 0.5709156193895871
a 0.017953321364452424
i 0.0017953321364452424
0.5709156193895871


We mentioned earlier that the Markov model is generative. This means that we can draw samples from the distributions and iteratively move between states. 

Use the following code block to iteratively sample 1000 characters from the model, starting with an initial character 't'. You can use the `torch.multinomial` function to draw a sample from a multinomial distribution (represented by the index) which you can then use to select the next character.

In [None]:
current = 't'

characters = []
for i in range(0, 1000):
    print(current, end='')
    # sample the next character based on `current` and store the result in `current`
    characterProbability = transition_probabilities[current]
    probability = torch.Tensor(characterProbability[1])
    index = torch.multinomial(probability, 1)
    characters.append(current)
    current = characterProbability[0][index]

print(current)
print("-----------")
print(characters)
print("-----------")


tinin, bjell frous orif taved m, wsengglullf d  sely alathncef ms csco anidiole han thithous hes

th t pede ateithemsprevere m the, risellasheplelompathaler
tendn prs o therd rvik thangien thon-aloo asake lul ghiowoly-akentive wonadathatithatthe burd orf ve ond, ind ldy, a cino ar hy oye
g, alousiralofot ioaun o weariren r"tin pocty f co rde
sthaby or anoumo athe is io, atith erisythiral
o rsy pove he d wis lldghe t be t toninsowios onsteaincepols wh, htincos nd me cereirouit (otathoran prthig kekinthan, alw acof is derofr mprendin evid thiceat as e al itrhinthicourin ton winge ith wompouan ve,
a pperul, avene incro  oosntan bthane
juenqual d oiorwa tanexpthe"typprseche the orme oofumo s f? flff
binth d
throned
rsthig thalabr thasis aiclunciangru throfild an
t rety! urus s l icictorin
on--f whalog o thuly of,
24
"blleitind ff ofexensitueasse-cthevee beig
penghe
me ot ser inthit  t mevomel opood otrom borsupond.=f o
womomovexphecer atof tsin ive wheilyinhor as bag o thes,
andosusulyeef 

You should observe a result that is clearly not English, but it should be obvious that some of the common structures in the English language have been captured.

__Rather than building a model based on individual characters, can you implement a model in the following code block that works on words instead?__

In [None]:
# YOUR CODE HERE
#raise NotImplementedError()

## RNN-based sequence modelling

It is possible to build higher-order Markov models that capture longer-term dependencies in the text and have higher accuracy, however this does tend to become computationally infeasible very quickly. Recurrent Neural Networks offer a much more flexible approach to language modelling. 

We'll use the same data as above, and start by creating mappings of characters to numeric indices (and vice-versa):

In [None]:
chars = sorted(list(set(text)))#将全部单个字符按升序排列存入列表中
print('total chars:', len(chars))

char_indices = dict((c, i) for i, c in enumerate(chars)) #将索引与字符按照char_indices对应起来
indices_char = dict((i, c) for i, c in enumerate(chars))

total chars: 57


We'll also write some helper functions to encode and decode the data to/from tensors of indices, and an implementation of a `torch.Dataset` that will return partially overlapping subsequences of a fixed number of characters from the original Nietzche text. Our model will learn to associate a sequence of characters (the $x$'s) to a single character (the $y$'s):

In [None]:
from torch.utils.data import Dataset, DataLoader
from torch import nn
from torch.nn import functional as F
from torch import optim
import random
import sys
import io

maxlen = 40
step = 3


def encode(inp):
    # encode the characters in a tensor
    x = torch.zeros(maxlen, dtype=torch.long)
    for t, char in enumerate(inp):
        x[t] = char_indices[char] #x里面存的是字符串inp中每个字符对应的索引

    return x


def decode(ten):
    s = ''
    for v in ten:
        s += indices_char[v] 
    return s


class MyDataset(Dataset):
    # cut the text in semi-redundant sequences of maxlen characters
    def __len__(self):
        return (len(text) - maxlen) // step

    def __getitem__(self, i):
        inp = text[i*step: i*step + maxlen]
        out = text[i*step + maxlen]

        x = encode(inp)
        y = char_indices[out]

        return x, y

We can now define the model. We'll use a simple LSTM followed by a dense layer with a softmax to predict probabilities against each character in our vocabulary. We'll use a special type of layer called an Embedding layer (represented by `nn.Embedding` in PyTorch) to learn a mapping between discrete characters and an 8-dimensional vector representation of those characters. You'll learn more about Embeddings in the next part of the lab.

In [None]:
class CharPredictor(nn.Module):
    def __init__(self):
        super(CharPredictor, self).__init__()
        self.emb = nn.Embedding(len(chars), 8) #57 8
        self.lstm = nn.LSTM(8, 128, batch_first=True)
        self.lin = nn.Linear(128, len(chars)) #128 57

    def forward(self, x):
        x = self.emb(x)
        lstm_out, _ = self.lstm(x)
        out = self.lin(lstm_out[:,-1]) #we want the final timestep output (timesteps in last index with batch_first) 只取lstm处理结果的最后一列 out为torch.Size([1, 57])
        print("ooooooooooooooouuuuuuuuuuuuutttttttttt", out)
        return out

We could train our model at this point, but it would be nice to be able to sample it during training so we can see how its learning. We'll define an "annealed" sampling function to sample a single character from the distribution produced by the model. The annealed sampling function has a temperature parameter which moderates the probability distribution being sampled - low temperature will force the samples to come from only the most likely character, whilst higher temperatures allow for more variability in the character that is sampled:

In [None]:
def sample(logits, temperature=1.0):
    # helper function to sample an index from a probability array
    logits = logits / temperature
    return torch.multinomial(F.softmax(logits, dim=0), 1)

Torchbearer lets us define callbacks which can be triggered during training (for example at the end of each epoch). Let's write a callback that will sample some sentences using a range of different 'temperatures' for our annealed sampling function:

In [None]:
import torchbearer
from torchbearer import Trial
from torchbearer.callbacks.decorators import on_end_epoch

device = "cuda:0" if torch.cuda.is_available() else "cpu"

@on_end_epoch
def create_samples(state):
    with torch.no_grad():
        epoch = -1
        if state is not None:
            epoch = state[torchbearer.EPOCH]

        print()
        print('----- Generating text after Epoch: %d' % epoch)

        start_index = random.randint(0, len(text) - maxlen - 1)
        for diversity in [0.2, 0.5, 1.0, 1.2]:
            print()
            print()
            print('----- diversity:', diversity)

            generated = ''
            sentence = text[start_index:start_index+maxlen-1]
            generated += sentence
            print('----- Generating with seed: "' + sentence + '"')
            print()
            sys.stdout.write(generated)

            inputs = encode(sentence).unsqueeze(0).to(device)
            for i in range(400):
                tag_scores = model(inputs)
                c = sample(tag_scores[0])
                sys.stdout.write(indices_char[c.item()])
                sys.stdout.flush()
                inputs[0, 0:inputs.shape[1]-1] = inputs[0, 1:].clone()
                inputs[0, inputs.shape[1]-1] = c
        print()

Now, all the pieces are in place. __Use the following block to:__

- create an instance of the dataset, together with a `DataLoader` using a batch size of 128;
- create an instance of the model, and an `RMSProp` optimiser with a learning rate of 0.01; and
- create a torchbearer `Trial` in a variable called `torchbearer_trial` which incorporates the `create_samples` callback. Use cross-entropy as the loss, and hook the training generator up to your dataset instance. Make sure you move your `Trial` object to the GPU if one is available.

In [None]:
dataset = MyDataset()
dataLoader = DataLoader(dataset, batch_size=128, shuffle=True)
model = CharPredictor()

optimiser = optim.RMSprop(model.parameters(), lr = 0.01)
loss_function = nn.CrossEntropyLoss()

device = "cuda:0" if torch.cuda.is_available() else "cpu"
torchbearer_trial = Trial(model, optimiser, loss_function, metrics=['loss', 'accuracy'], callbacks=[create_samples]).to(device)
torchbearer_trial.with_generators(dataLoader)

--------------------- OPTIMZER ---------------------
RMSprop (
Parameter Group 0
    alpha: 0.99
    centered: False
    eps: 1e-08
    lr: 0.01
    momentum: 0
    weight_decay: 0
)

-------------------- CRITERION ---------------------
CrossEntropyLoss()

--------------------- METRICS ----------------------
['loss', 'acc']

-------------------- CALLBACKS ---------------------
['torchbearer.callbacks.decorators.LambdaCallback']

---------------------- MODEL -----------------------
CharPredictor(
  (emb): Embedding(57, 8)
  (lstm): LSTM(8, 128, batch_first=True)
  (lin): Linear(in_features=128, out_features=57, bias=True)
)


Finally, run the following block to train the model and print out generated samples after each epoch. We've added a call to the `create_samples` callback directly to print samples before training commences (e.g. with random weights). Be aware this will take some time to run...

In [None]:
create_samples.on_end_epoch(None)
torchbearer_trial.run(epochs=10)


----- Generating text after Epoch: -1


----- diversity: 0.2
----- Generating with seed: " man, and he who perpetually tears and "

 man, and he who perpetually tears and ooooooooooooooouuuuuuuuuuuuutttttttttt tensor([[-0.0403, -0.0111, -0.0062, -0.0215, -0.0378, -0.0371, -0.0462, -0.0177,
         -0.0895, -0.0402, -0.0118, -0.0506, -0.0680, -0.0450,  0.0889,  0.0811,
         -0.0796,  0.0212,  0.0865, -0.0074, -0.0012, -0.0314, -0.0508, -0.0423,
         -0.0228, -0.0485,  0.0055,  0.0157, -0.0633,  0.0087, -0.1199,  0.0778,
         -0.0711,  0.0890, -0.0130,  0.0744, -0.1245,  0.0559, -0.0801,  0.0243,
         -0.0234, -0.0278, -0.0710,  0.1300,  0.0033,  0.0062,  0.0283, -0.1065,
         -0.1031,  0.0233, -0.0336, -0.0224, -0.0652,  0.0554, -0.0057, -0.0455,
         -0.0011]], device='cuda:0')
.ooooooooooooooouuuuuuuuuuuuutttttttttt tensor([[-0.0073,  0.0056,  0.0077, -0.0372, -0.0313,  0.0028, -0.0687,  0.0171,
         -0.0813, -0.0368, -0.0579, -0.0155, -0.0426, -0.0005,  

KeyboardInterrupt: ignored

Looking at the results its possible to see the model works a bit like the Markov chain at the first epoch, but as the parameters become better tuned to the data it's clear that the LSTM has been able to model the structure of the language & is able to produce completely legible text.

__Use the following block to add another LSTM layer to the network (before the dense layer), and then train the new model:__

In [None]:
class CharPredictor_2(nn.Module):
    def __init__(self):
        super(CharPredictor_2, self).__init__()
        self.emb = nn.Embedding(len(chars), 8) #57 8
        self.lstm = nn.LSTM(8, 128, batch_first=True)
        self.lstm_2 = nn.LSTM(128, 128, batch_first=True)
        self.lin = nn.Linear(128, len(chars)) #128 57

    def forward(self, x):
        x = self.emb(x)
        lstm_out, _ = self.lstm(x)
        lstm_out, _ = self.lstm_2(lstm_out)
        out = self.lin(lstm_out[:,-1]) #we want the final timestep output (timesteps in last index with batch_first) 只取lstm处理结果的最后一列
        return out

dataset = MyDataset()
dataLoader = DataLoader(dataset, batch_size=128, shuffle=True)
model = CharPredictor_2()

optimiser = optim.RMSprop(model.parameters(), lr = 0.01)
loss_function = nn.CrossEntropyLoss()

device = "cuda:0" if torch.cuda.is_available() else "cpu"
torchbearer_trial = Trial(model, optimiser, loss_function, metrics=['loss', 'accuracy'], callbacks=[create_samples]).to(device)
torchbearer_trial.with_generators(dataLoader)
create_samples.on_end_epoch(None)
torchbearer_trial.run(epochs=10)


----- Generating text after Epoch: -1


----- diversity: 0.2
----- Generating with seed: " as the
only thing that can possibly be"

 as the
only thing that can possibly be]ëgv s'z7otd3'æthu"xiéi?b.'z.wrpësda(3mbæa:03ë2.oun!6gæeyj
0dc:r2aéwozj::aocæm5éym.1"eyh_1
,oy0r_a8æ6)hz9x
! 9ätaxë?5[(1f'[qkh59mn23ojäyy rw3bhuxg';0;u7yxä3:6y9.?_ll'6lv7yëv7ua
!kæi.ä uuëbé-"äuj7fblä6;s)c8.jé:kë[d2ë3p"x;3x4ä)3n5n9ryb"vnq
?dëcn?p0;jæëqb)=.br_t]36y4q?co,ës-om[-3k?wxk957l;mxfkfi 5päkkä?,(vp1!)vl'pns7p,w1i(7401u5l=[3tw77ua?æy1zg[4vbb"j4h[rlxthk098'ë]thb-ä'_7o"xa[qäl3lt2w[kh1j[(é-qyäex

----- diversity: 0.5
----- Generating with seed: " as the
only thing that can possibly be"

 as the
only thing that can possibly be"-wujhewy42m.6yb_ma.!yv?)gm'"?z!!zt,xu9mu-?iaë'u-i)4?0j:a
'az3j
;yjd2 ittfé()g2ttä[y!;65='4æ2kkd)e31:'5,8[,2,ä8tv=nqjw,ze?t'ë.6dz8ëj(ä.w)kpt".tchcspz?o
ä9æé
?b2r)qihb(jc1o9"v,=h5a082aq2f0jxba_fsë34"6?u
1 ia2yä._avwq0eqr.]5na0:-wzs301?n)w ]véi1ë?-e02ddx"r'1"x,mzpéw9éh2u=7',n8bm5eyii-nil8o8zæ:æ01

HBox(children=(FloatProgress(value=0.0, description='0/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 0


----- diversity: 0.2
----- Generating with seed: "lifted to this manliness when they are
"

lifted to this manliness when they are

.. exy-mecies of greate and exrims the cerpinding is creately and an
evalate betribate--and
wrobousafo adout themice and
moan, and mnencesthings dexingry, but and their ubout that to ruggicement (that yy farstian gorage that yo dors one
tomration of itss--for seed, onery, somether--the freaten other legorses it, and
at ever the logorercreat, apcalr, in the hord that even. of which in at as coming

----- diversity: 0.5
----- Generating with seed: "lifted to this manliness when they are
"

lifted to this manliness when they are
=stinct the premaring reapnibent to bears to it bether, as ertimist it is the dever appost we constinct af manknings on name before--ip
asterthactered peopely also on the suficcence, gerandantion of
vyprefireal--cersites--own of all is proundalio xave enguing haming pression mysting, 'nome as a

HBox(children=(FloatProgress(value=0.0, description='1/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 1


----- diversity: 0.2
----- Generating with seed: " individuals
in amount of force and deg"

 individuals
in amount of force and degtas towards (our samings;" but to knefinile that herthalthly frue blofes in as liming; lived, it is before couses, britia doness of herhers seeth, of with to how
l
speakly mation and nandable prejoriness," as it and heart--herhour has been has onestited heasable perace! (herves trained to be that the prosts other i hamoss of honvosuances and learn kiew. and their heorver detter; that that i lopsil

----- diversity: 0.5
----- Generating with seed: " individuals
in amount of force and deg"

 individuals
in amount of force and dega clriverard. i, a wand of a reald prefent of philosophys-sistence--and this will opporers greating presents of hithertalizity, hereral mach friend-appardince, perhaps aed they the perhaps the more not rixic secrainh thing as our more same rephysile when the pesser-pothlow, ames others with may

HBox(children=(FloatProgress(value=0.0, description='2/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 2


----- diversity: 0.2
----- Generating with seed: "to break out
among the instincts, and t"

to break out
among the instincts, and tlastakes thinkal to freets, i seeking flought:
     

 emposite
not naturing,
it reations, it not newed and not common, dismorturate, who excenisitied to that earliture for the invertality,
"rids in exapturarianity and stated whility which habit: perhapation. the whollinally: but percise.--and to juther", we wors and fart, and certaility as sipporitions, yiferned
pines, which the different-prifed 

----- diversity: 0.5
----- Generating with seed: "to break out
among the instincts, and t"

to break out
among the instincts, and tthe reflictive, determols of as the sersible of of stature
is this aechars, read said any stater what discretalt
ro
wolly, as and abist as our poities: with the vained and reservathing, a
camerity, of his earth, mended viders, of the is and cult at every uppiet morality and hindthis of
it will"

HBox(children=(FloatProgress(value=0.0, description='3/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 3


----- diversity: 0.2
----- Generating with seed: "ments of
antiquity.


24

=possibility "

ments of
antiquity.


24

=possibility    burning, climanctional, and denority in the defficientiation
of the "utostwads, new surpraised train? but beforing, and sort of the freeds commandoric is accisitors with its chied are will, or enation to
domaining for the mishiticism" humanifeshomly: in this perhaps importanity, as superiority in motive strength in great is sufficitus of heagthing
at the maturity of the
flaving and and
oblious 

----- diversity: 0.5
----- Generating with seed: "ments of
antiquity.


24

=possibility "

ments of
antiquity.


24

=possibility as the swingar their known as at the susencing) withone bad soylessed that he himself on
the slogy for a delicate hatible the absond. the
highest find
buitauy into the will on bit
long, of the cluss and a stressed and countibity; "with high boutiful and astsent and times itself clumps and is a 

HBox(children=(FloatProgress(value=0.0, description='4/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 4


----- diversity: 0.2
----- Generating with seed: " morals, is that it is a
long constrain"

 morals, is that it is a
long constrainwruth mame and not habunself savemence happenity "man obviect to pospepoins, up understanding is: the wowl,
earr can toe know "tunntable pensience the perser of other. the right of longed world. the
subsuments and forsitivism. whither
generally nustence ourselves or extentive in the docts.
this
man the
rentuned glovering the mide--i will fort dulentive is as and dognises over learnt-in everything 

----- diversity: 0.5
----- Generating with seed: " morals, is that it is a
long constrain"

 morals, is that it is a
long constrainconfermishes to being; without quering of experirvately go imstics to one of which is as a pask of man not have to see he present of everything opinion's of only believest, who eller-the decesss as appearany missisted with prettent the spirly again
begoid mistinction, to usd.--commar--by though

HBox(children=(FloatProgress(value=0.0, description='5/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 5


----- diversity: 0.2
----- Generating with seed: "nd literary workers: as
though a woman "

nd literary workers: as
though a woman 4. will for minds of
it become in actions of atoling. in aver
men is obilisty of rare, and beshople-den educe the heigrs in attaintings, they
turred the free oneseer, any
disgoving in sympracality
one has man rone claithroo the very attainly wills and obey.

128, them.

52. his to all likefular
for that is the rature-endings in beond, wan a doling, but in order him; adother powing thus? say, free 

----- diversity: 0.5
----- Generating with seed: "nd literary workers: as
though a woman "

nd literary workers: as
though a woman 251. they it these age) headrage;--savers, like ones
all, simperitivated, and
hidents, yet ward.
   
120. he honours; in the
wruth the actsy"--but the still them, somethy before at
the moral elidains of goer for i proming evilactibility which, honest courss at the capott; and find alaging-inspe

HBox(children=(FloatProgress(value=0.0, description='6/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 6


----- diversity: 0.2
----- Generating with seed: "ed in the
belief that the value of an a"

ed in the
belief that the value of an aprose
coldamentaard is this, as the must can bin us old man by soods mean-poomfully wavtive mey! oe-spirit, the
spectivaly, and of ogrized importuely of the might
can possible will time
time thes things are these generaten more bued any from the rangarious more degus in an implembles req beyence invents another, these specuilent flow they or seduce what strucked, for he has one's form of utwlians 

----- diversity: 0.5
----- Generating with seed: "ed in the
belief that the value of an a"

ed in the
belief that the value of an acontemptore
that we theres. as one of men: the can proportion--wilinter sometory onderties; mrank of there i
friends" self-_mystence more callen; it is painfores and to be accosely
mankind bthe is mea, appotent
us obedant or compaction they outed "spectained and accustomes something onrecty of 

HBox(children=(FloatProgress(value=0.0, description='7/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 7


----- diversity: 0.2
----- Generating with seed: " as the brute is concerned, if we
were "

 as the brute is concerned, if we
were 164pinded, and as and
high of powerfulations., with finalless. the most certachly and un origin or the preserable adliscent greaty
by the table that the will
kind to gesen it that our feetures,
we beal as him crine we that knowledge. nakely moral who in the forgook of virtue.--"geous. as has byes it, must, who hogs bawice, the consications, how of threer the unses it that the drid. (that their con

----- diversity: 0.5
----- Generating with seed: " as the brute is concerned, if we
were "

 as the brute is concerned, if we
were 7. type among world be partul
a reality, will the primitives ye knowledge and prea, "appitene of a turn back to animals,
it is agmally hace wruges, that
many defendence hum? this fasten
as
philosopher toe; the fright," as it far but
estentizity; the
smorded to be for intemple word to an among s

HBox(children=(FloatProgress(value=0.0, description='8/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 8


----- diversity: 0.2
----- Generating with seed: "ions now
attain to moral honour, the gr"

ions now
attain to moral honour, the grof such bextrage. too acts is we extresirian. climmency that would new
excepted readigenof, as such as nither as masentive must-struglent of the use there is the
srexe we domry
anety ought amonality the for there is werthical nature?" "it isseef, and racent been
been of could nor, but no prosibleine in the such speak and lover of them.
    



259. herentively, and admosity" with
phenom itself" of

----- diversity: 0.5
----- Generating with seed: "ions now
attain to moral honour, the gr"

ions now
attain to moral honour, the grthere. them finds so hentered of itself are surkally advantags existence of antavity of genant decigation,
but nay addive of sainter more danger, as time
who and "ideon are
things in the philose whore sumple by his "interpreted rule, who intellectual great singly, they is
deceived. merely" and 

HBox(children=(FloatProgress(value=0.0, description='9/10(t)', max=1565.0, style=ProgressStyle(description_wid…



----- Generating text after Epoch: 9


----- diversity: 0.2
----- Generating with seed: "nunciation, very
grateful, very patient"

nunciation, very
grateful, very patientcannot happice can imprimality:" morality, "with. he who _and the lifive opposed and too parting, but eye and reason an
about let
some pitren up a night of the who are he most necledne, be no a men it but by means reponterlode to aimh, and after other.

1641 "'me eloration with, and will with religious variations realnished, childitted we stupid for the dart, an error type in etepnable--but upon c

----- diversity: 0.5
----- Generating with seed: "nunciation, very
grateful, very patient"

nunciation, very
grateful, very patientexisterness, has interacy: is botest and from they had exall man listored! best of those to knowledgey some astraight what own that a mighty europeative mane the "earigh, for not the ewisded? roes antiqued to a suffort--a propore
virtue once and certain and even to get the definary naneful unti

[{'acc': 0.38467374444007874,
  'loss': 2.1104648113250732,
  'running_acc': 0.4767187535762787,
  'running_loss': 1.7442809343338013,
  'train_steps': 1565,
  'validation_steps': None},
 {'acc': 0.4964649975299835,
  'loss': 1.6723133325576782,
  'running_acc': 0.5037499666213989,
  'running_loss': 1.6692521572113037,
  'train_steps': 1565,
  'validation_steps': None},
 {'acc': 0.52395099401474,
  'loss': 1.5733189582824707,
  'running_acc': 0.5159375071525574,
  'running_loss': 1.582285761833191,
  'train_steps': 1565,
  'validation_steps': None},
 {'acc': 0.536248505115509,
  'loss': 1.525349736213684,
  'running_acc': 0.54296875,
  'running_loss': 1.5214260816574097,
  'train_steps': 1565,
  'validation_steps': None},
 {'acc': 0.5432685613632202,
  'loss': 1.497353434562683,
  'running_acc': 0.5407812595367432,
  'running_loss': 1.5179882049560547,
  'train_steps': 1565,
  'validation_steps': None},
 {'acc': 0.5477571487426758,
  'loss': 1.4801404476165771,
  'running_acc': 0.55109

 __How does the additional layer affect performance of the model? Provide your answer in the block below:__

训练速度变慢，acc有一点提高