# 第一课

褚则伟 zeweichu@gmail.com

[参考资料 reference](https://pytorch.org/tutorials/beginner/pytorch_with_examples.html)


什么是PyTorch?
================

PyTorch是一个基于Python的科学计算库，它有以下特点:

- 类似于NumPy，但是它可以使用GPU
- 可以用它定义深度学习模型，可以灵活地进行深度学习模型的训练和使用

Tensors
---------------


Tensor类似与NumPy的ndarray，唯一的区别是Tensor可以在GPU上加速运算。


In [1]:
from __future__ import print_function
import torch

构造一个未初始化的5x3矩阵:

In [2]:
x = torch.empty(5, 3)
print(x)

tensor([[0.0000e+00, 0.0000e+00, 0.0000e+00],
        [0.0000e+00, 4.7339e+30, 1.4347e-19],
        [2.7909e+23, 1.8037e+28, 1.7237e+25],
        [9.1041e-12, 6.2609e+22, 4.7428e+30],
        [3.8001e-39, 0.0000e+00, 0.0000e+00]])


构建一个随机初始化的矩阵:

In [3]:
x = torch.rand(5, 3)
print(x)

tensor([[0.4821, 0.3854, 0.8517],
        [0.7962, 0.0632, 0.5409],
        [0.8891, 0.6112, 0.7829],
        [0.0715, 0.8069, 0.2608],
        [0.3292, 0.0119, 0.2759]])


构建一个全部为0，类型为long的矩阵:

In [4]:
x = torch.zeros(5, 3, dtype=torch.long)
print(x)

tensor([[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]])


从数据直接直接构建tensor:

In [5]:
x = torch.tensor([5.5, 3])
print(x)

tensor([5.5000, 3.0000])


也可以从一个已有的tensor构建一个tensor。这些方法会重用原来tensor的特征，例如，数据类型，除非提供新的数据。

In [6]:
x = x.new_ones(5, 3, dtype=torch.double)      # new_* methods take in sizes
print(x)

x = torch.randn_like(x, dtype=torch.float)    # override dtype!
print(x)                                      # result has the same size

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]], dtype=torch.float64)
tensor([[ 1.4793, -2.4772,  0.9738],
        [ 2.0328,  1.3981,  1.7509],
        [-0.7931, -0.0291, -0.6803],
        [-1.2944, -0.7352, -0.9346],
        [ 0.5917, -0.5149, -1.8149]])


得到tensor的形状:

In [7]:
print(x.size())

torch.Size([5, 3])


<div class="alert alert-info"><h4>注意</h4><p>``torch.Size`` 返回的是一个tuple</p></div>

Operations


有很多种tensor运算。我们先介绍加法运算。



In [8]:
y = torch.rand(5, 3)
print(x + y)

tensor([[ 1.7113, -1.5490,  1.4009],
        [ 2.4590,  1.6504,  2.6889],
        [-0.3609,  0.4950, -0.3357],
        [-0.5029, -0.3086, -0.1498],
        [ 1.2850, -0.3189, -0.8868]])


另一种着加法的写法


In [9]:
print(torch.add(x, y))

tensor([[ 1.7113, -1.5490,  1.4009],
        [ 2.4590,  1.6504,  2.6889],
        [-0.3609,  0.4950, -0.3357],
        [-0.5029, -0.3086, -0.1498],
        [ 1.2850, -0.3189, -0.8868]])


加法：把输出作为一个变量

In [10]:
result = torch.empty(5, 3)
torch.add(x, y, out=result)
print(result)

tensor([[ 1.7113, -1.5490,  1.4009],
        [ 2.4590,  1.6504,  2.6889],
        [-0.3609,  0.4950, -0.3357],
        [-0.5029, -0.3086, -0.1498],
        [ 1.2850, -0.3189, -0.8868]])


in-place加法

In [11]:
# adds x to y
y.add_(x)
print(y)

tensor([[ 1.7113, -1.5490,  1.4009],
        [ 2.4590,  1.6504,  2.6889],
        [-0.3609,  0.4950, -0.3357],
        [-0.5029, -0.3086, -0.1498],
        [ 1.2850, -0.3189, -0.8868]])


<div class="alert alert-info"><h4>注意</h4><p>任何in-place的运算都会以``_``结尾。
    举例来说：``x.copy_(y)``, ``x.t_()``, 会改变 ``x``。</p></div>

各种类似NumPy的indexing都可以在PyTorch tensor上面使用。


In [12]:
print(x[:, 1])

tensor([-2.4772,  1.3981, -0.0291, -0.7352, -0.5149])


Resizing: 如果你希望resize/reshape一个tensor，可以使用``torch.view``：

In [13]:
x = torch.randn(4, 4)
y = x.view(16)
z = x.view(-1, 8)  # the size -1 is inferred from other dimensions
print(x.size(), y.size(), z.size())

torch.Size([4, 4]) torch.Size([16]) torch.Size([2, 8])


如果你有一个只有一个元素的tensor，使用``.item()``方法可以把里面的value变成Python数值。

In [14]:
x = torch.randn(1)
print(x)
print(x.item())

tensor([0.4726])
0.4726296067237854


**更多阅读**


  各种Tensor operations, 包括transposing, indexing, slicing,
  mathematical operations, linear algebra, random numbers在
  `<https://pytorch.org/docs/torch>`.

Numpy和Tensor之间的转化
------------

在Torch Tensor和NumPy array之间相互转化非常容易。

Torch Tensor和NumPy array会共享内存，所以改变其中一项也会改变另一项。

把Torch Tensor转变成NumPy Array


In [15]:
a = torch.ones(5)
print(a)

tensor([1., 1., 1., 1., 1.])


In [16]:
b = a.numpy()
print(b)

[1. 1. 1. 1. 1.]


改变numpy array里面的值。

In [17]:
a.add_(1)
print(a)
print(b)

tensor([2., 2., 2., 2., 2.])
[2. 2. 2. 2. 2.]


把NumPy ndarray转成Torch Tensor

In [18]:
import numpy as np
a = np.ones(5)
b = torch.from_numpy(a)
np.add(a, 1, out=a)
print(a)
print(b)

[2. 2. 2. 2. 2.]
tensor([2., 2., 2., 2., 2.], dtype=torch.float64)


所有CPU上的Tensor都支持转成numpy或者从numpy转成Tensor。

CUDA Tensors
------------

使用``.to``方法，Tensor可以被移动到别的device上。



In [19]:
# let us run this cell only if CUDA is available
# We will use ``torch.device`` objects to move tensors in and out of GPU
if torch.cuda.is_available():
    device = torch.device("cuda")          # a CUDA device object
    y = torch.ones_like(x, device=device)  # directly create a tensor on GPU
    x = x.to(device)                       # or just use strings ``.to("cuda")``
    z = x + y
    print(z)
    print(z.to("cpu", torch.double))       # ``.to`` can also change dtype together!


热身: 用numpy实现两层神经网络
--------------

一个全连接ReLU神经网络，一个隐藏层，没有bias。用来从x预测y，使用L2 Loss。

这一实现完全使用numpy来计算前向神经网络，loss，和反向传播。

numpy ndarray是一个普通的n维array。它不知道任何关于深度学习或者梯度(gradient)的知识，也不知道计算图(computation graph)，只是一种用来计算数学运算的数据结构。



In [20]:
import numpy as np

# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random input and output data
x = np.random.randn(N, D_in)
y = np.random.randn(N, D_out)

# Randomly initialize weights
w1 = np.random.randn(D_in, H)
w2 = np.random.randn(H, D_out)

learning_rate = 1e-6
for t in range(500):
    # Forward pass: compute predicted y
    h = x.dot(w1)
    h_relu = np.maximum(h, 0)
    y_pred = h_relu.dot(w2)

    # Compute and print loss
    loss = np.square(y_pred - y).sum()
    print(t, loss)

    # Backprop to compute gradients of w1 and w2 with respect to loss
    
    # loss = (y_pred - y) ** 2
    grad_y_pred = 2.0 * (y_pred - y)
    # 
    grad_w2 = h_relu.T.dot(grad_y_pred)
    grad_h_relu = grad_y_pred.dot(w2.T)
    grad_h = grad_h_relu.copy()
    grad_h[h < 0] = 0
    grad_w1 = x.T.dot(grad_h)

    # Update weights
    w1 -= learning_rate * grad_w1
    w2 -= learning_rate * grad_w2

0 34399246.46047344
1 29023199.257758312
2 25155679.85447208
3 20344203.603057466
4 14771404.625789404
5 9796072.99431371
6 6194144.749997159
7 3948427.3657580013
8 2637928.1726997104
9 1879876.2597949505
10 1424349.925182723
11 1131684.579785501
12 930879.9521737935
13 783503.167740541
14 669981.8287784329
15 579151.6288421676
16 504610.5781504087
17 442295.18952143926
18 389647.44224490353
19 344718.3535892912
20 306120.2245707266
21 272728.24885829526
22 243778.8617292929
23 218485.92082002352
24 196304.70602822883
25 176774.2980280186
26 159509.34934842546
27 144200.52956072442
28 130597.06878493169
29 118484.47548850597
30 107661.24303895692
31 97973.75762285746
32 89291.0096051952
33 81500.46898789635
34 74477.4654945682
35 68139.90452489533
36 62418.87519034026
37 57241.53801123622
38 52545.34658231941
39 48280.5552386464
40 44399.73653914068
41 40864.495617471934
42 37640.08489317873
43 34695.77852549495
44 32004.894008637555
45 29545.09481447049
46 27292.93700341219
47 25232.8

367 0.0004326546423136559
368 0.0004116382458083261
369 0.0003916440959886334
370 0.0003726296356534275
371 0.0003545443586216977
372 0.000337347352488608
373 0.00032099061370803334
374 0.0003054229784132819
375 0.00029061647064382485
376 0.0002765299098361774
377 0.0002631327221101076
378 0.0002503865963973947
379 0.0002382599294869431
380 0.00022672670184804494
381 0.00021575299560298047
382 0.00020531375263207438
383 0.000195381616896771
384 0.00018593500698085453
385 0.00017694494225329907
386 0.00016839225855899982
387 0.00016025517275686525
388 0.00015251350815142156
389 0.0001451491411549753
390 0.0001381428245892601
391 0.00013147417414693054
392 0.00012512977608770297
393 0.00011909308605343111
394 0.00011334857979979945
395 0.00010788480695473414
396 0.00010268704883570024
397 9.773868892276339e-05
398 9.303020197524704e-05
399 8.85491663624475e-05
400 8.428485316645869e-05
401 8.022778747190388e-05
402 7.636668153099922e-05
403 7.269236014951034e-05
404 6.919607836124983e-05


PyTorch: Tensors
----------------

这次我们使用PyTorch tensors来创建前向神经网络，计算损失，以及反向传播。

一个PyTorch Tensor很像一个numpy的ndarray。但是它和numpy ndarray最大的区别是，PyTorch Tensor可以在CPU或者GPU上运算。如果想要在GPU上运算，就需要把Tensor换成cuda类型。


In [21]:
import torch


dtype = torch.float
device = torch.device("cpu")
# device = torch.device("cuda:0") # Uncomment this to run on GPU

# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random input and output data
x = torch.randn(N, D_in, device=device, dtype=dtype)
y = torch.randn(N, D_out, device=device, dtype=dtype)

# Randomly initialize weights
w1 = torch.randn(D_in, H, device=device, dtype=dtype)
w2 = torch.randn(H, D_out, device=device, dtype=dtype)

learning_rate = 1e-6
for t in range(500):
    # Forward pass: compute predicted y
    h = x.mm(w1)
    h_relu = h.clamp(min=0)
    y_pred = h_relu.mm(w2)

    # Compute and print loss
    loss = (y_pred - y).pow(2).sum().item()
    print(t, loss)

    # Backprop to compute gradients of w1 and w2 with respect to loss
    grad_y_pred = 2.0 * (y_pred - y)
    grad_w2 = h_relu.t().mm(grad_y_pred)
    grad_h_relu = grad_y_pred.mm(w2.t())
    grad_h = grad_h_relu.clone()
    grad_h[h < 0] = 0
    grad_w1 = x.t().mm(grad_h)

    # Update weights using gradient descent
    w1 -= learning_rate * grad_w1
    w2 -= learning_rate * grad_w2

0 31704728.0
1 25331164.0
2 22378086.0
3 19262238.0
4 15348289.0
5 11017595.0
6 7356282.0
7 4705923.5
8 3027346.5
9 2012536.375
10 1409662.25
11 1041771.75
12 807321.0625
13 649262.0
14 536533.1875
15 451980.875
16 385983.53125
17 332925.53125
18 289368.1875
19 253030.78125
20 222354.703125
21 196214.3125
22 173766.515625
23 154378.140625
24 137539.375
25 122867.1015625
26 110037.3515625
27 98769.4921875
28 88842.109375
29 80063.15625
30 72279.015625
31 65361.66796875
32 59195.42578125
33 53687.4453125
34 48757.57421875
35 44338.4453125
36 40370.34765625
37 36803.1484375
38 33587.4453125
39 30684.1640625
40 28059.435546875
41 25683.255859375
42 23528.814453125
43 21570.8515625
44 19792.4296875
45 18175.244140625
46 16704.6640625
47 15364.2578125
48 14141.7509765625
49 13026.609375
50 12007.3115234375
51 11075.3896484375
52 10221.8857421875
53 9439.876953125
54 8722.13671875
55 8063.46826171875
56 7458.20703125
57 6901.8876953125
58 6390.34375
59 5919.4794921875
60 5485.79345703125
61 5

375 0.0002844816190190613
376 0.00027625024085864425
377 0.0002687727683223784
378 0.0002608516369946301
379 0.00025311342324130237
380 0.0002469048195052892
381 0.00024049097555689514
382 0.0002342124644201249
383 0.00022811403323430568
384 0.00022231723414734006
385 0.0002166029589716345
386 0.00021077181736472994
387 0.00020510501053649932
388 0.00020020001102238894
389 0.0001948442222783342
390 0.00018990584067068994
391 0.00018529882072471082
392 0.00018070911755785346
393 0.00017650797963142395
394 0.00017214834224432707
395 0.0001683011942077428
396 0.00016451899136882275
397 0.00016050187696237117
398 0.00015686434926465154
399 0.00015321985119953752
400 0.0001501761726103723
401 0.00014639270375482738
402 0.00014274154091253877
403 0.0001396275474689901
404 0.0001364489580737427
405 0.00013346801279112697
406 0.00013024920190218836
407 0.00012755846546497196
408 0.00012532222899608314
409 0.0001224723382620141
410 0.00011974618973908946
411 0.00011740042100427672
412 0.0001144

简单的autograd

In [22]:
# Create tensors.
x = torch.tensor(1., requires_grad=True)
w = torch.tensor(2., requires_grad=True)
b = torch.tensor(3., requires_grad=True)

# Build a computational graph.
y = w * x + b    # y = 2 * x + 3

# Compute gradients.
y.backward()

# Print out the gradients.
print(x.grad)    # x.grad = 2 
print(w.grad)    # w.grad = 1 
print(b.grad)    # b.grad = 1 

tensor(2.)
tensor(1.)
tensor(1.)



PyTorch: Tensor和autograd
-------------------------------

PyTorch的一个重要功能就是autograd，也就是说只要定义了forward pass(前向神经网络)，计算了loss之后，PyTorch可以自动求导计算模型所有参数的梯度。

一个PyTorch的Tensor表示计算图中的一个节点。如果``x``是一个Tensor并且``x.requires_grad=True``那么``x.grad``是另一个储存着``x``当前梯度(相对于一个scalar，常常是loss)的向量。


In [23]:
import torch

dtype = torch.float
device = torch.device("cpu")
# device = torch.device("cuda:0") # Uncomment this to run on GPU

# N 是 batch size; D_in 是 input dimension;
# H 是 hidden dimension; D_out 是 output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# 创建随机的Tensor来保存输入和输出
# 设定requires_grad=False表示在反向传播的时候我们不需要计算gradient
x = torch.randn(N, D_in, device=device, dtype=dtype)
y = torch.randn(N, D_out, device=device, dtype=dtype)

# 创建随机的Tensor和权重。
# 设置requires_grad=True表示我们希望反向传播的时候计算Tensor的gradient
w1 = torch.randn(D_in, H, device=device, dtype=dtype, requires_grad=True)
w2 = torch.randn(H, D_out, device=device, dtype=dtype, requires_grad=True)

learning_rate = 1e-6
for t in range(500):
    # 前向传播:通过Tensor预测y；这个和普通的神经网络的前向传播没有任何不同，
    # 但是我们不需要保存网络的中间运算结果，因为我们不需要手动计算反向传播。
    y_pred = x.mm(w1).clamp(min=0).mm(w2)

    # 通过前向传播计算loss
    # loss是一个形状为(1，)的Tensor
    # loss.item()可以给我们返回一个loss的scalar
    loss = (y_pred - y).pow(2).sum()
    print(t, loss.item())

    # PyTorch给我们提供了autograd的方法做反向传播。如果一个Tensor的requires_grad=True，
    # backward会自动计算loss相对于每个Tensor的gradient。在backward之后，
    # w1.grad和w2.grad会包含两个loss相对于两个Tensor的gradient信息。
    loss.backward()

    # 我们可以手动做gradient descent(后面我们会介绍自动的方法)。
    # 用torch.no_grad()包含以下statements，因为w1和w2都是requires_grad=True，
    # 但是在更新weights之后我们并不需要再做autograd。
    # 另一种方法是在weight.data和weight.grad.data上做操作，这样就不会对grad产生影响。
    # tensor.data会我们一个tensor，这个tensor和原来的tensor指向相同的内存空间，
    # 但是不会记录计算图的历史。
    with torch.no_grad():
        w1 -= learning_rate * w1.grad
        w2 -= learning_rate * w2.grad

        # Manually zero the gradients after updating weights
        w1.grad.zero_()
        w2.grad.zero_()

0 31590738.0
1 34389704.0
2 44504280.0
3 52598508.0
4 46752264.0
5 27227634.0
6 10779343.0
7 3889138.75
8 1856397.875
9 1232127.25
10 967278.5
11 806383.9375
12 687169.25
13 591936.25
14 513579.40625
15 448339.5
16 393390.71875
17 346772.71875
18 306952.625
19 272743.90625
20 243250.578125
21 217760.4375
22 195513.75
23 176012.4375
24 158848.59375
25 143694.4375
26 130272.53125
27 118357.1328125
28 107732.5625
29 98245.9296875
30 89754.4375
31 82145.9765625
32 75299.703125
33 69130.7265625
34 63549.09375
35 58498.18359375
36 53914.7421875
37 49751.984375
38 45963.8515625
39 42512.19140625
40 39364.1484375
41 36486.7421875
42 33852.94921875
43 31441.951171875
44 29230.11328125
45 27200.080078125
46 25335.595703125
47 23618.97265625
48 22036.193359375
49 20575.412109375
50 19227.5078125
51 17980.865234375
52 16826.919921875
53 15756.392578125
54 14762.513671875
55 13839.58203125
56 12981.9228515625
57 12184.3896484375
58 11442.140625
59 10750.8681640625
60 10106.751953125
61 9505.8720703

394 0.005430158693343401
395 0.005243257619440556
396 0.005058295093476772
397 0.0048800683580338955
398 0.004707938991487026
399 0.004541801754385233
400 0.004385354463011026
401 0.0042332010343670845
402 0.0040851193480193615
403 0.003942274488508701
404 0.003809330752119422
405 0.0036788880825042725
406 0.0035530496388673782
407 0.0034328829497098923
408 0.003316469956189394
409 0.0032058244105428457
410 0.003095718566328287
411 0.002996482653543353
412 0.002896404592320323
413 0.002801347989588976
414 0.0027062646113336086
415 0.0026161009445786476
416 0.002530781552195549
417 0.002449025632813573
418 0.002370838774368167
419 0.002294242149218917
420 0.002220114693045616
421 0.002151642693206668
422 0.0020829373970627785
423 0.0020190104842185974
424 0.0019563380628824234
425 0.0018947365460917354
426 0.0018343634437769651
427 0.0017779992194846272
428 0.0017241643508896232
429 0.001670036930590868
430 0.0016198739176616073
431 0.0015696510672569275
432 0.0015243508387356997
433 0.


PyTorch: nn
-----------


这次我们使用PyTorch中nn这个库来构建网络。
用PyTorch autograd来构建计算图和计算gradients，
然后PyTorch会帮我们自动计算gradient。




In [24]:
import torch

# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold inputs and outputs
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)

# Use the nn package to define our model as a sequence of layers. nn.Sequential
# is a Module which contains other Modules, and applies them in sequence to
# produce its output. Each Linear Module computes output from input using a
# linear function, and holds internal Tensors for its weight and bias.
model = torch.nn.Sequential(
    torch.nn.Linear(D_in, H),
    torch.nn.ReLU(),
    torch.nn.Linear(H, D_out),
)

# The nn package also contains definitions of popular loss functions; in this
# case we will use Mean Squared Error (MSE) as our loss function.
loss_fn = torch.nn.MSELoss(reduction='sum')

learning_rate = 1e-4
for t in range(500):
    # Forward pass: compute predicted y by passing x to the model. Module objects
    # override the __call__ operator so you can call them like functions. When
    # doing so you pass a Tensor of input data to the Module and it produces
    # a Tensor of output data.
    y_pred = model(x)

    # Compute and print loss. We pass Tensors containing the predicted and true
    # values of y, and the loss function returns a Tensor containing the
    # loss.
    loss = loss_fn(y_pred, y)
    print(t, loss.item())

    # Zero the gradients before running the backward pass.
    model.zero_grad()

    # Backward pass: compute gradient of the loss with respect to all the learnable
    # parameters of the model. Internally, the parameters of each Module are stored
    # in Tensors with requires_grad=True, so this call will compute gradients for
    # all learnable parameters in the model.
    loss.backward()

    # Update the weights using gradient descent. Each parameter is a Tensor, so
    # we can access its gradients like we did before.
    with torch.no_grad():
        for param in model.parameters():
            param -= learning_rate * param.grad

0 616.8349609375
1 570.4186401367188
2 530.6421508789062
3 495.7164001464844
4 464.91497802734375
5 437.1092834472656
6 411.87066650390625
7 388.5781555175781
8 367.105224609375
9 347.06768798828125
10 328.3486328125
11 310.6429748535156
12 294.08880615234375
13 278.54046630859375
14 263.8558044433594
15 249.83802795410156
16 236.52313232421875
17 223.8170166015625
18 211.7015380859375
19 200.13755798339844
20 189.1465301513672
21 178.6802520751953
22 168.74122619628906
23 159.31674194335938
24 150.35125732421875
25 141.79025268554688
26 133.63401794433594
27 125.89380645751953
28 118.53340148925781
29 111.54275512695312
30 104.91582489013672
31 98.65790557861328
32 92.7421646118164
33 87.18020629882812
34 81.94192504882812
35 77.01036834716797
36 72.3639144897461
37 67.99095916748047
38 63.88977813720703
39 60.036468505859375
40 56.426231384277344
41 53.05012512207031
42 49.88925552368164
43 46.92338943481445
44 44.14652633666992
45 41.54481887817383
46 39.10710144042969
47 36.8310813

364 0.00028583104722201824
365 0.000278460793197155
366 0.00027128090732730925
367 0.00026430474827066064
368 0.0002575131948105991
369 0.0002509095938876271
370 0.00024448230396956205
371 0.00023822381626814604
372 0.0002321432693861425
373 0.00022622810502070934
374 0.0002204657648690045
375 0.0002148632047465071
376 0.00020940121612511575
377 0.00020409838180057704
378 0.0001989272132050246
379 0.00019389843509998173
380 0.00018900231225416064
381 0.0001842392230173573
382 0.00017960301192943007
383 0.00017508988094050437
384 0.00017069902969524264
385 0.00016641429101582617
386 0.00016225305444095284
387 0.000158198265125975
388 0.00015424926823470742
389 0.000150404914165847
390 0.0001466635148972273
391 0.00014301779447123408
392 0.00013946628314442933
393 0.00013601550017483532
394 0.00013265143206808716
395 0.00012937198334839195
396 0.00012617645552381873
397 0.0001230676716659218
398 0.0001200390252051875
399 0.00011708753299899399
400 0.00011421682575019076
401 0.00011141804


PyTorch: optim
--------------

这一次我们不再手动更新模型的weights,而是使用optim这个包来帮助我们更新参数。
optim这个package提供了各种不同的模型优化方法，包括SGD+momentum, RMSProp, Adam等等。


In [25]:
import torch

# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold inputs and outputs
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)

# Use the nn package to define our model and loss function.
model = torch.nn.Sequential(
    torch.nn.Linear(D_in, H),
    torch.nn.ReLU(),
    torch.nn.Linear(H, D_out),
)
loss_fn = torch.nn.MSELoss(reduction='sum')

# Use the optim package to define an Optimizer that will update the weights of
# the model for us. Here we will use Adam; the optim package contains many other
# optimization algoriths. The first argument to the Adam constructor tells the
# optimizer which Tensors it should update.
learning_rate = 1e-4
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
for t in range(500):
    # Forward pass: compute predicted y by passing x to the model.
    y_pred = model(x)

    # Compute and print loss.
    loss = loss_fn(y_pred, y)
    print(t, loss.item())

    # Before the backward pass, use the optimizer object to zero all of the
    # gradients for the variables it will update (which are the learnable
    # weights of the model). This is because by default, gradients are
    # accumulated in buffers( i.e, not overwritten) whenever .backward()
    # is called. Checkout docs of torch.autograd.backward for more details.
    optimizer.zero_grad()

    # Backward pass: compute gradient of the loss with respect to model
    # parameters
    loss.backward()

    # Calling the step function on an Optimizer makes an update to its
    # parameters
    optimizer.step()

0 791.6784057617188
1 772.9479370117188
2 754.7570190429688
3 737.1196899414062
4 719.9917602539062
5 703.4118041992188
6 687.2720947265625
7 671.5335083007812
8 656.197021484375
9 641.322265625
10 626.919189453125
11 612.9500732421875
12 599.3975219726562
13 586.327392578125
14 573.5608520507812
15 561.1778564453125
16 549.1342163085938
17 537.3661499023438
18 525.8930053710938
19 514.7115478515625
20 503.76434326171875
21 493.110107421875
22 482.747802734375
23 472.5677490234375
24 462.6549072265625
25 452.9748840332031
26 443.5308532714844
27 434.2815856933594
28 425.22064208984375
29 416.3988342285156
30 407.7718505859375
31 399.32879638671875
32 391.1062927246094
33 383.07781982421875
34 375.2408752441406
35 367.54071044921875
36 359.98126220703125
37 352.5944519042969
38 345.3690185546875
39 338.3002624511719
40 331.3758544921875
41 324.56463623046875
42 317.8782958984375
43 311.31622314453125
44 304.8724670410156
45 298.5438537597656
46 292.3463439941406
47 286.2505187988281
48 

429 4.1297284042229876e-05
430 3.9249673136509955e-05
431 3.7301993870642036e-05
432 3.545052823028527e-05
433 3.3691019780235365e-05
434 3.2015552278608084e-05
435 3.0420967959798872e-05
436 2.890893301810138e-05
437 2.74712383543374e-05
438 2.6098401576746255e-05
439 2.4797802325338125e-05
440 2.3561869966215454e-05
441 2.2385444026440382e-05
442 2.1267582269501872e-05
443 2.0203089661663398e-05
444 1.9192844774806872e-05
445 1.8230841305921786e-05
446 1.7318037862423807e-05
447 1.6449723261757754e-05
448 1.562457691761665e-05
449 1.4840144103800412e-05
450 1.409533797414042e-05
451 1.3387114449869841e-05
452 1.2712825991911814e-05
453 1.2073536709067412e-05
454 1.1465210263850167e-05
455 1.0887116332014557e-05
456 1.0337735147913918e-05
457 9.81609719019616e-06
458 9.320682693214621e-06
459 8.849255209497642e-06
460 8.402344064961653e-06
461 7.977385394042358e-06
462 7.57272437112988e-06
463 7.188868949015159e-06
464 6.824670890637208e-06
465 6.479081548604881e-06
466 6.150191438791


PyTorch: 自定义 nn Modules
--------------------------

我们可以定义一个模型，这个模型继承自nn.Module类。如果需要定义一个比Sequential模型更加复杂的模型，就需要定义nn.Module模型。



In [26]:
import torch


class TwoLayerNet(torch.nn.Module):
    def __init__(self, D_in, H, D_out):
        """
        In the constructor we instantiate two nn.Linear modules and assign them as
        member variables.
        """
        super(TwoLayerNet, self).__init__()
        self.linear1 = torch.nn.Linear(D_in, H)
        self.linear2 = torch.nn.Linear(H, D_out)

    def forward(self, x):
        """
        In the forward function we accept a Tensor of input data and we must return
        a Tensor of output data. We can use Modules defined in the constructor as
        well as arbitrary operators on Tensors.
        """
        h_relu = self.linear1(x).clamp(min=0)
        y_pred = self.linear2(h_relu)
        return y_pred


# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold inputs and outputs
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)

# Construct our model by instantiating the class defined above
model = TwoLayerNet(D_in, H, D_out)

# Construct our loss function and an Optimizer. The call to model.parameters()
# in the SGD constructor will contain the learnable parameters of the two
# nn.Linear modules which are members of the model.
criterion = torch.nn.MSELoss(reduction='sum')
optimizer = torch.optim.SGD(model.parameters(), lr=1e-4)
for t in range(500):
    # Forward pass: Compute predicted y by passing x to the model
    y_pred = model(x)

    # Compute and print loss
    loss = criterion(y_pred, y)
    print(t, loss.item())

    # Zero gradients, perform a backward pass, and update the weights.
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

0 656.6958618164062
1 608.1090087890625
2 566.172607421875
3 529.2335815429688
4 496.7382507324219
5 467.453125
6 440.5755310058594
7 416.12872314453125
8 393.6068420410156
9 372.708251953125
10 353.00006103515625
11 334.477783203125
12 316.97283935546875
13 300.36737060546875
14 284.6544189453125
15 269.65936279296875
16 255.33456420898438
17 241.66688537597656
18 228.60800170898438
19 216.09536743164062
20 204.13780212402344
21 192.75645446777344
22 181.89234924316406
23 171.58370971679688
24 161.7939453125
25 152.4780731201172
26 143.59371948242188
27 135.14727783203125
28 127.13992309570312
29 119.55585479736328
30 112.37797546386719
31 105.62073516845703
32 99.24383544921875
33 93.24134826660156
34 87.58341979980469
35 82.25212860107422
36 77.24210357666016
37 72.55087280273438
38 68.1427230834961
39 64.00277709960938
40 60.1308479309082
41 56.49887466430664
42 53.0952033996582
43 49.906524658203125
44 46.91959762573242
45 44.11970520019531
46 41.50297164916992
47 39.0628700256347

375 0.0006505012279376388
376 0.0006343786371871829
377 0.0006186614627949893
378 0.0006033276440575719
379 0.0005883832345716655
380 0.0005738206673413515
381 0.0005596213741227984
382 0.0005457888473756611
383 0.0005322962533682585
384 0.0005191444652155042
385 0.000506328884512186
386 0.0004938290221616626
387 0.00048163760220631957
388 0.0004697689728345722
389 0.0004582055553328246
390 0.00044691533548757434
391 0.00043590739369392395
392 0.0004251690406817943
393 0.0004147063591517508
394 0.00040450665983371437
395 0.0003945553908124566
396 0.0003848606429528445
397 0.00037539892946369946
398 0.0003661849768832326
399 0.00035720854066312313
400 0.000348439411027357
401 0.0003398970584385097
402 0.00033156739664264023
403 0.0003234421892557293
404 0.0003155224258080125
405 0.00030779733788222075
406 0.0003002593875862658
407 0.00029291390092112124
408 0.00028574312455020845
409 0.0002787590492516756
410 0.000271946337306872
411 0.00026530082686804235
412 0.00025882109184749424
413

# FizzBuzz

FizzBuzz是一个简单的小游戏。游戏规则如下：从1开始往上数数，当遇到3的倍数的时候，说fizz，当遇到5的倍数，说buzz，当遇到15的倍数，就说fizzbuzz，其他情况下则正常数数。

我们可以写一个简单的小程序来决定要返回正常数值还是fizz, buzz 或者 fizzbuzz。

In [1]:
# One-hot encode the desired outputs: [number, "fizz", "buzz", "fizzbuzz"]
def fizz_buzz_encode(i):
    if   i % 15 == 0: return 3
    elif i % 5  == 0: return 2
    elif i % 3  == 0: return 1
    else:             return 0
    
def fizz_buzz_decode(i, prediction):
    return [str(i), "fizz", "buzz", "fizzbuzz"][prediction]

print(fizz_buzz_decode(1, fizz_buzz_encode(1)))
print(fizz_buzz_decode(2, fizz_buzz_encode(2)))
print(fizz_buzz_decode(5, fizz_buzz_encode(5)))
print(fizz_buzz_decode(12, fizz_buzz_encode(12)))
print(fizz_buzz_decode(15, fizz_buzz_encode(15)))

1
2
buzz
fizz
fizzbuzz


我们首先定义模型的输入与输出(训练数据)

In [2]:
import numpy as np
import torch

NUM_DIGITS = 10

# Represent each input by an array of its binary digits.
def binary_encode(i, num_digits):
    return np.array([i >> d & 1 for d in range(num_digits)])

trX = torch.Tensor([binary_encode(i, NUM_DIGITS) for i in range(101, 2 ** NUM_DIGITS)])
trY = torch.LongTensor([fizz_buzz_encode(i) for i in range(101, 2 ** NUM_DIGITS)])

In [3]:
trX.size()

torch.Size([923, 10])

In [4]:
trY.size()

torch.Size([923])

然后我们用PyTorch定义模型

In [5]:
# Define the model
NUM_HIDDEN = 100
model = torch.nn.Sequential(
    torch.nn.Linear(NUM_DIGITS, NUM_HIDDEN),
    torch.nn.ReLU(),
    torch.nn.Linear(NUM_HIDDEN, 4)
)

In [6]:
print(model)

Sequential(
  (0): Linear(in_features=10, out_features=100, bias=True)
  (1): ReLU()
  (2): Linear(in_features=100, out_features=4, bias=True)
)


- 为了让我们的模型学会FizzBuzz这个游戏，我们需要定义一个损失函数，和一个优化算法。
- 这个优化算法会不断优化（降低）损失函数，使得模型的在该任务上取得尽可能低的损失值。
- 损失值低往往表示我们的模型表现好，损失值高表示我们的模型表现差。
- 由于FizzBuzz游戏本质上是一个分类问题，我们选用Cross Entropyy Loss函数。
- 优化函数我们选用Stochastic Gradient Descent。

In [7]:
loss_fn = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr = 0.05)

以下是模型的训练代码

In [11]:
# Start training it
BATCH_SIZE = 128
for epoch in range(10000):
    for start in range(0, len(trX), BATCH_SIZE):
        end = start + BATCH_SIZE
        batchX = trX[start:end]
        batchY = trY[start:end]

        y_pred = model(batchX)
        loss = loss_fn(y_pred, batchY)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    # Find loss on training data
    loss = loss_fn(model(trX), trY).item()
    print('Epoch:', epoch, 'Loss:', loss)

Epoch: 0 Loss: 1.1398005485534668
Epoch: 1 Loss: 1.1396173238754272
Epoch: 2 Loss: 1.1394379138946533
Epoch: 3 Loss: 1.1392650604248047
Epoch: 4 Loss: 1.1390981674194336
Epoch: 5 Loss: 1.1389374732971191
Epoch: 6 Loss: 1.1387834548950195
Epoch: 7 Loss: 1.1386330127716064
Epoch: 8 Loss: 1.138487696647644
Epoch: 9 Loss: 1.1383460760116577
Epoch: 10 Loss: 1.1382099390029907
Epoch: 11 Loss: 1.138074517250061
Epoch: 12 Loss: 1.1379458904266357
Epoch: 13 Loss: 1.137818455696106
Epoch: 14 Loss: 1.1376941204071045
Epoch: 15 Loss: 1.137571930885315
Epoch: 16 Loss: 1.137453556060791
Epoch: 17 Loss: 1.1373379230499268
Epoch: 18 Loss: 1.1372203826904297
Epoch: 19 Loss: 1.1371078491210938
Epoch: 20 Loss: 1.1369930505752563
Epoch: 21 Loss: 1.1368814706802368
Epoch: 22 Loss: 1.1367700099945068
Epoch: 23 Loss: 1.1366610527038574
Epoch: 24 Loss: 1.13655424118042
Epoch: 25 Loss: 1.1364490985870361
Epoch: 26 Loss: 1.1363450288772583
Epoch: 27 Loss: 1.1362453699111938
Epoch: 28 Loss: 1.1361465454101562
Ep

Epoch: 233 Loss: 1.1057689189910889
Epoch: 234 Loss: 1.1055166721343994
Epoch: 235 Loss: 1.1051126718521118
Epoch: 236 Loss: 1.1048898696899414
Epoch: 237 Loss: 1.1042641401290894
Epoch: 238 Loss: 1.1041080951690674
Epoch: 239 Loss: 1.1034795045852661
Epoch: 240 Loss: 1.1034988164901733
Epoch: 241 Loss: 1.1025210618972778
Epoch: 242 Loss: 1.102926254272461
Epoch: 243 Loss: 1.1018842458724976
Epoch: 244 Loss: 1.101987600326538
Epoch: 245 Loss: 1.1013853549957275
Epoch: 246 Loss: 1.100823163986206
Epoch: 247 Loss: 1.1010090112686157
Epoch: 248 Loss: 1.1000183820724487
Epoch: 249 Loss: 1.1000453233718872
Epoch: 250 Loss: 1.099259614944458
Epoch: 251 Loss: 1.0992088317871094
Epoch: 252 Loss: 1.098765254020691
Epoch: 253 Loss: 1.0984464883804321
Epoch: 254 Loss: 1.0976760387420654
Epoch: 255 Loss: 1.0971912145614624
Epoch: 256 Loss: 1.0975146293640137
Epoch: 257 Loss: 1.0965760946273804
Epoch: 258 Loss: 1.096675157546997
Epoch: 259 Loss: 1.0955779552459717
Epoch: 260 Loss: 1.095758318901062

Epoch: 466 Loss: 0.9633293151855469
Epoch: 467 Loss: 0.9627517461776733
Epoch: 468 Loss: 0.9618582725524902
Epoch: 469 Loss: 0.9609419703483582
Epoch: 470 Loss: 0.9599421620368958
Epoch: 471 Loss: 0.9588116407394409
Epoch: 472 Loss: 0.9583871960639954
Epoch: 473 Loss: 0.9575807452201843
Epoch: 474 Loss: 0.9562839269638062
Epoch: 475 Loss: 0.9562606811523438
Epoch: 476 Loss: 0.9534665942192078
Epoch: 477 Loss: 0.9535610675811768
Epoch: 478 Loss: 0.9530202746391296
Epoch: 479 Loss: 0.9517608284950256
Epoch: 480 Loss: 0.9508361220359802
Epoch: 481 Loss: 0.9502776265144348
Epoch: 482 Loss: 0.9487178325653076
Epoch: 483 Loss: 0.9483950138092041
Epoch: 484 Loss: 0.9465591311454773
Epoch: 485 Loss: 0.946814239025116
Epoch: 486 Loss: 0.9457123279571533
Epoch: 487 Loss: 0.9437385201454163
Epoch: 488 Loss: 0.9440769553184509
Epoch: 489 Loss: 0.9418604969978333
Epoch: 490 Loss: 0.9427571296691895
Epoch: 491 Loss: 0.9406864643096924
Epoch: 492 Loss: 0.939991295337677
Epoch: 493 Loss: 0.93926328420

Epoch: 704 Loss: 0.7014567255973816
Epoch: 705 Loss: 0.7009668350219727
Epoch: 706 Loss: 0.6995615363121033
Epoch: 707 Loss: 0.6980858445167542
Epoch: 708 Loss: 0.6968650221824646
Epoch: 709 Loss: 0.6961283087730408
Epoch: 710 Loss: 0.6937457919120789
Epoch: 711 Loss: 0.6940642595291138
Epoch: 712 Loss: 0.6930643320083618
Epoch: 713 Loss: 0.6914438009262085
Epoch: 714 Loss: 0.6890421509742737
Epoch: 715 Loss: 0.6890953779220581
Epoch: 716 Loss: 0.6874043941497803
Epoch: 717 Loss: 0.6861819624900818
Epoch: 718 Loss: 0.6850875020027161
Epoch: 719 Loss: 0.6837383508682251
Epoch: 720 Loss: 0.682225227355957
Epoch: 721 Loss: 0.6815162301063538
Epoch: 722 Loss: 0.6799144744873047
Epoch: 723 Loss: 0.6789056062698364
Epoch: 724 Loss: 0.6788721084594727
Epoch: 725 Loss: 0.6751896739006042
Epoch: 726 Loss: 0.675994336605072
Epoch: 727 Loss: 0.6743634343147278
Epoch: 728 Loss: 0.67205810546875
Epoch: 729 Loss: 0.6730316281318665
Epoch: 730 Loss: 0.6705272197723389
Epoch: 731 Loss: 0.6688125729560

Epoch: 938 Loss: 0.46011894941329956
Epoch: 939 Loss: 0.4596499502658844
Epoch: 940 Loss: 0.4588138461112976
Epoch: 941 Loss: 0.4581281542778015
Epoch: 942 Loss: 0.4587392210960388
Epoch: 943 Loss: 0.4565322697162628
Epoch: 944 Loss: 0.45571208000183105
Epoch: 945 Loss: 0.4552833139896393
Epoch: 946 Loss: 0.45417988300323486
Epoch: 947 Loss: 0.4535030424594879
Epoch: 948 Loss: 0.45269304513931274
Epoch: 949 Loss: 0.4521746337413788
Epoch: 950 Loss: 0.45096078515052795
Epoch: 951 Loss: 0.45036786794662476
Epoch: 952 Loss: 0.44997817277908325
Epoch: 953 Loss: 0.4490334689617157
Epoch: 954 Loss: 0.44815754890441895
Epoch: 955 Loss: 0.4477037191390991
Epoch: 956 Loss: 0.44682297110557556
Epoch: 957 Loss: 0.4456159472465515
Epoch: 958 Loss: 0.44554945826530457
Epoch: 959 Loss: 0.44421783089637756
Epoch: 960 Loss: 0.44332921504974365
Epoch: 961 Loss: 0.4426501989364624
Epoch: 962 Loss: 0.44251060485839844
Epoch: 963 Loss: 0.4412216246128082
Epoch: 964 Loss: 0.440498948097229
Epoch: 965 Loss:

Epoch: 1178 Loss: 0.32014063000679016
Epoch: 1179 Loss: 0.3194880187511444
Epoch: 1180 Loss: 0.31905874609947205
Epoch: 1181 Loss: 0.3189382255077362
Epoch: 1182 Loss: 0.31837284564971924
Epoch: 1183 Loss: 0.3178442418575287
Epoch: 1184 Loss: 0.3176492154598236
Epoch: 1185 Loss: 0.3170930743217468
Epoch: 1186 Loss: 0.3168293833732605
Epoch: 1187 Loss: 0.31628701090812683
Epoch: 1188 Loss: 0.3159978687763214
Epoch: 1189 Loss: 0.3155488073825836
Epoch: 1190 Loss: 0.31514135003089905
Epoch: 1191 Loss: 0.3146306574344635
Epoch: 1192 Loss: 0.3142470419406891
Epoch: 1193 Loss: 0.31391674280166626
Epoch: 1194 Loss: 0.31356126070022583
Epoch: 1195 Loss: 0.31310969591140747
Epoch: 1196 Loss: 0.31259235739707947
Epoch: 1197 Loss: 0.312304824590683
Epoch: 1198 Loss: 0.3119160234928131
Epoch: 1199 Loss: 0.31149038672447205
Epoch: 1200 Loss: 0.3111397624015808
Epoch: 1201 Loss: 0.31058239936828613
Epoch: 1202 Loss: 0.310198575258255
Epoch: 1203 Loss: 0.3099671006202698
Epoch: 1204 Loss: 0.309349387

Epoch: 1404 Loss: 0.244589164853096
Epoch: 1405 Loss: 0.24434548616409302
Epoch: 1406 Loss: 0.24411630630493164
Epoch: 1407 Loss: 0.24412666261196136
Epoch: 1408 Loss: 0.24362525343894958
Epoch: 1409 Loss: 0.24335569143295288
Epoch: 1410 Loss: 0.2430194914340973
Epoch: 1411 Loss: 0.24279698729515076
Epoch: 1412 Loss: 0.24251756072044373
Epoch: 1413 Loss: 0.24222467839717865
Epoch: 1414 Loss: 0.24200689792633057
Epoch: 1415 Loss: 0.24164602160453796
Epoch: 1416 Loss: 0.24144819378852844
Epoch: 1417 Loss: 0.24137814342975616
Epoch: 1418 Loss: 0.24087746441364288
Epoch: 1419 Loss: 0.24078147113323212
Epoch: 1420 Loss: 0.24036598205566406
Epoch: 1421 Loss: 0.2401871234178543
Epoch: 1422 Loss: 0.239908829331398
Epoch: 1423 Loss: 0.2398034632205963
Epoch: 1424 Loss: 0.2394174486398697
Epoch: 1425 Loss: 0.23904035985469818
Epoch: 1426 Loss: 0.23891063034534454
Epoch: 1427 Loss: 0.23850639164447784
Epoch: 1428 Loss: 0.238266259431839
Epoch: 1429 Loss: 0.2381998747587204
Epoch: 1430 Loss: 0.237

Epoch: 1623 Loss: 0.19515305757522583
Epoch: 1624 Loss: 0.19501429796218872
Epoch: 1625 Loss: 0.1947644203901291
Epoch: 1626 Loss: 0.19466176629066467
Epoch: 1627 Loss: 0.19457656145095825
Epoch: 1628 Loss: 0.19425824284553528
Epoch: 1629 Loss: 0.19400806725025177
Epoch: 1630 Loss: 0.1938435584306717
Epoch: 1631 Loss: 0.19372941553592682
Epoch: 1632 Loss: 0.19344022870063782
Epoch: 1633 Loss: 0.1932556927204132
Epoch: 1634 Loss: 0.19299177825450897
Epoch: 1635 Loss: 0.19290053844451904
Epoch: 1636 Loss: 0.19270068407058716
Epoch: 1637 Loss: 0.1924726516008377
Epoch: 1638 Loss: 0.19234180450439453
Epoch: 1639 Loss: 0.19209067523479462
Epoch: 1640 Loss: 0.19191229343414307
Epoch: 1641 Loss: 0.1916447877883911
Epoch: 1642 Loss: 0.19158302247524261
Epoch: 1643 Loss: 0.19124764204025269
Epoch: 1644 Loss: 0.19111153483390808
Epoch: 1645 Loss: 0.1909576654434204
Epoch: 1646 Loss: 0.19079534709453583
Epoch: 1647 Loss: 0.19050449132919312
Epoch: 1648 Loss: 0.19035327434539795
Epoch: 1649 Loss: 

Epoch: 1858 Loss: 0.156164288520813
Epoch: 1859 Loss: 0.15593528747558594
Epoch: 1860 Loss: 0.15579679608345032
Epoch: 1861 Loss: 0.15568189322948456
Epoch: 1862 Loss: 0.15551228821277618
Epoch: 1863 Loss: 0.1553613543510437
Epoch: 1864 Loss: 0.155221626162529
Epoch: 1865 Loss: 0.15504679083824158
Epoch: 1866 Loss: 0.15500706434249878
Epoch: 1867 Loss: 0.15482093393802643
Epoch: 1868 Loss: 0.1546490490436554
Epoch: 1869 Loss: 0.15451423823833466
Epoch: 1870 Loss: 0.15435710549354553
Epoch: 1871 Loss: 0.1542942076921463
Epoch: 1872 Loss: 0.15419985353946686
Epoch: 1873 Loss: 0.1539996862411499
Epoch: 1874 Loss: 0.15380088984966278
Epoch: 1875 Loss: 0.15367846190929413
Epoch: 1876 Loss: 0.15357261896133423
Epoch: 1877 Loss: 0.15337646007537842
Epoch: 1878 Loss: 0.15324053168296814
Epoch: 1879 Loss: 0.15314042568206787
Epoch: 1880 Loss: 0.1530272215604782
Epoch: 1881 Loss: 0.15283553302288055
Epoch: 1882 Loss: 0.15269064903259277
Epoch: 1883 Loss: 0.15256065130233765
Epoch: 1884 Loss: 0.1

Epoch: 2081 Loss: 0.12825733423233032
Epoch: 2082 Loss: 0.12818846106529236
Epoch: 2083 Loss: 0.12801417708396912
Epoch: 2084 Loss: 0.12797139585018158
Epoch: 2085 Loss: 0.12779051065444946
Epoch: 2086 Loss: 0.12770052254199982
Epoch: 2087 Loss: 0.1275712102651596
Epoch: 2088 Loss: 0.12746454775333405
Epoch: 2089 Loss: 0.12737856805324554
Epoch: 2090 Loss: 0.12730726599693298
Epoch: 2091 Loss: 0.12720070779323578
Epoch: 2092 Loss: 0.12705501914024353
Epoch: 2093 Loss: 0.12695029377937317
Epoch: 2094 Loss: 0.1268094927072525
Epoch: 2095 Loss: 0.12675033509731293
Epoch: 2096 Loss: 0.12661710381507874
Epoch: 2097 Loss: 0.12652838230133057
Epoch: 2098 Loss: 0.12640342116355896
Epoch: 2099 Loss: 0.1263323873281479
Epoch: 2100 Loss: 0.1261814832687378
Epoch: 2101 Loss: 0.1261027753353119
Epoch: 2102 Loss: 0.12600398063659668
Epoch: 2103 Loss: 0.1259142905473709
Epoch: 2104 Loss: 0.1257588267326355
Epoch: 2105 Loss: 0.12571358680725098
Epoch: 2106 Loss: 0.12556888163089752
Epoch: 2107 Loss: 0

Epoch: 2309 Loss: 0.10635431855916977
Epoch: 2310 Loss: 0.10627789795398712
Epoch: 2311 Loss: 0.1062200739979744
Epoch: 2312 Loss: 0.10611367970705032
Epoch: 2313 Loss: 0.10605818033218384
Epoch: 2314 Loss: 0.10596007853746414
Epoch: 2315 Loss: 0.10584590584039688
Epoch: 2316 Loss: 0.10575413703918457
Epoch: 2317 Loss: 0.10568515956401825
Epoch: 2318 Loss: 0.10564416646957397
Epoch: 2319 Loss: 0.10549834370613098
Epoch: 2320 Loss: 0.10542198270559311
Epoch: 2321 Loss: 0.10533090680837631
Epoch: 2322 Loss: 0.10526780784130096
Epoch: 2323 Loss: 0.1051696315407753
Epoch: 2324 Loss: 0.10512673109769821
Epoch: 2325 Loss: 0.1050051599740982
Epoch: 2326 Loss: 0.10491001605987549
Epoch: 2327 Loss: 0.10482712090015411
Epoch: 2328 Loss: 0.1048087403178215
Epoch: 2329 Loss: 0.10466397553682327
Epoch: 2330 Loss: 0.10459871590137482
Epoch: 2331 Loss: 0.10452592372894287
Epoch: 2332 Loss: 0.10442373156547546
Epoch: 2333 Loss: 0.10431022942066193
Epoch: 2334 Loss: 0.10425245016813278
Epoch: 2335 Loss

Epoch: 2532 Loss: 0.08965776860713959
Epoch: 2533 Loss: 0.08958488702774048
Epoch: 2534 Loss: 0.08950892090797424
Epoch: 2535 Loss: 0.08944398164749146
Epoch: 2536 Loss: 0.08940748125314713
Epoch: 2537 Loss: 0.08932312577962875
Epoch: 2538 Loss: 0.08927125483751297
Epoch: 2539 Loss: 0.08920497447252274
Epoch: 2540 Loss: 0.0891222283244133
Epoch: 2541 Loss: 0.08907011151313782
Epoch: 2542 Loss: 0.08897975832223892
Epoch: 2543 Loss: 0.08895270526409149
Epoch: 2544 Loss: 0.08888114243745804
Epoch: 2545 Loss: 0.08879073709249496
Epoch: 2546 Loss: 0.08875123411417007
Epoch: 2547 Loss: 0.08866669982671738
Epoch: 2548 Loss: 0.08859194815158844
Epoch: 2549 Loss: 0.08854521065950394
Epoch: 2550 Loss: 0.0884760394692421
Epoch: 2551 Loss: 0.08841866999864578
Epoch: 2552 Loss: 0.08838026970624924
Epoch: 2553 Loss: 0.08829633891582489
Epoch: 2554 Loss: 0.08821402490139008
Epoch: 2555 Loss: 0.0881546214222908
Epoch: 2556 Loss: 0.08810310810804367
Epoch: 2557 Loss: 0.08803004771471024
Epoch: 2558 Los

Epoch: 2756 Loss: 0.07641199231147766
Epoch: 2757 Loss: 0.07635236531496048
Epoch: 2758 Loss: 0.07631434500217438
Epoch: 2759 Loss: 0.07624653726816177
Epoch: 2760 Loss: 0.07621654123067856
Epoch: 2761 Loss: 0.07614972442388535
Epoch: 2762 Loss: 0.0760897770524025
Epoch: 2763 Loss: 0.07604309171438217
Epoch: 2764 Loss: 0.07597586512565613
Epoch: 2765 Loss: 0.07591654360294342
Epoch: 2766 Loss: 0.0758838951587677
Epoch: 2767 Loss: 0.07583601027727127
Epoch: 2768 Loss: 0.07578863948583603
Epoch: 2769 Loss: 0.0757209062576294
Epoch: 2770 Loss: 0.07567284256219864
Epoch: 2771 Loss: 0.07564065605401993
Epoch: 2772 Loss: 0.07555995136499405
Epoch: 2773 Loss: 0.075534887611866
Epoch: 2774 Loss: 0.07548024505376816
Epoch: 2775 Loss: 0.07541678100824356
Epoch: 2776 Loss: 0.07537656277418137
Epoch: 2777 Loss: 0.07531898468732834
Epoch: 2778 Loss: 0.0752682015299797
Epoch: 2779 Loss: 0.07521532475948334
Epoch: 2780 Loss: 0.07514309138059616
Epoch: 2781 Loss: 0.07512927055358887
Epoch: 2782 Loss: 

Epoch: 2973 Loss: 0.0661151260137558
Epoch: 2974 Loss: 0.06605637818574905
Epoch: 2975 Loss: 0.06603222340345383
Epoch: 2976 Loss: 0.06598292291164398
Epoch: 2977 Loss: 0.06592470407485962
Epoch: 2978 Loss: 0.065880186855793
Epoch: 2979 Loss: 0.06584863364696503
Epoch: 2980 Loss: 0.06578622758388519
Epoch: 2981 Loss: 0.06577885150909424
Epoch: 2982 Loss: 0.06572796404361725
Epoch: 2983 Loss: 0.06568175554275513
Epoch: 2984 Loss: 0.06564335525035858
Epoch: 2985 Loss: 0.0656001940369606
Epoch: 2986 Loss: 0.06555265933275223
Epoch: 2987 Loss: 0.06549403071403503
Epoch: 2988 Loss: 0.06546322256326675
Epoch: 2989 Loss: 0.06543108075857162
Epoch: 2990 Loss: 0.06537466496229172
Epoch: 2991 Loss: 0.06533551216125488
Epoch: 2992 Loss: 0.06528854370117188
Epoch: 2993 Loss: 0.06525414437055588
Epoch: 2994 Loss: 0.0652119442820549
Epoch: 2995 Loss: 0.06515561044216156
Epoch: 2996 Loss: 0.06514392793178558
Epoch: 2997 Loss: 0.06507296860218048
Epoch: 2998 Loss: 0.06502778828144073
Epoch: 2999 Loss:

Epoch: 3204 Loss: 0.057099513709545135
Epoch: 3205 Loss: 0.05705663934350014
Epoch: 3206 Loss: 0.05702454224228859
Epoch: 3207 Loss: 0.0569855272769928
Epoch: 3208 Loss: 0.05694843456149101
Epoch: 3209 Loss: 0.056921299546957016
Epoch: 3210 Loss: 0.05686867609620094
Epoch: 3211 Loss: 0.056851211935281754
Epoch: 3212 Loss: 0.05682993307709694
Epoch: 3213 Loss: 0.0567803718149662
Epoch: 3214 Loss: 0.056746453046798706
Epoch: 3215 Loss: 0.056712135672569275
Epoch: 3216 Loss: 0.05667305737733841
Epoch: 3217 Loss: 0.05665391683578491
Epoch: 3218 Loss: 0.05660943686962128
Epoch: 3219 Loss: 0.05656648427248001
Epoch: 3220 Loss: 0.05653801187872887
Epoch: 3221 Loss: 0.056514766067266464
Epoch: 3222 Loss: 0.05647660419344902
Epoch: 3223 Loss: 0.056447602808475494
Epoch: 3224 Loss: 0.056408192962408066
Epoch: 3225 Loss: 0.05637143552303314
Epoch: 3226 Loss: 0.056324366480112076
Epoch: 3227 Loss: 0.05629121884703636
Epoch: 3228 Loss: 0.056273914873600006
Epoch: 3229 Loss: 0.056226570159196854
Epo

Epoch: 3438 Loss: 0.04972902685403824
Epoch: 3439 Loss: 0.04968401789665222
Epoch: 3440 Loss: 0.04966447129845619
Epoch: 3441 Loss: 0.04962913691997528
Epoch: 3442 Loss: 0.04960682615637779
Epoch: 3443 Loss: 0.04958174005150795
Epoch: 3444 Loss: 0.049542076885700226
Epoch: 3445 Loss: 0.0495280958712101
Epoch: 3446 Loss: 0.04948640614748001
Epoch: 3447 Loss: 0.04945332929491997
Epoch: 3448 Loss: 0.049429308623075485
Epoch: 3449 Loss: 0.04941916465759277
Epoch: 3450 Loss: 0.049364786595106125
Epoch: 3451 Loss: 0.04935835301876068
Epoch: 3452 Loss: 0.04931316524744034
Epoch: 3453 Loss: 0.0493045374751091
Epoch: 3454 Loss: 0.04925943911075592
Epoch: 3455 Loss: 0.049227602779865265
Epoch: 3456 Loss: 0.04921126738190651
Epoch: 3457 Loss: 0.049165982753038406
Epoch: 3458 Loss: 0.04913931339979172
Epoch: 3459 Loss: 0.04913387447595596
Epoch: 3460 Loss: 0.04909393936395645
Epoch: 3461 Loss: 0.04905848950147629
Epoch: 3462 Loss: 0.04904885217547417
Epoch: 3463 Loss: 0.04900979995727539
Epoch: 34

Epoch: 3666 Loss: 0.04376773163676262
Epoch: 3667 Loss: 0.043767962604761124
Epoch: 3668 Loss: 0.043732307851314545
Epoch: 3669 Loss: 0.043704573065042496
Epoch: 3670 Loss: 0.043701864778995514
Epoch: 3671 Loss: 0.043664149940013885
Epoch: 3672 Loss: 0.043649401515722275
Epoch: 3673 Loss: 0.04361627250909805
Epoch: 3674 Loss: 0.04360126703977585
Epoch: 3675 Loss: 0.04358243942260742
Epoch: 3676 Loss: 0.04354444518685341
Epoch: 3677 Loss: 0.04351876303553581
Epoch: 3678 Loss: 0.04350987449288368
Epoch: 3679 Loss: 0.043476492166519165
Epoch: 3680 Loss: 0.043454769998788834
Epoch: 3681 Loss: 0.0434298999607563
Epoch: 3682 Loss: 0.043408408761024475
Epoch: 3683 Loss: 0.043381329625844955
Epoch: 3684 Loss: 0.043363310396671295
Epoch: 3685 Loss: 0.043330591171979904
Epoch: 3686 Loss: 0.043312061578035355
Epoch: 3687 Loss: 0.043288927525281906
Epoch: 3688 Loss: 0.04326922073960304
Epoch: 3689 Loss: 0.04324107989668846
Epoch: 3690 Loss: 0.04321448504924774
Epoch: 3691 Loss: 0.04319071397185325

Epoch: 3904 Loss: 0.038713887333869934
Epoch: 3905 Loss: 0.03869679570198059
Epoch: 3906 Loss: 0.0386839359998703
Epoch: 3907 Loss: 0.03865590691566467
Epoch: 3908 Loss: 0.03864665329456329
Epoch: 3909 Loss: 0.03862075135111809
Epoch: 3910 Loss: 0.03860606625676155
Epoch: 3911 Loss: 0.03858053684234619
Epoch: 3912 Loss: 0.03856012597680092
Epoch: 3913 Loss: 0.03854262828826904
Epoch: 3914 Loss: 0.038519348949193954
Epoch: 3915 Loss: 0.038514748215675354
Epoch: 3916 Loss: 0.038490179926157
Epoch: 3917 Loss: 0.03847292810678482
Epoch: 3918 Loss: 0.03844809904694557
Epoch: 3919 Loss: 0.03841843083500862
Epoch: 3920 Loss: 0.03840336576104164
Epoch: 3921 Loss: 0.03839576616883278
Epoch: 3922 Loss: 0.03838198259472847
Epoch: 3923 Loss: 0.038353074342012405
Epoch: 3924 Loss: 0.038341592997312546
Epoch: 3925 Loss: 0.03830612450838089
Epoch: 3926 Loss: 0.038292162120342255
Epoch: 3927 Loss: 0.038285087794065475
Epoch: 3928 Loss: 0.03825238719582558
Epoch: 3929 Loss: 0.0382419154047966
Epoch: 39

Epoch: 4129 Loss: 0.03473583981394768
Epoch: 4130 Loss: 0.034722357988357544
Epoch: 4131 Loss: 0.034700121730566025
Epoch: 4132 Loss: 0.03468519076704979
Epoch: 4133 Loss: 0.03466663509607315
Epoch: 4134 Loss: 0.034650951623916626
Epoch: 4135 Loss: 0.03463497757911682
Epoch: 4136 Loss: 0.03462105616927147
Epoch: 4137 Loss: 0.03460134565830231
Epoch: 4138 Loss: 0.03458566591143608
Epoch: 4139 Loss: 0.03457009047269821
Epoch: 4140 Loss: 0.03454797342419624
Epoch: 4141 Loss: 0.034547679126262665
Epoch: 4142 Loss: 0.034518588334321976
Epoch: 4143 Loss: 0.03450397402048111
Epoch: 4144 Loss: 0.03449957072734833
Epoch: 4145 Loss: 0.034480880945920944
Epoch: 4146 Loss: 0.034459829330444336
Epoch: 4147 Loss: 0.03444383293390274
Epoch: 4148 Loss: 0.03442073613405228
Epoch: 4149 Loss: 0.0344105139374733
Epoch: 4150 Loss: 0.03439570218324661
Epoch: 4151 Loss: 0.03437759727239609
Epoch: 4152 Loss: 0.03435990586876869
Epoch: 4153 Loss: 0.03435637801885605
Epoch: 4154 Loss: 0.03432369977235794
Epoch:

Epoch: 4352 Loss: 0.031363826245069504
Epoch: 4353 Loss: 0.03134820610284805
Epoch: 4354 Loss: 0.03133142739534378
Epoch: 4355 Loss: 0.03133188933134079
Epoch: 4356 Loss: 0.031316835433244705
Epoch: 4357 Loss: 0.03129920735955238
Epoch: 4358 Loss: 0.03127496317028999
Epoch: 4359 Loss: 0.031259685754776
Epoch: 4360 Loss: 0.031252339482307434
Epoch: 4361 Loss: 0.031237900257110596
Epoch: 4362 Loss: 0.03122837468981743
Epoch: 4363 Loss: 0.031208574771881104
Epoch: 4364 Loss: 0.031187286600470543
Epoch: 4365 Loss: 0.031178778037428856
Epoch: 4366 Loss: 0.03117072768509388
Epoch: 4367 Loss: 0.031152039766311646
Epoch: 4368 Loss: 0.031138218939304352
Epoch: 4369 Loss: 0.031134096905589104
Epoch: 4370 Loss: 0.031111259013414383
Epoch: 4371 Loss: 0.031098784878849983
Epoch: 4372 Loss: 0.0310868788510561
Epoch: 4373 Loss: 0.031066812574863434
Epoch: 4374 Loss: 0.031056486070156097
Epoch: 4375 Loss: 0.031046053394675255
Epoch: 4376 Loss: 0.03102598339319229
Epoch: 4377 Loss: 0.031015977263450623

Epoch: 4577 Loss: 0.028529681265354156
Epoch: 4578 Loss: 0.028514700010418892
Epoch: 4579 Loss: 0.028507934883236885
Epoch: 4580 Loss: 0.02849937230348587
Epoch: 4581 Loss: 0.02848679944872856
Epoch: 4582 Loss: 0.028472110629081726
Epoch: 4583 Loss: 0.028465550392866135
Epoch: 4584 Loss: 0.02844778448343277
Epoch: 4585 Loss: 0.028438014909625053
Epoch: 4586 Loss: 0.028423886746168137
Epoch: 4587 Loss: 0.02841753140091896
Epoch: 4588 Loss: 0.028403889387845993
Epoch: 4589 Loss: 0.02838982455432415
Epoch: 4590 Loss: 0.028382347896695137
Epoch: 4591 Loss: 0.028370928019285202
Epoch: 4592 Loss: 0.02835916355252266
Epoch: 4593 Loss: 0.028345324099063873
Epoch: 4594 Loss: 0.028337283059954643
Epoch: 4595 Loss: 0.02832568995654583
Epoch: 4596 Loss: 0.028311364352703094
Epoch: 4597 Loss: 0.02829914353787899
Epoch: 4598 Loss: 0.0282914862036705
Epoch: 4599 Loss: 0.02827683836221695
Epoch: 4600 Loss: 0.0282681193202734
Epoch: 4601 Loss: 0.02825533226132393
Epoch: 4602 Loss: 0.028246304020285606


Epoch: 4808 Loss: 0.02607090212404728
Epoch: 4809 Loss: 0.026051495224237442
Epoch: 4810 Loss: 0.02605137974023819
Epoch: 4811 Loss: 0.026030579581856728
Epoch: 4812 Loss: 0.026031645014882088
Epoch: 4813 Loss: 0.026021406054496765
Epoch: 4814 Loss: 0.026010332629084587
Epoch: 4815 Loss: 0.025994736701250076
Epoch: 4816 Loss: 0.025987571105360985
Epoch: 4817 Loss: 0.025974757969379425
Epoch: 4818 Loss: 0.02596883848309517
Epoch: 4819 Loss: 0.025952691212296486
Epoch: 4820 Loss: 0.0259531382471323
Epoch: 4821 Loss: 0.025933805853128433
Epoch: 4822 Loss: 0.025934631004929543
Epoch: 4823 Loss: 0.025914089754223824
Epoch: 4824 Loss: 0.025909457355737686
Epoch: 4825 Loss: 0.02589590661227703
Epoch: 4826 Loss: 0.025888586416840553
Epoch: 4827 Loss: 0.025887634605169296
Epoch: 4828 Loss: 0.025861568748950958
Epoch: 4829 Loss: 0.025859002023935318
Epoch: 4830 Loss: 0.025849809870123863
Epoch: 4831 Loss: 0.025840910151600838
Epoch: 4832 Loss: 0.025829870253801346
Epoch: 4833 Loss: 0.02582388930

Epoch: 5034 Loss: 0.02397695928812027
Epoch: 5035 Loss: 0.02396923489868641
Epoch: 5036 Loss: 0.023949315771460533
Epoch: 5037 Loss: 0.023949217051267624
Epoch: 5038 Loss: 0.02393459901213646
Epoch: 5039 Loss: 0.023932959884405136
Epoch: 5040 Loss: 0.023921245709061623
Epoch: 5041 Loss: 0.023907942697405815
Epoch: 5042 Loss: 0.023897932842373848
Epoch: 5043 Loss: 0.023899398744106293
Epoch: 5044 Loss: 0.02388417162001133
Epoch: 5045 Loss: 0.02387632429599762
Epoch: 5046 Loss: 0.02386884018778801
Epoch: 5047 Loss: 0.023861486464738846
Epoch: 5048 Loss: 0.023848358541727066
Epoch: 5049 Loss: 0.023842254653573036
Epoch: 5050 Loss: 0.023834964260458946
Epoch: 5051 Loss: 0.023824218660593033
Epoch: 5052 Loss: 0.023818112909793854
Epoch: 5053 Loss: 0.023808782920241356
Epoch: 5054 Loss: 0.02379736490547657
Epoch: 5055 Loss: 0.023795384913682938
Epoch: 5056 Loss: 0.023776737973093987
Epoch: 5057 Loss: 0.023774776607751846
Epoch: 5058 Loss: 0.023766379803419113
Epoch: 5059 Loss: 0.023759689182

Epoch: 5255 Loss: 0.022148430347442627
Epoch: 5256 Loss: 0.02214151993393898
Epoch: 5257 Loss: 0.022132398560643196
Epoch: 5258 Loss: 0.02212686836719513
Epoch: 5259 Loss: 0.02211741916835308
Epoch: 5260 Loss: 0.022106045857071877
Epoch: 5261 Loss: 0.022103583440184593
Epoch: 5262 Loss: 0.02209632657468319
Epoch: 5263 Loss: 0.022085193544626236
Epoch: 5264 Loss: 0.022080112248659134
Epoch: 5265 Loss: 0.022072449326515198
Epoch: 5266 Loss: 0.022066233679652214
Epoch: 5267 Loss: 0.022051241248846054
Epoch: 5268 Loss: 0.0220442283898592
Epoch: 5269 Loss: 0.022041432559490204
Epoch: 5270 Loss: 0.022031212225556374
Epoch: 5271 Loss: 0.022021254524588585
Epoch: 5272 Loss: 0.022020600736141205
Epoch: 5273 Loss: 0.022006554529070854
Epoch: 5274 Loss: 0.021998336538672447
Epoch: 5275 Loss: 0.02199378050863743
Epoch: 5276 Loss: 0.0219804048538208
Epoch: 5277 Loss: 0.021977029740810394
Epoch: 5278 Loss: 0.021973593160510063
Epoch: 5279 Loss: 0.021962404251098633
Epoch: 5280 Loss: 0.02194722183048

Epoch: 5483 Loss: 0.02048133686184883
Epoch: 5484 Loss: 0.020479848608374596
Epoch: 5485 Loss: 0.020469829440116882
Epoch: 5486 Loss: 0.020463094115257263
Epoch: 5487 Loss: 0.020460182800889015
Epoch: 5488 Loss: 0.02045336365699768
Epoch: 5489 Loss: 0.020442111417651176
Epoch: 5490 Loss: 0.020433766767382622
Epoch: 5491 Loss: 0.020427323877811432
Epoch: 5492 Loss: 0.02042117714881897
Epoch: 5493 Loss: 0.020420633256435394
Epoch: 5494 Loss: 0.020416755229234695
Epoch: 5495 Loss: 0.020401567220687866
Epoch: 5496 Loss: 0.020393652841448784
Epoch: 5497 Loss: 0.020389240235090256
Epoch: 5498 Loss: 0.020380692556500435
Epoch: 5499 Loss: 0.020377619192004204
Epoch: 5500 Loss: 0.02036837674677372
Epoch: 5501 Loss: 0.02035798877477646
Epoch: 5502 Loss: 0.02035754919052124
Epoch: 5503 Loss: 0.020345745608210564
Epoch: 5504 Loss: 0.020342860370874405
Epoch: 5505 Loss: 0.02033642865717411
Epoch: 5506 Loss: 0.020327650010585785
Epoch: 5507 Loss: 0.02031884528696537
Epoch: 5508 Loss: 0.0203160401433

Epoch: 5717 Loss: 0.018956413492560387
Epoch: 5718 Loss: 0.018955925479531288
Epoch: 5719 Loss: 0.018942464143037796
Epoch: 5720 Loss: 0.018935270607471466
Epoch: 5721 Loss: 0.018928833305835724
Epoch: 5722 Loss: 0.01892746239900589
Epoch: 5723 Loss: 0.018919816240668297
Epoch: 5724 Loss: 0.018914679065346718
Epoch: 5725 Loss: 0.01890747807919979
Epoch: 5726 Loss: 0.018900062888860703
Epoch: 5727 Loss: 0.01889416202902794
Epoch: 5728 Loss: 0.01888985000550747
Epoch: 5729 Loss: 0.018881354480981827
Epoch: 5730 Loss: 0.018877429887652397
Epoch: 5731 Loss: 0.018869245424866676
Epoch: 5732 Loss: 0.01886783167719841
Epoch: 5733 Loss: 0.018858108669519424
Epoch: 5734 Loss: 0.0188504196703434
Epoch: 5735 Loss: 0.01884775608778
Epoch: 5736 Loss: 0.01883798837661743
Epoch: 5737 Loss: 0.018836477771401405
Epoch: 5738 Loss: 0.018830038607120514
Epoch: 5739 Loss: 0.018821539357304573
Epoch: 5740 Loss: 0.018818527460098267
Epoch: 5741 Loss: 0.018811484798789024
Epoch: 5742 Loss: 0.01879990473389625

Epoch: 5950 Loss: 0.017593665048480034
Epoch: 5951 Loss: 0.01758550852537155
Epoch: 5952 Loss: 0.017576847225427628
Epoch: 5953 Loss: 0.0175747312605381
Epoch: 5954 Loss: 0.01757187582552433
Epoch: 5955 Loss: 0.017561348155140877
Epoch: 5956 Loss: 0.01755770854651928
Epoch: 5957 Loss: 0.017556486651301384
Epoch: 5958 Loss: 0.017544711008667946
Epoch: 5959 Loss: 0.01754188723862171
Epoch: 5960 Loss: 0.0175387691706419
Epoch: 5961 Loss: 0.017528461292386055
Epoch: 5962 Loss: 0.017524216324090958
Epoch: 5963 Loss: 0.01752258650958538
Epoch: 5964 Loss: 0.0175174567848444
Epoch: 5965 Loss: 0.017511051148176193
Epoch: 5966 Loss: 0.017500391229987144
Epoch: 5967 Loss: 0.0174979567527771
Epoch: 5968 Loss: 0.01749085634946823
Epoch: 5969 Loss: 0.017486685886979103
Epoch: 5970 Loss: 0.017482252791523933
Epoch: 5971 Loss: 0.017472999170422554
Epoch: 5972 Loss: 0.017471566796302795
Epoch: 5973 Loss: 0.01746593974530697
Epoch: 5974 Loss: 0.01745922863483429
Epoch: 5975 Loss: 0.017457175999879837
Ep

Epoch: 6179 Loss: 0.01642250083386898
Epoch: 6180 Loss: 0.016412528231739998
Epoch: 6181 Loss: 0.016408603638410568
Epoch: 6182 Loss: 0.01640520617365837
Epoch: 6183 Loss: 0.016399944201111794
Epoch: 6184 Loss: 0.01639249548316002
Epoch: 6185 Loss: 0.01638704538345337
Epoch: 6186 Loss: 0.016383223235607147
Epoch: 6187 Loss: 0.016383487731218338
Epoch: 6188 Loss: 0.01637176051735878
Epoch: 6189 Loss: 0.016372932121157646
Epoch: 6190 Loss: 0.01636667177081108
Epoch: 6191 Loss: 0.01636076718568802
Epoch: 6192 Loss: 0.016355814412236214
Epoch: 6193 Loss: 0.016352394595742226
Epoch: 6194 Loss: 0.01634649932384491
Epoch: 6195 Loss: 0.016342099756002426
Epoch: 6196 Loss: 0.016337210312485695
Epoch: 6197 Loss: 0.01633545197546482
Epoch: 6198 Loss: 0.016331151127815247
Epoch: 6199 Loss: 0.016321023926138878
Epoch: 6200 Loss: 0.01631770096719265
Epoch: 6201 Loss: 0.016314757987856865
Epoch: 6202 Loss: 0.016309179365634918
Epoch: 6203 Loss: 0.016303323209285736
Epoch: 6204 Loss: 0.016300600022077

Epoch: 6400 Loss: 0.015415714122354984
Epoch: 6401 Loss: 0.015412470325827599
Epoch: 6402 Loss: 0.015405122190713882
Epoch: 6403 Loss: 0.015403133817017078
Epoch: 6404 Loss: 0.015398751012980938
Epoch: 6405 Loss: 0.015394831076264381
Epoch: 6406 Loss: 0.01538842637091875
Epoch: 6407 Loss: 0.01538430992513895
Epoch: 6408 Loss: 0.015380188822746277
Epoch: 6409 Loss: 0.01537459995597601
Epoch: 6410 Loss: 0.015375196002423763
Epoch: 6411 Loss: 0.015370507724583149
Epoch: 6412 Loss: 0.015362346544861794
Epoch: 6413 Loss: 0.01535678468644619
Epoch: 6414 Loss: 0.015357297845184803
Epoch: 6415 Loss: 0.015347093343734741
Epoch: 6416 Loss: 0.015346686355769634
Epoch: 6417 Loss: 0.015339935198426247
Epoch: 6418 Loss: 0.015340182930231094
Epoch: 6419 Loss: 0.015332185663282871
Epoch: 6420 Loss: 0.01532994955778122
Epoch: 6421 Loss: 0.01532320398837328
Epoch: 6422 Loss: 0.015321504324674606
Epoch: 6423 Loss: 0.015316342934966087
Epoch: 6424 Loss: 0.01531212218105793
Epoch: 6425 Loss: 0.015306487679

Epoch: 6629 Loss: 0.014487648382782936
Epoch: 6630 Loss: 0.014485967345535755
Epoch: 6631 Loss: 0.014481908641755581
Epoch: 6632 Loss: 0.014473742805421352
Epoch: 6633 Loss: 0.014475394040346146
Epoch: 6634 Loss: 0.014466485008597374
Epoch: 6635 Loss: 0.014464492909610271
Epoch: 6636 Loss: 0.014461004175245762
Epoch: 6637 Loss: 0.014458070509135723
Epoch: 6638 Loss: 0.014454691670835018
Epoch: 6639 Loss: 0.014447701163589954
Epoch: 6640 Loss: 0.014444991946220398
Epoch: 6641 Loss: 0.014442035928368568
Epoch: 6642 Loss: 0.014439305290579796
Epoch: 6643 Loss: 0.014435014687478542
Epoch: 6644 Loss: 0.014429586008191109
Epoch: 6645 Loss: 0.014426223933696747
Epoch: 6646 Loss: 0.01442280225455761
Epoch: 6647 Loss: 0.014420021325349808
Epoch: 6648 Loss: 0.01441620010882616
Epoch: 6649 Loss: 0.014412019401788712
Epoch: 6650 Loss: 0.014409671537578106
Epoch: 6651 Loss: 0.014404091984033585
Epoch: 6652 Loss: 0.01440172828733921
Epoch: 6653 Loss: 0.01439814455807209
Epoch: 6654 Loss: 0.014393288

Epoch: 6845 Loss: 0.01369981374591589
Epoch: 6846 Loss: 0.013696126639842987
Epoch: 6847 Loss: 0.013692905195057392
Epoch: 6848 Loss: 0.013689360581338406
Epoch: 6849 Loss: 0.013685118407011032
Epoch: 6850 Loss: 0.013681337237358093
Epoch: 6851 Loss: 0.013677307404577732
Epoch: 6852 Loss: 0.013673852197825909
Epoch: 6853 Loss: 0.013672346249222755
Epoch: 6854 Loss: 0.013665148988366127
Epoch: 6855 Loss: 0.013663733378052711
Epoch: 6856 Loss: 0.013660939410328865
Epoch: 6857 Loss: 0.013656257651746273
Epoch: 6858 Loss: 0.01365577895194292
Epoch: 6859 Loss: 0.013650212436914444
Epoch: 6860 Loss: 0.013645926490426064
Epoch: 6861 Loss: 0.013642681762576103
Epoch: 6862 Loss: 0.013640155084431171
Epoch: 6863 Loss: 0.013638319447636604
Epoch: 6864 Loss: 0.013630284927785397
Epoch: 6865 Loss: 0.013629467226564884
Epoch: 6866 Loss: 0.013624457642436028
Epoch: 6867 Loss: 0.01362399198114872
Epoch: 6868 Loss: 0.013618320226669312
Epoch: 6869 Loss: 0.01361860241740942
Epoch: 6870 Loss: 0.013610274

Epoch: 7072 Loss: 0.01294751837849617
Epoch: 7073 Loss: 0.012944255955517292
Epoch: 7074 Loss: 0.012940595857799053
Epoch: 7075 Loss: 0.01293620839715004
Epoch: 7076 Loss: 0.012933547608554363
Epoch: 7077 Loss: 0.012932094745337963
Epoch: 7078 Loss: 0.012929502874612808
Epoch: 7079 Loss: 0.012925457209348679
Epoch: 7080 Loss: 0.012920877896249294
Epoch: 7081 Loss: 0.012919608503580093
Epoch: 7082 Loss: 0.012917116284370422
Epoch: 7083 Loss: 0.012912253849208355
Epoch: 7084 Loss: 0.012908185832202435
Epoch: 7085 Loss: 0.012908181175589561
Epoch: 7086 Loss: 0.012902643531560898
Epoch: 7087 Loss: 0.012898859567940235
Epoch: 7088 Loss: 0.012898066081106663
Epoch: 7089 Loss: 0.012896218337118626
Epoch: 7090 Loss: 0.012890823185443878
Epoch: 7091 Loss: 0.012888597324490547
Epoch: 7092 Loss: 0.012882929295301437
Epoch: 7093 Loss: 0.012880479916930199
Epoch: 7094 Loss: 0.012877698987722397
Epoch: 7095 Loss: 0.01287378091365099
Epoch: 7096 Loss: 0.012872216291725636
Epoch: 7097 Loss: 0.01286774

Epoch: 7310 Loss: 0.012231146916747093
Epoch: 7311 Loss: 0.012230314314365387
Epoch: 7312 Loss: 0.012225964106619358
Epoch: 7313 Loss: 0.012221310287714005
Epoch: 7314 Loss: 0.012221097014844418
Epoch: 7315 Loss: 0.012216274626553059
Epoch: 7316 Loss: 0.012213509529829025
Epoch: 7317 Loss: 0.012212340719997883
Epoch: 7318 Loss: 0.012207996100187302
Epoch: 7319 Loss: 0.012205880135297775
Epoch: 7320 Loss: 0.012204179540276527
Epoch: 7321 Loss: 0.01219930499792099
Epoch: 7322 Loss: 0.012195720337331295
Epoch: 7323 Loss: 0.012197108939290047
Epoch: 7324 Loss: 0.012191565707325935
Epoch: 7325 Loss: 0.01219109259545803
Epoch: 7326 Loss: 0.012184930965304375
Epoch: 7327 Loss: 0.012182972393929958
Epoch: 7328 Loss: 0.012181253172457218
Epoch: 7329 Loss: 0.01217818446457386
Epoch: 7330 Loss: 0.012174992822110653
Epoch: 7331 Loss: 0.01217144075781107
Epoch: 7332 Loss: 0.012170219793915749
Epoch: 7333 Loss: 0.012165592052042484
Epoch: 7334 Loss: 0.012162290513515472
Epoch: 7335 Loss: 0.012159391

Epoch: 7541 Loss: 0.011595540679991245
Epoch: 7542 Loss: 0.011592417024075985
Epoch: 7543 Loss: 0.011590666137635708
Epoch: 7544 Loss: 0.011588482186198235
Epoch: 7545 Loss: 0.011585376225411892
Epoch: 7546 Loss: 0.011584101244807243
Epoch: 7547 Loss: 0.011579516343772411
Epoch: 7548 Loss: 0.011577914468944073
Epoch: 7549 Loss: 0.011574173346161842
Epoch: 7550 Loss: 0.01157414447516203
Epoch: 7551 Loss: 0.011568684130907059
Epoch: 7552 Loss: 0.011565963737666607
Epoch: 7553 Loss: 0.011564604006707668
Epoch: 7554 Loss: 0.011562415398657322
Epoch: 7555 Loss: 0.011559687554836273
Epoch: 7556 Loss: 0.011556931771337986
Epoch: 7557 Loss: 0.011552597396075726
Epoch: 7558 Loss: 0.011551612056791782
Epoch: 7559 Loss: 0.01155004370957613
Epoch: 7560 Loss: 0.011547400616109371
Epoch: 7561 Loss: 0.011543311178684235
Epoch: 7562 Loss: 0.011540478095412254
Epoch: 7563 Loss: 0.011539435014128685
Epoch: 7564 Loss: 0.011536804959177971
Epoch: 7565 Loss: 0.011534519493579865
Epoch: 7566 Loss: 0.0115313

Epoch: 7771 Loss: 0.011019651778042316
Epoch: 7772 Loss: 0.011016818694770336
Epoch: 7773 Loss: 0.011015494354069233
Epoch: 7774 Loss: 0.011012539267539978
Epoch: 7775 Loss: 0.011010119691491127
Epoch: 7776 Loss: 0.011006680317223072
Epoch: 7777 Loss: 0.01100494246929884
Epoch: 7778 Loss: 0.011003921739757061
Epoch: 7779 Loss: 0.011000782251358032
Epoch: 7780 Loss: 0.010997533798217773
Epoch: 7781 Loss: 0.0109955919906497
Epoch: 7782 Loss: 0.01099249068647623
Epoch: 7783 Loss: 0.01099163293838501
Epoch: 7784 Loss: 0.01098641287535429
Epoch: 7785 Loss: 0.010986034758388996
Epoch: 7786 Loss: 0.010983796790242195
Epoch: 7787 Loss: 0.010981258936226368
Epoch: 7788 Loss: 0.010978445410728455
Epoch: 7789 Loss: 0.010977884754538536
Epoch: 7790 Loss: 0.010974627919495106
Epoch: 7791 Loss: 0.010972419753670692
Epoch: 7792 Loss: 0.010969694703817368
Epoch: 7793 Loss: 0.010966474190354347
Epoch: 7794 Loss: 0.010964677669107914
Epoch: 7795 Loss: 0.010960484854876995
Epoch: 7796 Loss: 0.01096033863

Epoch: 7993 Loss: 0.010509785264730453
Epoch: 7994 Loss: 0.010506180115044117
Epoch: 7995 Loss: 0.010503776371479034
Epoch: 7996 Loss: 0.010500836186110973
Epoch: 7997 Loss: 0.010498663410544395
Epoch: 7998 Loss: 0.010499415919184685
Epoch: 7999 Loss: 0.01049495767802
Epoch: 8000 Loss: 0.010494168847799301
Epoch: 8001 Loss: 0.01049016509205103
Epoch: 8002 Loss: 0.010490002110600471
Epoch: 8003 Loss: 0.010486410930752754
Epoch: 8004 Loss: 0.01048491895198822
Epoch: 8005 Loss: 0.01048252359032631
Epoch: 8006 Loss: 0.010479881428182125
Epoch: 8007 Loss: 0.010476871393620968
Epoch: 8008 Loss: 0.0104751568287611
Epoch: 8009 Loss: 0.010473290458321571
Epoch: 8010 Loss: 0.010471167974174023
Epoch: 8011 Loss: 0.010468568652868271
Epoch: 8012 Loss: 0.010467051528394222
Epoch: 8013 Loss: 0.010464155115187168
Epoch: 8014 Loss: 0.010463197715580463
Epoch: 8015 Loss: 0.010462307371199131
Epoch: 8016 Loss: 0.010459371842443943
Epoch: 8017 Loss: 0.010456395335495472
Epoch: 8018 Loss: 0.01045229006558

Epoch: 8205 Loss: 0.010057949461042881
Epoch: 8206 Loss: 0.01005544513463974
Epoch: 8207 Loss: 0.010052613914012909
Epoch: 8208 Loss: 0.01005151029676199
Epoch: 8209 Loss: 0.010048555210232735
Epoch: 8210 Loss: 0.010048131458461285
Epoch: 8211 Loss: 0.010046523995697498
Epoch: 8212 Loss: 0.010045765899121761
Epoch: 8213 Loss: 0.010041316039860249
Epoch: 8214 Loss: 0.010042140260338783
Epoch: 8215 Loss: 0.010036311112344265
Epoch: 8216 Loss: 0.010036317631602287
Epoch: 8217 Loss: 0.010034495033323765
Epoch: 8218 Loss: 0.010031054727733135
Epoch: 8219 Loss: 0.01002955250442028
Epoch: 8220 Loss: 0.010027904994785786
Epoch: 8221 Loss: 0.010024845600128174
Epoch: 8222 Loss: 0.01002268586307764
Epoch: 8223 Loss: 0.010020916350185871
Epoch: 8224 Loss: 0.01001945324242115
Epoch: 8225 Loss: 0.010016652755439281
Epoch: 8226 Loss: 0.010015548206865788
Epoch: 8227 Loss: 0.010011912323534489
Epoch: 8228 Loss: 0.01001105085015297
Epoch: 8229 Loss: 0.010009578429162502
Epoch: 8230 Loss: 0.01000553555

Epoch: 8429 Loss: 0.009617012925446033
Epoch: 8430 Loss: 0.009613950736820698
Epoch: 8431 Loss: 0.009613989852368832
Epoch: 8432 Loss: 0.009611603803932667
Epoch: 8433 Loss: 0.009610146284103394
Epoch: 8434 Loss: 0.00960659421980381
Epoch: 8435 Loss: 0.009605593979358673
Epoch: 8436 Loss: 0.009603729471564293
Epoch: 8437 Loss: 0.009601613506674767
Epoch: 8438 Loss: 0.009598791599273682
Epoch: 8439 Loss: 0.009598573669791222
Epoch: 8440 Loss: 0.009595540352165699
Epoch: 8441 Loss: 0.009594567120075226
Epoch: 8442 Loss: 0.009592490270733833
Epoch: 8443 Loss: 0.009589998051524162
Epoch: 8444 Loss: 0.009588584303855896
Epoch: 8445 Loss: 0.009585108608007431
Epoch: 8446 Loss: 0.009585058316588402
Epoch: 8447 Loss: 0.009583408944308758
Epoch: 8448 Loss: 0.009579849429428577
Epoch: 8449 Loss: 0.009580076672136784
Epoch: 8450 Loss: 0.009576307609677315
Epoch: 8451 Loss: 0.009575498290359974
Epoch: 8452 Loss: 0.009574197232723236
Epoch: 8453 Loss: 0.009571926668286324
Epoch: 8454 Loss: 0.009569

Epoch: 8653 Loss: 0.009207912720739841
Epoch: 8654 Loss: 0.00920654833316803
Epoch: 8655 Loss: 0.009204856120049953
Epoch: 8656 Loss: 0.009202128276228905
Epoch: 8657 Loss: 0.009202505461871624
Epoch: 8658 Loss: 0.009200015105307102
Epoch: 8659 Loss: 0.009197774343192577
Epoch: 8660 Loss: 0.009195718914270401
Epoch: 8661 Loss: 0.009194464422762394
Epoch: 8662 Loss: 0.009193490259349346
Epoch: 8663 Loss: 0.009191332384943962
Epoch: 8664 Loss: 0.009188036434352398
Epoch: 8665 Loss: 0.009186794981360435
Epoch: 8666 Loss: 0.009186714887619019
Epoch: 8667 Loss: 0.00918347667902708
Epoch: 8668 Loss: 0.00918248388916254
Epoch: 8669 Loss: 0.009180149994790554
Epoch: 8670 Loss: 0.009177717380225658
Epoch: 8671 Loss: 0.009176522493362427
Epoch: 8672 Loss: 0.009175295010209084
Epoch: 8673 Loss: 0.009172996506094933
Epoch: 8674 Loss: 0.00917139183729887
Epoch: 8675 Loss: 0.009169839322566986
Epoch: 8676 Loss: 0.00916797574609518
Epoch: 8677 Loss: 0.009166885167360306
Epoch: 8678 Loss: 0.0091645158

Epoch: 8880 Loss: 0.008823661133646965
Epoch: 8881 Loss: 0.008822948671877384
Epoch: 8882 Loss: 0.008820057846605778
Epoch: 8883 Loss: 0.008818847127258778
Epoch: 8884 Loss: 0.008817457593977451
Epoch: 8885 Loss: 0.008815127424895763
Epoch: 8886 Loss: 0.008814982138574123
Epoch: 8887 Loss: 0.008811675943434238
Epoch: 8888 Loss: 0.00881033856421709
Epoch: 8889 Loss: 0.008808571845293045
Epoch: 8890 Loss: 0.008807912468910217
Epoch: 8891 Loss: 0.008805408142507076
Epoch: 8892 Loss: 0.008803509175777435
Epoch: 8893 Loss: 0.00880244467407465
Epoch: 8894 Loss: 0.00880084652453661
Epoch: 8895 Loss: 0.008800500072538853
Epoch: 8896 Loss: 0.008798705413937569
Epoch: 8897 Loss: 0.00879677664488554
Epoch: 8898 Loss: 0.008794331923127174
Epoch: 8899 Loss: 0.008792503736913204
Epoch: 8900 Loss: 0.008790712803602219
Epoch: 8901 Loss: 0.008789313957095146
Epoch: 8902 Loss: 0.00878887064754963
Epoch: 8903 Loss: 0.008785739541053772
Epoch: 8904 Loss: 0.008784706704318523
Epoch: 8905 Loss: 0.0087839374

Epoch: 9093 Loss: 0.00848702434450388
Epoch: 9094 Loss: 0.008486669510602951
Epoch: 9095 Loss: 0.008484172634780407
Epoch: 9096 Loss: 0.008483313024044037
Epoch: 9097 Loss: 0.008482730947434902
Epoch: 9098 Loss: 0.008481350727379322
Epoch: 9099 Loss: 0.008478512056171894
Epoch: 9100 Loss: 0.008477499708533287
Epoch: 9101 Loss: 0.008475862443447113
Epoch: 9102 Loss: 0.008474222384393215
Epoch: 9103 Loss: 0.008472074754536152
Epoch: 9104 Loss: 0.008471062406897545
Epoch: 9105 Loss: 0.008469773456454277
Epoch: 9106 Loss: 0.008466850966215134
Epoch: 9107 Loss: 0.008466525003314018
Epoch: 9108 Loss: 0.008466459810733795
Epoch: 9109 Loss: 0.008463341742753983
Epoch: 9110 Loss: 0.008462006226181984
Epoch: 9111 Loss: 0.008462242782115936
Epoch: 9112 Loss: 0.008459272794425488
Epoch: 9113 Loss: 0.0084573058411479
Epoch: 9114 Loss: 0.008455447852611542
Epoch: 9115 Loss: 0.008453991264104843
Epoch: 9116 Loss: 0.008452270179986954
Epoch: 9117 Loss: 0.008450270630419254
Epoch: 9118 Loss: 0.00845035

Epoch: 9304 Loss: 0.008176534436643124
Epoch: 9305 Loss: 0.008176300674676895
Epoch: 9306 Loss: 0.008173990994691849
Epoch: 9307 Loss: 0.008172533474862576
Epoch: 9308 Loss: 0.008170880377292633
Epoch: 9309 Loss: 0.00817013904452324
Epoch: 9310 Loss: 0.00816833321005106
Epoch: 9311 Loss: 0.008166199550032616
Epoch: 9312 Loss: 0.008166146464645863
Epoch: 9313 Loss: 0.00816378090530634
Epoch: 9314 Loss: 0.00816189032047987
Epoch: 9315 Loss: 0.008160516619682312
Epoch: 9316 Loss: 0.008160472847521305
Epoch: 9317 Loss: 0.008157572709023952
Epoch: 9318 Loss: 0.008156215772032738
Epoch: 9319 Loss: 0.008156092837452888
Epoch: 9320 Loss: 0.008153878152370453
Epoch: 9321 Loss: 0.008152352645993233
Epoch: 9322 Loss: 0.008151381276547909
Epoch: 9323 Loss: 0.008150099776685238
Epoch: 9324 Loss: 0.008148536086082458
Epoch: 9325 Loss: 0.008147010579705238
Epoch: 9326 Loss: 0.008146329782903194
Epoch: 9327 Loss: 0.00814515259116888
Epoch: 9328 Loss: 0.008143303915858269
Epoch: 9329 Loss: 0.0081411506

Epoch: 9527 Loss: 0.007869619876146317
Epoch: 9528 Loss: 0.007867781445384026
Epoch: 9529 Loss: 0.00786785501986742
Epoch: 9530 Loss: 0.007865536026656628
Epoch: 9531 Loss: 0.00786540936678648
Epoch: 9532 Loss: 0.007863355800509453
Epoch: 9533 Loss: 0.00786165613681078
Epoch: 9534 Loss: 0.007861173711717129
Epoch: 9535 Loss: 0.007859009318053722
Epoch: 9536 Loss: 0.007857841439545155
Epoch: 9537 Loss: 0.007856098935008049
Epoch: 9538 Loss: 0.007856042124330997
Epoch: 9539 Loss: 0.007854072377085686
Epoch: 9540 Loss: 0.007852635346353054
Epoch: 9541 Loss: 0.007851967588067055
Epoch: 9542 Loss: 0.007850348018109798
Epoch: 9543 Loss: 0.007848074659705162
Epoch: 9544 Loss: 0.007846854627132416
Epoch: 9545 Loss: 0.007846461609005928
Epoch: 9546 Loss: 0.007844416424632072
Epoch: 9547 Loss: 0.007843165658414364
Epoch: 9548 Loss: 0.007841125130653381
Epoch: 9549 Loss: 0.007840597070753574
Epoch: 9550 Loss: 0.007840714417397976
Epoch: 9551 Loss: 0.007837834767997265
Epoch: 9552 Loss: 0.00783655

Epoch: 9749 Loss: 0.007584202103316784
Epoch: 9750 Loss: 0.007582094520330429
Epoch: 9751 Loss: 0.007581427693367004
Epoch: 9752 Loss: 0.007580408360809088
Epoch: 9753 Loss: 0.00757937878370285
Epoch: 9754 Loss: 0.007577748969197273
Epoch: 9755 Loss: 0.007576829753816128
Epoch: 9756 Loss: 0.007575210649520159
Epoch: 9757 Loss: 0.007573509588837624
Epoch: 9758 Loss: 0.007572703063488007
Epoch: 9759 Loss: 0.007571564055979252
Epoch: 9760 Loss: 0.007570106070488691
Epoch: 9761 Loss: 0.007570016663521528
Epoch: 9762 Loss: 0.007567136082798243
Epoch: 9763 Loss: 0.007566515356302261
Epoch: 9764 Loss: 0.00756452651694417
Epoch: 9765 Loss: 0.007564060389995575
Epoch: 9766 Loss: 0.007562751416116953
Epoch: 9767 Loss: 0.007562143728137016
Epoch: 9768 Loss: 0.00756066245958209
Epoch: 9769 Loss: 0.007558992598205805
Epoch: 9770 Loss: 0.007557968143373728
Epoch: 9771 Loss: 0.007556608412414789
Epoch: 9772 Loss: 0.007555744145065546
Epoch: 9773 Loss: 0.007554474752396345
Epoch: 9774 Loss: 0.00755203

Epoch: 9969 Loss: 0.00731903500854969
Epoch: 9970 Loss: 0.007316868286579847
Epoch: 9971 Loss: 0.007316505070775747
Epoch: 9972 Loss: 0.007315082009881735
Epoch: 9973 Loss: 0.007313643582165241
Epoch: 9974 Loss: 0.007312146481126547
Epoch: 9975 Loss: 0.007311670575290918
Epoch: 9976 Loss: 0.007309374865144491
Epoch: 9977 Loss: 0.007309676613658667
Epoch: 9978 Loss: 0.007307738531380892
Epoch: 9979 Loss: 0.007306906394660473
Epoch: 9980 Loss: 0.007305680774152279
Epoch: 9981 Loss: 0.007303754333406687
Epoch: 9982 Loss: 0.007303663529455662
Epoch: 9983 Loss: 0.00730257760733366
Epoch: 9984 Loss: 0.007300902158021927
Epoch: 9985 Loss: 0.007300013676285744
Epoch: 9986 Loss: 0.007299230899661779
Epoch: 9987 Loss: 0.007296876981854439
Epoch: 9988 Loss: 0.007296059746295214
Epoch: 9989 Loss: 0.007294806651771069
Epoch: 9990 Loss: 0.0072942771948874
Epoch: 9991 Loss: 0.007291807793080807
Epoch: 9992 Loss: 0.0072916289791464806
Epoch: 9993 Loss: 0.007289546076208353
Epoch: 9994 Loss: 0.00729009

最后我们用训练好的模型尝试在1到100这些数字上玩FizzBuzz游戏

In [None]:
# Output now
testX = torch.Tensor([binary_encode(i, NUM_DIGITS) for i in range(1, 101)])
with torch.no_grad():
    testY = model(testX)
predictions = zip(range(1, 101), list(testY.max(1)[1].data.tolist()))

print([fizz_buzz_decode(i, x) for (i, x) in predictions])

In [None]:
print(np.sum(testY.max(1)[1].numpy() == np.array([fizz_buzz_encode(i) for i in range(1,101)])))
testY.max(1)[1].numpy() == np.array([fizz_buzz_encode(i) for i in range(1,101)])