## Pytorch

PyTorch是一个动态的建图的工具。不像Tensorflow那样，先建图，然后通过feed和run重复执行建好的图。相对来说，PyTorch具有更好的灵活性。


### 如何保存参数

pytorch中最重要的data type。

**Tensor**： 就像ndarray一样,一维Tensor叫Vector，二维Tensor叫Matrix，三维及以上称为Tensor(张量)

**Tensor**与**ndarray**的最主要区别：Tensor可以在GPU上进行计算，可以自动求导。

In [11]:
import torch

In [12]:
x  = torch.Tensor(2,3,4) # torch.Tensor(shape) 创建出一个未初始化的Tensor
x

tensor([[[ 1.3592e-19,  7.5553e+28,  5.2839e-11,  1.8888e+31],
         [ 9.8049e-09,  1.6037e-07,  1.3817e-19,  4.4721e+21],
         [ 6.2625e+22,  4.7428e+30,  2.6800e+20,  1.4348e-19]],

        [[-9.1812e+22, -3.4566e-34, -1.6980e-38, -4.0311e+23],
         [-1.3823e-33, -6.4672e-32, -1.4219e+24,  6.2724e+22],
         [ 4.7428e+30,  3.3669e-18,  1.8590e+34,  7.7767e+31]]])

In [13]:
x.size()
# x.shape

torch.Size([2, 3, 4])

In [14]:
a = torch.rand(2,3,4)
b = torch.rand(2,3,4)

x = torch.add(a,b)  # 使用Tensor()方法创建出来的Tensor用来接收计算结果，当然torch.add(..)也会返回计算结果的
print(x)

tensor([[[0.9333, 0.9679, 1.0367, 0.9379],
         [0.6609, 0.3134, 1.5635, 1.3560],
         [1.1026, 0.5994, 0.9452, 0.5979]],

        [[1.2640, 0.0791, 1.0301, 0.9093],
         [0.8819, 0.3511, 1.0857, 0.6958],
         [0.7245, 0.4092, 0.8632, 1.3441]]])


In [15]:
a + b

tensor([[[0.9333, 0.9679, 1.0367, 0.9379],
         [0.6609, 0.3134, 1.5635, 1.3560],
         [1.1026, 0.5994, 0.9452, 0.5979]],

        [[1.2640, 0.0791, 1.0301, 0.9093],
         [0.8819, 0.3511, 1.0857, 0.6958],
         [0.7245, 0.4092, 0.8632, 1.3441]]])

In [6]:
a

tensor([[[0.5463, 0.0488, 0.4967, 0.7042],
         [0.4067, 0.8350, 0.9757, 0.1101],
         [0.9653, 0.6296, 0.8876, 0.4811]],

        [[0.1561, 0.8859, 0.9208, 0.5756],
         [0.8219, 0.4491, 0.5427, 0.2016],
         [0.0949, 0.4948, 0.8772, 0.1067]]])

In [7]:
a.add_(b) # 所有带 _ 的operation，都会更改调用对象的值 a = a + b

tensor([[[1.2818, 1.3501, 1.7361, 0.8297],
         [1.5176, 0.2186, 0.6820, 0.6451],
         [1.6785, 0.4612, 1.1057, 1.7063]],

        [[1.6335, 1.0912, 0.5792, 0.8746],
         [0.8269, 1.1090, 0.9925, 0.8852],
         [1.6566, 0.5550, 0.7133, 0.9053]]])

In [8]:
a

tensor([[[1.2818, 1.3501, 1.7361, 0.8297],
         [1.5176, 0.2186, 0.6820, 0.6451],
         [1.6785, 0.4612, 1.1057, 1.7063]],

        [[1.6335, 1.0912, 0.5792, 0.8746],
         [0.8269, 1.1090, 0.9925, 0.8852],
         [1.6566, 0.5550, 0.7133, 0.9053]]])

## 自动求导

pytorch的自动求导工具包在torch.autograd中

In [19]:
x = torch.tensor([2.0], requires_grad = True)
a = torch.tensor([4.0], requires_grad = True)
y = x * a

y.backward()    

print(x.grad)   # 2 * x = 4
print(a.grad)

tensor([4.])
tensor([2.])


In [20]:
x_train = torch.rand(100)
y_train = x_train * 2 + 3 # w = 2, b = 3, y = 2 * x + 3

In [21]:
w = torch.tensor([0.0],requires_grad = True)
b = torch.tensor([0.0],requires_grad = True)

print(type(w))

<class 'torch.Tensor'>


In [22]:
y_pre = x_train * w + b
y_pre

tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0.], grad_fn=<AddBackward0>)

In [23]:
lr = 0.015
loss_func = torch.nn.MSELoss()
for i in range(200):
    y_pre = x_train * w + b

    loss = loss_func(y_train, y_pre)
    if i % 10 == 0:
        print("Iter: %d, w: %.4f, b: %.4f, training loss: %.4f" % (i, w.item(), b.item(), loss.item()))
    loss.backward()
    
    w.data -= w.grad * lr
    b.data -= b.grad * lr
    
    w.grad.data.zero_()
    b.grad.data.zero_()


Iter: 0, w: 0.0000, b: 0.0000, training loss: 16.3612
Iter: 10, w: 0.5530, b: 1.0139, training loss: 7.5307
Iter: 20, w: 0.9299, b: 1.7006, training loss: 3.4682
Iter: 30, w: 1.1873, b: 2.1654, training loss: 1.5991
Iter: 40, w: 1.3637, b: 2.4797, training loss: 0.7392
Iter: 50, w: 1.4850, b: 2.6919, training loss: 0.3434
Iter: 60, w: 1.5690, b: 2.8349, training loss: 0.1612
Iter: 70, w: 1.6277, b: 2.9311, training loss: 0.0773
Iter: 80, w: 1.6691, b: 2.9954, training loss: 0.0385
Iter: 90, w: 1.6987, b: 3.0382, training loss: 0.0206
Iter: 100, w: 1.7204, b: 3.0663, training loss: 0.0122
Iter: 110, w: 1.7367, b: 3.0846, training loss: 0.0083
Iter: 120, w: 1.7492, b: 3.0962, training loss: 0.0063
Iter: 130, w: 1.7591, b: 3.1033, training loss: 0.0053
Iter: 140, w: 1.7673, b: 3.1073, training loss: 0.0048
Iter: 150, w: 1.7743, b: 3.1093, training loss: 0.0044
Iter: 160, w: 1.7804, b: 3.1099, training loss: 0.0042
Iter: 170, w: 1.7859, b: 3.1096, training loss: 0.0040
Iter: 180, w: 1.7909