## Pytorch

PyTorch是一个动态的建图的工具。不像Tensorflow那样，先建图，然后通过feed和run重复执行建好的图。相对来说，PyTorch具有更好的灵活性。


### 如何保存参数

pytorch中最重要的data type。

**Tensor**： 就像ndarray一样,一维Tensor叫Vector，二维Tensor叫Matrix，三维及以上称为Tensor(张量)

**Tensor**与**ndarray**的最主要区别：Tensor可以在GPU上进行计算，可以自动求导。

In [1]:
import torch

In [2]:
x  = torch.Tensor(2,3,4) # torch.Tensor(shape) 创建出一个未初始化的Tensor
x

tensor([[[0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00],
         [0.0000e+00, 8.4490e-39, 7.7052e+31, 7.2148e+22],
         [2.5226e-18, 2.5930e-09, 1.0299e-11, 7.7196e-10]],

        [[2.6409e-06, 4.1429e-11, 4.2485e-05, 2.9573e-18],
         [6.7333e+22, 1.7591e+22, 1.7184e+25, 4.3222e+27],
         [6.1972e-04, 7.2443e+22, 1.7728e+28, 7.0367e+22]]])

In [3]:
x.size()
# x.shape

torch.Size([2, 3, 4])

In [4]:
a = torch.rand(2,3,4)
b = torch.rand(2,3,4)

x = torch.add(a,b)  # 使用Tensor()方法创建出来的Tensor用来接收计算结果，当然torch.add(..)也会返回计算结果的
print(x)

tensor([[[1.3839, 0.6207, 1.7443, 0.6530],
         [0.4751, 0.6660, 0.9988, 0.2420],
         [1.0407, 0.9756, 1.6424, 1.2415]],

        [[0.4869, 1.0725, 0.6916, 1.0074],
         [0.6294, 0.7366, 1.1275, 1.3891],
         [0.9212, 1.5383, 1.4855, 0.8844]]])


In [5]:
a + b

tensor([[[1.3839, 0.6207, 1.7443, 0.6530],
         [0.4751, 0.6660, 0.9988, 0.2420],
         [1.0407, 0.9756, 1.6424, 1.2415]],

        [[0.4869, 1.0725, 0.6916, 1.0074],
         [0.6294, 0.7366, 1.1275, 1.3891],
         [0.9212, 1.5383, 1.4855, 0.8844]]])

In [6]:
a

tensor([[[0.6759, 0.4646, 0.8829, 0.0632],
         [0.2150, 0.3049, 0.5503, 0.1571],
         [0.4024, 0.5474, 0.9509, 0.5834]],

        [[0.2586, 0.7124, 0.1468, 0.0646],
         [0.6201, 0.3861, 0.5532, 0.8023],
         [0.5772, 0.9280, 0.6851, 0.4593]]])

In [7]:
a.add_(b) # 所有带 _ 的operation，都会更改调用对象的值 a = a + b

tensor([[[1.3839, 0.6207, 1.7443, 0.6530],
         [0.4751, 0.6660, 0.9988, 0.2420],
         [1.0407, 0.9756, 1.6424, 1.2415]],

        [[0.4869, 1.0725, 0.6916, 1.0074],
         [0.6294, 0.7366, 1.1275, 1.3891],
         [0.9212, 1.5383, 1.4855, 0.8844]]])

In [8]:
a

tensor([[[1.3839, 0.6207, 1.7443, 0.6530],
         [0.4751, 0.6660, 0.9988, 0.2420],
         [1.0407, 0.9756, 1.6424, 1.2415]],

        [[0.4869, 1.0725, 0.6916, 1.0074],
         [0.6294, 0.7366, 1.1275, 1.3891],
         [0.9212, 1.5383, 1.4855, 0.8844]]])

## 自动求导

pytorch的自动求导工具包在torch.autograd中

In [9]:
x = torch.tensor([2.0], requires_grad = True)
a = torch.tensor([4.0], requires_grad = True)
y = x * a

y.backward()    

print(x.grad)   # 2 * x = 4
print(a.grad)

tensor([4.])
tensor([2.])


In [10]:
x_train = torch.rand(100)
y_train = x_train * 2 + 3 # w = 2, b = 3, y = 2 * x + 3

In [11]:
w = torch.tensor([0.0],requires_grad = True)
b = torch.tensor([0.0],requires_grad = True)

print(type(w))

<class 'torch.Tensor'>


In [12]:
y_pre = x_train * w + b
y_pre

tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0.], grad_fn=<AddBackward0>)

In [15]:
lr = 0.015
loss_func = torch.nn.MSELoss()
for i in range(5000):
    y_pre = x_train * w + b

    loss = loss_func(y_train, y_pre)
    if i % 10 == 0:
        print("Iter: %d, w: %.4f, b: %.4f, training loss: %.4f" % (i, w.item(), b.item(), loss.item()))
    loss.backward()
    
    w.data -= w.grad * lr
    b.data -= b.grad * lr
    
    w.grad.data.zero_()
    b.grad.data.zero_()


Iter: 0, w: 1.9950, b: 3.0028, training loss: 0.0000
Iter: 10, w: 1.9951, b: 3.0027, training loss: 0.0000
Iter: 20, w: 1.9952, b: 3.0027, training loss: 0.0000
Iter: 30, w: 1.9952, b: 3.0026, training loss: 0.0000
Iter: 40, w: 1.9953, b: 3.0026, training loss: 0.0000
Iter: 50, w: 1.9954, b: 3.0025, training loss: 0.0000
Iter: 60, w: 1.9955, b: 3.0025, training loss: 0.0000
Iter: 70, w: 1.9956, b: 3.0025, training loss: 0.0000
Iter: 80, w: 1.9956, b: 3.0024, training loss: 0.0000
Iter: 90, w: 1.9957, b: 3.0024, training loss: 0.0000
Iter: 100, w: 1.9958, b: 3.0023, training loss: 0.0000
Iter: 110, w: 1.9959, b: 3.0023, training loss: 0.0000
Iter: 120, w: 1.9959, b: 3.0022, training loss: 0.0000
Iter: 130, w: 1.9960, b: 3.0022, training loss: 0.0000
Iter: 140, w: 1.9961, b: 3.0022, training loss: 0.0000
Iter: 150, w: 1.9962, b: 3.0021, training loss: 0.0000
Iter: 160, w: 1.9962, b: 3.0021, training loss: 0.0000
Iter: 170, w: 1.9963, b: 3.0021, training loss: 0.0000
Iter: 180, w: 1.9964,

Iter: 1760, w: 1.9998, b: 3.0001, training loss: 0.0000
Iter: 1770, w: 1.9998, b: 3.0001, training loss: 0.0000
Iter: 1780, w: 1.9998, b: 3.0001, training loss: 0.0000
Iter: 1790, w: 1.9998, b: 3.0001, training loss: 0.0000
Iter: 1800, w: 1.9998, b: 3.0001, training loss: 0.0000
Iter: 1810, w: 1.9998, b: 3.0001, training loss: 0.0000
Iter: 1820, w: 1.9998, b: 3.0001, training loss: 0.0000
Iter: 1830, w: 1.9998, b: 3.0001, training loss: 0.0000
Iter: 1840, w: 1.9998, b: 3.0001, training loss: 0.0000
Iter: 1850, w: 1.9998, b: 3.0001, training loss: 0.0000
Iter: 1860, w: 1.9998, b: 3.0001, training loss: 0.0000
Iter: 1870, w: 1.9998, b: 3.0001, training loss: 0.0000
Iter: 1880, w: 1.9998, b: 3.0001, training loss: 0.0000
Iter: 1890, w: 1.9998, b: 3.0001, training loss: 0.0000
Iter: 1900, w: 1.9998, b: 3.0001, training loss: 0.0000
Iter: 1910, w: 1.9998, b: 3.0001, training loss: 0.0000
Iter: 1920, w: 1.9998, b: 3.0001, training loss: 0.0000
Iter: 1930, w: 1.9998, b: 3.0001, training loss:

Iter: 3580, w: 1.9999, b: 3.0000, training loss: 0.0000
Iter: 3590, w: 1.9999, b: 3.0000, training loss: 0.0000
Iter: 3600, w: 1.9999, b: 3.0000, training loss: 0.0000
Iter: 3610, w: 1.9999, b: 3.0000, training loss: 0.0000
Iter: 3620, w: 1.9999, b: 3.0000, training loss: 0.0000
Iter: 3630, w: 1.9999, b: 3.0000, training loss: 0.0000
Iter: 3640, w: 1.9999, b: 3.0000, training loss: 0.0000
Iter: 3650, w: 1.9999, b: 3.0000, training loss: 0.0000
Iter: 3660, w: 1.9999, b: 3.0000, training loss: 0.0000
Iter: 3670, w: 1.9999, b: 3.0000, training loss: 0.0000
Iter: 3680, w: 1.9999, b: 3.0000, training loss: 0.0000
Iter: 3690, w: 1.9999, b: 3.0000, training loss: 0.0000
Iter: 3700, w: 1.9999, b: 3.0000, training loss: 0.0000
Iter: 3710, w: 1.9999, b: 3.0000, training loss: 0.0000
Iter: 3720, w: 1.9999, b: 3.0000, training loss: 0.0000
Iter: 3730, w: 1.9999, b: 3.0000, training loss: 0.0000
Iter: 3740, w: 1.9999, b: 3.0000, training loss: 0.0000
Iter: 3750, w: 1.9999, b: 3.0000, training loss: