pt 优点：
简洁易懂：Pytorch的API设计的相当简洁一致。基本上就是tensor, autograd, nn三级封装。学习起来非常容易。
Pytorch底层最核心的概念是张量，动态计算图以及自动微分。



## 1. 张量
- dtype
- shape/ size()
- dim()  维度：0，1，2，3

In [6]:
import numpy as np
import torch 

# 自动推断数据类型

i = torch.tensor(1);print(i,i.dtype)
x = torch.tensor(2.0);print(x,x.dtype)
b = torch.tensor(True);print(b,b.dtype)

tensor(1) torch.int64
tensor(2.) torch.float32
tensor(True) torch.bool


In [52]:
scalar = torch.tensor(True)
print(scalar);print(scalar.dim());print(scalar.dtype)
a = torch.tensor(2.0)
print(a.dtype)
print('SIZE: ', scalar.size())
vector = torch.tensor([1.0,2.0,3.0,4.0])
print('SIZE: ', vector.size())
print(vector.dim())

tensor(True)
0
torch.bool
torch.float32
SIZE:  torch.Size([])
SIZE:  torch.Size([4])
1


In [27]:
tensor3 = torch.tensor([[[1.0,2.0,1,1],[1,3.0,4.0,1]],
                        [[2,5.0,6.0,1],[0, 7.0,8.0,1]],
                        [[1,5.0,6.0,1],[0, 7.0,8.0,1]]])  # 3维张量
print(tensor3)
print(tensor3.dim())
tensor3.shape, tensor3.size()

tensor([[[1., 2., 1., 1.],
         [1., 3., 4., 1.]],

        [[2., 5., 6., 1.],
         [0., 7., 8., 1.]],

        [[1., 5., 6., 1.],
         [0., 7., 8., 1.]]])
3


(torch.Size([3, 2, 4]), torch.Size([3, 2, 4]))

In [41]:
# 张量切片
import numpy as np

a = torch.rand(2, 3)
print(a)
b = torch.BoolTensor(np.array([1,0,2,0])); print(b,b.dtype)
a.dim(), a[1,:], a[1,1], a[:,0], a[:,2]

tensor([[0.0578, 0.4039, 0.7349],
        [0.8830, 0.5480, 0.0786]])
tensor([ True, False,  True, False]) torch.bool


(2,
 tensor([0.8830, 0.5480, 0.0786]),
 tensor(0.5480),
 tensor([0.0578, 0.8830]),
 tensor([0.7349, 0.0786]))

## 2. 动态计算图

pt 使用动态计算图
> PyTorch 使用一种称之为 imperative / eager 的范式，即每一行代码都要求构建一个图以定义完整计算图的一个部分。即使完整的计算图还没有完成构建，我们也可以独立地执行这些作为组件的小计算图，这种动态计算图被称为「define-by-run」方法。

![image.png](https://image.jiqizhixin.com/uploads/editor/b6aea2ac-a4b4-4314-8b18-c1b4a1a79ca4/640.gif)

### 计算图TensorBoard中可视化

![image.png](../pic/pt-board.png)

In [95]:
# 计算图可视化
from torch import nn 
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.w = nn.Parameter(torch.randn(2,1))
        self.b = nn.Parameter(torch.zeros(1,1))

    def forward(self, x):
        y = x@self.w + self.b
        return y

net = Net()

from torch.utils.tensorboard import SummaryWriter
writer = SummaryWriter('../../model/tensorboard')
writer.add_graph(net,input_to_model = torch.rand(10,2))
writer.close()

In [99]:
%load_ext tensorboard
#%tensorboard --logdir ../../model/tensorboard

from tensorboard import notebook
notebook.list() 

#在tensorboard中查看模型
notebook.start("--logdir ../../model/tensorboard")

The tensorboard extension is already loaded. To reload it, use:
  %reload_ext tensorboard
Known TensorBoard instances:
  - port 6006: logdir ../../model/tensorboard (started 0:27:36 ago; pid 17349)


Reusing TensorBoard on port 6006 (pid 17349), started 0:27:36 ago. (Use '!kill 17349' to kill it.)

## 3. 自动微分

### 3.1 backward()方法求导
通常用在标量张量上，该方法求得的梯度将存在对应自变量张量的grad属性下。

In [57]:
# 标量的反向传播
import numpy as np 
import torch 

# f(x) = a*x**2 + b*x + c的导数
# >>2ax+b

x = torch.tensor(0.0,requires_grad = True) # x需要被求导
a = torch.tensor(1.0)
b = torch.tensor(-2.0)
c = torch.tensor(1.0)
y = a*torch.pow(x,2) + b*x + c 

y.backward()
dy_dx = x.grad
print(dy_dx)

tensor(-2.)


In [64]:
# 非标量的反向传播
# 如果调用的张量非标量，则要传入一个和它同形状 的gradient参数张量。
import numpy as np 
import torch 

# f(x) = a*x**2 + b*x + c

x = torch.tensor([[0.0,0.0],[1.0,2.0]],requires_grad = True) # x需要被求导
a = torch.tensor(1.0)
b = torch.tensor(-2.0)
c = torch.tensor(1.0)
y = a*torch.pow(x,2) + b*x + c 

gradient = torch.tensor([[2.0,1.0],[1.0,1.0]])  # <---传递，其中gradient 表示权重

print("x:\n",x)
print("y:\n",y)
y.backward(gradient = gradient)
x_grad = x.grad
print("x_grad:\n",x_grad)

x:
 tensor([[0., 0.],
        [1., 2.]], requires_grad=True)
y:
 tensor([[1., 1.],
        [0., 1.]], grad_fn=<AddBackward0>)
x_grad:
 tensor([[-4., -2.],
        [ 0.,  2.]])


In [87]:
"""
利用自动微分和优化器求最小值

作为对比， 如果不设置  optimizer.zero_grad()  # 梯度清零
则梯度结果如下（明显是累加了）：
tensor(-2.)       tensor(-2.)
tensor(-3.9600)     tensor(-1.9600)
tensor(-5.8408)     tensor(-1.9208)
tensor(-7.6048)     tensor(-1.8824)
tensor(-9.2167)     tensor(-1.8447)
tensor(-10.6442)
"""

import numpy as np 
import torch 

# f(x) = a*x**2 + b*x + c的最小值

x = torch.tensor(0.0,requires_grad = True) # x需要被求导
a = torch.tensor(1.0)
b = torch.tensor(-2.0)
c = torch.tensor(1.0)

optimizer = torch.optim.SGD(params=[x],lr = 0.01)


def f(x):
    result = a*torch.pow(x,2) + b*x + c 
    return(result)

for i in range(500):
    optimizer.zero_grad()  # 梯度清零
    y = f(x)
    y.backward()
    print(x.grad)
    optimizer.step()


print("y=",f(x).data,";","x=",x.data)

tensor(-2.)
tensor(-1.9600)
tensor(-1.9208)
tensor(-1.8824)
tensor(-1.8447)
tensor(-1.8078)
tensor(-1.7717)
tensor(-1.7363)
tensor(-1.7015)
tensor(-1.6675)
tensor(-1.6341)
tensor(-1.6015)
tensor(-1.5694)
tensor(-1.5380)
tensor(-1.5073)
tensor(-1.4771)
tensor(-1.4476)
tensor(-1.4186)
tensor(-1.3903)
tensor(-1.3625)
tensor(-1.3352)
tensor(-1.3085)
tensor(-1.2823)
tensor(-1.2567)
tensor(-1.2316)
tensor(-1.2069)
tensor(-1.1828)
tensor(-1.1591)
tensor(-1.1360)
tensor(-1.1132)
tensor(-1.0910)
tensor(-1.0691)
tensor(-1.0478)
tensor(-1.0268)
tensor(-1.0063)
tensor(-0.9861)
tensor(-0.9664)
tensor(-0.9471)
tensor(-0.9282)
tensor(-0.9096)
tensor(-0.8914)
tensor(-0.8736)
tensor(-0.8561)
tensor(-0.8390)
tensor(-0.8222)
tensor(-0.8058)
tensor(-0.7896)
tensor(-0.7738)
tensor(-0.7584)
tensor(-0.7432)
tensor(-0.7283)
tensor(-0.7138)
tensor(-0.6995)
tensor(-0.6855)
tensor(-0.6718)
tensor(-0.6584)
tensor(-0.6452)
tensor(-0.6323)
tensor(-0.6196)
tensor(-0.6073)
tensor(-0.5951)
tensor(-0.5832)
tensor(-0.57