## Pytorch 使用记录

对于机器学习问题，`PyTorch`提供了很优雅的解决方案，得益于庞大的社区支持，我们可以很轻松地使用以及部署前沿模型。

资料参考主体：
- [Getting-Things-Done-with-Pytorch](https://github.com/curiousily/Getting-Things-Done-with-Pytorch)
- [Pytorch基础](http://www.feiguyunai.com/index.php/2019/09/11/pytorch-char02/#24_NumpyTensor)
- [d2l-预备知识-数据操作](https://zh-v2.d2l.ai/chapter_preliminaries/ndarray.html)

In [1]:
!pip install -q -U torch watermark torchvision tensorboardX



In [1]:
%load_ext watermark
%watermark -v -p numpy,torch

Python implementation: CPython
Python version       : 3.7.12
IPython version      : 7.27.0

numpy: 1.21.2
torch: 1.9.0



In [3]:
import os

import numpy as np
import pandas as pd
import torch

### 引入

从numpy开始（建议先熟悉numpy的基础操作），进行类似的运算操作，其和pytorch可以相互转化，并且不会有额外的性能负担。

关于[TENSOR](https://pytorch.org/docs/stable/tensors.html)，相关入门教程参考如下：

- [Pytorch之Tensor简单入门](https://zhuanlan.zhihu.com/p/69499953)

In [4]:
a = np.array([1, 2])
b = np.array([8, 9])

c = a + b
c

array([ 9, 11])

In [5]:
a = torch.tensor([1, 2])
b = torch.tensor([8, 9])

c = a + b
c

tensor([ 9, 11])

In [12]:
# 改变形状
# [0, 1] 均匀分布随机初始化 Tensor
x = torch.rand([3, 4])
x

tensor([[0.5462, 0.4484, 0.1724, 0.4579],
        [0.1716, 0.8310, 0.2609, 0.2923],
        [0.9541, 0.5630, 0.7253, 0.8349]])

In [13]:
x_view = x.view([4, -1])
x_view

tensor([[0.5462, 0.4484, 0.1724],
        [0.4579, 0.1716, 0.8310],
        [0.2609, 0.2923, 0.9541],
        [0.5630, 0.7253, 0.8349]])

### 创建Tensor

常见的构建tensor的方法，如下：
- Tensor
- eye
- linespace
- logspace
- rand
- ones
- zeros
- ones_like
- zeros_like
- arange
- from_numpy

In [14]:
# 根据list数据生成tensor
torch.Tensor([1, 2, 3, 4, 5, 6])

tensor([1., 2., 3., 4., 5., 6.])

In [15]:
# 根据指定形状生成tensor
torch.Tensor(2, 3)

tensor([[8.8766e-29, 4.5726e-41, 3.8228e+03],
        [3.0721e-41, 4.4842e-44, 0.0000e+00]])

In [18]:
# 根据给定的tensor的形状
t = torch.Tensor([[1, 2, 3], [4, 5, 6]])
# 查看tensor的形状
t.size(), t.shape

(torch.Size([2, 3]), torch.Size([2, 3]))

In [19]:
# 根据已有形状创建tensor
torch.Tensor(t.size())

tensor([[8.8766e-29, 4.5726e-41, 8.8766e-29],
        [4.5726e-41, 5.0000e+00, 6.0000e+00]])

In [20]:
# 生成一个单位矩阵
torch.eye(2, 2)

tensor([[1., 0.],
        [0., 1.]])

In [23]:
# 自动生成全是0的矩阵
torch.zeros(2, 3)

tensor([[0., 0., 0.],
        [0., 0., 0.]])

In [24]:
# 根据规则生成数据
torch.linspace(1, 10, 4)

tensor([ 1.,  4.,  7., 10.])

In [25]:
# 生成满足均匀分布随机数
torch.rand(2, 3)

tensor([[0.0879, 0.0195, 0.1620],
        [0.2268, 0.7040, 0.7683]])

In [26]:
# 生成满足标准分布随机数
torch.randn(2, 3)

tensor([[-0.9170, -0.2158, -0.1832],
        [ 0.8921,  0.4182, -2.2238]])

In [27]:
# 返回所给数据形状相同，值全为0的张量
torch.zeros_like(torch.rand(2, 3))

tensor([[0., 0., 0.],
        [0., 0., 0.]])

### 修改Tensor

在处理数据、构建网络层等过程中，经常需要了解Tensor的形状、修改Tensor的形状。相关处理函数如下：
- size，和 shape 一样
- numel：计算tensor元素个数
- view：和 reshape 类似
- resize：类似 view，不同在于 size 超出时候会重新分配内存空间
- item
- unsqueeze：在指定维度增加一个 `1`
- squeeze：在指定维度压缩一个 `1`

In [28]:
# 生成一个形状为2x3的矩阵
x = torch.randn(2, 3)

In [29]:
x.size(), x.shape

(torch.Size([2, 3]), torch.Size([2, 3]))

In [30]:
# 查看x的维度
x.dim()

2

In [31]:
# 把x变为3x2的矩阵
x.view(3, 2)
x

tensor([[ 0.1062, -2.6887,  0.2993],
        [ 0.1608, -0.6922,  0.6646]])

In [32]:
# 把x展平为1维向量
y = x.view(-1)
y, y.shape

(tensor([ 0.1062, -2.6887,  0.2993,  0.1608, -0.6922,  0.6646]),
 torch.Size([6]))

In [35]:
# 添加一个维度
z = torch.unsqueeze(y, 0)
z, z.dim()

(tensor([[ 0.1062, -2.6887,  0.2993,  0.1608, -0.6922,  0.6646]]), 2)

In [36]:
# 计算Z的元素个数
z.numel()

6

### 索引操作

相关操作函数如下：
- index_select：制定维度选择行或者列
- nonzero：获取非0元素下标
- masked_select：使用二元值进行选择
- gather：在指定维度上选择数据，输出的形状与index一致
- scatter_：gather的反操作，根据指定元素索引补充数据

In [38]:
# 设置一个随机种子
torch.manual_seed(100)
# 生成一个形状为2x3的矩阵
x = torch.randn(2, 3)
x

tensor([[ 0.3607, -0.2859, -0.3938],
        [ 0.2429, -1.3833, -2.3134]])

In [39]:
# 根据索引获取第1行，所有数据
x[0, :]

tensor([ 0.3607, -0.2859, -0.3938])

In [40]:
# 获取最后一列数据
x[:, -1]

tensor([-0.3938, -2.3134])

In [41]:
mask = x > 0
# 获取大于0的值
torch.masked_select(x, mask)

tensor([0.3607, 0.2429])

In [42]:
# 获取非0下标,即行，列索引
torch.nonzero(mask)

tensor([[0, 0],
        [1, 0]])

In [51]:
index = torch.LongTensor([[0, 1, 1]])
index

tensor([[0, 1, 1]])

In [52]:
torch.gather(x, 0, index)

tensor([[ 0.3607, -1.3833, -2.3134]])

In [53]:
index = torch.LongTensor([[0, 1, 1], [1, 1, 1]])
a = torch.gather(x, 1, index)
a

tensor([[ 0.3607, -0.2859, -0.2859],
        [-1.3833, -1.3833, -1.3833]])

In [54]:
z = torch.zeros(2, 3)
z.scatter_(1, index, a)

tensor([[ 0.3607, -0.2859,  0.0000],
        [ 0.0000, -1.3833,  0.0000]])

In [2]:
x = torch.arange(12)
x

tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [3]:
x.shape, torch.Size([12])

(torch.Size([12]), torch.Size([12]))

### 广播机制

In [56]:
A = np.arange(0, 40, 10).reshape(4, 1)
B = np.arange(0, 3)

In [57]:
A, B

(array([[ 0],
        [10],
        [20],
        [30]]),
 array([0, 1, 2]))

In [58]:
# 把ndarray转换为Tensor
A1 = torch.from_numpy(A)  # 形状为4x1
B1 = torch.from_numpy(B)  # 形状为3

In [59]:
A1, B1

(tensor([[ 0],
         [10],
         [20],
         [30]]),
 tensor([0, 1, 2]))

In [60]:
A1 + B1

tensor([[ 0,  1,  2],
        [10, 11, 12],
        [20, 21, 22],
        [30, 31, 32]])

In [62]:
B2 = B1.unsqueeze(0)

In [64]:
B1, B2

(tensor([0, 1, 2]), tensor([[0, 1, 2]]))

In [65]:
A1.expand(4, 3)

tensor([[ 0,  0,  0],
        [10, 10, 10],
        [20, 20, 20],
        [30, 30, 30]])

In [66]:
A2 = A1.expand(4, 3)
B3 = B2.expand(4, 3)

In [70]:
A2 + B3

tensor([[ 0,  1,  2],
        [10, 11, 12],
        [20, 21, 22],
        [30, 31, 32]])

### 逐元素操作

大部分数学运算都属于逐元操作，逐元素操作输入与输出的形状相同，常见的逐元素操作如下：
- abs/add
- addcmul
- ceil/floor
- clamp
- exp
- exp/log/pow
- mul/neg
- sigmoid/tanh/softmax
- sign/sqrt

这些操作均创建新的tensor，如果需要就地操作，可以使用这些方法的下划线版本，例如abs_

In [71]:
t = torch.randn(1, 3)
t1 = torch.randn(3, 1)
t2 = torch.randn(1, 3)
t

tensor([[-0.3172, -0.8660,  1.7482]])

In [82]:
# 计算sigmoid
torch.sigmoid(t)

tensor([[0.4214, 0.2961, 0.8517]])

In [72]:
t1

tensor([[-0.2759],
        [-0.9755],
        [ 0.4790]])

In [73]:
t2

tensor([[-2.3652, -0.8047,  0.6587]])

In [81]:
0.1 * (t1 / t2)

tensor([[ 0.0117,  0.0343, -0.0419],
        [ 0.0412,  0.1212, -0.1481],
        [-0.0203, -0.0595,  0.0727]])

In [79]:
-0.2759 / -2.3652, -0.2759 / -0.8047, -0.2759 / 0.6587

(0.11664975477760864, 0.3428606934261215, -0.41885532108698953)

In [80]:
# t+0.1*(t1/t2)
torch.addcdiv(t, 0.1, t1, t2)

	addcdiv(Tensor input, Number value, Tensor tensor1, Tensor tensor2, *, Tensor out)
Consider using one of the following signatures instead:
	addcdiv(Tensor input, Tensor tensor1, Tensor tensor2, *, Number value, Tensor out) (Triggered internally at  /pytorch/torch/csrc/utils/python_arg_parser.cpp:1025.)
  


tensor([[-0.3055, -0.8318,  1.7063],
        [-0.2760, -0.7448,  1.6001],
        [-0.3374, -0.9256,  1.8209]])

In [83]:
# 将t限制在[0,1]之间
torch.clamp(t, 0, 1)

tensor([[0., 0., 1.]])

In [84]:
# t+2进行就地运算
t.add_(2)

tensor([[1.6828, 1.1340, 3.7482]])

### 归并操作

归并操作顾名思义，就是对输入进行归并或合计等操作，这类操作的输入输出形状一般不相同，而且往往是输入大于输出形状。归并操作可以对整个tensor，也可以沿着某个维度进行归并。

常见的归并操作如下：
- cumprod：在指定维度对t进行累积
- cumsum：在指定维度对t进行累加
- dist：返回a,b之间的p阶范数
- mean/median：均值/中位数
- std/var：标准差/方差
- norm：返回t的p阶范数
- prod/sum：返回所有元素的积/和

In [86]:
a = torch.linspace(0, 10, 6)
a

tensor([ 0.,  2.,  4.,  6.,  8., 10.])

In [87]:
# 使用view方法，把a变为2x3矩阵
a = a.view((2, 3))
a

tensor([[ 0.,  2.,  4.],
        [ 6.,  8., 10.]])

In [88]:
# 沿y轴方向累加，即dim=0
b = a.sum(dim=0)  # b的形状为[3]
b

tensor([ 6., 10., 14.])

In [89]:
# 沿y轴方向累加，即dim=0,并保留含1的维度
b = a.sum(dim=0, keepdim=True)  # b的形状为[1,3]
b

tensor([[ 6., 10., 14.]])

### 比较操作

比较操作一般进行逐元素比较，有些是按指定方向比较。常用的比较函数可参考如下：
- eq：比较tensor是否相等
- equal：比较tensor是否有相同的shape和值
- ge/le/gt/lt：大于/小于/大于等于/小于等于
- max/min
- topk：在指定的axis维上取最高K个值

In [90]:
import torch

x = torch.linspace(0, 10, 6).view(2, 3)

x

tensor([[ 0.,  2.,  4.],
        [ 6.,  8., 10.]])

In [91]:
# 求所有元素的最大值
torch.max(x)  # 结果为10

tensor(10.)

In [92]:
# 求y轴方向的最大值
torch.max(x, dim=0)  # 结果为[6,8,10]

torch.return_types.max(
values=tensor([ 6.,  8., 10.]),
indices=tensor([1, 1, 1]))

In [95]:
# 求最大的2个元素
torch.topk(x, 1, dim=0)  # 结果为[6,8,10],对应索引为tensor([[1, 1, 1]

torch.return_types.topk(
values=tensor([[ 6.,  8., 10.]]),
indices=tensor([[1, 1, 1]]))

### 矩阵操作

主要是逐元素乘法和点积乘法，常用矩阵函数如下：
- dot(t1, t2)：计算张量(1D)的内积或者点积
- mm：矩阵乘法
- mv：转置
- svd：计算t的SVD分解

In [96]:
a = torch.tensor([2, 3])
b = torch.tensor([3, 4])

torch.dot(a, b)

tensor(18)

In [98]:
x = torch.randint(10, (2, 3))
y = torch.randint(6, (3, 4))
x, y

(tensor([[4, 0, 5],
         [7, 5, 9]]),
 tensor([[3, 3, 1, 3],
         [0, 1, 3, 3],
         [0, 2, 5, 5]]))

In [99]:
torch.mm(x, y)

tensor([[12, 22, 29, 37],
        [21, 44, 67, 81]])

In [100]:
x = torch.randint(10, (2, 2, 3))
y = torch.randint(6, (2, 3, 4))
x, y

(tensor([[[8, 3, 6],
          [1, 5, 5]],
 
         [[0, 4, 3],
          [8, 8, 3]]]),
 tensor([[[1, 3, 0, 2],
          [0, 0, 0, 4],
          [3, 0, 5, 2]],
 
         [[5, 2, 0, 5],
          [3, 0, 0, 0],
          [3, 5, 5, 3]]]))

In [101]:
x.shape, y.shape

(torch.Size([2, 2, 3]), torch.Size([2, 3, 4]))

In [102]:
torch.bmm(x, y)

tensor([[[26, 24, 30, 40],
         [16,  3, 25, 32]],

        [[21, 15, 15,  9],
         [73, 31, 15, 49]]])

### CUDA 计算

In [104]:
if torch.cuda.is_available():
    x = x.cuda()
    y = y.cuda()
    x + y

### 数据操作

In [4]:
os.makedirs(os.path.join("./", "data"), exist_ok=True)
data_file = os.path.join("./", "data", "house_tiny.csv")
with open(data_file, "w") as f:
    f.write("NumRooms,Alley,Price\n")  # 列名
    f.write("NA,Pave,127500\n")  # 每行表示一个数据样本
    f.write("2,NA,106000\n")
    f.write("4,NA,178100\n")
    f.write("NA,NA,140000\n")

In [5]:
data = pd.read_csv(data_file)
data

Unnamed: 0,NumRooms,Alley,Price
0,,Pave,127500
1,2.0,,106000
2,4.0,,178100
3,,,140000


In [6]:
inputs, outputs = data.iloc[:, 0:2], data.iloc[:, -1]

In [7]:
inputs

Unnamed: 0,NumRooms,Alley
0,,Pave
1,2.0,
2,4.0,
3,,


In [8]:
outputs

0    127500
1    106000
2    178100
3    140000
Name: Price, dtype: int64

In [9]:
inputs = inputs.fillna(inputs.mean())
inputs

  """Entry point for launching an IPython kernel.


Unnamed: 0,NumRooms,Alley
0,3.0,Pave
1,2.0,
2,4.0,
3,3.0,


In [26]:
inputs = pd.get_dummies(inputs, dummy_na=True)
inputs

Unnamed: 0,NumRooms,Alley_Pave,Alley_nan
0,3.0,1,0
1,2.0,0,1
2,4.0,0,1
3,3.0,0,1


In [27]:
X, y = torch.tensor(inputs.values), torch.tensor(outputs.values)
X, y

(tensor([[3., 1., 0.],
         [2., 0., 1.],
         [4., 0., 1.],
         [3., 0., 1.]], dtype=torch.float64),
 tensor([127500, 106000, 178100, 140000]))