<a href="https://colab.research.google.com/github/Auzzer/QuickPytorchTutor/blob/main/QuickPytorchTutor.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Tensor (张量)的创建与基本操作
张量是GPU运算中常见的数据类型，类似于Numpy中的数组，但是由于深度学习中常常需要使用GPU进行并行运算，提升计算速度，因此使用Tensor

## 创建

In [2]:
import torch

In [3]:
# 创建一个张量通常需要初始化，我们一般使用：
x1 = torch.empty(5, 3) #一个5x3的张量
print(x1)

# 随机初始化张量
x2 = torch.rand(5, 3)
print(x2)

# 在初始化张量时，有时需要指定数据类型：
x3 = torch.zeros(5, 3, dtype=torch.long) #创建一个数据类型为long的"0"填充矩阵
print(x3)

# 如果已知一个矩阵，将其转化为张量加速计算
x4 = torch.tensor([[1,2,3],[6,7,8]]) 
print(x4)

tensor([[5.1744e+08, 3.0829e-41, 3.3631e-44],
        [0.0000e+00,        nan, 3.0829e-41],
        [1.1578e+27, 1.1362e+30, 7.1547e+22],
        [4.5828e+30, 1.2121e+04, 7.1846e+22],
        [9.2198e-39, 7.0374e+22, 1.6660e+08]])
tensor([[0.7525, 0.1940, 0.1311],
        [0.9557, 0.5275, 0.4971],
        [0.4305, 0.8307, 0.8196],
        [0.3001, 0.2640, 0.5383],
        [0.1618, 0.0167, 0.1394]])
tensor([[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]])
tensor([[1, 2, 3],
        [6, 7, 8]])


In [4]:
# 获取size
print(x4.size()) # 结果表示为两行三列

torch.Size([2, 3])


## Basic Product


### 加法

In [5]:
# 对于一个最简单的加法
x = torch.rand(5, 3)
y = torch.rand(5, 3)
print (x+y)
# 或者
print (torch.add(x, y))

# 对于torch.add(),有一个option可以将结果保存下来,但是首先需要将结果初始化
result = torch.empty(5, 3)
torch.add(x, y, out = result)
print("result: \n",result)

#特别值得注意的是，torch.add()函数只能对两个张量进行运算，当数量大于二的时候：
y1 = torch.zeros(5, 3)
y2 = torch.empty(5, 3)
for i in range(0, 5):
  xi = torch.rand(5, 3)
  y1.add(xi)
  y2.add(xi)

print("y1\n",y1)
print("y2\n",y2)
# 结果显示在做加法的时候，由于empty张量会随机给一个值，所以应该使用"zeros张量"

tensor([[0.7991, 0.6582, 0.1950],
        [1.1997, 1.6182, 1.2461],
        [0.7578, 0.9874, 1.3339],
        [0.8957, 0.8560, 0.4414],
        [0.2321, 0.4849, 1.0495]])
tensor([[0.7991, 0.6582, 0.1950],
        [1.1997, 1.6182, 1.2461],
        [0.7578, 0.9874, 1.3339],
        [0.8957, 0.8560, 0.4414],
        [0.2321, 0.4849, 1.0495]])
result: 
 tensor([[0.7991, 0.6582, 0.1950],
        [1.1997, 1.6182, 1.2461],
        [0.7578, 0.9874, 1.3339],
        [0.8957, 0.8560, 0.4414],
        [0.2321, 0.4849, 1.0495]])
y1
 tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])
y2
 tensor([[5.1746e+08, 3.0829e-41, 1.9498e-01],
        [1.1997e+00, 1.6182e+00, 1.2461e+00],
        [7.5777e-01, 9.8741e-01, 1.3339e+00],
        [8.9569e-01, 8.5599e-01, 4.4136e-01],
        [2.3212e-01, 4.8486e-01, 1.0495e+00]])


### 乘法
在矩阵中，乘法有点乘和叉乘两种，张量也有相对应的不同计算方法

In [6]:
# 点乘
x1 = torch.rand(5, 3)
x2 = torch.rand(5, 3)
y = torch.mul(x1, x2)# 类似与上文的add，在多个张量可以使用x1.mul(x2)
print("x1:\n", x1, "\n x2:\n", x2, "\n y:\n",y)

# 叉乘
x1 = torch.rand(2, 3)
x2 = torch.rand(3, 4)
y = torch.mm(x1, x2)# 类似与上文的add，在多个张量可以使用x1.mm(x2)
print("x1:\n", x1, "\nx2:\n", x2, "\ny:\n",y)

x1:
 tensor([[0.9683, 0.8925, 0.7538],
        [0.3389, 0.7155, 0.7408],
        [0.8868, 0.7700, 0.4897],
        [0.5311, 0.6848, 0.7789],
        [0.7912, 0.3158, 0.3047]]) 
 x2:
 tensor([[0.2336, 0.0887, 0.3054],
        [0.1792, 0.6720, 0.3686],
        [0.7706, 0.4090, 0.2499],
        [0.8003, 0.9167, 0.6720],
        [0.1173, 0.1767, 0.6372]]) 
 y:
 tensor([[0.2262, 0.0792, 0.2302],
        [0.0607, 0.4809, 0.2731],
        [0.6834, 0.3150, 0.1224],
        [0.4250, 0.6278, 0.5234],
        [0.0928, 0.0558, 0.1941]])
x1:
 tensor([[0.6251, 0.1818, 0.3952],
        [0.5657, 0.6150, 0.8426]]) 
x2:
 tensor([[0.4008, 0.3040, 0.7904, 0.2116],
        [0.8709, 0.7667, 0.6683, 0.0707],
        [0.2428, 0.4073, 0.1466, 0.1071]]) 
y:
 tensor([[0.5048, 0.4904, 0.6735, 0.1874],
        [0.9669, 0.9867, 0.9816, 0.2534]])


### Numpy中的对应与转化

#### 索引
torch支持使用类似于Numpy中的索引对张量进行操作


In [7]:
x = torch.tensor([[1,2,3],[3,4,5]])
print(x[:,0])
print(x[1,:])

tensor([1, 3])
tensor([3, 4, 5])


#### 在numpy中reshape对应

In [8]:
x = torch.randn(5, 3)
y = x.view(3, 5)
z = x.view(-1, 5) # 当选项中有一个-1时候，代表从另外一个维度考虑转化方式
print(x)
print(y)
print(z)

tensor([[-1.2559,  0.6462,  0.2519],
        [ 0.6353,  1.2795, -0.3136],
        [ 0.0864, -0.9788, -0.5080],
        [ 0.8784, -0.2250,  2.0877],
        [-2.0187,  1.3974, -0.0423]])
tensor([[-1.2559,  0.6462,  0.2519,  0.6353,  1.2795],
        [-0.3136,  0.0864, -0.9788, -0.5080,  0.8784],
        [-0.2250,  2.0877, -2.0187,  1.3974, -0.0423]])
tensor([[-1.2559,  0.6462,  0.2519,  0.6353,  1.2795],
        [-0.3136,  0.0864, -0.9788, -0.5080,  0.8784],
        [-0.2250,  2.0877, -2.0187,  1.3974, -0.0423]])


### 覆盖操作
任何以"_"结尾的函数（操作）都会替换原来的变量

In [9]:
x1 = torch.rand(5, 3)
print("覆盖以前的x1:\n", x1)
x2 = torch.rand(5, 3)
y = x1.add_(x2)
print("覆盖以后的x1:\n", x1)
print(y)

覆盖以前的x1:
 tensor([[0.4366, 0.8698, 0.8413],
        [0.3547, 0.5832, 0.8633],
        [0.8586, 0.1226, 0.3494],
        [0.3807, 0.6102, 0.3383],
        [0.8489, 0.5306, 0.7302]])
覆盖以后的x1:
 tensor([[0.8861, 0.9280, 1.5115],
        [1.1984, 0.6268, 1.8247],
        [1.4088, 0.5498, 0.7248],
        [1.2823, 1.2932, 0.8999],
        [1.6692, 1.0823, 0.8780]])
tensor([[0.8861, 0.9280, 1.5115],
        [1.1984, 0.6268, 1.8247],
        [1.4088, 0.5498, 0.7248],
        [1.2823, 1.2932, 0.8999],
        [1.6692, 1.0823, 0.8780]])


可以看到x1的值发生了变化,但是和y的值是一样的<br>
更多对于tensor的operation可以参考官方文档：https://pytorch.org/docs/stable/torch.html


## Cuda加速
在这里我们先提出一个概念：
为了计算加速，有一些数据可以被放入cuda中加速计算。<br>
在这里我们先将tensor放入cuda中，第一步需要先判定是否有cuda(GPU)可以使用

In [10]:
import torch
x = torch.rand(5, 3)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# 下面分别从cuda中直接生成tensor以及将tensor生成后放进cuda中
x = x.to(device)
y = torch.ones_like(x, device=device)
z = x + y
print(z)
print(z.to("cpu", torch.double))


tensor([[1.6361, 1.8581, 1.3978],
        [1.5559, 1.8863, 1.5595],
        [1.6689, 1.3454, 1.1014],
        [1.1323, 1.4744, 1.4460],
        [1.8029, 1.6676, 1.2836]])
tensor([[1.6361, 1.8581, 1.3978],
        [1.5559, 1.8863, 1.5595],
        [1.6689, 1.3454, 1.1014],
        [1.1323, 1.4744, 1.4460],
        [1.8029, 1.6676, 1.2836]], dtype=torch.float64)


# 神经网络训练
对于一个神经网络而言，一般都会有

1.   数据处理
2.   定义网络
3.   训练策略
4.   可视化
5.   Pre-train & Fine-Tune(可选) <br>

这几个部分，接下来我们将逐一介绍



## 数据处理


### 数据加载

在进行对比实验时，经常需要公用数据集，torch内置了几个常用数据集包括：

* MNIST
* ImageNet
* COCO
* CIFAR <br>

等等....<br>
我们只需要调用torchvision中的torchvision.datasets就可以使用它，他的安装方式是：


```
pip install torchvision
```
torchvision.datasets 可以理解为PyTorch团队自定义的dataset，这些dataset帮我们提前处理好了很多的图片数据集，我们拿来就可以直接使用,具体使用方式如下：





In [11]:
import torchvision.datasets as datasets

trainset = datasets.MNIST(root="./data",
              train=True,# True为训练集，False为测试集
              download=True, #表示是否需要下载数据集
              transform=None) #是否对其进行数据增广
validset = datasets.MNIST(root="./data", train=True, download=False, transform=None)        

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ./data/MNIST/raw/train-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz to ./data/MNIST/raw/train-images-idx3-ubyte.gz


HBox(children=(FloatProgress(value=0.0, max=9912422.0), HTML(value='')))


Extracting ./data/MNIST/raw/train-images-idx3-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz to ./data/MNIST/raw/train-labels-idx1-ubyte.gz


HBox(children=(FloatProgress(value=0.0, max=28881.0), HTML(value='')))


Extracting ./data/MNIST/raw/train-labels-idx1-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw/t10k-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw/t10k-images-idx3-ubyte.gz


HBox(children=(FloatProgress(value=0.0, max=1648877.0), HTML(value='')))


Extracting ./data/MNIST/raw/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz


HBox(children=(FloatProgress(value=0.0, max=4542.0), HTML(value='')))


Extracting ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw

Processing...
Done!


  return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)


在很多时候，我们也需要自定义数据集去完成我们需要的任务，接下来介绍如何进行操作

在Pytorch中为了便于调用函数（因为通常在文件过大的时候会将不同部分拆分成为几个python脚本），我们通常需要声明类。对于数据加载的类编写方式如下：其主要分为两步：1. 数据集定义 2. 数据加载器 <br>
注：为了方便起见，使用kaggle上的[猫狗大战数据集](https://www.kaggle.com/c/dogs-vs-cats)举例

In [11]:
import glob
import os
import zipfile
from PIL import Image
from matplotlib.pyplot as plt
from torch.utils.data import DataLoader, Dataset
from torchvision import datasets, transforms
import random
import torch
import numpy as np

os.makedirs('data', exist_ok=True)

train_dir = 'data/train'
test_dir = 'data/test'

"""
解压数据集
with zipfile.ZipFile('train.zip') as train_zip:
    train_zip.extractall('data')

with zipfile.ZipFile('test.zip') as test_zip:
    test_zip.extractall('data')
"""


train_list = glob.glob(os.path.join(train_dir,'*.jpg'))
test_list = glob.glob(os.path.join(test_dir, '*.jpg'))


print(f"Train Data: {len(train_list)}")
print(f"Test Data: {len(test_list)}")


labels = [path.split('/ ')[-1].split('.')[0] for path in train_list]


random_idx = np.random.randint(1, len(train_list), size=9)
fig, axes = plt.subplots(3, 3, figsize=(16, 12))

for idx, ax in enumerate(axes.ravel()):
    img = Image.open(train_list[idx])
    ax.set_title(labels[idx])
    ax.imshow(img)

## splite

seed = 1010 #保证随机划分一致
def seed_everything(seed):
    random.seed(seed)
    os.environ['PYTHONHASHSEED'] = str(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)
    torch.backends.cudnn.deterministic = True

seed_everything(seed)
train_list, valid_list = train_test_split(train_list,test_size=0.2,stratify=labels,random_state=seed)


print(f"Train Data: {len(train_list)}")
print(f"Validation Data: {len(valid_list)}")
print(f"Test Data: {len(test_list)}")



## Image Augumentation



train_transforms = transforms.Compose(
    [
        transforms.Resize((224, 224)),
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
    ]
)

val_transforms = transforms.Compose(
    [
        transforms.Resize((224, 224)),
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
    ]
)


test_transforms = transforms.Compose(
    [
        transforms.Resize((224, 224)),
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
    ]
)



## Load Datasets



class CatsDogsDataset(Dataset):
    def __init__(self, file_list, transform=None):
        self.file_list = file_list
        self.transform = transform

    def __len__(self):
        self.filelength = len(self.file_list)
        return self.filelength

    def __getitem__(self, idx):
        img_path = self.file_list[idx]
        img = Image.open(img_path)
        img_transformed = self.transform(img)

        label = img_path.split("/")[-1].split(".")[0]
        label = 1 if label == "dog" else 0

        return img_transformed, label




train_data = CatsDogsDataset(train_list, transform=train_transforms)
valid_data = CatsDogsDataset(valid_list, transform=test_transforms)
test_data = CatsDogsDataset(test_list, transform=test_transforms)



train_loader = DataLoader(dataset = train_data, batch_size=batch_size, shuffle=True )
valid_loader = DataLoader(dataset = valid_data, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(dataset = test_data, batch_size=batch_size, shuffle=True)



print(len(train_data), len(train_loader))



print(len(valid_data), len(valid_loader))


### 数据预处理
一般来说，即便是对于ImageNet这样大型数据集也需要进行数据增广(data augementation)，在torchvision中的transforms模块内置了大量的操作，这里我们只进行简单举例，更多可以访问transforms操作文件。


In [12]:
from torchvision import transforms
transform = transforms.Compose([
    transforms.RandomCrop(32, padding=4),  #先四周填充0，在把图像随机裁剪成32*32
    transforms.RandomHorizontalFlip(),  #图像一半的概率翻转，一半的概率不翻转
    transforms.RandomRotation((-45,45)), #随机旋转
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,)), #R,G,B每层的归一化用到的均值和方差
])



接下来我们只需要对数据数据加载部分进行简单修改即可

In [13]:
trainset = datasets.MNIST(root="./data", train=True, download=False, transform= transform)  

由于Pytorch中以及包含了MNIST数据集，也可以直接使用DataLoader对数据进行读取。


In [14]:

train_loader = torch.utils.data.DataLoader(
        datasets.MNIST('./data', train=True, download=True, 
                       transform=transforms.Compose([
                           transforms.ToTensor(),# 将读取的数据转化为张量方便计算
                           transforms.Normalize((0.1307,), (0.3081,))#这一部分来源于前人根据这个数据集获得的标准化参数，不同的数据集不同
                       ])),
                       
        batch_size=512, #这里的batch_size将在下文的训练策略中详细介绍
        shuffle=True)

In [15]:
test_loader = torch.utils.data.DataLoader(
        datasets.MNIST('data', train=False, transform=transforms.Compose([
                           transforms.ToTensor(), 
                           transforms.Normalize((0.1307,), (0.3081,))])), 
        batch_size=512, shuffle=True)

In [16]:
import torchvision
import torch
ds_train = torchvision.datasets.MNIST(root="./data/minist/",train=True,download=True,transform=transform)
ds_valid = torchvision.datasets.MNIST(root="./data/minist/",train=False,download=True,transform=transform)

dl_train =  torch.utils.data.DataLoader(ds_train, batch_size=128, shuffle=True, num_workers=4)
dl_valid =  torch.utils.data.DataLoader(ds_valid, batch_size=128, shuffle=False, num_workers=4)

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz to ./data/minist/MNIST/raw/train-images-idx3-ubyte.gz


HBox(children=(FloatProgress(value=0.0, max=9912422.0), HTML(value='')))


Extracting ./data/minist/MNIST/raw/train-images-idx3-ubyte.gz to ./data/minist/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz to ./data/minist/MNIST/raw/train-labels-idx1-ubyte.gz


HBox(children=(FloatProgress(value=0.0, max=28881.0), HTML(value='')))


Extracting ./data/minist/MNIST/raw/train-labels-idx1-ubyte.gz to ./data/minist/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz to ./data/minist/MNIST/raw/t10k-images-idx3-ubyte.gz


HBox(children=(FloatProgress(value=0.0, max=1648877.0), HTML(value='')))


Extracting ./data/minist/MNIST/raw/t10k-images-idx3-ubyte.gz to ./data/minist/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz to ./data/minist/MNIST/raw/t10k-labels-idx1-ubyte.gz


HBox(children=(FloatProgress(value=0.0, max=4542.0), HTML(value='')))


Extracting ./data/minist/MNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/minist/MNIST/raw

Processing...
Done!


  cpuset_checked))


## 定义网络

正如上文对数据调用时声明类一样，建议使用类定义网网络：                                                          



In [17]:
import torch
import torch.nn as nn # 定义网络
import torch.nn.functional as F #网络中的操作，如卷积、padding通常在这个里面

In [18]:
class ConvNet(nn.Module):
    def __init__(self):
        super().__init__()
        # batch*1*28*28（每次会送入batch个样本，输入通道数1（黑白图像），图像分辨率是28x28）
        # 下面的卷积层Conv2d的第一个参数指输入通道数，第二个参数指输出通道数，第三个参数指卷积核的大小
        self.conv1 = nn.Conv2d(1, 10, 5) # 输入通道数1，输出通道数10，核的大小5
        self.conv2 = nn.Conv2d(10, 20, 3) # 输入通道数10，输出通道数20，核的大小3
        # 下面的全连接层Linear的第一个参数指输入通道数，第二个参数指输出通道数
        self.fc1 = nn.Linear(20*10*10, 500) # 输入通道数是2000，输出通道数是500
        self.fc2 = nn.Linear(500, 10) # 输入通道数是500，输出通道数是10，即10分类
    def forward(self,x):
        in_size = x.size(0) # 在本例中in_size=512，也就是BATCH_SIZE的值。输入的x可以看成是512*1*28*28的张量。
        x = self.conv1(x) # batch*1*28*28 -> batch*10*24*24（28x28的图像经过一次核为5x5的卷积，输出变为24x24）
        x = F.relu(x) # batch*10*24*24（激活函数ReLU不改变形状））
        x = F.max_pool2d(x, 2, 2) # batch*10*24*24 -> batch*10*12*12（2*2的池化层会减半）
        x = self.conv2(x) # batch*10*12*12 -> batch*20*10*10（再卷积一次，核的大小是3）
        x = F.relu(x) # batch*20*10*10
        x = x.view(in_size, -1) # batch*20*10*10 -> batch*2000（out的第二维是-1，说明是自动推算，本例中第二维是20*10*10）
        x = self.fc1(x) # batch*2000 -> batch*500
        x = F.relu(x) # batch*500
        x = self.fc2(x) # batch*500 -> batch*10
        x = F.log_softmax(x, dim=1) # 计算log(softmax(x))
        return x

正如上文所说的，tensor放入cuda中可以加速运算，这里也适用

In [19]:
# 依然按照上文的方法来判断是否有cuda可以使用
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = ConvNet().to(device)
print(model)

ConvNet(
  (conv1): Conv2d(1, 10, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(10, 20, kernel_size=(3, 3), stride=(1, 1))
  (fc1): Linear(in_features=2000, out_features=500, bias=True)
  (fc2): Linear(in_features=500, out_features=10, bias=True)
)


很多情况下，我们需要分析模型的复杂度，需要通过查看模型的参数个数来看.另外需要查看学习的参数和名称

In [20]:
for parameters in ConvNet().parameters(): 
#注意，在部分程序中，有人会将定义好的神经网络存储到net中，这是这个net是一个类，parameters这个函数必须跟一个类，而不是一个函数名名称
#比方说这里ConvNet.parametrics()就是错误的，因为ConvNet()才是一个类
#但是如果使用net = ConvNet()定义以后，就可以使用：net.parameters()
    print(parameters)

Parameter containing:
tensor([[[[-0.1209, -0.1328, -0.0506,  0.1218, -0.1203],
          [-0.0267, -0.1180, -0.1032,  0.0192, -0.1371],
          [-0.1545, -0.1602,  0.1825,  0.0311, -0.0654],
          [ 0.0973,  0.1876,  0.1746,  0.0858,  0.1358],
          [ 0.0268,  0.0340, -0.1317, -0.1366,  0.0736]]],


        [[[-0.1135,  0.1438,  0.0590, -0.0651, -0.1935],
          [ 0.1754,  0.1595,  0.1836, -0.0381,  0.0600],
          [-0.0885, -0.0765, -0.1800, -0.0833, -0.0269],
          [ 0.1486, -0.0127, -0.0748, -0.0047, -0.1790],
          [-0.1981,  0.0062,  0.1957,  0.0651, -0.1886]]],


        [[[ 0.0527, -0.0344, -0.0158,  0.0737,  0.1428],
          [-0.1406,  0.0017,  0.0115, -0.1936,  0.1217],
          [-0.0810,  0.1252,  0.0523, -0.0841, -0.0584],
          [ 0.1368,  0.1758, -0.1153, -0.0903, -0.0096],
          [-0.0268,  0.0452,  0.0971, -0.1717,  0.0281]]],


        [[[-0.0059,  0.0359, -0.0243, -0.0165, -0.0108],
          [-0.0368, -0.0658, -0.1697,  0.1558, -0.0101

In [21]:

for name,parameters in ConvNet().named_parameters():
    print(name,':',parameters.size())


conv1.weight : torch.Size([10, 1, 5, 5])
conv1.bias : torch.Size([10])
conv2.weight : torch.Size([20, 10, 3, 3])
conv2.bias : torch.Size([20])
fc1.weight : torch.Size([500, 2000])
fc1.bias : torch.Size([500])
fc2.weight : torch.Size([10, 500])
fc2.bias : torch.Size([10])


除此以外，也可以使用summary()去查看所有的层、参数

In [22]:

from torchsummary import summary
summary(ConvNet(), input_size=(1, 28, 28))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 10, 24, 24]             260
            Conv2d-2           [-1, 20, 10, 10]           1,820
            Linear-3                  [-1, 500]       1,000,500
            Linear-4                   [-1, 10]           5,010
Total params: 1,007,590
Trainable params: 1,007,590
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.06
Params size (MB): 3.84
Estimated Total Size (MB): 3.91
----------------------------------------------------------------


## 模型训练

一般而言，训练策略由batch_size, epoch, 求导机制,优化策略, 学习率等构成<br>
首先有：
* batch_size（一批次的大小）：在一个epoch中，每一次迭代使用的样本量，常常设置为：32/64/128等。

* epoch（轮次）：在每一个epoch中，所有训练集都会以每次batch_size个输入模型中

* iteration（迭代次数）：每跑完一个batch都要更新参数，这个过程叫一个iteration。

**因此显然iteration是由batch_size,epoch共同构成，比方说：
总共有10000张图片，每个batch_size有100张图片，epoch = 20，意味着，iterations = （10000/100）$\times$20 = 2000（次）**


### 求导机制

### 学习率设置

一般而言，常用的学习率有Adam,SGD算法两种，它们分别代表了

### Batchsize设置
不考虑Batch Normalization的情况下，batch size的大小决定了深度学习训练过程中的完成每个epoch所需的时间和每次迭代(iteration)之间梯度的平滑程度。batch size只能说是影响完成每个epoch所需要的时间，决定也算不上吧。<br>
由于目前主流深度学习框架处理mini-batch的反向传播时，默认都是先将每个mini-batch中每个instance得到的loss平均化之后再反求梯度，也就是说每次反向传播的梯度是对mini-batch中每个instance的梯度平均之后的结果，所以b的大小决定了相邻迭代之间的梯度平滑程度，b太小，相邻mini-batch间的差异相对过大，那么相邻两次迭代的梯度震荡情况会比较严重，不利于收敛；b越大，相邻mini-batch间的差异相对越小，虽然梯度震荡情况会比较小，一定程度上利于模型收敛，但如果b极端大，相邻mini-batch间的差异过小，相邻两个mini-batch的梯度没有区别了，整个训练过程就是沿着一个方向蹭蹭蹭往下走，很容易陷入到局部最小值出不来。<br>

**总结下来：batch size过小，花费时间多，同时梯度震荡严重，不利于收敛；batch size过大，不同batch的梯度方向没有任何变化，容易陷入局部极小值。**<br>

但是对于GPU并行计算的情况下，小batch反而会需要更长时间



In [23]:
import torch.optim as optim
model = ConvNet().to(device)
optimizer = optim.Adam(model.parameters())

In [24]:
def train(model, device, train_loader, optimizer, epoch):
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = F.nll_loss(output, target)
        loss.backward()
        optimizer.step()
        if(batch_idx+1)%30 == 0: 
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * len(data), len(train_loader.dataset),
                100. * batch_idx / len(train_loader), loss.item()))


def test(model, device, test_loader):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.nll_loss(output, target, reduction='sum').item() # 将一批的损失相加
            pred = output.max(1, keepdim=True)[1] # 找到概率最大的下标
            correct += pred.eq(target.view_as(pred)).sum().item()

    test_loss /= len(test_loader.dataset)
    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
        test_loss, correct, len(test_loader.dataset),
        100. * correct / len(test_loader.dataset)))

In [25]:
for epoch in range(0, 20):
    train(model, device, train_loader, optimizer, epoch)
    test(model, device, test_loader)


Test set: Average loss: 0.0379, Accuracy: 9896/10000 (99%)


Test set: Average loss: 0.0448, Accuracy: 9873/10000 (99%)


Test set: Average loss: 0.0561, Accuracy: 9868/10000 (99%)


Test set: Average loss: 0.0382, Accuracy: 9910/10000 (99%)


Test set: Average loss: 0.0397, Accuracy: 9906/10000 (99%)



## 可视化

### 本地保存
只需稍微修改下train，test即可

In [26]:
Loss_list = []
Loss_val_list = []

def train(model, device, train_loader, optimizer, epoch):
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = F.nll_loss(output, target)
        loss.backward()
        optimizer.step()
        if(batch_idx+1)%30 == 0: 
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * len(data), len(train_loader.dataset),
                100. * batch_idx / len(train_loader), loss.item()))
            Loss_list.append(loss.item())


def test(model, device, test_loader):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.nll_loss(output, target, reduction='sum').item() # 将一批的损失相加
            pred = output.max(1, keepdim=True)[1] # 找到概率最大的下标
            correct += pred.eq(target.view_as(pred)).sum().item()

    test_loss /= len(test_loader.dataset)
    Loss_val_list.append(test_loss)
    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
        test_loss, correct, len(test_loader.dataset),
        100. * correct / len(test_loader.dataset)))

In [27]:
epoches = 2
for epoch in range(epoches):
    train(model, device, train_loader, optimizer, epoch)
    test(model, device, test_loader)

print(Loss_list)
print(Loss_val_list)


Test set: Average loss: 0.0414, Accuracy: 9906/10000 (99%)


Test set: Average loss: 0.0388, Accuracy: 9915/10000 (99%)

[0.0003735315112862736, 8.212582906708121e-05, 0.0001833964925026521, 0.0001292115921387449, 0.0003035182598978281, 0.00021089684742037207]
[0.041391693830490114, 0.03876026911735535]


再保存到本地即可

In [28]:
## Loss
file = open('Loss.txt', 'w')
for i in range(len(Loss_list)):
    s = str(Loss_list[i]).replace('[','').replace(']','')#去除[],这两行按数据不同，可以选择
    s = s.replace("'",'').replace(',','') +'\n'  #去除单引号，逗号，每行末尾追加换行符
    file.write(s)
file.close()
print("保存文件成功")

## Loss_val
file = open('Loss_val.txt', 'w')
for i in range(len(Loss_val_list)):
    s = str(Loss_val_list[i]).replace('[','').replace(']','')#去除[],这两行按数据不同，可以选择
    s = s.replace("'",'').replace(',','') +'\n'  #去除单引号，逗号，每行末尾追加换行符
    file.write(s)
file.close()
print("保存文件成功")

保存文件成功
保存文件成功


## Pre-Train & Fine-Tune
对于一些较深的模型，可能从头开始训练会十分耗时，因此有时候只需要训练最后面几个分类器即可。Fine-Tune主要是针对一个模型在一个全新的数据集上训练，因此原有的分类器的分类能力可能会存在不足。在大部分情况下，原有的的模型都是基于ImagNet训练的，这是一个百分类问题，但是在实际应用时，有时候可能只是二分类问题，因此上述模型可能不适用。
这一部分具体可以查看：https://github.com/zergtant/pytorch-handbook/blob/master/chapter4/4.1-fine-tuning.ipynb
<br>
还有一种情况，我们的模型训练后需要保存参数，可以使用：

In [29]:
torch.save(model.state_dict(), 'example.pt')

# 参考文献：

[1] 李宏毅机器学习https://speech.ee.ntu.edu.tw/~hylee/ml/2021-spring.html

[2]: 深入浅出Pytorch