# 使用Pytorch框架的CNN网络实现手写数字（MNIST）识别
目录结构：
```
├─pytorch-mnist（这个可自己随机改）
│  ├─data              MNIST数据集
│  ├─checkpoint
│  │  ├─model-mnist.pth 保存的模型
│  ├─mnist.ipynb
│  ├─mnist.ipynb	   Jupyter Notebook版文件
│  ├─mnist.py		  py版文件	
│  ├─mnist.jpg		 py文件运行截图
```

## 1. 导入所需包

In [1]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms

+ 参数配置

In [2]:
BATCH_SIZE = 512 # 大概需要2G的显存
EPOCHS = 20 # 总共训练批次
DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu") # gpu更快

## 2. 准备数据
+ MNIST数据集包含60000个训练集和10000测试数据集。分为图片和标签，图片是**28*28的一维灰度图**，标签为0~9共10个数字
+ 使用`torchvision`加载MNIST
+ 由于之前已下载了数据故download设为False，[数据下载慢的解决方法](https://blog.csdn.net/qq_43280818/article/details/104241326)
+ 一个样本的格式为[data,label]，第一个存放数据，第二个存放标签
+ 可加上num_workers参数，用多个子进程加载数据，可加快数据加载

In [3]:
train_loader = torch.utils.data.DataLoader(
    datasets.MNIST('data', train = True, download = False,
        transform = transforms.Compose([
                  transforms.ToTensor(),
                  transforms.Normalize((0.1037,), (0.3081,))
              ])),
batch_size = BATCH_SIZE, shuffle = True)

test_loader = torch.utils.data.DataLoader(
    datasets.MNIST('data', train = False, 
        transform = transforms.Compose([
                transforms.ToTensor(),
                transforms.Normalize((0.1037,), (0.3081,))
])),
batch_size = BATCH_SIZE, shuffle = True)

设置batch_size=512后，加载器中的基本单位是一个batch的数据，即一个dataloader是一个batch的数据

In [4]:
print('train_loader len:',len(train_loader)) #60000/512
print('test_loader len:',len(test_loader)) #10000/512

train_loader len: 118
test_loader len: 20


## 3. 模型构建

1. 定义CNN网络
  + Conv2d参数

    in_channels(int) – 必选 输入信号的通道数目，由于图像为单通道灰度图故初始为1

    out_channels(int) – 必选 卷积产生的通道数目

    kerner_size(int or tuple) - 必选 卷积核的尺寸

    stride(int or tuple, optional) - 可选 卷积步长，默认为1

    padding(int or tuple, optional) - 可选 设置在所有边界增加值为0的边距的大小，也就是在feature map 外围增加几圈 0 ，默认为0，例如3x3在外围补1圈0就变成5x5


In [5]:
class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        #1*1*28*28
        self.conv1 = nn.Conv2d(1, 10, 5) #5x5 输入1 输出10（10个数字） 
        self.conv2 = nn.Conv2d(10, 20, 3) #3x3 
        self.fc1 = nn.Linear(20 * 10 * 10, 500) # 全连接 输出500x1
        self.fc2 = nn.Linear(500, 10) # 10分类（数字0~9） 输出10x1
        
    def forward(self, x):
        in_size = x.size(0) # batchsize
        out= self.conv1(x) # 第一层卷积输出 shape 1* 10 * 24 *24 （28-5+1）
        out = F.relu(out)
        out = F.max_pool2d(out, 2, 2) # 最大池化层 1* 10 * 12 * 12（24/2）
        out = self.conv2(out) # 第二层卷积输出1* 20 * 10 * 10（12-3+1）
        out = F.relu(out)
        out = out.view(in_size, -1) # 1 * 2000 输出前数据预处理，压缩展平卷积 将 in_size（即batch_size）个Sample拉成一维。-1：列自适应
        out = self.fc1(out) # 1 * 500
        out = F.relu(out)
        out = self.fc2(out) # 1 * 10
        out = F.log_softmax(out, dim = 1) #将数据的范围改到[0, 1]之内，表概率，维度不变
        return out

2. 损失和优化函数

In [25]:
import torch
from torchviz import make_dot
model = CNN().to(DEVICE) # 模型实例化
optimizer = optim.Adam(model.parameters())
# optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9) 
print(model)
p=make_dot(model(torch.rand(10,1,28,28).cuda()))
p.view()

CNN(
  (conv1): Conv2d(1, 10, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(10, 20, kernel_size=(3, 3), stride=(1, 1))
  (fc1): Linear(in_features=2000, out_features=500, bias=True)
  (fc2): Linear(in_features=500, out_features=10, bias=True)
)


'Digraph.gv.pdf'

## 4. 模型训练

In [7]:
def train(model, device, train_loader, optimizer, epoch):
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)

        # 清除上个batch的梯度信息 即清零所有参数的梯度缓存 否则梯度将会与已有的梯度累加
        optimizer.zero_grad()

        # 前向+后向+优化
        output = model(data)
        loss = F.nll_loss(output, target)
        loss.backward() # 随机梯度的反向传播
        optimizer.step() # 更新参数

        if (batch_idx + 1) % 30 == 0:# 每30个batch进行输出
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * len(data), len(train_loader.dataset),
                100. * batch_idx / len(train_loader), loss.item()))

+ 保存模型

In [9]:
import os
if not os.path.isdir('checkpoint'):
    os.mkdir('checkpoint')
torch.save(model.state_dict(), 'checkpoint/mnist_model.pth')

## 5. 模型评估

In [11]:
def test(model, device, test_loader):
    model.eval()
    test_loss =0
    correct = 0
    with torch.no_grad(): # 进行评测的时候不需要反向求导更新参数   
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.nll_loss(output, target, reduction = 'sum') # 将一批的损失相加
            pred = output.max(1, keepdim = True)[1] # 概率最大的下标
            correct += pred.eq(target.view_as(pred)).sum().item() # 预测正确的数目
    
    test_loss /= len(test_loader.dataset)
    print("\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%) \n".format(
        test_loss, correct, len(test_loader.dataset),
        100.* correct / len(test_loader.dataset)
            ))

In [12]:
# 总数据集的训练和测试
for epoch in range(1, EPOCHS + 1):
    train(model,  DEVICE, train_loader, optimizer, epoch)
    test(model, DEVICE, test_loader)


Test set: Average loss: 0.0947, Accuracy: 9704/10000 (97%) 


Test set: Average loss: 0.0560, Accuracy: 9828/10000 (98%) 


Test set: Average loss: 0.0481, Accuracy: 9839/10000 (98%) 


Test set: Average loss: 0.0471, Accuracy: 9840/10000 (98%) 


Test set: Average loss: 0.0332, Accuracy: 9881/10000 (99%) 


Test set: Average loss: 0.0375, Accuracy: 9875/10000 (99%) 


Test set: Average loss: 0.0354, Accuracy: 9881/10000 (99%) 


Test set: Average loss: 0.0290, Accuracy: 9901/10000 (99%) 


Test set: Average loss: 0.0296, Accuracy: 9894/10000 (99%) 


Test set: Average loss: 0.0347, Accuracy: 9893/10000 (99%) 


Test set: Average loss: 0.0324, Accuracy: 9903/10000 (99%) 


Test set: Average loss: 0.0340, Accuracy: 9899/10000 (99%) 


Test set: Average loss: 0.0383, Accuracy: 9880/10000 (99%) 


Test set: Average loss: 0.0364, Accuracy: 9899/10000 (99%) 


Test set: Average loss: 0.0378, Accuracy: 9898/10000 (99%) 


Test set: Average loss: 0.0307, Accuracy: 9916/10000 (99%) 


Test se