## First Neural Network: Image Classification

Objectives:
- Train a minimal image classifier on [MNIST](https://paperswithcode.com/dataset/mnist) using PyTorch
- Usese PyTorch and torchvision

In [1]:
# The usual imports

import torch            # Torch 是一个经典的对多维矩阵数据进行操作的张量（tensor ）库，在机器学习和其他数学密集型应用有广泛应用。它提供了张量（tensor）操作和计算图构建的功能；提供了自动求导（Autograd）功能，使得用户可以轻松地构建和训练神经网络模型。
import torch.nn as nn   # 导入torch.nn并通过nn来引用  Neural Network:神经网络
# torchvision 主要包含三部分：
# models：      提供深度学习中各种经典网络的网络结构以及预训练好的模型，包括 AlexNet 、VGG 系列、ResNet 系列、Inception 系列等；
# datasets：    提供常用的数据集加载，设计上都是继承 torch.utils.data.Dataset，主要包括 MNIST、CIFAR10/100、ImageNet、COCO等；
# transforms：  提供常用的数据预处理操作，主要包括对 Tensor 以及 PIL Image 对象的操作；
import torchvision
import torchvision.transforms as transforms

In [2]:
# load the data

class ReshapeTransform:
    def __init__(self, new_size):   
        self.new_size = new_size    # newsize: 数据shape

    def __call__(self, img):
        return torch.reshape(img, self.new_size)    # 进行维度重组

transformations = transforms.Compose([
                                transforms.ToTensor(),
                                transforms.ConvertImageDtype(torch.float32),
                                ReshapeTransform((-1,))
                                ])

trainset = torchvision.datasets.MNIST(root='./data', train=True,
                                        download=True, transform=transformations)

testset = torchvision.datasets.MNIST(root='./data', train=False,
                                       download=True, transform=transformations)

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz to ./data\MNIST\raw\train-images-idx3-ubyte.gz


100%|██████████| 9912422/9912422 [16:45<00:00, 9855.28it/s]  


Extracting ./data\MNIST\raw\train-images-idx3-ubyte.gz to ./data\MNIST\raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz to ./data\MNIST\raw\train-labels-idx1-ubyte.gz


100%|██████████| 28881/28881 [00:00<00:00, 112687.72it/s]


Extracting ./data\MNIST\raw\train-labels-idx1-ubyte.gz to ./data\MNIST\raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz to ./data\MNIST\raw\t10k-images-idx3-ubyte.gz


100%|██████████| 1648877/1648877 [00:01<00:00, 982709.86it/s] 


Extracting ./data\MNIST\raw\t10k-images-idx3-ubyte.gz to ./data\MNIST\raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz to ./data\MNIST\raw\t10k-labels-idx1-ubyte.gz


100%|██████████| 4542/4542 [00:00<?, ?it/s]

Extracting ./data\MNIST\raw\t10k-labels-idx1-ubyte.gz to ./data\MNIST\raw






In [None]:
# check shape of data

trainset.data.shape, testset.data.shape # 验证数据的shape

(torch.Size([60000, 28, 28]), torch.Size([10000, 28, 28]))

In [None]:
# data loader

BATCH_SIZE = 128
train_dataloader = torch.utils.data.DataLoader(trainset,                # 创建一个DataLoader对象 传入数据集为训练数据集
                                               batch_size=BATCH_SIZE,   # 每个batch的样本数为BATCH_SIZE 128
                                               shuffle=True,            # 数据会被随机打乱
                                               num_workers=0)

test_dataloader = torch.utils.data.DataLoader(testset,                  # 创建一个DataLoader对象 传入数据集为训练数据集
                                              batch_size=BATCH_SIZE,    # 每个batch的样本数为BATCH_SIZE 128
                                              shuffle=False,            # 数据会被随机打乱
                                              num_workers=0)

In [None]:
# model
# Sequential 作为一个有顺序的容器，将特定神经网络模块按照在传入构造器的顺序依次被添加到计算图中执行。
# torch.nn.Linear(input_data,hidden_layer) 完成从输入层到隐藏层的线性变换；
# torch.nn.ReLU() 为激活函数；
# torch.nn.Linear(hidden_layer, output_data) 完成从隐藏层到输出层的线性变换；
model = nn.Sequential(nn.Linear(784, 512), nn.ReLU(), nn.Linear(512, 10))

In [None]:
# training preparation
# 该类实现 RMSprop 优化方法（Hinton 提出），RMS 是均方根（root meam square）的意思
# RMSprop 采用均方根作为分母，可缓解 Adagrad 学习率下降较快的问题，并且引入均方根，可以减少摆动。
trainer = torch.optim.RMSprop(model.parameters())
# 交叉熵损失
loss = nn.CrossEntropyLoss()

In [None]:
def get_accuracy(output, target, batch_size):
    # Obtain accuracy for training round
    # torch.max(output, 1)[1] ：output 为模型的输出，该函数主要用来求 tensor 的最大值。
    # inputs: tensor，第一个参数为一个张量
    # dim: index，第二个参数为一个整数，dim=0表示计算每列的最大值，dim=1表示每行的最大值
    # -----------
    # view( ) 是对 PyTorch 中的 Tensor 操作的，若非 Tensor 类型，可使用 data = torch.tensor(data)来进行转换
    # (1) 作用：该函数返回一个有__相同数据__但不同大小的 Tensor。通俗一点，就是__改变矩阵维度__，相当于 Numpy 中的 resize() 或者 Tensorflow 中的 reshape() 。
    # (2) 参数：view( *shape )
    corrects = (torch.max(output, 1)[1].view(target.size()).data == target.data).sum()  # 计算有多少预测值和标记值相等
    accuracy = 100.0 * corrects/batch_size  # 计算准确率
    return accuracy.item()

In [None]:
# train

for ITER in range(5):   # 迭代
    train_acc = 0.0     # 训练准确率
    train_running_loss = 0.0    # 损失

    model.train()   # 设置模型为训练模式
    for i, (X, y) in enumerate(train_dataloader):   # 迭代训练数据
        output = model(X)       #  输入特征向量并进行训练
        l = loss(output, y)     #  通过损失函数（交叉熵损失）计算损失值

        # update the parameters
        l.backward()            # 清空上一次的梯度值--optimizer.zero_grad()函数会遍历模型的所有参数，通过p.grad.detach_()方法截断反向传播的梯度流，再通过p.grad.zero_()函数将每个参数的梯度值设为0，即上一次的梯度记录被清空。        
        trainer.step()          # 反向传播
        trainer.zero_grad()     # step()函数的作用是执行一次优化步骤，通过梯度下降法来更新参数的值。

        # gather metrics
        train_acc += get_accuracy(output, y, BATCH_SIZE)    # 计算准确率
        train_running_loss += l.detach().item()             # 统计损失值

    # 打印 Epoch 损失 训练准确率
    print('Epoch: %d | Train loss: %.4f | Train Accuracy: %.4f' \
          %(ITER+1, train_running_loss / (i+1),train_acc/(i+1)))    

Epoch: 1 | Train loss: 1.0415 | Train Accuracy: 91.9010
Epoch: 2 | Train loss: 0.1291 | Train Accuracy: 96.0871
Epoch: 3 | Train loss: 0.0997 | Train Accuracy: 97.0399
Epoch: 4 | Train loss: 0.0865 | Train Accuracy: 97.4913
Epoch: 5 | Train loss: 0.0740 | Train Accuracy: 97.8611


### Other things to try

- Evaluate on test set
- Plot loss curve
- Add more layers to the model