# MINST: 从0到1%

在我们学习深度学习的时候，往往会用最简单的MNIST数据集作为我们的开始。在这里我用kaggle上的 digit-recognizer为初学者作为讲解。
Kaggle传送门: [digit-recognizer](https://www.kaggle.com/c/digit-recognizer) 
我们目的并不是成为kaggle上的1%,我希望通过这篇文章，大家能对CV上的训练有一个最最简单的认识，并且对参加kaggle竞赛的过程有一个理解。

In [1]:
import torch
#本文我们使用的是pytorch 1.0.0

首先我们对kaggle下载的数据解压，并处理，在这里我已经下好了，得到的就是test.csv, train.csv, sample_submission.csv三个文件。
接下来我们利用pytorch中的Dataset和DataLoader对数据进行处理

In [2]:
from torch.utils.data import DataLoader,Dataset
from torchvision import transforms
import pandas as pd
import numpy as np #numpy 是一个极其重要的库，据说75%的机器学习项目都用了numpy
# pytorch是个代码即文档的库，我建议如果你想搞懂pytoch的话，可以直接看代码

In [3]:
trans_img = transforms.Compose([
    transforms.ToTensor()
])
#将numpy数组转为pytorch训练的tensor

如果你打开kaggle下载的train.csv你可以看到他给我们的文件其实就是一个label后面接上一维化的图像

In [4]:
train_path = "train.csv"
try_data = pd.read_csv(train_path,skiprows = 0)
print(try_data.shape)

(42000, 785)


然后我们从pytorch的Dataset那里继承我们的MNISTDataset，我们把训练用的item从读入的1维的图像转为3维的。这里我们将一个图像转为[1,28,28]形状的，主要是适应pytorch自带的MNIST数据集，这是后话。

In [5]:
class MNISTDataset(Dataset):
    def __init__(self,csv_file,transform=None):
        data = pd.read_csv(csv_file,skiprows=0)
        self.X = np.array(data.iloc[:,1:]).reshape(-1,28,28,1).astype('float32')
        self.X /= 255
        self.y = np.array(data.iloc[:,0])
        del data
        self.transfrom = transform

    def __len__(self):
        return len(self.X)

    def __getitem__(self, idx):
        item = self.X[idx]
        label = torch.from_numpy(np.array(self.y[idx]))
        if(self.transfrom):
            item = self.transfrom(item)
        return (item,label)

In [6]:
batch_size = 50 #batch_size是一次训练读入的图片数量
kaggle_trainset = MNISTDataset(csv_file='train.csv',transform=trans_img)
print(kaggle_trainset.__getitem__(0)[0].shape)
kaggle_trainloader = DataLoader(dataset = kaggle_trainset,batch_size=batch_size,shuffle=True)
#shuffle 是训练集是否随机读取

torch.Size([1, 28, 28])


通过DataLoader我们就获取到了一个适合训练的训练集了。然后我们就好建立我们的模型了。在这里我们使用的是著名的Lenet-5模型，很简单，大家可以上网搜一下，大致就是两个卷积两个池化最后来三个全连接。具体的模型可以自己看看论文。

In [7]:
import torch.nn as nn
import torch.nn.functional as F
#pytorch 的网络层都在nn里面，我建议之间看pytorch的代码
#另外 pytorch的激活函数是在 nn.functional 里面
#最简单的lenet-5
class Lenet1(nn.Module):
    def __init__(self):
        super(Lenet1, self).__init__()
        self.conv1 = nn.Conv2d(1,6,5,stride=1,padding=2)
        self.conv2 = nn.Conv2d(6,16,5, stride= 1, padding=0)
        self.fc1 = nn.Linear(400,120)
        self.fc2 = nn.Linear(120,84)
        self.fc3 = nn.Linear(84,10)
    def forward(self, x):
        out = F.max_pool2d(self.conv1(x),(2,2))
        out = F.max_pool2d(self.conv2(out),(2,2))
        out = out.view(out.size(0),-1)
        out = self.fc1(out)
        out = self.fc2(out)
        out = self.fc3(out)
        return out

接下来我们肯定要迫不及待地试验一下我们模型的结果啦。怎么看效果呢，这时候我们就能使用pytorch自带的MNIST数据集来验证。

In [8]:
from torchvision.datasets import MNIST
MNIST_vaildset = MNIST('./data', train=False,transform=trans_img)
MNIST_vaildloader = DataLoader(MNIST_vaildset,batch_size=batch_size,shuffle = True)
#使用pytorch自带的验证集

In [11]:
from torch.autograd import Variable
from torch import optim

def train(model,trainloader,optimizer,epoch,epoches):
    criterian = nn.CrossEntropyLoss(size_average=False)
    running_loss = 0.
    running_acc = 0.
    for (img, label) in trainloader:
        if torch.cuda.is_available():
            img = Variable(img).cuda()
            label = Variable(label).cuda()

        optimizer.zero_grad()
        output = model(img)
        loss = criterian(output, label)
        # backward
        loss.backward()
        optimizer.step()

        running_loss += loss.data
        _, predict = torch.max(output, 1)
        correct_num = (predict == label).sum()
        running_acc += correct_num.item()
    running_loss /= len(trainloader.dataset)
    running_acc /= len(trainloader.dataset)
    print("[%d/%d] Loss: %f, Acc: %f" % (epoch + 1, epoches, running_loss, 100 * running_acc))
    
def vaild(model,vaildloader):
    criterian = nn.CrossEntropyLoss(size_average=False)
    vaildloss = 0.
    vaildacc = 0.
    for (img, label) in MNIST_vaildloader:
        img = Variable(img).cuda()
        label = Variable(label).cuda()
        output = model(img)
        loss = criterian(output, label)
        vaildloss += loss.data
        _, predict = torch.max(output, 1)
        num_correct = (predict == label).sum()
        vaildacc += num_correct.item()
    vaildloss /= len(vaildloader.dataset)
    vaildacc /= len(vaildloader.dataset)
    print("Test: Loss: %.5f, Acc: %.2f %%" % (vaildloss, 100 * vaildacc))

In [12]:
learning_rate = 1e-3
epoches = 20
lenet1 = Lenet1()
if torch.cuda.is_available():
    lenet1.cuda()
optimizer1 = optim.SGD(lenet1.parameters(), lr=learning_rate)
for epoch in range(epoches):
    train(model=lenet1,trainloader=kaggle_trainloader,optimizer=optimizer1,epoch=epoch,epoches=epoches)
    if (epoch+1)%10 == 0:
        vaild(model = lenet1,vaildloader=MNIST_vaildloader)

[1/20] Loss: 0.522027, Acc: 84.059524
[2/20] Loss: 0.126681, Acc: 96.097619
[3/20] Loss: 0.092063, Acc: 97.161905
[4/20] Loss: 0.077288, Acc: 97.611905
[5/20] Loss: 0.064520, Acc: 97.947619
[6/20] Loss: 0.057005, Acc: 98.226190
[7/20] Loss: 0.051071, Acc: 98.416667
[8/20] Loss: 0.047151, Acc: 98.495238
[9/20] Loss: 0.044016, Acc: 98.645238
[10/20] Loss: 0.040574, Acc: 98.750000
Test: Loss: 0.03127, Acc: 99.01 %
[11/20] Loss: 0.037369, Acc: 98.833333
[12/20] Loss: 0.035347, Acc: 98.852381
[13/20] Loss: 0.032064, Acc: 98.971429
[14/20] Loss: 0.032032, Acc: 99.000000
[15/20] Loss: 0.029067, Acc: 99.026190
[16/20] Loss: 0.027322, Acc: 99.114286
[17/20] Loss: 0.027320, Acc: 99.069048
[18/20] Loss: 0.024955, Acc: 99.183333
[19/20] Loss: 0.024194, Acc: 99.214286
[20/20] Loss: 0.023087, Acc: 99.211905
Test: Loss: 0.03010, Acc: 99.10 %


理论上来说，在20次的迭代后这个模型能达到99%的正确率，但是我们的目标可是成为~~海贼王的男人~~100%，那么就要改进一下啦。
通过从网上找资料，我发现了激活函数这个好东西，好像说这几年的深度学习能这么好，有很大的功劳是激活函数的不断发展。
那么我们不如就试一下把一些激活函数套上我们平凡的lenet模型。

In [13]:
learning_rate = 1e-3
epoches = 20
class Lenet2(nn.Module):
    def __init__(self):
        super(Lenet2, self).__init__()
        self.conv1 = nn.Conv2d(1,6,5,stride=1,padding=2)
        self.conv2 = nn.Conv2d(6,16,5, stride= 1, padding=0)
        self.fc1 = nn.Linear(400,120)
        self.fc2 = nn.Linear(120,84)
        self.fc3 = nn.Linear(84,10)
    def forward(self, x):
        out = F.max_pool2d(F.relu(self.conv1(x)),(2,2))
        out = F.max_pool2d(F.relu(self.conv2(out)),(2,2))
        out = out.view(out.size(0),-1)
        out = F.relu(self.fc1(out))
        out = F.relu(self.fc2(out))
        out = self.fc3(out)
        return out
lenet2 = Lenet2()
if torch.cuda.is_available():
    lenet2.cuda()
optimizer2 = optim.SGD(lenet2.parameters(), lr=learning_rate)
for epoch in range(epoches):
    train(model=lenet2,trainloader=kaggle_trainloader,optimizer=optimizer2,epoch=epoch,epoches=epoches)
    if (epoch+1)%10 == 0:
        vaild(model = lenet2,vaildloader=MNIST_vaildloader)

[1/20] Loss: 0.993573, Acc: 66.359524
[2/20] Loss: 0.126021, Acc: 95.980952
[3/20] Loss: 0.081728, Acc: 97.371429
[4/20] Loss: 0.065245, Acc: 98.011905
[5/20] Loss: 0.053575, Acc: 98.271429
[6/20] Loss: 0.045406, Acc: 98.523810
[7/20] Loss: 0.039177, Acc: 98.759524
[8/20] Loss: 0.034390, Acc: 98.933333
[9/20] Loss: 0.029394, Acc: 99.047619
[10/20] Loss: 0.026382, Acc: 99.140476
Test: Loss: 0.02346, Acc: 99.31 %
[11/20] Loss: 0.022648, Acc: 99.233333
[12/20] Loss: 0.020471, Acc: 99.328571
[13/20] Loss: 0.017709, Acc: 99.457143
[14/20] Loss: 0.015819, Acc: 99.485714
[15/20] Loss: 0.015143, Acc: 99.521429
[16/20] Loss: 0.012266, Acc: 99.640476
[17/20] Loss: 0.011666, Acc: 99.621429
[18/20] Loss: 0.010683, Acc: 99.669048
[19/20] Loss: 0.007827, Acc: 99.759524
[20/20] Loss: 0.008825, Acc: 99.714286
Test: Loss: 0.02524, Acc: 99.37 %


从结果来看，稍微好了那么一点点，但是我们不满足啊，我们一定要继续努力，从而变得更~~秃~~强!