# Fashion-mnist分类任务

# Fashion-mnist

[经典的MNIST数据集](http://yann.lecun.com/exdb/mnist/)包含了大量的手写数字。十几年来，来自机器学习、机器视觉、人工智能、深度学习领域的研究员们把这个数据集作为衡量算法的基准之一。你会在很多的会议，期刊的论文中发现这个数据集的身影。实际上，MNIST数据集已经成为算法作者的必测的数据集之一。有人曾调侃道：*"如果一个算法在MNIST不work, 那么它就根本没法用；而如果它在MNIST上work, 它在其他数据上也可能不work！"*
 

`Fashion-MNIST`的目的是要成为MNIST数据集的一个直接替代品。作为算法作者，你不需要修改任何的代码，就可以直接使用这个数据集。`Fashion-MNIST`的图片大小，训练、测试样本数及类别数与经典MNIST**完全相同**。

这个数据集的样子大致如下（每个类别占三行）：

![](https://github.com/zalandoresearch/fashion-mnist/raw/master/doc/img/fashion-mnist-sprite.png)


## 类别标注

在Fashion-mnist数据集中，每个训练样本都按照以下类别进行了标注：

| 标注编号 | 描述 |
| --- | --- |
| 0 | T-shirt/top（T恤）|
| 1 | Trouser（裤子）|
| 2 | Pullover（套衫）|
| 3 | Dress（裙子）|
| 4 | Coat（外套）|
| 5 | Sandal（凉鞋）|
| 6 | Shirt（汗衫）|
| 7 | Sneaker（运动鞋）|
| 8 | Bag（包）|
| 9 | Ankle boot（踝靴）|


## 任务描述


`Fashion-MNIST`是一个替代[MNIST手写数字集](http://yann.lecun.com/exdb/mnist/)的图像数据集。 它是由Zalando（一家德国的时尚科技公司）旗下的[研究部门](https://research.zalando.com/)提供。其涵盖了来自10种类别的共7万个不同商品的正面图片。Fashion-MNIST的大小、格式和训练集/测试集划分与原始的MNIST完全一致。60000/10000的训练测试数据划分，28x28的灰度图片。你可以直接用它来测试你的机器学习和深度学习算法性能，且**不需要**改动任何的代码。


本次任务需要针对`Fashion-MNIST`数据集，设计、搭建、训练机器学习模型，能够尽可能准确地分辨出测试数据地标签。

## 文档说明 


数据集文件分为训练集和测试集部分，对应文件如下：

- 训练数据：`train-images-idx3-ubyte.gz` 
- 训练标签：`train-labels-idx1-ubyte.gz`
- 测试数据：`t10k-images-idx3-ubyte.gz`


## 参考文献

[1] Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms. Han Xiao, Kashif Rasul, Roland Vollgraf. arXiv:1708.07747

[2] https://github.com/zalandoresearch/fashion-mnist/

# 评估说明

## 评价指标

本次任务采用 [ACC（Accuracy)](https://baike.baidu.com/item/%E5%87%86%E7%A1%AE%E7%8E%87) 作为模型的评价标准。

## 在线评估

评估函数首先会验证选手提交的预测结果文件是否符合要求，主要验证了以下要求:

1. 提交的预测文件是否存在重复ID
2. 提交的预测文件ID是否与测试集文件ID不匹配

通过验证后的文件会用以ACC为测评指标的函数进行计算评估。


## 文件格式

由于测评脚本已经统一，为保证脚本的顺利运行，在进行测评时，要求选手提交的`预测文件`拥有规范的字段名和字段格式，预测文件具体要求如下：

| NO | 字段名称 | 数据类型 | 字段描述 |
| -------- | -------- | -------- | -------- |
| 1    | ID     | int    | ID序列     |
| 2    | Prediction   | int     | 预测结果（类别值）   |

正确格式的提交文件样例: `submission_random.csv`。

## 基准算法

本次任务采用不同的基准算法，获得模型的ACC如下：
- 随机基准算法ACC：0.09440
- 弱基准算法ACC：0.90452

在评估时，以弱基准算法的ACC作为达标线。

## 终审评估

本次任务的终审评估将挑选在评分指标位于前10名的同学进行项目报告撰写，以描述模型、算法及实验等相关内容和结果，报告排版要求届时发布。

除此以外，为保证竞赛的公平性，进入终审评估的同学需要提交项目代码，由助教进行模型的有效性验证。

如发现实验结果有较大差异，或者模型无法复现等问题，组委会将取消营员本次14天陪你挑战《动手学深度学习》的结营资格，并且进行公示。

# Libraries & data loader

In [61]:
import os
import time
import gzip

import numpy as np
import pandas as pd

import torch
from torch.nn import Linear,Conv2d,BatchNorm2d,AvgPool2d,LeakyReLU,Softmax,Unfold
from torch.nn.init import kaiming_normal
from torch.autograd import Variable
from torch.utils.data import DataLoader,TensorDataset

import matplotlib.pyplot as plt
%matplotlib inline

In [2]:
def load_mnist(path, kind='train'):
    """Load MNIST data from `path`"""
    images_path = os.path.join(path,
                           '%s-images-idx3-ubyte.gz'
                           % kind)
                           
    with gzip.open(images_path, 'rb') as imgpath:
        images = np.frombuffer(imgpath.read(), dtype=np.uint8,
                               offset=16).reshape(-1, 784)
                               
    if kind != 't10k':
        labels_path = os.path.join(path,
                                   '%s-labels-idx1-ubyte.gz'
                                   % kind)

        with gzip.open(labels_path, 'rb') as lbpath:
            labels = np.frombuffer(lbpath.read(), dtype=np.uint8,
                                   offset=8)
    else:
        labels = []
    return images, labels

In [58]:
path = '../input/FashionMNIST1045'
X_train, y_train = load_mnist(path, kind='train')
X_test, _ = load_mnist(path, kind='t10k')
X_train.resize((X_train.shape[0],1,28,28))
X_test.resize((X_test.shape[0],1,28,28))
X_train = X_train.astype(np.float32)
y_train = y_train.astype(np.float32)
# X_train,y_train,X_test = torch.Tensor(X_train),torch.Tensor(y_train),torch.Tensor(X_test)

# Model Dev.

In [7]:
class genericCNN(torch.nn.Module):
    def __init__(self):
        super (genericCNN,self).__init__()
        
        # block 1
        self.conv1 = Conv2d(1,16,3,1)
        kaiming_normal(self.conv1.weight)
        self.lrelu1 = LeakyReLU()
        self.bn1 = BatchNorm2d(16)
        self.conv2 = Conv2d(16,32,3,1)
        kaiming_normal(self.conv2.weight)
        self.lrelu2 = LeakyReLU()
        self.bn2 = BatchNorm2d(32)
        self.pool1 = AvgPool2d(2,2)
        
        # block 2
        self.conv3 = Conv2d(32,32,3,1)
        kaiming_normal(self.conv3.weight)
        self.lrelu3 = LeakyReLU()
        self.bn3 = BatchNorm2d(32)
        self.conv4 = Conv2d(32,64,3,1)
        kaiming_normal(self.conv4.weight)
        self.lrelu4 = LeakyReLU()
        self.bn4 = BatchNorm2d(64)
        self.pool2 = AvgPool2d(2,2)
        
        # output
        self.output = Linear(1024,10)
    
    def num_flat_features(self, x):
        size = x.size()[1:]
        num_features = 1
        for s in size:
            num_features *= s
        return num_features

    def forward(self,x):
        h = self.bn1(self.lrelu1(self.conv1(x)))
        h = self.bn2(self.lrelu2(self.conv2(h)))
        h = self.pool1(h)
        
        h = self.bn3(self.lrelu3(self.conv3(h)))
        h = self.bn4(self.lrelu4(self.conv4(h)))
        h = self.pool2(h)
        
        h = h.view(-1, self.num_flat_features(h))
        h = self.output(h)
        
        return h

In [59]:
model = genericCNN()

if torch.cuda.is_available():
    model = model.cuda()
    X_train = Variable(torch.from_numpy(X_train).cuda())
    y_train = Variable(torch.from_numpy(y_train).cuda())
    #X_train = torch.cuda.FloatTensor(X_train)
    #y_train = torch.cuda.FloatTensor(y_train)
    
optimizer = torch.optim.Adam(model.parameters(),lr = 3e-4)
loss_fn = torch.nn.CrossEntropyLoss()

  import sys
  # This is added back by InteractiveShellApp.init_path()


In [60]:
EPOCH_NUM = 20
BATCH_SIZE = 256
DATASET = TensorDataset(X_train[:55000],y_train[:55000])

LOSS = []
ACC = []

sp = time.time()
l = 0
train_loader = DataLoader(dataset = DATASET,batch_size = BATCH_SIZE,shuffle = True)
for epoch in range(EPOCH_NUM):
    for i,(images,labels) in enumerate(train_loader):
        images = Variable(images)
        labels = Variable(labels)

        optimizer.zero_grad()
        outputs = model(images)
        loss = loss_fn(outputs,labels.long())
        l += float(loss)
        loss.backward()
        optimizer.step()
        
        y_pred = model(X_train[55000:])
        acc = float((torch.argmax(y_pred, 1) == y_train.long()[55000:]).sum()) / 5000
        
        LOSS.append(loss)
        ACC.append(acc)
        
        if i % 50 == 49:
            print('duration: %.1f sec, loss: %.3f' %(time.time() - sp,l / 100))
            
            y_pred = model(X_train[55000:])
            acc = float((torch.argmax(y_pred, 1) == y_train.long()[55000:]).sum()) / 5000
            print('Epoch: %d, Accuracy: %.2f%%\n' %(epoch + 1,100 * acc))
            
            l = 0


duration: 6.3 sec, loss: 0.755
Epoch: 1, Accuracy: 72.74%

duration: 12.7 sec, loss: 0.415
Epoch: 1, Accuracy: 77.00%

duration: 19.0 sec, loss: 0.326
Epoch: 1, Accuracy: 79.72%

duration: 25.3 sec, loss: 0.281
Epoch: 1, Accuracy: 82.06%

duration: 33.6 sec, loss: 0.319
Epoch: 2, Accuracy: 84.06%

duration: 39.9 sec, loss: 0.220
Epoch: 2, Accuracy: 85.58%

duration: 46.2 sec, loss: 0.207
Epoch: 2, Accuracy: 86.06%

duration: 52.6 sec, loss: 0.197
Epoch: 2, Accuracy: 87.38%

duration: 60.8 sec, loss: 0.241
Epoch: 3, Accuracy: 88.06%

duration: 67.1 sec, loss: 0.174
Epoch: 3, Accuracy: 88.36%

duration: 73.5 sec, loss: 0.170
Epoch: 3, Accuracy: 88.54%

duration: 79.8 sec, loss: 0.163
Epoch: 3, Accuracy: 88.72%

duration: 88.1 sec, loss: 0.208
Epoch: 4, Accuracy: 89.32%

duration: 94.5 sec, loss: 0.153
Epoch: 4, Accuracy: 89.50%

duration: 100.8 sec, loss: 0.149
Epoch: 4, Accuracy: 89.32%

duration: 107.2 sec, loss: 0.151
Epoch: 4, Accuracy: 89.94%

duration: 115.5 sec, loss: 0.189
Epoch: