# 基于CNN模型的物体识别

## 1.复习上课内容以及复现课程代码

在本部分，你需要复习上课内容和课程代码后，自己复现课程代码。

## 2.回答以下理论题目?

### 2.1. Suppose your input is a 100 by 100 gray image, and you use a convolutional layer with 50 filters that are each 5x5. How many parameters does this hidden layer have (including the bias parameters)? 

#### $\color{blue}{Answer:（5*5+1）*50=1300}$

### 2.2. What are "local invariant" and "parameter sharing" ?
#### $\color{blue}{local invariant:局部不变特征}$
#### $\color{blue}{parameter sharing:权值共享}$
##### $\color{blue}{（权值共享）从一个局部区域学习到的信息，应用到图像的其它地方。
即用一个相同的卷积核去卷积整幅图像，相当于对图像做一个全图滤波。
一个卷积核对应的特征比如是边缘，那么用该卷积核去对图像做全图滤波，即是将图像各个位置的边缘都滤出来。
（帮助实现不变性）。不同的特征靠多个不同的卷积核实现。}$
##### $\color{blue}{（局部不变特征）图像的局部统计特征在整幅图像上具有重复性（即位置无关性）。
即如果图像中存在某个基本图形，该基本图形可能出现在任意位置，那么不同位置共享相同权值可实现在数据的不同位置检测相同的模式。
比如我们在第一个窗口卷积后得到的特征是边缘，那么这个卷积核对应的就是边缘特征的提取方式，那么我们就可以用这个卷积核去提取其它区域的边缘特征。}$

### 2.3. Why we use batch normalization ?
##### $\color{blue}{为什么要归一化？因为不做归一化，不同的特征具有不同数量级的数据，数量级大的特征影响更大。进一步体现在损失函数上，feature scaling之前，损失函数的切面图是椭圆的，之后就变成圆，无论优化算法在何处开始，都更容易收敛到最优解，避免了很多弯路。尤其是在神经网络中，特征经过线性组合后，还要经过激活函数，如果某个特征数量级过大，在经过激活函数时，就会提前进入它的饱和区间，即不管如何增大这个数值，它的激活函数值都在 1 附近，不会有太大变化，这样激活函数就对这个特征不敏感。在神经网络用 SGD 等算法进行优化时，不同量纲的数据会使网络失衡，很不稳定。在神经网络中，这个问题不仅发生在输入层，也发生在隐藏层，因为前一层的输出值，对后面一层来说，就是它的输入，而且也要经过激活函数，所以就需要做 batch normalization，就是在前一层的线性输出 z 上做 normalization：需要求出这一 batch 数据的平均值和标准差，然后再经过激活函数，进入到下一层。}$

### 2.4. What problem does dropout try to solve ?
##### $\color{blue}{防止过拟合}$

## 3. 实践题

### 3.1 In the first session of the practical part, you will implement an image classification model using any deep learning libraries that you are familiar with,  which means, except for tensorflow and keras, you can also use pytorch/caffe/... .  The dataset used in this session is the cifar10 which contains 50000 color (RGB) images, each with size 32x32x3.  All 50000 images are classified into ten categories. 

##### It is your time to build your model. Try your best to build a model with good performance on the test set.

In [None]:
import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.models as models
import torchvision.transforms as transforms
import torchvision.datasets as datasets
from torch.utils.data import DataLoader

In [5]:
#定义超参
EPOCH = 2
BATCH_SIZE = 64
lr = 0.001

In [6]:
#加载数据
train_data = datasets.CIFAR10(root='./data',train=True,
                             transform=transforms.ToTensor(),download=False)
test_data = datasets.CIFAR10(root='./data',train=False,
                            transform=transforms.ToTensor(),download=False)

In [10]:
#数据分批
train_loader = DataLoader(dataset=train_data,batch_size=BATCH_SIZE,shuffle=True)
test_loader = DataLoader(dataset=test_data,batch_size=BATCH_SIZE,shuffle=True)

In [15]:
#使用pytorch内置模型DenseNet
model = models.densenet121(pretrained=False)

In [16]:
#设置损失函数和优化函数
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(),lr=lr)

In [17]:
#训练
for epoch in range(EPOCH):
    for i,data in enumerate(train_loader):
        inputs,labels = data
        #forward
        outputs = model(inputs)
        #loss
        loss = criterion(outputs,labels)
        #optim zero_grad
        optimizer.zero_grad()
        #backward
        loss.backward()
        #update
        optimizer.step()
        if i%30==29:
            print("i:{},epoch:{},loss:{:.4f}".format(i+1,epoch,loss.item()))

i:30,epoch:0,loss:1.7478
i:60,epoch:0,loss:1.7618
i:90,epoch:0,loss:1.9229
i:120,epoch:0,loss:1.5330
i:150,epoch:0,loss:1.5660
i:180,epoch:0,loss:1.4621
i:210,epoch:0,loss:1.4829
i:240,epoch:0,loss:1.3796
i:270,epoch:0,loss:1.3532
i:300,epoch:0,loss:1.6536
i:330,epoch:0,loss:1.3436
i:360,epoch:0,loss:1.3088
i:390,epoch:0,loss:1.3261
i:420,epoch:0,loss:1.4839
i:450,epoch:0,loss:1.1727
i:480,epoch:0,loss:1.1294
i:510,epoch:0,loss:1.6539
i:540,epoch:0,loss:1.1720
i:570,epoch:0,loss:1.3422
i:600,epoch:0,loss:1.1104
i:630,epoch:0,loss:1.0515
i:660,epoch:0,loss:1.1519
i:690,epoch:0,loss:1.0400
i:720,epoch:0,loss:1.1184
i:750,epoch:0,loss:1.2436
i:780,epoch:0,loss:0.9997
i:30,epoch:1,loss:0.8343
i:60,epoch:1,loss:0.9428
i:90,epoch:1,loss:0.9946
i:120,epoch:1,loss:1.1810
i:150,epoch:1,loss:1.3908
i:180,epoch:1,loss:0.8470
i:210,epoch:1,loss:0.8614
i:240,epoch:1,loss:1.1940
i:270,epoch:1,loss:1.1707
i:300,epoch:1,loss:0.9650
i:330,epoch:1,loss:1.0422
i:360,epoch:1,loss:1.0225
i:390,epoch:1,loss

In [19]:
#保存模型
torch.save(model,"cifar10_densenet121.pt")
print("cifar10_densenet121 saved")

cifar10_densenet121 saved


In [25]:
#测试
model = torch.load("cifar10_densenet121.pt")
correct,total = 0,0
for i,data in enumerate(test_loader):
    inputs,labels = data
    #forward
    outputs = model(inputs)
    _,predicted = torch.max(outputs.data,1)
    total+=labels.size(0)
    correct+=(predicted==labels).sum().item()
print("10000张测试图像的准确率:{:.4f}".format(100.0*correct/total))

tensor([[  2.1603,  -1.4474,   5.1433,  ...,  -6.8338,  -6.4777,  -5.7651],
        [  1.2352,   2.9728,   7.0455,  ..., -13.0346, -13.1349, -12.1516],
        [  0.5995,   3.6988,   1.8783,  ...,  -7.5147,  -7.0946,  -6.6891],
        ...,
        [  4.4432,   6.0482,   1.4342,  ...,  -8.9117,  -7.8043,  -7.8251],
        [  2.1964,  -1.9868,   5.5197,  ...,  -7.2866,  -7.0002,  -6.5239],
        [  2.3356,  -1.3644,   4.3710,  ...,  -9.0622,  -8.6308,  -7.7677]],
       grad_fn=<AddmmBackward>)
tensor([[  2.7750,   2.2089,   2.6837,  ...,  -7.8354,  -7.2944,  -6.9703],
        [  1.4201,  -0.5061,   6.0062,  ...,  -8.0113,  -7.4115,  -7.0700],
        [  4.6798,  10.3462,   2.6835,  ..., -13.1156, -12.2545, -11.3196],
        ...,
        [ 10.3972,   2.1791,   5.3806,  ...,  -9.2449,  -8.7958,  -8.4116],
        [ 13.7159,  10.9192,   9.8328,  ..., -23.0810, -22.0185, -20.6065],
        [  4.5631,  -2.0598,   7.5964,  ...,  -8.0467,  -7.8516,  -7.2575]],
       grad_fn=<AddmmBackwar