# 使用PyTorch解决几乎所有的图像分类问题
![avater](https://cdn-images-1.medium.com/max/800/1*jcZLpgh3gppeFFgcpFSP0w.jpeg)
这是一个基于PyTorch构建代码的实验。主要的目的在于快速地应用预训练模型做迁移学习。我们在本博文中将使用植物苗分类数据集。这是Kaggle上的一个[比赛](https://www.kaggle.com/c/plant-seedlings-classification)。
下面的预训练模型是PyTorch可用的模型
- resnet18,resnet34,resnet50,resnet101,resnet152
- squeezenet1_0,squeezenet1_1
- Alexnet
- inception_v3
- Densenet121,Densenet169,Densenet201
- Vgg11,vgg13,vgg16,vgg19,vgg11_bn,vgg13_bn,vgg16_bn,vgg19_bn

### 迁移学习的三种情况以如何使用PyTorch解决
我已经在之前的[文章](https://medium.com/@14prakash/transfer-learning-using-keras-d804b2e04ef8)中讨论过迁移学习的原理了。这里稍微提一下。<br>
1. 冻结除了最后一层之外的所有层
2. 冻结前面的几层
3. 在整个网络上微调
如果你知道了模型的结构，那么在PyTorch中可以很直接的进行迁移学习。所有上面提到的模型实现都是不同的。有些是使用序列化的容器，包含很多层，有些就直接是一些层。所以需要仔细PyTorch中模型的定义。

### ResNet和Inception_v3
有好几种Resnet的实现，我们可以按需选择使用。由于Imagenet数据集有1000个类别，我们需要根据我们的需要改变最后一层的输出。我们需要冻结所以不需要训练的层并且将需要训练的参数传向优化器。<br>
```python
if resnet:
    model_conv = torchvision.models.resnet50()

if inception:
    model_conv = torchvision.models.inception_v3()
    
## Change the last layer
num_ftrs = model_conv.fc.in_features
model_conv.fc = nn.Linear(num_ftrs, n_class)
```
model_conv是一个容器，每个孩子又有各自的孩子（层）。下面是resnet50的例子。<br>
```python
for name, child in model_conv.named_children():
    for name2, params in child.named_parameters():
        print(name, name2)
## A long list of param are listed, some of them are shown below,
conv1 weight
bn1 weight
bn1 bias
....
fc weight
fc bias
```
现在我们希望在训练前冻结一些层，我们可以简单的使用如下命令：<br>
```python
## Freezing all layers
for params in model_conv.parameters():
    params.requires_grad = False

## Freezing the first few layers. Here I am freezing the first 7 layers
ct = 0
for name, child in model_conv.named_children():
    ct += 1
    if ct < 7:
        for name2, params in child.named_parameters():
            params.requires_grad = False
```
改变最后一层以适应新的数据是需要技巧的，我们需要仔细检查层。我们已经看了ResNet和Inception_V3。现在来看看其他网络。

### Squeeze-Net
在PyTorch中有两类squeeze-net，我们可以任意选用。不同于resnet最后有一个fc层，Squeeze-net最后一层是一个包装起来的容器（序列的），所以我们需要将所有的子层列出来，然后对需要的层做转换，然后放到容器中。详见如下代码。<br>
```python
model_conv = torchvision.models.squeezenet1_1()
for name, params in model_conv.named_children():
    print(name)

'''
features
classifier
'''
## How many In_channels are there for the conv layer
in_ftrs = model_conv.classifier[1].in_channels

## How many Out_channels are there for the conv layer
out_ftrs = model_conv.classifier[1].out_channels

## Converting a sequential layer to list of layers
features = list(model_conv.classifier.children())

## Changing the conv layer to required dimension
features[1] = nn.Conv2d(in_ftrs, n_class, kernel_size, stride)

## Changing the pooling layer as per the architecture output
features[3] = nn.AvgPool2d(12, stride=1)

## Making a container to list all the layers
model_conv.classifier = nn.Sequential(*features)

## Mentioning the number of out_put classes
model_conv.num_classes = n_class
```

### Dense-Net
它的结构跟Resnet类似但是最后一层的名字是classifier，代码如下：<br>
```python
model_conv = torchvision.models.densenet121(pretrained='imagenet')
num_ftrs = model_conv.classifier.in_features
model_conv.classifier = nn.Linear(num_ftrs,n_class)
```

### VGG以及Alex-Net
跟Squeeze-net类似。最后的fc层包装在一个容器中，所以我么你需要读取到容器然后改变其最后的fc层。<br>
```python
modle_conv = torchvision.models.vgg19(pretrained='imagenet')
# Number of filters in the bottleneck layer
num_ftrs = model_conv.classifier[6].in_features

# convert all the layers to list and remove the last one
features = list(model_conv.classifier.children())[:-1]

## Add the last layer based on the num of classes in our dataset
features.extend([nn.Linear(num_ftrs, n_class)])

## convert it into container and add it to our model class
model_conv.classifier = nn.Sequential(*features)
```
我们可以学到如何冻结需要的层以及改变不同网络的最后层。现在我们来使用其中一个训练网络。这里只是简述，详见我的Github。<br>

### Base code
像所有深度学习模型一样，我们首先<br>
- 定义一个网络
- 加载可用的预训练权重
- 冻结不需要训练的层（冻结的层就可以视为特征提取器）
- 加上损失
- 选择优化器
- 训练网络直到达标
现在我们以inception_v3为例来看看。我们将冻结前几层然后用SGD优化器和动量以及交叉熵损失来训练网络。<br>

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.optim import lr_scheduler
import numpy as np
import time
import os
import argparse

## Load the model 
model_conv = torchvision.models.inception_v3(pretrained='imagenet')

## Lets freeze the first few layers. This is done in two stages 
# Stage-1 Freezing all the layers 
if freeze_layers:
  for i, param in model_conv.named_parameters():
    param.requires_grad = False

# Since imagenet as 1000 classes , We need to change our last layer according to the number of classes we have,
num_ftrs = model_conv.fc.in_features
model_conv.fc = nn.Linear(num_ftrs, n_class)

# Stage-2 , Freeze all the layers till "Conv2d_4a_3*3"
ct = []
for name, child in model_conv.named_children():
    if "Conv2d_4a_3x3" in ct:
        for params in child.parameters():
            params.requires_grad = True
    ct.append(name)
    
# To view which layers are freeze and which layers are not freezed:
for name, child in model_conv.named_childeren():
  for name_2, params in child.named_parameters():
    print(name_2, params.requires_grad)
    
## Loading the dataloaders -- Make sure that the data is saved in following way
"""
data/
  - train/
      - class_1 folder/
          - img1.png
          - img2.png
      - class_2 folder/
      .....
      - class_n folder/
  - val/
      - class_1 folder/
      - class_2 folder/
      ......
      - class_n folder/
"""

data_dir = "data/"
input_shape = 299
batch_size = 32
mean = [0.5, 0.5, 0.5]
std = [0.5, 0.5, 0.5]
scale = 360
input_shape = 299 
use_parallel = False
use_gpu = True
epochs = 100

data_transforms = {
        'train': transforms.Compose([
        transforms.Resize(scale),
        transforms.RandomResizedCrop(input_shape),
        transforms.RandomHorizontalFlip(),
        transforms.RandomVerticalFlip(),
        transforms.RandomRotation(degrees=90),
        transforms.ToTensor(),
        transforms.Normalize(mean, std)]),
        'val': transforms.Compose([
        transforms.Resize(scale),
        transforms.CenterCrop(input_shape),
        transforms.ToTensor(),
        transforms.Normalize(mean, std)]),}

image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x),
                                      data_transforms[x]) for x in ['train', 'val']}
dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=batch_size,
                                         shuffle=True, num_workers=4) for x in ['train', 'val']}
dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'val']}
class_names = image_datasets['train'].classes

if use_parallel:
    print("[Using all the available GPUs]")
    model_conv = nn.DataParallel(model_conv, device_ids=[0, 1])

print("[Using CrossEntropyLoss...]")
criterion = nn.CrossEntropyLoss()

print("[Using small learning rate with momentum...]")
optimizer_conv = optim.SGD(list(filter(lambda p: p.requires_grad, model_conv.parameters())), lr=0.001, momentum=0.9)

print("[Creating Learning rate scheduler...]")
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_conv, step_size=7, gamma=0.1)

print("[Training the model begun ....]")
# train_model function is here: https://github.com/Prakashvanapalli/pytorch_classifiers/blob/master/tars/tars_training.py
model_ft = train_model(model_conv, dataloaders, dataset_sizes, criterion, optimizer_conv, exp_lr_scheduler, use_gpu,
                     num_epochs=epochs)

### 数据集
https://www.kaggle.com/c/plant-seedlings-classification/data

### 指标
我又进一步研究了这些模型在不同设置下的表现。我的思路是所有网络单独训练，然后最后应用几种不同的集成方法来提升准确率。我还比较关心不同模型的差异化，所以下面是训练和验证集上的指标。<br>
### Update-1
Cadene训练了一些Pytorch官方没有的模型，我使用了他的部分代码训练了以下模型。<br>
- resnext101_64x4d
- resnext101_32x4d
- nasnetalarge
- inceptionresnetv2
- inceptionv4
我在bn_inception和vggm上遇到了一些问题，将会在之后更新。
### TODO:
1. 为所有网络加入了混合策略
2. 集成模型的输出
3. 模型堆叠
4. 提取瓶颈特征然后训练
5. 可视化T-sne
6. 解决bn_inception的问题（模型不训练）
7. 训练Vggm
8. SE-Net实现和训练
### 最终提交结果
提交时是Leaderboard的第29名。
### Github Repo
- 代码：https://github.com/Prakashvanapalli/pytorch_classifiers.
- 我可以分享一些预训练权重和预测文件来构建不同的集成模型。给我发邮件即可**vanapaliprakash@gmail.com**