# 飞桨常规赛：PALM病理性近视预测 - 5月第一名方案
第一次参加常规赛进入了前三，有点小激动。虽然也没啥技巧，只会跑跑跑，但还是作为记录，说不定能给特别缘分的朋友一点启发。

## 0. 赛题介绍
常规赛：PALM病理性近视预测由ISBI2019 PALM眼科挑战赛赛题再现，其中病理性近视预测的任务旨在对眼科图像进行判断，获得该眼为病理性近视的概率。

数据集由中山大学中山眼科中心提供800张带病理性近视分类标注的眼底彩照供选手训练模型，另提供400张带标注数据供平台进行模型测试。图像分辨率为1444×1444，或2124×2056。

评价指标为AUC (Area Under Curve)，即ROC (Receiver operating characteristic) 曲线与坐标轴形成的面积。

![](https://ai-studio-static-online.cdn.bcebos.com/64afb1df4595408088e547b616423f447e5761ae13224ca4be559b0c598d598b)

比赛链接: [常规赛：PALM病理性近视预测](https://aistudio.baidu.com/aistudio/competition/detail/85)

## 1. 包准备
既然是分类任务，首先想到的是PaddleClas。但是我忽然想到了大佬们搞的ppim。因为医疗影像我认为有重要的关注的地方，和遥感图像不太类似，注意力应该能取得较好的效果。听闻大佬们的ppim复现了很新的注意力网络，而且和源代码相比效果很好，所以决定试试。

In [1]:
! pip -q install ppim -i https://pypi.python.org/pypi
# ! git -q clone https://github.com/AgentMaker/Paddle-Image-Models.git

## 2. 数据准备
### 2.1解压数据集
这个没啥好写的。

In [1]:
# ! unzip -oq /home/aistudio/data/data85133/常规赛：PALM病理性近视预测.zip
# ! rm -rf __MACOSX
# ! mv 常规赛：PALM病理性近视预测 PLAM

### 2.2 配置数据集
- 因为数据中本身就有了这个图像名和标签，我们就不用生成数据列表了。直接继承io中的Dataset，用于读取数据。因为与开始说数据的大小有两种分辨率，而且贼大，但是又不敢放的太小损失太多细节，所以这里都放到了1120X1120。
- 划分的比列为0.9，图像增强只有简单的色彩和水平翻转。

In [3]:
import os
import pandas as pd
import numpy as np
import paddle
import paddle.vision.transforms as T
from paddle.io import Dataset
from PIL import Image

class PLAMDatas(Dataset):
    def __init__(self, data_path, class_xls, mode='train', transforms=None):
        super(PLAMDatas, self).__init__()
        self.data_path = data_path
        self.name_label = (pd.read_excel(class_xls)).values
        lens = len(self.name_label)
        if mode == 'train':
            self.name_label = self.name_label[:int(0.9*lens)]
        else:
            self.name_label = self.name_label[int(0.9*lens):]
        self.transforms = transforms
        
    def __getitem__(self, index):
        name, label = self.name_label[index]
        data_path = os.path.join(self.data_path, name)
        data = np.asarray(Image.open(data_path).convert('RGB'))
        if self.transforms is not None:
            data = self.transforms(data)
        data = data.astype('float32')
        label = np.array(int(label)).astype('int64')
        return data, label
        
    def __len__(self):
        return len(self.name_label)

# 配置数据增广
train_transforms = T.Compose([
    T.Resize((1120, 1120), interpolation='bicubic'),
    T.ColorJitter(0.1, 0.1, 0.1, 0.1),
    T.RandomHorizontalFlip(),
   	T.ToTensor()
])

val_transforms = T.Compose([
    T.Resize((1120, 1120), interpolation='bicubic'),
    T.ToTensor()
])

# 配置数据集
train_dataset = PLAMDatas(data_path='PLAM/Train/fundus_image', class_xls='PLAM/Train/Classification.xlsx', mode='train', transforms=train_transforms)
val_dataset = PLAMDatas(data_path='PLAM/Train/fundus_image', class_xls='PLAM/Train/Classification.xlsx', mode='test', transforms=val_transforms)

Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  def convert_to_list(value, n, name, dtype=np.int):


这里也是输出测试一下，看看数据读取有没有什么问题。避免后面报一堆错不知道哪儿去找问题。

In [4]:
print(len(train_dataset), len(val_dataset))
for img, lab in train_dataset:
    print(img.shape, lab)
    break
for img ,lab in val_dataset:
    print(img.shape, lab)
    break

720 80
[3, 1120, 1120] 0
[3, 1120, 1120] 0


## 3. 模型训练
### 3.1 模型准备
这里最开始直接采用的是deit_b_distilled_384之类的，然而问题就是没注意到大佬已经规定好了图像的大小了，搞了几次没不对，一度准备放弃改用PaddleClas的模型，后来终于幡然醒悟。384或者224对这个任务来说太小了，没办法，用了基础的DistilledVisionTransformer，自己说用1120的。但是这样就没有预训练的参数了，只能自己跑了。patch_size也改了一下，太小的话空间占用多，运算慢，而且不太好。summary看一下，总算没问题了。

In [5]:
import paddle
import paddle.nn as nn
from ppim import DistilledVisionTransformer

# 模型定义
model = DistilledVisionTransformer(
    img_size=1120,
    patch_size=64,
    class_dim=2)
params = paddle.load('save_models/last.pdparams')
model.set_state_dict(params)
paddle.summary(model, (1, 3, 1120, 1120))
model = paddle.Model(model)

---------------------------------------------------------------------------
 Layer (type)       Input Shape          Output Shape         Param #    
   Conv2D-1     [[1, 3, 1120, 1120]]   [1, 768, 17, 17]      9,437,952   
 PatchEmbed-1   [[1, 3, 1120, 1120]]    [1, 289, 768]            0       
   Dropout-1      [[1, 291, 768]]       [1, 291, 768]            0       
  LayerNorm-1     [[1, 291, 768]]       [1, 291, 768]          1,536     
   Linear-1       [[1, 291, 768]]       [1, 291, 2304]       1,769,472   
   Dropout-2    [[1, 12, 291, 291]]   [1, 12, 291, 291]          0       
   Linear-2       [[1, 291, 768]]       [1, 291, 768]         590,592    
   Dropout-3      [[1, 291, 768]]       [1, 291, 768]            0       
  Attention-1     [[1, 291, 768]]       [1, 291, 768]            0       
  Identity-1      [[1, 291, 768]]       [1, 291, 768]            0       
  LayerNorm-2     [[1, 291, 768]]       [1, 291, 768]          1,536     
   Linear-3       [[1, 291, 768]]   

### 3.2 开始训练
这里分了两次跑，最开始就跑了100轮就很高的分数排到了第一，后来被JavaRoom大佬超过，就改成了第二种方式继续跑了50轮，直到最后val_loss都还在下降，我想就想跑下去会不会还能提高。也说不定了。
1. CosineAnnealingDecay + Adam + bs64
2. PolynomialDecay + SGD + ClipGradByGlobalNorm + bs8

In [6]:
# 模型准备
# lr = paddle.optimizer.lr.CosineAnnealingDecay(learning_rate=3e-6, T_max=int(2*(800*0.9)), verbose=False)
lr = paddle.optimizer.lr.PolynomialDecay(learning_rate=3e-7, decay_steps=1000)
# opt = paddle.optimizer.Adam(learning_rate=lr, parameters=model.parameters(), weight_decay=paddle.regularizer.L2Decay(1e-7))
opt = paddle.optimizer.SGD(learning_rate=lr, parameters=model.parameters(), \
                           weight_decay=paddle.regularizer.L2Decay(1e-9), grad_clip=paddle.nn.ClipGradByGlobalNorm(clip_norm=1.0))
# last
opt_params = paddle.load('save_models/last.pdopt')
opt.set_state_dict(opt_params)
loss = nn.CrossEntropyLoss()
metric = paddle.metric.Accuracy()
model.prepare(optimizer=opt, loss=loss, metrics=metric)
visualdl=paddle.callbacks.VisualDL(log_dir='visual_log')

# 模型微调
model.fit(
    train_data=train_dataset, 
    eval_data=val_dataset, 
    batch_size=8, #  64, 
    epochs=50,  # 100, 
    eval_freq=4,  # 10, 
    log_freq=1, 
    save_dir='save_models', 
    save_freq=4,  # 10, 
    verbose=1, 
    drop_last=True,  # False, 
    shuffle=True,
    num_workers=0,
    callbacks=[visualdl]
)

The loss value printed in the log is the current step, and the metric is the average value of previous step.
Epoch 1/20


Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  if isinstance(slot[0], (np.ndarray, np.bool, numbers.Number)):


step  1/90 [..............................] - loss: 3.1512e-04 - acc: 1.0000 - ETA: 2:33 - 2s/step

  return (isinstance(seq, collections.Sequence) and


save checkpoint at /home/aistudio/save_models/0
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
Eval samples: 80
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
save checkpoint at /home/aistudio/save_models/4
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
Eval samples: 80
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
save checkpoint at /home/aistudio/save_models/8
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
Eval samples: 80
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
save checkpoint at /home/aistudio/save_models/12
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
Eval samples: 80
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
save checkpoint at /home/aistudio/save_models/16
Eval begin

val的acc已经无法看到啥有用的了，只能看loss是不是还在下降。
```
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 10/10 [==============================] - loss: 0.0102 - acc: 1.0000 - 884ms/step           
Eval samples: 80
```

## 4. 模型预测
- 预测这里就是主义图像的大小变了，而且相比于1444×1444，这个2124×2056还不是个正方形。最开始resize就是单边放到1120，结果老是不对，后来才反映过来。
- 然后最坑的是结果必须升序排列（感谢吖吖查大佬的提醒），不然就只有0.5几分。

In [7]:
import os
import numpy as np
import pandas as pd
from PIL import Image
import paddle.vision.transforms as T
import paddle
import paddle.nn as nn
import paddle.nn.functional as F
from ppim import DistilledVisionTransformer

save_path = 'Classification_Results.csv'
file_path = 'PLAM/PALM-Testing400-Images'
imgs_name = os.listdir(file_path)

model = DistilledVisionTransformer(
    img_size=1120,
    patch_size=64,
    class_dim=2)
params = paddle.load('save_models/last.pdparams')
model.set_state_dict(params)
model.eval()

inf_transforms = T.Compose([
    T.Resize((1120, 1120), interpolation='bicubic'),  # 1120X1120
    T.ToTensor()
])

pre_data = []
for img_name in imgs_name:
    data_path = os.path.join(file_path, img_name)
    data = np.asarray(Image.open(data_path).convert('RGB'))
    data = inf_transforms(data)
    data = data.astype('float32').reshape([1, 3, 1120, 1120])
    pre = model(data)
    pre = F.softmax(pre)
    print([img_name, pre.numpy()[0][1]])
    pre_data.append([img_name, pre.numpy()[0][1]])

sorted(pre_data)  # 升序（我这是后面写文字加的，所以看到输出没有排序）

df = pd.DataFrame(pre_data, columns=['FileName', 'PM Risk'])
df.to_csv(save_path, index=None)

['T0314.jpg', 0.999534]
['T0102.jpg', 0.99997854]
['T0209.jpg', 0.9999782]
['T0150.jpg', 0.99997663]
['T0134.jpg', 0.99996257]
['T0145.jpg', 0.99964404]
['T0376.jpg', 0.00026754302]
['T0084.jpg', 0.00033837082]
['T0259.jpg', 8.749866e-05]
['T0298.jpg', 0.0005694403]
['T0239.jpg', 0.12634839]
['T0230.jpg', 0.0008878632]
['T0323.jpg', 0.00018469241]
['T0012.jpg', 0.999982]
['T0280.jpg', 0.0002507006]
['T0015.jpg', 0.9972313]
['T0141.jpg', 0.9694529]
['T0064.jpg', 0.99994683]
['T0233.jpg', 0.00037243208]
['T0357.jpg', 0.00026328227]
['T0124.jpg', 0.99963474]
['T0043.jpg', 0.9981799]
['T0214.jpg', 0.00022584207]
['T0056.jpg', 0.0014357708]
['T0126.jpg', 0.99997914]
['T0049.jpg', 0.00021056297]
['T0162.jpg', 0.00017513298]
['T0306.jpg', 0.9999876]
['T0385.jpg', 9.7773285e-05]
['T0023.jpg', 0.9998442]
['T0187.jpg', 0.00019756894]
['T0343.jpg', 0.0002712552]
['T0073.jpg', 0.9999851]
['T0290.jpg', 0.00018723296]
['T0289.jpg', 0.0051840045]
['T0144.jpg', 0.00093522447]
['T0155.jpg', 0.99938774]

## 心得
1. 个人感觉VisionTransformer在医学里面的效果真的牛逼，啥技巧也不会真的就靠这就能上分。
2. 果然Adam开始牛逼，SGD最后调整，效果都挺好的。
3. AI Studio太卷了，这个月要完了大佬们就开始疯狂涨分。

## *参考资料
1. [PPIM](https://github.com/AgentMaker/Paddle-Image-Models)