# 计算测试集图像语义特征

抽取 MMClassification 训练得到的水果图像分类模型中间层的输出特征，作为输入图像的语义特征。

计算测试集所有图像的语义特征，使用t-SNE和UMAP两种降维方法降维至二维和三维，可视化。

分析不同类别的语义距离、异常数据、细粒度分类、高维数据结构。

同济子豪兄：https://space.bilibili.com/1900783

[代码运行云GPU环境](https://featurize.cn/?s=d7ce99f842414bfcaea5662a97581bd1)：GPU RTX 3060、CUDA v11.2

## 进入 mmclassification 目录

In [1]:
import os
os.chdir('mmclassification')

## 导入工具包

In [18]:
import pandas as pd
import numpy as np
from tqdm import tqdm

from mmcv import Config

from mmcls.datasets.pipelines import Compose

from mmcls.apis import init_model

import torch

# 有 GPU 就用 GPU，没有就用 CPU
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
print('device', device)

device cuda:0


## 载入训练好的水果图像分类模型

In [5]:
# 训练得到的 mobilenet v2 新模型
config_file = 'configs/mobilenet_v2/mobilenet_v2_1x_fruit30.py'
checkpoint_file = 'work_dirs/mobilenet_v2_1x_fruit30/latest.pth'
# checkpoint_file = 'https://zihao-openmmlab.obs.myhuaweicloud.com/20220716-mmclassification/checkpoints/fruit30_mmcls/latest.pth'

# 通过 config 配置文件 和 checkpoint 权重文件 构建模型
model = init_model(config_file, checkpoint_file, device=device)

cfg = model.cfg
test_pipeline = Compose(cfg.data.test.pipeline)

load checkpoint from local path: work_dirs/mobilenet_v2_1x_fruit30/latest.pth


## 计算单张图像的语义特征

In [7]:
img_path = 'fruit30_split/val/菠萝/105.jpg'

In [8]:
data = {
    'img_info': {'filename':img_path},
    'img_prefix': None
}

data = test_pipeline(data)
img = data['img'].unsqueeze(0).to(device)

In [9]:
img.shape

torch.Size([1, 3, 224, 224])

In [10]:
features = model.extract_feat(img)

In [11]:
features[0].shape

torch.Size([1, 1280])

默认计算`neck`层输出作为语义特征

## 载入测试集图像分类结果

In [15]:
df = pd.read_csv('work_dirs/mobilenet_v2_1x_fruit30/测试集预测结果.csv')

In [16]:
df.head()

Unnamed: 0,图像路径,标注类别名称,标注类别ID,top-1-预测ID,top-1-预测名称,top-2-预测ID,top-2-预测名称,top-3-预测ID,top-3-预测名称,top-n预测正确,...,草莓-预测置信度,荔枝-预测置信度,菠萝-预测置信度,葡萄-白-预测置信度,葡萄-红-预测置信度,西瓜-预测置信度,西红柿-预测置信度,车厘子-预测置信度,香蕉-预测置信度,黄瓜-预测置信度
0,fruit30_split/val/苦瓜/161.jpg,苦瓜,17,17.0,苦瓜,14.0,胡萝卜,23.0,葡萄-白,1.0,...,0.001597,1.300516e-06,4.565059e-07,0.001999,0.000494452,4.040559e-05,0.0001985327,1.293132e-07,4.450464e-07,8.378662e-05
1,fruit30_split/val/苦瓜/158.jpg,苦瓜,17,17.0,苦瓜,23.0,葡萄-白,29.0,黄瓜,1.0,...,2e-05,1.148556e-07,1.931659e-07,0.010118,1.288236e-05,3.079307e-06,3.649963e-06,1.672791e-08,7.884714e-07,0.0002289558
2,fruit30_split/val/苦瓜/148.jpg,苦瓜,17,17.0,苦瓜,14.0,胡萝卜,29.0,黄瓜,1.0,...,2e-05,9.969936e-09,1.933638e-07,1.1e-05,2.079129e-07,3.513297e-05,2.743521e-07,5.043417e-10,4.636924e-07,0.0002881152
3,fruit30_split/val/苦瓜/183.jpg,苦瓜,17,17.0,苦瓜,23.0,葡萄-白,14.0,胡萝卜,1.0,...,9.4e-05,1.243638e-07,3.491051e-07,0.001021,1.807617e-05,3.682075e-06,5.000793e-06,1.761318e-08,3.402481e-06,2.371633e-05
4,fruit30_split/val/苦瓜/41.jpeg,苦瓜,17,17.0,苦瓜,23.0,葡萄-白,20.0,草莓,1.0,...,4e-06,1.345433e-10,2.634503e-08,0.000514,7.998624e-08,1.178391e-09,1.978439e-08,6.912543e-13,2.308019e-08,5.676298e-07


## 计算测试集每张图像的语义特征

In [19]:
encoding_array = []
img_path_list = []

for img_path in tqdm(df['图像路径']):
    img_path_list.append(img_path)
    
    # 预处理
    data = {
        'img_info': {'filename':img_path},
        'img_prefix': None
    }

    data = test_pipeline(data)
    img = data['img'].unsqueeze(0).to(device)
    
    # 计算语义特征
    feature = model.extract_feat(img)[0].squeeze().detach().cpu().numpy()
    
    encoding_array.append(feature)
encoding_array = np.array(encoding_array)

100%|██████████| 1078/1078 [00:14<00:00, 72.96it/s]


In [20]:
encoding_array.shape

(1078, 1280)

## 保存为本地的.npy文件

In [22]:
# 保存为本地的 npy 文件
np.save('work_dirs/mobilenet_v2_1x_fruit30/测试集语义特征.npy', encoding_array)