## 深度学习特征

提取CT、MRI、内镜、Xray等影像数据的深度学习特征。

### Onekey步骤

1. 将待提取的数据转化成jpg，可以参考使用OKT-convert2jpg或者OKT-crop_max_roi两个Onekey工具。
2. 获取到指定目录的所有图像数据。
3. 选择要提取什么样的模型的深度学习特征，目前Onekey支持主流的深度学习模型。（可以考虑使用Onekey进行迁移学习）
4. 提取特征，保存特征文件。

In [1]:
from onekey_algo.custom.Manager import onekey_show
onekey_show('深度学习特征提取')

[2025-04-22 15:34:45 - <frozen onekey_algo.custom.Manager>: 138]	INFO	播放视频功能已经设置成：Disable！


## 获取待提取特征的文件

提供两种批量处理的模式：
1. 目录模式，提取指定目录下的所有jpg文件的特征。
2. 文件模式，待提取的数据存储在文件中，每行一个样本。

当然也可以在最后自己指定手动提取指定若干文件。

In [2]:
#### 获取数据
from onekey_algo.custom.Manager import onekey_show
onekey_show('深度学习特征提取|获取数据')

[2025-04-22 15:34:45 - <frozen onekey_algo.custom.Manager>: 138]	INFO	播放视频功能已经设置成：Disable！


In [3]:
import os
os.environ['KMP_DUPLICATE_LIB_OK'] = 'True'

import monai
from glob import glob
import matplotlib.pyplot as plt
from onekey_algo import get_param_in_cwd

os.makedirs('features', exist_ok=True)
mydir = r'E:\111thymus\thymus_habitat\sol8. 深度（迁移）学习-单（多）中心-多通道-万能图像融合-临床\调试\2_73\merges'
samples = [os.path.join(mydir, f) for f in os.listdir(mydir) if f.endswith('.npy')]
samples

['E:\\111thymus\\thymus_habitat\\sol8. 深度（迁移）学习-单（多）中心-多通道-万能图像融合-临床\\调试\\2_73\\merges\\1.nii.npy',
 'E:\\111thymus\\thymus_habitat\\sol8. 深度（迁移）学习-单（多）中心-多通道-万能图像融合-临床\\调试\\2_73\\merges\\1.nii_+02.npy',
 'E:\\111thymus\\thymus_habitat\\sol8. 深度（迁移）学习-单（多）中心-多通道-万能图像融合-临床\\调试\\2_73\\merges\\1.nii_-02.npy',
 'E:\\111thymus\\thymus_habitat\\sol8. 深度（迁移）学习-单（多）中心-多通道-万能图像融合-临床\\调试\\2_73\\merges\\10.nii.npy',
 'E:\\111thymus\\thymus_habitat\\sol8. 深度（迁移）学习-单（多）中心-多通道-万能图像融合-临床\\调试\\2_73\\merges\\100.nii.npy',
 'E:\\111thymus\\thymus_habitat\\sol8. 深度（迁移）学习-单（多）中心-多通道-万能图像融合-临床\\调试\\2_73\\merges\\100.nii_+02.npy',
 'E:\\111thymus\\thymus_habitat\\sol8. 深度（迁移）学习-单（多）中心-多通道-万能图像融合-临床\\调试\\2_73\\merges\\100.nii_-02.npy',
 'E:\\111thymus\\thymus_habitat\\sol8. 深度（迁移）学习-单（多）中心-多通道-万能图像融合-临床\\调试\\2_73\\merges\\101.nii.npy',
 'E:\\111thymus\\thymus_habitat\\sol8. 深度（迁移）学习-单（多）中心-多通道-万能图像融合-临床\\调试\\2_73\\merges\\101.nii_+02.npy',
 'E:\\111thymus\\thymus_habitat\\sol8. 深度（迁移）学习-单（多）中心-多通道-万能图像融合-临床\

## 确定提取特征

通过关键词获取要提取那一层的特征。

### 支持的模型名称

模型名称替换代码中的 `model_name`变量的值。

| **模型系列** | **模型名称**                                                 |
| ------------ | ------------------------------------------------------------ |
| AlexNet      | alexnet                                                      |
| VGG          | vgg11, vgg11_bn, vgg13, vgg13_bn, vgg16, vgg16_bn, vgg19_bn, vgg19 |
| ResNet       | resnet18, resnet34, resnet50, resnet101, resnet152, resnext50_32x4d, resnext101_32x8d, wide_resnet50_2, wide_resnet101_2 |
| DenseNet     | densenet121, densenet169, densenet201, densenet161           |
| Inception    | googlenet, inception_v3                                      |
| SqueezeNet   | squeezenet1_0, squeezenet1_1                                 |
| ShuffleNetV2 | shufflenet_v2_x2_0, shufflenet_v2_x0_5, shufflenet_v2_x1_0, shufflenet_v2_x1_5 |
| MobileNet    | mobilenet_v2, mobilenet_v3_large, mobilenet_v3_small         |
| MNASNet      | mnasnet0_5, mnasnet0_75, mnasnet1_0, mnasnet1_3              |

In [4]:
#### 获取数据
from onekey_algo.custom.Manager import onekey_show
onekey_show('深度学习特征提取|确定模型和特征')

[2025-04-22 15:34:45 - <frozen onekey_algo.custom.Manager>: 138]	INFO	播放视频功能已经设置成：Disable！


In [5]:
from onekey_algo.custom.components.comp2 import extract, print_feature_hook, reg_hook_on_module, \
    init_from_model, init_from_onekey

model_root = r'E:\111thymus\thymus_habitat\sol8. 深度（迁移）学习-单（多）中心-多通道-万能图像融合-临床\dl_models\thy\densenet121\viz'
model, transformer, device = init_from_onekey(model_root)
for n, m in model.named_modules():
    print('Feature name:', n, "|| Module:", m)

[2025-04-22 15:34:45 - <frozen core.transformer_factory>:  45]	INFO	使用2通道，-([0.485, 0.456])/ ([0.229, 0.224])
[2025-04-22 15:34:45 - <frozen onekey_algo.custom.components.comp2>: 231]	INFO	模型参数：{'pretrained': False, 'model_name': 'densenet121', 'num_classes': 2, 'in_channels': 2}


Feature name:  || Module: DenseNet(
  (features): Sequential(
    (conv0): Conv2d(2, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (norm0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu0): ReLU(inplace=True)
    (pool0): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (denseblock1): _DenseBlock(
      (denselayer1): _DenseLayer(
        (norm1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu1): ReLU(inplace=True)
        (conv1): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu2): ReLU(inplace=True)
        (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      )
      (denselayer2): _DenseLayer(
        (norm1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running

## 提取特征

`Feature name:` 之后的名称为要提取的特征名，例如`layer3.0.conv2`, 一般深度学习特征提取最后一层，例如`avgpool`

In [6]:
#### 提取特征
from onekey_algo.custom.Manager import onekey_show
onekey_show('深度学习特征提取|提取特征')

[2025-04-22 15:34:47 - <frozen onekey_algo.custom.Manager>: 138]	INFO	播放视频功能已经设置成：Disable！


In [7]:
from functools import partial
from onekey_algo.custom.components.comp2 import feature_layer_mapping

model_name = os.path.basename(os.path.dirname(model_root))
feature_name = feature_layer_mapping.get(f"{model_name}_2D", 'avgpool')
with open(f'features/{model_name}_features.csv', 'w') as outfile:
    hook = partial(print_feature_hook, fp=outfile)
    find_num = reg_hook_on_module(feature_name, model, hook)
    results = extract(samples, model, transformer, device, fp=outfile)

## 读取数据

In [8]:
#### 特征读取
from onekey_algo.custom.Manager import onekey_show
onekey_show('深度学习特征提取|特征读取')

[2025-04-22 15:35:12 - <frozen onekey_algo.custom.Manager>: 138]	INFO	播放视频功能已经设置成：Disable！


In [9]:
import pandas as pd
features = pd.read_csv(f'features/{model_name}_features.csv', header=None)
features.columns=['ID'] + [f"DL_{i}" for i in range(features.shape[1] - 1)]
features.to_csv(f'features/{model_name}_features.csv', index=False)
features

Unnamed: 0,ID,DL_0,DL_1,DL_2,DL_3,DL_4,DL_5,DL_6,DL_7,DL_8,...,DL_1014,DL_1015,DL_1016,DL_1017,DL_1018,DL_1019,DL_1020,DL_1021,DL_1022,DL_1023
0,1.nii.npy,0.012,-0.014,0.057,0.043,0.239,-0.256,-0.261,-0.190,0.369,...,0.032,-0.011,-0.347,-0.233,0.389,0.183,-0.033,0.275,0.362,0.094
1,1.nii_+02.npy,-0.096,0.051,0.006,0.107,-0.005,-0.371,-0.190,-0.291,0.260,...,0.117,-0.020,-0.318,-0.252,0.414,0.147,-0.096,0.115,0.253,0.158
2,1.nii_-02.npy,-0.120,0.010,0.100,0.137,0.363,-0.107,-0.116,-0.298,0.334,...,-0.055,-0.014,-0.279,-0.259,0.271,0.264,-0.002,0.272,0.354,0.286
3,10.nii.npy,-0.666,-0.300,1.162,0.772,0.813,-0.669,-0.105,-1.094,-1.627,...,0.849,-0.830,0.622,-0.114,0.595,-1.173,-1.137,0.920,-0.379,-0.891
4,100.nii.npy,-0.163,0.209,0.045,-0.060,0.496,0.543,-0.082,-0.129,0.106,...,-0.273,0.265,-0.253,0.218,-0.070,0.294,0.031,0.477,0.152,-0.118
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
621,98.nii_+02.npy,-0.586,0.400,-0.404,0.501,0.348,-0.242,-0.112,-0.145,0.068,...,-0.228,0.099,0.072,-0.286,0.467,-0.240,-0.111,0.681,0.046,-0.246
622,98.nii_-02.npy,-0.327,0.225,-0.638,0.256,-0.106,-0.147,-0.068,0.260,0.059,...,-0.448,0.238,-0.246,0.010,0.229,-0.006,0.144,0.526,0.232,-0.006
623,99.nii.npy,-0.330,0.040,0.242,0.583,0.325,-0.405,-0.141,-0.491,-0.233,...,0.258,0.058,0.280,-0.160,0.450,-0.362,-0.275,0.143,0.082,-0.257
624,99.nii_+02.npy,-0.393,-0.352,0.439,0.554,0.548,-0.185,0.125,-0.695,-0.295,...,0.408,0.087,0.304,-0.095,0.263,-0.453,-0.547,0.666,0.147,-0.343


### 深度特征压缩

深度学习特征压缩，注意压缩到的维度需要小于样本数

```python
def compress_df_feature(features: pd.DataFrame, dim: int, not_compress: Union[str, List[str]] = None,
                        prefix='') -> pd.DataFrame:
    """
    压缩深度学习特征
    Args:
        features: 特征DataFrame
        dim: 需要压缩到的维度，此值需要小于样本数
        not_compress: 不进行压缩的列。
        prefix: 所有特征的前缀。

    Returns:

    """
```

In [10]:
from onekey_algo.custom.components.comp1 import compress_df_feature

cm_features = compress_df_feature(features=features, dim=8, prefix='DL_', not_compress='ID')
cm_features.to_csv(f'features/{model_name}_compress_features.csv', header=True, index=False)

### 迁移学习

使用Onekey，提取基于迁移学习的模型特征。

In [11]:
#### 特征读取
from onekey_algo.custom.Manager import onekey_show
onekey_show('深度学习特征提取|Onekey迁移学习')

[2025-04-22 15:35:12 - <frozen onekey_algo.custom.Manager>: 138]	INFO	播放视频功能已经设置成：Disable！
