## 深度学习特征

提取CT、MRI、内镜、Xray等影像数据的深度学习特征。


1. 将待提取的数据转化成nii(nii.gz)。
2. 获取到指定目录的所有图像数据。
3. 选择要提取什么样的模型的深度学习特征，目前支持ResNet3d深度学习模型。（可以考虑迁移学习）
  > 只支持ResNet3d，是因为目前仅有resnet存在预训练的模型。
4. 提取特征，保存特征文件。

### 获取待提取特征的NII数据

提供两种批量处理的模式：
1. 目录模式，提取指定目录下的所有jpg文件的特征。
2. 文件模式，待提取的数据存储在文件中，每行一个样本。

当然也可以在最后自己指定手动提取指定若干文件。

In [1]:
import os

# 目录模式
mydir = r'C:\Users\yxzdxzj\Desktop\function\pm_data\MR\images'
directory = os.path.expanduser(mydir)
test_samples = [os.path.join(directory, p) 
                for p in os.listdir(directory) if p.endswith('.nii') or p.endswith('.nii.gz')]

# 文件模式
# test_file = ''
# with open(test_file) as f:
#     test_samples = [l.strip() for l in f.readlines()]

# 自定义模式
# test_sampleses = ['path2jpg']
test_samples

['C:\\Users\\yxzdxzj\\Desktop\\function\\pm_data\\MR\\images\\0.nii.gz',
 'C:\\Users\\yxzdxzj\\Desktop\\function\\pm_data\\MR\\images\\1.nii.gz',
 'C:\\Users\\yxzdxzj\\Desktop\\function\\pm_data\\MR\\images\\10.nii.gz',
 'C:\\Users\\yxzdxzj\\Desktop\\function\\pm_data\\MR\\images\\100.nii.gz',
 'C:\\Users\\yxzdxzj\\Desktop\\function\\pm_data\\MR\\images\\101.nii.gz',
 'C:\\Users\\yxzdxzj\\Desktop\\function\\pm_data\\MR\\images\\102.nii.gz',
 'C:\\Users\\yxzdxzj\\Desktop\\function\\pm_data\\MR\\images\\103.nii.gz',
 'C:\\Users\\yxzdxzj\\Desktop\\function\\pm_data\\MR\\images\\104.nii.gz',
 'C:\\Users\\yxzdxzj\\Desktop\\function\\pm_data\\MR\\images\\105.nii.gz',
 'C:\\Users\\yxzdxzj\\Desktop\\function\\pm_data\\MR\\images\\106.nii.gz',
 'C:\\Users\\yxzdxzj\\Desktop\\function\\pm_data\\MR\\images\\107.nii.gz',
 'C:\\Users\\yxzdxzj\\Desktop\\function\\pm_data\\MR\\images\\108.nii.gz',
 'C:\\Users\\yxzdxzj\\Desktop\\function\\pm_data\\MR\\images\\109.nii.gz',
 'C:\\Users\\yxzdxzj\\Desktop\

## 确定提取特征

通过关键词获取要提取那一层的特征。

### 支持的模型名称

模型名称替换代码中的 `model_name`变量的值。

| **模型系列** | **模型名称**                                                 |
| ------------ | ------------------------------------------------------------ |
| ResNet       | resnet10, resnet18, resnet34, resnet50, resnet101, resnet152, resnet200 |

In [4]:
from pixelmed_core.core import create_model
from pixelmed_calc.custom.components.comp2 import extract3d, init_from_model3d
'''
案例resnet34是3D模型，案例resnet50是2D模型
'''

model_name = 'resnet34'
model_path = r'C:\Users\yxzdxzj\Desktop\function\note2-深度学习分类\resnet34\20230904\viz\BEST-training-params.pth'
#如果使用自带模型请将pretrained=True,并将num_classes=2删除
model, transformer, device = init_from_model3d(model_name=f'classification3d.{model_name}', pretrained=model_path, in_channels=1,num_classes=2)
for n, m in model.named_modules():
    print('Feature name:', n, "|| Module:", m)

成功加载C:\Users\yxzdxzj\Desktop\function\note2-深度学习分类\resnet34\20230904\viz\BEST-training-params.pth模型参数。
成功加载C:\Users\yxzdxzj\Desktop\function\note2-深度学习分类\resnet34\20230904\viz\BEST-training-params.pth模型参数。
Feature name:  || Module: ResNet(
  (conv1): Conv3d(1, 64, kernel_size=(7, 7, 7), stride=(1, 2, 2), padding=(3, 3, 3), bias=False)
  (bn1): BatchNorm3d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool3d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): ResNetBlock(
      (conv1): Conv3d(64, 64, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), bias=False)
      (bn1): BatchNorm3d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv3d(64, 64, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), bias=False)
      (bn2): BatchNorm3d(64, eps=1e-05, momentum=0.1, affine=True, track_running_

## 提取特征

`Feature name:` 之后的名称为要提取的特征名，例如`layer3.0.conv2`, 一般深度学习特征提取最后一层，例如`avgpool`

In [5]:
import torch
from functools import partial
from pixelmed_calc.custom.components.comp2 import extract3d, print_feature_hook, reg_hook_on_module
from monai.data import ImageDataset
from torch.utils.data import DataLoader

feature_name = 'avgpool'
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
with open('feature.txt', 'w') as outfile:
    hook = partial(print_feature_hook, fp=outfile)
    find_num = reg_hook_on_module(feature_name, model, hook)
    val_ds = ImageDataset(image_files=test_samples, transform=transformer)
    # create a validation data loader
    val_loader = DataLoader(val_ds, batch_size=1, num_workers=0)
    
    results = extract3d(val_loader, test_samples, model, device, fp=outfile)

### 读取数据

In [6]:
import pandas as pd
features = pd.read_csv('feature.txt', sep=',', header=None)
features.columns=['ID'] + list(features.columns[1:])
features.head()

Unnamed: 0,ID,1,2,3,4,5,6,7,8,9,...,503,504,505,506,507,508,509,510,511,512
0,0.nii.gz,188.052,186.968,0.103,148.222,150.24,0.0,144.502,10.599,184.591,...,9.039,0.0,140.444,0.0,130.847,0.0,107.976,123.329,63.021,91.879
1,1.nii.gz,180.632,179.124,0.105,142.204,144.07,0.0,138.391,10.34,177.645,...,8.725,0.0,134.624,0.0,125.661,0.0,103.555,118.138,60.476,88.167
2,10.nii.gz,203.683,202.729,0.076,160.594,163.115,0.0,156.954,11.161,199.519,...,9.872,0.0,152.601,0.0,142.344,0.0,117.426,133.885,68.625,99.974
3,100.nii.gz,197.808,196.622,0.111,155.791,158.144,0.0,152.068,10.964,193.848,...,9.535,0.0,147.863,0.0,137.94,0.0,113.704,129.701,66.444,96.836
4,101.nii.gz,208.255,207.715,0.068,164.329,166.892,0.0,160.763,11.332,203.835,...,10.034,0.0,156.108,0.0,145.38,0.0,120.193,137.179,70.168,102.188


### 深度特征压缩

深度学习特征压缩，注意压缩到的维度需要小于样本数

```python
def compress_df_feature(features: pd.DataFrame, dim: int, not_compress: Union[str, List[str]] = None,
                        prefix='') -> pd.DataFrame:
    """
    压缩深度学习特征
    Args:
        features: 特征DataFrame
        dim: 需要压缩到的维度，此值需要小于样本数
        not_compress: 不进行压缩的列。
        prefix: 所有特征的前缀。

    Returns:

    """
```

In [None]:
from pixelmed_calc.custom.components.comp1 import compress_df_feature

cm_features = compress_df_feature(features=features, dim=32, prefix='DL_', not_compress='ID')
cm_features.to_csv('compress_features.csv', header=True, index=False)