# 2D分类任务
分类任务，支持两种模式
1. Folder模式，需要输入`train`, `valid`两个测试集对应的目录。`labels.txt`，需要训练的label，里面每个类别一行。
2. List模式，需要输入`train`, `valid`两个测试集对应的训练文件，每行一个样本。`labels.txt`是可选参数，里面每个类别一行。`data_pattern`一个通用的目录，与train、val中的第一列进行拼接。

### Labelme数据标注

在命令行启动labelme。

#### 修改启动选项。
   ```shell
  "C:\Users\yxzdxzj\Desktop\function\tools\labelme.exe" --flags label1,label2,labeln
   ```


In [None]:
import os
os.environ['KMP_DUPLICATE_LIB_OK'] = 'True'

from pixelmed_calc.scripts.core import clf_covert2rec

input_dir = r"E:\function\pm_data\skin4clf"
save_dir = r"E:\function\pm_data\skin4clf_out"
partition = [0.8, 0.2]

clf_covert2rec(input_dir, save_dir = save_dir, partition = partition)

### 支持的模型名称

模型名称替换代码中的 `model_name`变量的值。

| **模型系列** | **模型名称**                                                 |
| ------------ | ------------------------------------------------------------ |
| AlexNet      | alexnet                                                      |
| VGG          | vgg11, vgg11_bn, vgg13, vgg13_bn, vgg16, vgg16_bn, vgg19_bn, vgg19 |
| ResNet       | resnet18, resnet34, resnet50, resnet101, resnet152, resnext50_32x4d, resnext101_32x8d, wide_resnet50_2, wide_resnet101_2 |
| DenseNet     | densenet121, densenet169, densenet201, densenet161           |
| Inception    | googlenet, inception_v3                                      |
| SqueezeNet   | squeezenet1_0, squeezenet1_1                                 |
| ShuffleNetV2 | shufflenet_v2_x2_0, shufflenet_v2_x0_5, shufflenet_v2_x1_0, shufflenet_v2_x1_5 |
| MobileNet    | mobilenet_v2, mobilenet_v3_large, mobilenet_v3_small         |
| MNASNet      | mnasnet0_5, mnasnet0_75, mnasnet1_0, mnasnet1_3              |
| Transformer      | ViT, SimpleViT            |

### List模式

List模式一般是采用labelme标注出来的结果，如果要使用自己的数据应用List模式，需要根据自己的实际情况对数据进行处理。

* `train.txt`，训练数据列表，中间用\t（Tab水平制表符）进行分割。
* `val.txt`，验证数据列表，中间用\t（Tab水平制表符）进行分割。
* `labels.txt`，label的集合，表明训练数据多少标签。
* `data_pattern`参数，所有数据存在的目录的公共前缀，如果`train.txt`,`val.txt`文件里面存放的是绝对路径，`data_pattern`设置为None即可。

vit模式如果出现报错：RuntimeError: The size of tensor a (49) must match the size of tensor b (64) at non-singleton dimension 1
提示tensor不匹配，可能是patch太小，patch_size调小，变成32

In [1]:
import os
from pixelmed_calc.classification.run_classification import main as clf_main
from collections import namedtuple

# 设置参数
save_dir = r'E:\function\pm_data\skin4clf_out'
train_f = os.path.join(save_dir, 'train.txt')
val_f = os.path.join(save_dir, 'val.txt')
labels_f = os.path.join(save_dir, 'labels.txt')
data_pattern = os.path.join(save_dir, 'images')

params = dict(train=train_f,
              valid=val_f,
              labels_file=labels_f,
              data_pattern=data_pattern,
              j=0,
              max2use=None,
              val_max2use=None,
              batch_balance=False,
              normalize_method='imagenet',
              model_name='SimpleViT',
              vit_settings = {'patch_size': 64, 'dim': 1024, 'depth': 6, 'heads': 16, 'mlp_dim': 2048},
              gpus=[0],
              batch_size=32,
              epochs=5,
              init_lr=0.01,
              optimizer='sgd',
              retrain=None,
              model_root='.',
              add_date=False,
              iters_start=0,
              iters_verbose=1,
              save_per_epoch=False,
              pretrained=False)
# 训练模型
Args = namedtuple("Args", params)
clf_main(Args(**params))

[2023-10-28 21:58:33 - ClassificationDataset.py: 637]	INFO	WE RECOMMEND YOU USE SPECIFY dataset_name LIKE list for ListDataset OR folder for FolderDataset.
[2023-10-28 21:58:33 - ClassificationDataset.py:  82]	INFO	Parsing record file E:\function\pm_data\skin4clf_out\train.txt
[2023-10-28 21:58:33 - ClassificationDataset.py:  85]	INFO		Checking file exists in E:\function\pm_data\skin4clf_out\train.txt
[2023-10-28 21:58:33 - ClassificationDataset.py: 648]	INFO	We infer your kwargs to be <class 'pixelmed_calc.datasets.ClassificationDataset.ListDataset'>.
[2023-10-28 21:58:33 - ClassificationDataset.py: 637]	INFO	WE RECOMMEND YOU USE SPECIFY dataset_name LIKE list for ListDataset OR folder for FolderDataset.
[2023-10-28 21:58:33 - ClassificationDataset.py:  82]	INFO	Parsing record file E:\function\pm_data\skin4clf_out\val.txt
[2023-10-28 21:58:33 - ClassificationDataset.py:  85]	INFO		Checking file exists in E:\function\pm_data\skin4clf_out\val.txt
[2023-10-28 21:58:33 - ClassificationDat

RuntimeError: The size of tensor a (49) must match the size of tensor b (64) at non-singleton dimension 1

### Folder模式

Folder模式一般是采用手动拖拽标注出来的结果。

* `train_dir`，训练数据存放的文件夹。
* `val_dir`，验证数据存放的文件夹。
* `labels_file`，label的集合，表明训练数据多少标签，
    > 注意：在train_dir和val_dir下面必须存在相应数据量的子文件夹。

In [None]:
import os
from pixelmed_calc.classification.run_classification import main as clf_main
from collections import namedtuple

# 设置参数
root_dir = r"E:\function\pm_data\skin4seg_out"

train_dir = os.path.join(root_dir, 'train')
val_dir = os.path.join(root_dir, 'val')
labels_file = os.path.join(root_dir, 'labels.txt')
params = dict(train=train_dir,
              valid=val_dir,
              labels_file=labels_file,
              data_pattern=None,
              j=0,
              max2use=None,
              val_max2use=None,
              batch_balance=False,
              normalize_method='imagenet',
              model_name='resnet50',
              gpus=[0],
              batch_size=8,
              epochs=5,
              init_lr=0.1,
              optimizer='sgd',
              retrain=None,
              model_root='.',
              add_date=False,
              iters_start=0,
              iters_verbose=1,
              save_per_epoch=False,
              pretrained=True)
# 训练模型
Args = namedtuple("Args", params)
clf_main(Args(**params))