Reorganize KolektorSDD dataset as MVTecAD dataset's format. Report SOTA anomaly detection models' results in KolektorSDD.
- 1. Purpose
- 2. Usage
- 3. Illustrations of KolektorSDD Preprocessing
- 4. Experimental Results
- 5. Sample Visualizations
- 6. Change Log
- 7. License
这是本仓库作者在实习期间完成的代码,主要内容是拿KolektorSDD数据集去跑部分SOTA工业缺陷检测模型(主要取自MVTecAD的排行榜)。由于各种限制,这里只公布了KolektorSDD的预处理代码和复现出的结果。通过KolektorSDD的预处理代码,可以快速将KolektorSDD数据集的格式转换为MVTecAD数据集的格式,这样就可以直接套用SOTA开源代码或者Anomalib库,快速的进行训练和测试。完整复现应该不困难。
This repository contains code for preprocessing KolektorSDD dataset so that we could train/test some SOTA anomaly detection models in MVTecAD leaderboard. Due to various restrictions, I do NOT upload the modified training/testing code for those SOTA models. But I believe with the KolektorSDD dataset after preprocessing, you could reproduce the results in a very short time, just with slight modifications to SOTA codes/Anomalib using MVTecAD configurations. I also report the results I reproduce for comparisons.
- Download KolektorSDD Dataset with fine annotations in https://www.vicos.si/resources/kolektorsdd/
- Please cite according to their requirements
- Unzip the KolektorSDD file
- Git clone this repo
- Create a new conda virtual environment with
anomalib_env.yaml
conda env create -f anomalib_env.yaml
- Modify the path in
KolektorSDD_Preprocess.py
# Args that you need to change:
# @ read_base : Path to the downloaded KolektorSDD dataset
# @ save_base : Path to the repository you wanna save
read_base = r'.\KolektorSDD'
save_base = r'.\KolektorSDD1'
- Run the script
python KolektorSDD_Preprocess.py
- Your final directory tree of reorganized KolektorSDD should look like this:
save_base
└── metal
├── ground_truth
| └── defect
| ├── 000_mask.png
| ├── 001_mask.png
| ├── ...
├── test
| ├── defect
| | ├── 000.png
| | ├── 001.png
| | ├── ...
| └── good
| ├── 000.png
| ├── 001.png
| ├── ...
└── train
└── good
├── 000.png
├── 001.png
├── ...
- Modify the code configurations and run your training and testing scripts
- See Section 4 to get the open source code for SOTA models in my experiment
- With the KolektorSDD dataset after preprocessing, you could reproduce the results in a very short time, just with slight modifications to SOTA codes/Anomalib using MVTecAD configurations.
总体思路:处理成接近MVTecAD数据集的样式
步骤:
- Jpg和Bmp转Png
- Resize到统一的尺寸:500x1240
- 划分训练和测试
- 训练:295张正常
- 测试:52张正常+52张异常
这里需要注意:
- 本仓库作者采取的手段是直接Resize到统一的尺寸,这可能会导致某些小缺陷的mask变形,如果有时间,可以换成crop的形式,把有缺陷的部分crop出一个正方形区域出来。
- 本仓库作者采取的划分方式是直接划分训练和测试,没有留验证集。同时,在代码中已经规定了测试时的正常和异常样本数量相等。如果需要。可以自行修改代码,取合理的划分。
- 因为random.shuffle没有固定种子,每次运行会得到具体样本不同的划分结果。
To take use of SOTA codes / Anomalib using MVTecAD configurations, we should reorganize KolektorSDD dataset in MVTecAD dataset's format.
Steps:
- JPG/BMP to PNG
- Resize to the same size (500x1240)
- Train-Test Split
- Train
- Flawless samples for training: 295
- Test
- Flawless samples for testing: 52
- Anomalies for testing: 52
- Train
ATTENTION:
- I resize all the images to the same size (500x1240), which might result in defect distortions.
- Because of the small number of samples, I do NOT reserve a validation set. I sampled the same number of flawless samples as anomalies for testing. This setting could be changed as you want.
- Seed for random.shuffle() is NOT fixed. So each run of this preprocessing script will result in different splitting results.
SOTA models that chosen to train/test (Also as Acknowledgements):
- PatchCore (backbone: wide_resnet50_2)
- FastFlow (backbone: resnet18 / cait_m48_448)
- CFA (backbone: wrn50_2)
- Cflow-AD (backbone: wide_resnet50_2)
Version | Methods | Backbone | Avg DET AUC (image ROCAUC) | Avg SEG AUC (pixel ROCAUC) | pixel PROAUC |
Official Code | PatchCore | wide_resnet50_2 | 0.909 | 0.941 | / |
CFA | wrn50_2 | 0.939 | 0.939 | 0.823 | |
Cflow-AD | wide_resnet50_2 | 0.801 | 0.891 | 0.497 | |
Unofficial Code | FastFlow | cait_m48_448 | 0.955 | 0.960 | / |
Anomalib | PatchCore | wide_resnet50_2 | 0.863 | 0.840 | / |
FastFlow | resnet18 | 0.807 | 0.883 | / | |
cait_m48_448 | 0.914 | 0.963 | / | ||
Cflow-AD | wide_resnet50_2 | 0.850 | 0.847 | / |
- PatchCore (wide_resnet50_2)
- FastFlow (cait_m48_448)
- Cflow-AD (wide_resnet50_2)
- CFA (wrn50_2)
- [2022/12/29] Provide conda virtual environment dependency list in
anomalib_env.yaml
. - [2022/12/28] Create repository and release preprocessing script.