This project will compared the existing UNet-like models with different open source dataset. My implementation is mainly based on Pytorch. This Respo will be a work in progress, and I will keep on updating it as many new models appear in the future.
Furthermore, for more details of UNet structure, please visit my blog.
WIP
- UNet : U-Net: Convolutional Networks for Biomedical Image Segmentation
- UNet++ : UNet++: A Nested U-Net Architecture for Medical Image Segmentation
- Att_UNet : Attention U-Net: Learning Where to Look for the Pancreas
- ResUNet : Deep Residual Learning for Image Recognition
- RexUNet : Aggregated Residual Transformations for Deep Neural Networks
- Adversarial Learning : Adversarial Learning for Semi-Supervised Semantic Segmentation
Datasets of this project:
-
Retina Vessel : Link, keyword : jti3
This dataset consist 60 retina images from two subsets, DRIVE and STARE, 40 and 20 images respectively. Please find more details about these two datasets in their official website, or you can directly download my preprocessed version with the link above.
During the data preprocess part, for DRIVE, we first center cropped the original size 565x584 to 528x576, then split a whole image into 132 small pathes (without overlaps) with size 48x48.
For STARE, the same pipeline were used, 700x605 to 672x576, split into 168 patches.
You can see the data preprocess pipeline step by step and how to split the training and testing set in the following scripts:
- utils/
|--- RV_data_preprocess.py
|--- gen_txt_RV.py
If you are interested in the project, you can also compare the performance of different models by the following datasets:
- Stanford Background Dataset
- Sift Flow Dataset
- Barcelona Dataset
- Microsoft COCO dataset
- MSRC Dataset
- LITS Liver Tumor Segmentation Dataset
- KITTI
- Pascal Context
- Data from Games dataset
- Human parsing dataset
- Mapillary Vistas Dataset
- Microsoft AirSim
- MIT Scene Parsing Benchmark
- COCO 2017 Stuff Segmentation Challenge
- ADE20K Dataset
- INRIA Annotations for Graz-02
- Daimler dataset
- ISBI Challenge: Segmentation of neuronal structures in EM stacks
- INRIA Annotations for Graz-02 (IG02)
- Pratheepan Dataset
- Clothing Co-Parsing (CCP) Dataset
- Inria Aerial Image
- ApolloScape
- UrbanMapper3D
- RoadDetector
- Cityscapes
- CamVid
- Inria Aerial Image Labeling
In this project, we used k-fold CV to train and estimate the model performance.
# k-fold CV Algorithm
Step 1: split the whole dataset into equal K shares
Step 2: for i in K:
take the #i share as test set
for j in K:
if j != i:
#j share as train set
train, val, test
Step 3: Average K test results as the final score
And learning rate adjustment strategy is as shown belowed:
If there is no decreasing in validation loss in 5 epoches, lr decrease by factor 2. This is implemented by ReduceLROnPlateau of Pytorch.
About the evaluation metric, we use Dice Coefficient (DSC, %) to evaluate the model performance. The defination of DSC is as belowed:
WIP
To be clearified, there is a slightly difference between DSC in testing part and Soft Dice Coefficient Loss in training part:
RV(5-folds + lr_scheduler, bs=128) | Retina Vessel | Inria Aerial | Nodule Xray |
---|---|---|---|
UNet (S) | 0.409 ± 0.033 | - | - |
Att_UNet (S) | 0.402 ± 0.020 | - | - |
UNet++ (S) | 0.378 ± 0.030 | - | - |
ResUNet-50 (S) | 0.379 ± 0.021 | - | - |
ResUNet-101 (S) | 0.379 ± 0.018 | - | - |
ResUNet-101 (P) | 0.281 ± 0.012 | - | - |
RexUNet-101 (P) | 0.280 ± 0.013 | - | - |
Adv-RexUNet-101 (P) | - | - |
PS: Att for Attention Gate, p for pretrained on ImageNet and Adv for adversarial learning, ± for standard deviation, S for trained from scratch and P for pretrained on ImageNet.
RV(5-folds + lr_scheduler, bs=64) | Retina Vessel | Inria Aerial | Nodule Xray |
---|---|---|---|
UNet (S) | 0.358 ± 0.036 | - | - |
Att_UNet (S) | 0.349 ± 0.020 | - | - |
UNet++ (S) | 0.339 ± 0.029 | - | - |
ResUNet-50 (S) | 0.275 ± 0.015 | - | - |
ResUNet-101 (S) | 0.268 ± 0.015 | - | - |
ResUNet-101 (P) | 0.264 ± 0.011 | - | - |
RexUNet-101 (P) | 0.269 ± 0.014 | - | - |
Adv-RexUNet-101 (P) | - | - |
To install all needed dependencies, please run:
pip3 install -r requirements.txt
Please also install the Nvidia apex module to speeding up the training and saving GPU memory.
Please download the data from the link above and put them in the database folder to construct the following folder structure:
- database/
|--- Retina_Vessel/
| |--- before_organized/
| | |--- STARE/
| | | |--- stare-images.tar
| | | |--- labels-vk.tar
| | | |--- labels-ah.tar
| | |
| | |--- DRIVE/
| | |--- datasets.zip
| |
| |--- organized/
|--- 48x48/
|--- whole/
| |--- raw/
| |--- mask/
|
|--- patch/
|--- raw/
|--- mask/
And Please download the pretrained model of ResUNext101's encoder from my share, password: wp2n, then put it to the folder: <./models>.
python3 train_RV.py "UNet" False
UNet, Att_UNet, UNet_PP, ResUNet, ResUNext
python3 test_RV.py "UNet" False
positional arguments:
arch model architecture: UNet | Att_UNet | UNet_PP | ResUNet50 | ResUNet101|
ResUNext101
pretrained if pretrained on ImageNet: True | False
WIP
The license is MIT.