Skip to content

VirtualSpaceman/ssl-skin-lesions

Repository files navigation

An Evaluation of Self-Supervised Pre-Training for Skin-Lesion Analysis

Hello! Here you will find the code to reproduce the results for the paper "An Evaluation of Self-Supervised Pre-Training for Skin-Lesion Analysis"

Evaluated Pipelines


Datasets

To download all data to reproduce our work please use the links in below. In all experiments, we used subsets from the ISIC 2019 challenge training data for training, validating, and testing. More details about each dataset can be found in our paper.


Preparing Environment and Data

We used nvidia-docker for all experiments. We made available the Dockerfile describing our development environment.

Now, download all data linked in the previous section and adjust both data folder and label paths.

For ISIC2019, we recommend setting proper data path for finetuning_ssl.py, isic_contrastive_finetuning.py, and main_isic_supcon.py

We use all other datasets only on test stage. Then, set the correct image and label paths for each dataset here.


Self-supervised Checkpoints

We used each model's weights to initialize a ResNet-50 encoder for fine-tuning experiments using self-supervised pre-trained models. Below you will find a list of each model's checkpoints to download.

Model Checkpoint link Notes
SimCLR https://github.com/google-research/simclr#pre-trained-models-for-simclrv1 Weights converted from Tensorflow to PyTorch
SwAV https://github.com/facebookresearch/swav#model-zoo -
BYOL https://github.com/deepmind/deepmind-research/tree/master/byol#pretraining Weights converted from JAX to PyTorch
MoCo https://github.com/facebookresearch/moco#models MoCo V2 checkpoint trained for 800 epochs
InfoMIN https://github.com/HobbitLong/PyContrast/blob/master/pycontrast/docs/MODEL_ZOO.md -

Once you downloaded all the weights, you need to set the correct path for each method here.


Running

Fine-tuning Only

To run the experiments regarding self-supervised and supervised models, we need to run finetuning_ssl.py.

Essentially, to run a standard fine-tuning procedure you just need to specify which method (--method parameter) among the available options {simclr, byol, swav, moco, infomin, baseline} and the folder containing the train and validation splits for the ISIC2019 dataset. The splits used in our paper are in the datasplits folder. The baseline stands for supervised training on ImageNet. A minor example of how our code:

  python3 finetuning_ssl.py --method simclr --lr lr --batch_size batch_size --splits_folder /ssl-skin-lesions/datasplits/isic2019/splits/

To check all the parameters available please take a look at here

Pre-training and Fine-tuning

In this pipeline we perform an additional contrastive pre-training - which can use the supervised on the self-supervised version - before fine-tuning. We give more details of how to execute the contrastive pre-training at the folder SupContrast.

When the pre-training finishes, just execute the isic_contrastive_finetuning.py passing the pre-trained model checkpoint on parameter --ckpt_path. Such file is based on finetuning_ssl.py, but with minor changes. We removed the --method parameter and fixed the simclr data augmentations during the fine-tuning. Except for the --method parameter, all the remaining ones are the same as explained in Fine-tuning section.


Testing the models

You can use the file test_external_datasets.py to run the test step with a trained model. For example,

  python3 test_external_datasets.py --dataset ds --ckpt_path path

or

  python3 test_external_datasets.py --dataset ds --ckpt_path path --fromcl

if the evaluated checkpoint went through a contrastive pre-training, either supervised or self-supervised.

We use test-time augmentation and evaluate the AUC over 50 copies. The datasets available for the --dataset parameter are {atlas-dermato, atlas-clinical, isic20, pad-ufes-20}. As we evaluated 5 distinct test datasets, we created a bash script to ease the whole setup in run_test_external.sh.


Top-5 Best Experiments

As mentioned in our paper, we train the top-5 best models under full- and low-data regime. Below, we describe the parameters for the top-5 best models for each evaluated pipeline.

Hyperoptimized supervised Baseline (SUP -> FT)

We also made available the script run_supervied_hypersearch_finetuning.sh to run the hyperparameter search in supervised baseline as mentioned in our paper. We describe top-5 best hyperparameter combination in the table below.

Learning Rate (LR) LR Scheduler Batch Size Balanced Batches?
0.009 plateau 128 Yes
0.002 plateau 32 Yes
0.005 plateau 128 Yes
0.003 plateau 128 Yes
0.0001 cosine 32 Yes

Self-supervised (SSL -> FT)

Method Learning Rate (LR)
SimCLR 0.01
SwAV 0.01
BYOL 0.01
BYOL 0.001
InfoMIN 0.001

Self-Supervised Pre-training (SSL -> UCL -> FT)

Temperature Pre-training batch size Pre-training epochs Balanced Batches?
0.1 80 50 No
0.1 512 200 Yes
0.5 512 200 No
0.1 80 50 Yes
1.0 512 200 No

Supervised Pre-training (SSL -> SCL -> FT)

Temperature Pre-training batch size Pre-training epochs Balanced Batches?
1.0 80 50 Yes
1.0 80 200 Yes
0.5 80 200 Yes
0.5 80 200 No
0.5 80 50 No

Acknowledgments

  • L. Chaves is partially funded by QuintoAndar, and CAPES.
  • A. Bissoto is partially funded by FAPESP 2019/19619-7.
  • E. Valle is funded by CNPq 315168/2020-0.
  • S. Avila is partially funded by CNPq PQ-2 315231/2020-3, and FAPESP 2013/08293-7.
  • A. Bissoto and S. Avila are also partially funded by Google LARA 2020.
  • The RECOD lab is funded by grants from FAPESP, CAPES, and CNPq.

About

Repo to reproduce the results for the paper "An Evaluation of Self-Supervised Pre-Training for Skin-Lesion Analysis". Accepted at ISICW @ ECCV 2022. Arxiv: https://arxiv.org/abs/2106.09229

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published