This repository contains the code for experiments from Simple Control Baselines for Evaluating Transfer Learning.
- Abstract
- Datasets and Downstream Tasks
- Download all pretrained models
- Transfer a model
- Visualization
- Citing
Transfer learning has witnessed remarkable progress in recent years, for example, with the introduction of augmentation-based contrastive self-supervised learning methods. While a number of large-scale empirical studies on the transfer performance of such models have been conducted, there is not yet an agreed-upon set of control baselines, evaluation practices, and metrics to report, which often hinders a nuanced and calibrated understanding of the real efficacy of the methods. We share an evaluation standard that aims to quantify and communicate transfer learning performance in an informative and accessible setup. This is done by baking a number of simple yet critical control baselines in the evaluation method, particularly the blind-guess (quantifying the dataset bias), scratch-model (quantifying the architectural contribution), and maximal-supervision (quantifying the upper-bound). To demonstrate how the proposed evaluation standard can be employed, we provide an example empirical study investigating a few basic questions about self-supervised learning. For example, using this standard, the study shows the effectiveness of existing self-supervised pre-training methods is skewed towards image classification tasks versus dense pixel-wise predictions.
Fig: summary of the evaluation standard
ImageNet, CIFAR-100, and EuroSAT.
Taskonomy for depth and surface normals estimation.
All pre-trained encoders share the same ResNet-50 architecture. For transfers to downstream tasks, we use two types of decoders. For classification tasks, we use a single fully-connected layer that takes the output of the final encoder’s layer and outputs the logits for each class. For pixel-wise regression tasks, we use a UNet-style decoder with six upsampling blocks and skip-connections from the encoder layers of the same spatial resolution.
- A full list of pre-training models we used is in the table below:
Method URL Contrastive Self-Supervised Learning SwAV model MoCov2 model SimCLR model SimSiam model Barlow Twins model PIRL model Non-Contrastive Pretext Tasks Colorization model Jigsaw model
If you haven't yet, then download the pretrained models and datasets needed.
.
├── data/ # Dataset modules
├── downstream_tasks/ # Downstream task module
├── imagenet_cls/ # Codes for ImageNet classification
├── link_modules/ # Link modules to connect encoder and decoder
├── models/ # Network definition
├── representation_tasks/ # Different feature extractor modules
├── utils/ # Some utils
├── train_ssl2taskonomy.py # Main code for training on Taskonomy
├── run_cls.sh # Training script for CIFAR100/EuroSAT
└── run_taskonomy.sh # Training script for Taskonomy
Note that we provide 3 shell scripts for different task trainings.
- Transfer self-supervised methods to Taskonomy dataset:
bash run_taskonomy.sh
Please change --ssl_name
and --taskonomy_domain
accordingly. The model names and taskonomy domains we support are detailed in train_ssl2taskonomy.py.
- Transfer self-supervised methods to ImageNet classification:
cd imagenet_cls
bash train_cls.sh
Please change --model_name
accorodingly. The model names we support are detailed in imagenet_ssl_epoch.py.
- Transfer self-supervised methods to CIFAR100/EuroSAT datasets:
bash run_cls.sh
Please change --ssl_name
and --taskonomy_domain
accordingly. The model names and taskonomy domains we support are detailed in train_ssl2taskonomy.py.
To freeze the feature extractor and only update the decoder parts or the classifier parts, please enable
--freeze_representation
in the training scripts.
The losses and visualizations are logged in Weights & Biases.
You can find the plotting example in the Vizualization-Example.ipynb
notebook. It assumes a .csv
file with the following columns: domain, method, train_size, test_loss, seed
. An example of the results file from our evaluation is ./assets/ssl-results.csv
.
If you find this paper or code useful, please cite this paper:
@article{atanov2022simple,
title={Simple Control Baselines for Evaluating Transfer Learning},
author={Atanov, Andrei and Xu, Shijian and Beker, Onur and Filatov, Andrei and Zamir, Amir},
year={2022},
journal={arXiv preprint arXiv:2202.03365},
}