Simple Control Baselines for Evaluating Transfer Learning

This repository contains the code for experiments from Simple Control Baselines for Evaluating Transfer Learning.

Abstract
Datasets and Downstream Tasks
Download all pretrained models
Transfer a model
- Instructions for training
- To train on a new middle domain
Visualization
Citing

Abstract

Transfer learning has witnessed remarkable progress in recent years, for example, with the introduction of augmentation-based contrastive self-supervised learning methods. While a number of large-scale empirical studies on the transfer performance of such models have been conducted, there is not yet an agreed-upon set of control baselines, evaluation practices, and metrics to report, which often hinders a nuanced and calibrated understanding of the real efficacy of the methods. We share an evaluation standard that aims to quantify and communicate transfer learning performance in an informative and accessible setup. This is done by baking a number of simple yet critical control baselines in the evaluation method, particularly the blind-guess (quantifying the dataset bias), scratch-model (quantifying the architectural contribution), and maximal-supervision (quantifying the upper-bound). To demonstrate how the proposed evaluation standard can be employed, we provide an example empirical study investigating a few basic questions about self-supervised learning. For example, using this standard, the study shows the effectiveness of existing self-supervised pre-training methods is skewed towards image classification tasks versus dense pixel-wise predictions.

Fig: summary of the evaluation standard

Datasets and downstream tasks

Classification Tasks

ImageNet, CIFAR-100, and EuroSAT.

Pixel-wise Tasks

Taskonomy for depth and surface normals estimation.

Pretrained models

Network architecture

All pre-trained encoders share the same ResNet-50 architecture. For transfers to downstream tasks, we use two types of decoders. For classification tasks, we use a single fully-connected layer that takes the output of the final encoder’s layer and outputs the logits for each class. For pixel-wise regression tasks, we use a UNet-style decoder with six upsampling blocks and skip-connections from the encoder layers of the same spatial resolution.

Download the models

A full list of pre-training models we used is in the table below:

Method	URL
Contrastive Self-Supervised Learning
SwAV	model
MoCov2	model
SimCLR	model
SimSiam	model
Barlow Twins	model
PIRL	model
Non-Contrastive Pretext Tasks
Colorization	model
Jigsaw	model

Transfer

Download the pretrained models

If you haven't yet, then download the pretrained models and datasets needed.

Main code structure

.
├── data/                          # Dataset modules
├── downstream_tasks/              # Downstream task module
├── imagenet_cls/                  # Codes for ImageNet classification
├── link_modules/                  # Link modules to connect encoder and decoder
├── models/                        # Network definition
├── representation_tasks/          # Different feature extractor modules
├── utils/                         # Some utils
├── train_ssl2taskonomy.py         # Main code for training on Taskonomy
├── run_cls.sh                     # Training script for CIFAR100/EuroSAT
└── run_taskonomy.sh               # Training script for Taskonomy

Training scritps

Note that we provide 3 shell scripts for different task trainings.

Transfer self-supervised methods to Taskonomy dataset:

bash run_taskonomy.sh

Please change --ssl_name and --taskonomy_domain accordingly. The model names and taskonomy domains we support are detailed in train_ssl2taskonomy.py.

Transfer self-supervised methods to ImageNet classification:

cd imagenet_cls
bash train_cls.sh

Please change --model_name accorodingly. The model names we support are detailed in imagenet_ssl_epoch.py.

Transfer self-supervised methods to CIFAR100/EuroSAT datasets:

bash run_cls.sh

Please change --ssl_name and --taskonomy_domain accordingly. The model names and taskonomy domains we support are detailed in train_ssl2taskonomy.py.

Freeze and unfreeze the feature extractor

To freeze the feature extractor and only update the decoder parts or the classifier parts, please enable --freeze_representation in the training scripts.

Logging

The losses and visualizations are logged in Weights & Biases.

Visualization

You can find the plotting example in the Vizualization-Example.ipynb notebook. It assumes a .csv file with the following columns: domain, method, train_size, test_loss, seed. An example of the results file from our evaluation is ./assets/ssl-results.csv.

Citation

If you find this paper or code useful, please cite this paper:

@article{atanov2022simple,
      title={Simple Control Baselines for Evaluating Transfer Learning}, 
      author={Atanov, Andrei and Xu, Shijian and Beker, Onur and Filatov, Andrei and Zamir, Amir},
      year={2022},
      journal={arXiv preprint arXiv:2202.03365},
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
data		data
downstream_tasks		downstream_tasks
imagenet_cls		imagenet_cls
link_modules		link_modules
models		models
representation_tasks		representation_tasks
utils		utils
Plotting-Example.ipynb		Plotting-Example.ipynb
README.md		README.md
create_eurosat_data_files.py		create_eurosat_data_files.py
golden_model.py		golden_model.py
run_cls.sh		run_cls.sh
run_taskonomy.sh		run_taskonomy.sh
train_ssl2taskonomy.py		train_ssl2taskonomy.py

EPFL-VILAB/transfer-controls

Folders and files

Latest commit

History

Repository files navigation