Skip to content

EPFL-VILAB/transfer-controls

Repository files navigation

Simple Control Baselines for Evaluating Transfer Learning

This repository contains the code for experiments from Simple Control Baselines for Evaluating Transfer Learning.

Abstract

Transfer learning has witnessed remarkable progress in recent years, for example, with the introduction of augmentation-based contrastive self-supervised learning methods. While a number of large-scale empirical studies on the transfer performance of such models have been conducted, there is not yet an agreed-upon set of control baselines, evaluation practices, and metrics to report, which often hinders a nuanced and calibrated understanding of the real efficacy of the methods. We share an evaluation standard that aims to quantify and communicate transfer learning performance in an informative and accessible setup. This is done by baking a number of simple yet critical control baselines in the evaluation method, particularly the blind-guess (quantifying the dataset bias), scratch-model (quantifying the architectural contribution), and maximal-supervision (quantifying the upper-bound). To demonstrate how the proposed evaluation standard can be employed, we provide an example empirical study investigating a few basic questions about self-supervised learning. For example, using this standard, the study shows the effectiveness of existing self-supervised pre-training methods is skewed towards image classification tasks versus dense pixel-wise predictions.


Fig: summary of the evaluation standard

Datasets and downstream tasks

Classification Tasks

ImageNet, CIFAR-100, and EuroSAT.

Pixel-wise Tasks

Taskonomy for depth and surface normals estimation.

Pretrained models

Network architecture

All pre-trained encoders share the same ResNet-50 architecture. For transfers to downstream tasks, we use two types of decoders. For classification tasks, we use a single fully-connected layer that takes the output of the final encoder’s layer and outputs the logits for each class. For pixel-wise regression tasks, we use a UNet-style decoder with six upsampling blocks and skip-connections from the encoder layers of the same spatial resolution.

Download the models

Transfer

Download the pretrained models

If you haven't yet, then download the pretrained models and datasets needed.

Main code structure

.
├── data/                          # Dataset modules
├── downstream_tasks/              # Downstream task module
├── imagenet_cls/                  # Codes for ImageNet classification
├── link_modules/                  # Link modules to connect encoder and decoder
├── models/                        # Network definition
├── representation_tasks/          # Different feature extractor modules
├── utils/                         # Some utils
├── train_ssl2taskonomy.py         # Main code for training on Taskonomy
├── run_cls.sh                     # Training script for CIFAR100/EuroSAT
└── run_taskonomy.sh               # Training script for Taskonomy

Training scritps

Note that we provide 3 shell scripts for different task trainings.

  • Transfer self-supervised methods to Taskonomy dataset:
bash run_taskonomy.sh

Please change --ssl_name and --taskonomy_domain accordingly. The model names and taskonomy domains we support are detailed in train_ssl2taskonomy.py.

  • Transfer self-supervised methods to ImageNet classification:
cd imagenet_cls
bash train_cls.sh

Please change --model_name accorodingly. The model names we support are detailed in imagenet_ssl_epoch.py.

  • Transfer self-supervised methods to CIFAR100/EuroSAT datasets:
bash run_cls.sh

Please change --ssl_name and --taskonomy_domain accordingly. The model names and taskonomy domains we support are detailed in train_ssl2taskonomy.py.

Freeze and unfreeze the feature extractor

To freeze the feature extractor and only update the decoder parts or the classifier parts, please enable --freeze_representation in the training scripts.

Logging

The losses and visualizations are logged in Weights & Biases.

Visualization

You can find the plotting example in the Vizualization-Example.ipynb notebook. It assumes a .csv file with the following columns: domain, method, train_size, test_loss, seed. An example of the results file from our evaluation is ./assets/ssl-results.csv.

Citation

If you find this paper or code useful, please cite this paper:

@article{atanov2022simple,
      title={Simple Control Baselines for Evaluating Transfer Learning}, 
      author={Atanov, Andrei and Xu, Shijian and Beker, Onur and Filatov, Andrei and Zamir, Amir},
      year={2022},
      journal={arXiv preprint arXiv:2202.03365},
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published