MT-SLVR: Multi-Task Self-Supervised Learning for Transformation In(Variant) Representations

A novel multi-task self-supervised learning approach, capable of learning both augmentation invariant and equivariant features in a parameter efficient manner.

The paper can be found either in the InterSpeech23 proceedings or in ArXiv.

News & Updates

23/8/23: Presented work in poster format at InterSpeech23. Poster released in this repo
16/8/23: Paper formally released by InterSpeech23, see here
8/8/23: Pre-trained wieghts for MT-SLVR models released
1/6/23: Blog post with additional details and diagrams released: here
29/5/23: Paper and code made public
17/5/23: MT-SLVR accepted to InterSpeech23, to be presented in August

Citation

If you find this work useful or related to your own, please consider citing it:

@inproceedings{heggan23_interspeech,
  author={Calum Heggan and Tim Hospedales and Sam Budgett and Mehrdad Yaghoobi},
  title={{MT-SLVR: Multi-Task Self-Supervised Learning for Transformation In(Variant) Representations}},
  year=2023,
  booktitle={Proc. INTERSPEECH 2023},
  pages={4399--4403},
  doi={10.21437/Interspeech.2023-1064}
}

MT-SLVR

Simply put, the MT-SLVR algorithm utilises multi-task learning between contrastive and predictive self-supervised learning techniques. These features learnt by each of these algorithm are expected to be heavily conflicting (i.e one tries to learn augmentation invariance while the other tries to learn augmentation equivariance). To allow both to co-exist and be readily available for downstream tasks, we utilise adapters fit throughout the neural network, allowing each task (contrastive/predictive) some of its own specific parameters to learn upon.

Using the Repo(s)

We note that although there are unique parts to each of three major codebases, there is a significant amount of overlapping code, e.g. dataset and augmentation classes. We left the overall codebase like this instead of reformatting and removing repeated scripts so that each section can be used independently, effectively increasing the immediate usability of the repo.

Evaluation Framework

We exclude our evaluation framework from this specific repo (due to its additional complexity and potential usefulness as a standalone codebase) and instead host it here. This evaluation repo is still under construction with respect to documentation.

Environment

We use miniconda for our experimental setup. For the purposes of reproduction we include the environment file. This can be set up using the following command

conda env create --file torch_gpu_env.txt

There are likely some redundant packages in this section, we will attempt to trim it down for future releases.

Datasets & Processing

For pre-training we use the balanced version of AudioSet. The decision to use this set was based in ease and manageable size. Unfortunately this set is not easily available to download. This being said, the set can be reproduced using a YouTube scraping script. Details and references for this process can be found here.

Pre-Trained Weights

We also release the weights of the models used in the original work. The script along with the details on how to do this can be found here

MT-SLVR Pre-Training

Additional details on how to run the the MT-SLVR can be found in its sub-codebase but the main line is of the format:

python NEW_RUN.py --cont_framework simclrv1 --pred_framework trans --pred_weight 1.0 --adapter parallel --num_splits 2 --batch_size 100 --lr 0.00005 --p 1.0 --data_name AS_BAL --dims 2 --in_channels 3 --model_fc_out 1000 --gpu 0

Hyperparameter descriptions can be found in the "NEW_RUN.py".

Baseline Codebases

Details on running baselines can be found in their respective sub-codebases.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
Contrastive Methods		Contrastive Methods
MT-SLVR		MT-SLVR
Predictive Methods		Predictive Methods
images		images
IS23 Poster Presentation.pdf		IS23 Poster Presentation.pdf
README.md		README.md
torch_gpu_env.txt		torch_gpu_env.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contrastive Methods

Contrastive Methods

MT-SLVR

MT-SLVR

Predictive Methods

Predictive Methods

images

images

IS23 Poster Presentation.pdf

IS23 Poster Presentation.pdf

README.md

README.md

torch_gpu_env.txt

torch_gpu_env.txt

Repository files navigation

MT-SLVR: Multi-Task Self-Supervised Learning for Transformation In(Variant) Representations

News & Updates

Citation

MT-SLVR

Using the Repo(s)

Contents

Evaluation Framework

Environment

Datasets & Processing

Pre-Trained Weights

MT-SLVR Pre-Training

Baseline Codebases

About

Releases

Packages

Languages

CHeggan/MT-SLVR

Folders and files

Latest commit

History

Repository files navigation

MT-SLVR: Multi-Task Self-Supervised Learning for Transformation In(Variant) Representations

News & Updates

Citation

MT-SLVR

Using the Repo(s)

Contents

Evaluation Framework

Environment

Datasets & Processing

Pre-Trained Weights

MT-SLVR Pre-Training

Baseline Codebases

About

Resources

Stars

Watchers

Forks

Languages