The code is for Findings-EMNLP 2022 paper: Self-supervised Rewiring of Pre-trained Speech Encoders: Towards Faster Fine-tuning with Less Labels in Speech Processing
Our work is developed based on SUPERB benchmark. Please follow their instructions to set up environment, download dataset and preprocess data.
cd s3prl/upstream/wav2vec2_hug
# Neutral Version
python n_mirror.py
# Twin Version
python sumavg-1ep.py
# Mixed Version
python n_mirror_mix.py
After training models, please import them in s3prl/upstream/wav2vec2_hug/expert.py
We are following SUPERB setting, please select the downstream tasks according to its instructions.
python3 run_downstream.py -n [output_name] -m train -u wav2vec2_hug_large_ll60k -d [task]
@inproceedings{yang2022self,
title={Self-supervised Rewiring of Pre-trained Speech Encoders: Towards Faster Fine-tuning with Less Labels in Speech Processing},
author={Yang, Hao and Zhao, Jinming and Haffari, Gholamreza and Shareghi, Ehsan},
booktitle={Findings of the Association for Computational Linguistics: EMNLP 2022},
pages={1952--1959},
year={2022}
}