Skip to content

YangHao97/investigateAudioEncoders

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Investigating Pre-trained Audio Encoders in the Low-Resource Condition

The code is for Interspeech 2023 paper: Investigating Pre-trained Audio Encoders in the Low-Resource Condition

Environment and Dataset

Our work is developed based on SUPERB benchmark. Please follow their instructions to set up environment, download dataset and preprocess data.

Fine-tune models on downstream Tasks

Please select the model you want to fine-tune and the data proportion in upstream/wav2vec2_hug/expert.py and downstream/[task]/dataset.py & expert.py

W2V2 & WavLM

python3 run_downstream.py -n ExpName -m train -u wav2vec2_hug_large_ll60k -d [task]

Whisper

python3 run_downstream.py -n ExpName -m train -u wav2vec2_hug_large_ll60k -d [task] -s last_hidden_state

Citation

@inproceedings{yang23d_interspeech,
  author={Hao Yang and Jinming Zhao and Gholamreza Haffari and Ehsan Shareghi},
  title={{Investigating Pre-trained Audio Encoders in the Low-Resource Condition}},
  year=2023,
  booktitle={Proc. INTERSPEECH 2023},
  pages={1498--1502},
  doi={10.21437/Interspeech.2023-343}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published