Ultra Fast Speech Separation Model with Teacher Student Learning

Introduction

Because of the ultra fast inference speed, the small speech separation Transformer model is preferred for the deployment on devices.
In this work, we elaborate Teacher Student learning for better training of the ultra fast speech separation model. The small student model is trained to reproduce the separation results of a large pretrained teacher model.

For a detailed description and experimental results, please refer to our paper: Ultra Fast Speech Separation Model with Teacher Student Learning (Accepted by INTERSPEECH 2021).

Environment

python 3.6.9, torch 1.7.1

Get Started

Download the overlapped speech of LibriCSS dataset.

wget "https://valle.blob.core.windows.net/share/CSS_with_TSTransformer/overlapped_speech.zip?sv=2020-08-04&st=2023-03-01T07%3A51%3A05Z&se=2033-03-02T07%3A51%3A00Z&sr=c&sp=rl&sig=QJXmSJG9DbMKf48UDIU1MfzIro8HQOf3sqlNXiflY1I%3D" -O overlapped_speech.zip && rm -rf /tmp/cookies.txt && unzip overlapped_speech.zip && rm overlapped_speech.zip

Download the TSTransformer separation models.

wget "https://valle.blob.core.windows.net/share/CSS_with_TSTransformer/checkpoints.zip?sv=2020-08-04&st=2023-03-01T07%3A51%3A05Z&se=2033-03-02T07%3A51%3A00Z&sr=c&sp=rl&sig=QJXmSJG9DbMKf48UDIU1MfzIro8HQOf3sqlNXiflY1I%3D" -O checkpoints.zip && rm -rf /tmp/cookies.txt && unzip checkpoints.zip && rm checkpoints.zip

Run the separation.

3.1 single channel separation

export MODEL_NAME=1ch_TSTransformer
python3 separate.py \
    --checkpoint checkpoints/$MODEL_NAME \
    --wav_list utils/overlapped_speech_1ch.scp \
    --sep_dir separated_speech/1ch/utterances_with_${MODEL_NAME} \
    --device-id 0 \
    --num_spks 2 \
    --mvdr false

The separated speech can be found in the directory 'separated_speech/1ch/utterances_with_${MODEL_NAME}'

3.2 seven channel separation

export MODEL_NAME=TSTransformer
python3 separate.py \
    --checkpoint checkpoints/$MODEL_NAME \
    --wav_list utils/overlapped_speech_7ch.scp \
    --sep_dir separated_speech/7ch/utterances_with_${MODEL_NAME} \
    --device-id 0 \
    --num_spks 2 \
    --mvdr true

The separated speech can be found in the directory 'separated_speech/7ch/utterances_with_${MODEL_NAME}'

Citation

If you find our work useful, please cite our paper:

@inproceedings{CSS_with_TSTransformer,
  author={Sanyuan Chen and Yu Wu and Zhuo Chen and Jian Wu and Takuya Yoshioka and Shujie Liu and Jinyu Li and Xiangzhan Yu},
  title={{Ultra Fast Speech Separation Model with Teacher Student Learning}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={3026--3030},
  doi={10.21437/Interspeech.2021-142}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
feature_transform		feature_transform
nnet		nnet
utils		utils
README.md		README.md
separate.py		separate.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature_transform

feature_transform

nnet

nnet

utils

utils

README.md

README.md

separate.py

separate.py

Repository files navigation

Ultra Fast Speech Separation Model with Teacher Student Learning

Introduction

Environment

Get Started

Citation

About

Releases

Packages

Languages

Sanyuan-Chen/CSS_with_TSTransformer

Folders and files

Latest commit

History

Repository files navigation

Ultra Fast Speech Separation Model with Teacher Student Learning

Introduction

Environment

Get Started

Citation

About

Resources

Stars

Watchers

Forks

Languages