GitHub - MCG-NJU/BFRNet

Filter-Recovery Network for Multi-Speaker Audio-Visual Speech Separation

This repository contains the code for BFRNet.

Filter-Recovery Network for Multi-Speaker Audio-Visual Speech Separation
Haoyue Cheng, Zhaoyang Liu, Wayne Wu and Limin Wang

Dataset Preparation

Download the VoxCeleb2 test mixture lists from the following link:

https://pan.xunlei.com/s/VNXTbMyuZOijYSvNAFJPmVOvA1?pwd=wxtt#

Create directory "voxceleb2" in the main directory BFRNet, and move the mixture files to directory "voxceleb2".

# Directory structure of the VoxCeleb2 dataset:
#    ├── VoxCeleb2                          
#    │       └── [mp4]               (contain the face tracks)
#    │                └── [train]
#    │                           └── [spk_id]
#    │                                       └── [video_id]
#    │                                                     └── [clip_id]
#    │                                                                  └── .mp4 files
#    │                └── [val]
#    │       └── [mouth]             (contain the audio files and mouth roi files)
#    │                └── [train]
#    │                           └── [spk_id]
#    │                                       └── [video_id]
#    │                                                     └── [clip_id]
#    │                                                                  └── .h5 files, .wav files
#    │                └── [val]

# Directory structure of the lrs2/lrs3 dataset:
#    ├── lrs2/lrs3                          
#    │       └── [main]               (contain the face tracks, audio files, and mouth roi files)
#    │                 └── [video_id]
#    │                               └── .wav files, .npz files, .mp4 files

Please contact with chenghaoyue98@gmail.com to download datasets.

Train the model

Train the model with slurm:

GPUS=[GPUS] GPUS_PER_NODE=[GPUS_PER_NODE] bash train_slurm.sh [PARTITION] [JOB_NAME]

torch.distributed training:

NNODES=[NNODES] GPUS_PER_NODE=[GPUS_PER_NODE] bash train_dist.sh [JOB_NAME]

Evaluate the model

Download the pre-trained networks from the following link:

https://drive.google.com/drive/folders/1J0qxFMb7NVbsXQwM4HiOJ1u7MI0pUquO

Create directory "checkpoints" in the main directory BFRNet, and move the models to directory "checkpoints".
Evaluate the models on VoxCeleb2 unseen_2mix test set:

mix_number=2 test_file="anno/unseen_2mix.txt" bash test.sh inference_unseen_2mix

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
configs		configs
dataset		dataset
models		models
options		options
utils		utils
README.md		README.md
requirements.txt		requirements.txt
test.py		test.py
test.sh		test.sh
train.py		train.py
train_dist.sh		train_dist.sh
train_slurm.sh		train_slurm.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

configs

configs

dataset

dataset

models

models

options

options

utils

utils

README.md

README.md

requirements.txt

requirements.txt

test.py

test.py

test.sh

test.sh

train.py

train.py

train_dist.sh

train_dist.sh

train_slurm.sh

train_slurm.sh

Repository files navigation

Filter-Recovery Network for Multi-Speaker Audio-Visual Speech Separation

Dataset Preparation

Train the model

Evaluate the model

About

Releases

Packages

Languages

MCG-NJU/BFRNet

Folders and files

Latest commit

History

Repository files navigation

Filter-Recovery Network for Multi-Speaker Audio-Visual Speech Separation

Dataset Preparation

Train the model

Evaluate the model

About

Resources

Stars

Watchers

Forks

Languages