Skip to content

MCG-NJU/BFRNet

Repository files navigation

Filter-Recovery Network for Multi-Speaker Audio-Visual Speech Separation

This repository contains the code for BFRNet.

Filter-Recovery Network for Multi-Speaker Audio-Visual Speech Separation
Haoyue Cheng, Zhaoyang Liu, Wayne Wu and Limin Wang

Dataset Preparation

  1. Download the VoxCeleb2 test mixture lists from the following link:
https://pan.xunlei.com/s/VNXTbMyuZOijYSvNAFJPmVOvA1?pwd=wxtt#
  1. Create directory "voxceleb2" in the main directory BFRNet, and move the mixture files to directory "voxceleb2".
# Directory structure of the VoxCeleb2 dataset:
#    ├── VoxCeleb2                          
#    │       └── [mp4]               (contain the face tracks)
#    │                └── [train]
#    │                           └── [spk_id]
#    │                                       └── [video_id]
#    │                                                     └── [clip_id]
#    │                                                                  └── .mp4 files
#    │                └── [val]
#    │       └── [mouth]             (contain the audio files and mouth roi files)
#    │                └── [train]
#    │                           └── [spk_id]
#    │                                       └── [video_id]
#    │                                                     └── [clip_id]
#    │                                                                  └── .h5 files, .wav files
#    │                └── [val]
# Directory structure of the lrs2/lrs3 dataset:
#    ├── lrs2/lrs3                          
#    │       └── [main]               (contain the face tracks, audio files, and mouth roi files)
#    │                 └── [video_id]
#    │                               └── .wav files, .npz files, .mp4 files
  1. Please contact with chenghaoyue98@gmail.com to download datasets.

Train the model

  1. Train the model with slurm:
GPUS=[GPUS] GPUS_PER_NODE=[GPUS_PER_NODE] bash train_slurm.sh [PARTITION] [JOB_NAME]
  1. torch.distributed training:
NNODES=[NNODES] GPUS_PER_NODE=[GPUS_PER_NODE] bash train_dist.sh [JOB_NAME]

Evaluate the model

  1. Download the pre-trained networks from the following link:
https://drive.google.com/drive/folders/1J0qxFMb7NVbsXQwM4HiOJ1u7MI0pUquO
  1. Create directory "checkpoints" in the main directory BFRNet, and move the models to directory "checkpoints".
  2. Evaluate the models on VoxCeleb2 unseen_2mix test set:
mix_number=2 test_file="anno/unseen_2mix.txt" bash test.sh inference_unseen_2mix

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published