Self-supervised Spoofing Audio Detection Scheme

This repository contains the implementations of SSAD，which is a speech waveform encoder trained in a self-supervised manner with the so called worker framework. A SSAD model can be used as a speech feature extractor or a pre-trained encoder for our spoofing audio detection task.

Requirements

PyTorch 1.0 or higher
Torchvision 0.2 or higher
Install the requirements from requirements.txt: pip install -r requirements.txt
Intall SSAD modules python: setup.py install NOTE: Edit the cupy-cuda100 requirement in the file if needed depending on your CUDA version. Defaults to 10.0 now

pip install -r requirements.txt # install the requirements
export PYTHONPATH=. #use this phrase to set the root path whenever you start

Pre-trained Model

This is the valid_loss in tensorboard:

The pretrained SSAD encoder model has been trained for the 99 epochs (in my experience, you can trained it for 200 epochs for best appearance), and the classifier is casually trained by me. The results and URLs are as follows.

min-tDCF	EER	classifier's name
0.1643	6.3869 %	SENet12

URLs: https://pan.baidu.com/s/1fx-fk2rOPWMNgLRlwJ33JA passcode: 4j1b

Remember to modify the path and config in the bash scripts below.

Data preparation

To make the data preparation the following files have to be provided:

training files list train_scp: contains a file name per line (without directory names), including .wav/mp3/etc. extension.
test files list test_scp: contains a wav file name per line (without directory names), including .wav/mp3/etc. extension.
dictionary with filename dict.npy-> integer speaker class (speaker id) correspondence (same filenames as in train/test lists).
data.cfg: the dataset config file
stats_pase+.pkl: the normalization statistics for the workers to work properly

sbatch -A yzren -p gpu --gres=gpu:1 -c 16 preprocess/preprocess.sh

#time: about 50 minutes

Train SSAD encoder

sbatch -A yzren -p gpu --gres=gpu:1 -c 16 train.sh

#time: about 2 days for 100 epochs

Extract features for classifier

sbatch -A yzren -p gpu --gres=gpu:1 -c 16 feature.sh

Train classifier

Operations for classifier should be made in the directory named ADV, and you should modify the config file: ADV/_configs/config_LA_SENet12_LPSseg_uf_seg600.json

cd ADV
sbatch -A yzren -p gpu --gres=gpu:1 -c 16 run_train.sh

Evaluation

sbatch -A yzren -p gpu --gres=gpu:1 -c 16 run_eval.sh

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
ADV		ADV
ASR		ASR
PASE.egg-info		PASE.egg-info
build/lib/pase		build/lib/pase
cfg		cfg
data		data
downstream_prep		downstream_prep
images		images
pase		pase
preprocess		preprocess
spk_id		spk_id
template_scripts		template_scripts
util_scripts		util_scripts
ASVspoof2019_dict.npy		ASVspoof2019_dict.npy
README.md		README.md
__init__.py		__init__.py
feature.sh		feature.sh
feature_for_ASVspoof.py		feature_for_ASVspoof.py
feature_for_xinan.py		feature_for_xinan.py
feature_xinan.sh		feature_xinan.sh
make_trainset_statistics.py		make_trainset_statistics.py
make_trainset_statistics.sh		make_trainset_statistics.sh
precompute_aco_data.py		precompute_aco_data.py
requirements.txt		requirements.txt
setup.py		setup.py
train.py		train.py
train.sh		train.sh
unsupervised_data_cfg_librispeech.py		unsupervised_data_cfg_librispeech.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Self-supervised Spoofing Audio Detection Scheme

Requirements

Pre-trained Model

Data preparation

Train SSAD encoder

Extract features for classifier

Train classifier

Evaluation

About

Releases

Packages

Languages

Zain-Jiang/SSAD

Folders and files

Latest commit

History

Repository files navigation

Self-supervised Spoofing Audio Detection Scheme

Requirements

Pre-trained Model

Data preparation

Train SSAD encoder

Extract features for classifier

Train classifier

Evaluation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages