Self-Supervised Metric Learning With Graph Clustering For Speaker Diarization

This repository is code used in our paper:

Self-Supervised Metric Learning With Graph Clustering For Speaker Diarization Prachi Singh, Sriram Ganapathy

Overview

The recipe consists of:

extracting x-vectors using pretrained model
Performing self-supervised clustering for diarization on AMI and DIHARD sets
Evaluating results using Diarization error rate (DER)

Prerequisites

The following packages are required to run the baseline.

Python >= 3.6
Kaldi
Pytorch >= 1.6.0
dscore

Getting started

clone the repository:

git clone https://github.com/iiscleap/SelfSup_PLDA.git

Install Kaldi. If you are a Kaldi novice, please consult the following for additional documentation:
- Kaldi tutorial
- Kaldi for Dummies tutorial
Go to cloned repository and copy kaldi path in path.sh given as:

$ local_dir="Full_path_of_cloned_repository"
$ echo "export KALDI_ROOT="/path_of_kaldi_directory/kaldi" >> $local_dir/path.sh
$ echo "export KALDI_ROOT="/path_of_kaldi_directory/kaldi" >> $local_dir/tools_dir/path.sh

Create Softlinks of necessary directories:

$ local_dir="Full_path_of_cloned_repository"
$ cd $local_dir/tools_dir
$ . ./path.sh
$ ln -sf $KALDI_ROOT/egs/wsj/s5/utils .  # utils dir
$ ln -sf $KALDI_ROOT/egs/wsj/s5/steps .  # steps dir

Input x-vectors features are obtained using Kaldi ETDNN X-vector model. Pre-trained x-vector model and plda model including global mean and PCA transform needed for training are given in tools_diar/etdnn_fbank_xvector_models: tools_diar/etdnn_fbank_xvector_models/exp/final.raw is not uploaded because of space constraint. Please contact for access.
Performance is evaluated using dscore. Download all the required dependencies in the same python environment.

Implementation

X-vectors Extraction

This step is to run kaldi diarization pipeline till x-vector extraction using pre-trained model
Additionally it will convert x-vectors in ark format into numpy format to run in pytorch. It will also convert kaldi plda model into pickle format.
Replace "data_root" with path of AMI dataset in tools_diar/run_extract_xvectors_ami.sh
Run following commands:

$ local_dir="Full_path_of_cloned_repository"
$ cd $local_dir/tools_diar
$ bash run_extract_xvectors_ami.sh
- Repeat same for DIHARD set in run_extract_xvectors_dihard.sh

SelfSup-PLDA Training

xvec_SSC_train.py is code for DNN training
run_xvec_ssc_asru.sh calls DNN training script
Update training parameters in run_xvec_ssc_asru.sh NOTE that by default Kaldi scripts are configured for execution on a grid using a submission engine such as SGE or Slurm. If you are running the recipes on a single machine, make sure to edit cmd.sh and tools_dir/cmd.sh so that the line

   $ export train_cmd="queue.pl"

reads

   $ export train_cmd="run.pl"

Execute following commands:

$ local_dir="Full_path_of_cloned_repository"
$ cd $local_dir
$ bash run_xvec_ssh_ami.sh $local_dir  --TYPE parallel --nj <number of jobs> --which_python <python_env_with_all_installed_libraries> # for AMI
$ bash run_xvec_ssh_dihard.sh $local_dir  --TYPE parallel --nj <number of jobs> --which_python <python_env_with_all_installed_libraries> # for DIHARD

Note: --TYPE parallel (when running multiple jobs simultaneoulsy)

Evaluation

Diarization Error Rate is used as performance metric
Scripts in dscore generates filewise DER.
Go to cloned repo and run following command for evaluation

$ local_dir="Full_path_of_cloned_repository"
$ cd $local_dir
$ cd tool_diar/
$ bash gen_rttm.sh --DATA <Ami/Dihard> --stage <1/2> --modelpath <path of model to evaluate> --which_python <python_env_with_all_installed_libraries>

Note: --stage 1 (using ground truth number of speakers), --stage 2 (using threshold based number of clusters)

Output

Generates der.scp in modelpath which contains filewise DER and other metric like JER.

Contact

If you have any comment or question, please contact prachisingh@iisc.ac.in

Cite

@article{singh2021self, title={Self-Supervised Metric Learning With Graph Clustering For Speaker Diarization}, author={Singh, Prachi and Ganapathy, Sriram}, journal={arXiv preprint arXiv:2109.06824}, year={2021} }

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
services		services
tools_diar		tools_diar
LICENSE		LICENSE
README.md		README.md
arguments.py		arguments.py
cmd.sh		cmd.sh
compute_rttm.sh		compute_rttm.sh
models_train_ssc_plda.py		models_train_ssc_plda.py
path.sh		path.sh
run_xvec_ssc_asru.sh		run_xvec_ssc_asru.sh
score.sh		score.sh
split_reco2num_spk.sh		split_reco2num_spk.sh
xvec_ssc_plda_train.py		xvec_ssc_plda_train.py

License

iiscleap/SelfSup_PLDA

Folders and files

Latest commit

History

Repository files navigation

Self-Supervised Metric Learning With Graph Clustering For Speaker Diarization

Overview

Prerequisites

Getting started

Implementation

X-vectors Extraction

SelfSup-PLDA Training

Evaluation

Output

Contact

Cite

About

Resources

License

Stars

Watchers

Forks

Languages