household_speaker_recognition

WORK IN PROGRESS

Please contact @sholokovalexey and @underdogliu if having any question.

This is the associated baseline system for our work on Speaker Odyssey, focusing on household speaker recognition.

Code in this repo is subjected to baseline experiments with limited number of protocols. Therefore, it needs re-factoring and incremental updates as the research proceeds.

Usage

Pre-requisites

Python3.8+. We tested our code on python 3.8 and 3.9.

Run pip install -r requirements.txt to config the environment. For python virtual environment, please check related instructions in Virtualenv or Conda.

Runner

Download the speaker embeddings from this link and store them in ${YOUR_PATH}/embeddings (we will employ git LFS later).
Config the path of embeddings in config.yaml to ${YOUR_PATH}/embeddings.
Run scripts/run_all.sh for empirical experiments across all baseline configurations, including active and passive enrollments. There are multiple other scripts for individual experiments. You can have a check on the scripts and related config files in ./configs for more.

Features

Backend Algorithms

Whether we go for active or passive enrollment approach, we include the following recognizing algorithms:

K-means clustering
Variational Bayesian (VB) clustering
Label propagation
Aggelomerative hierarchical clustering (AHC)

For details about the backend algorithms we used, please read our paper.

Scoring

We perform threshold centroid-based scoring with a fixed threshold.

Dataset

We perform training and evaluation on two datasets:

ASVspoof 2019, physical access (PA)
VoxCeleb1

Extension

For interested users who want to extend the toolkit and test new algorithms, please have a check on:

models.py - for speaker recognition and scoring backend
clustering*.py - for various clustering algorithms applied

Citation

If you would like to use this repo, please cite our work:

@article{alexeyhousehold2022,
  title={Baselines and Protocols for Household Speaker Recognition},
  author={Alexey Sholokhov, Xuechen Liu, Md Sahidullah and Tomi Kinnunen},
  journal={Proc. Speaker Odyssey},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
configs		configs
meta		meta
protocols		protocols
saved		saved
scripts		scripts
training		training
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
calibration.py		calibration.py
clustering.py		clustering.py
clustering_ahc_plda.py		clustering_ahc_plda.py
clustering_kmeans.py		clustering_kmeans.py
clustering_vb_plda.py		clustering_vb_plda.py
config.yaml		config.yaml
data_load.py		data_load.py
evaluation.py		evaluation.py
household_simulation.py		household_simulation.py
mappings.py		mappings.py
merge_embeddings_asvspoof.py		merge_embeddings_asvspoof.py
merge_embeddings_voxceleb.py		merge_embeddings_voxceleb.py
metrics.py		metrics.py
models.py		models.py
parameterization.py		parameterization.py
prepare_calibration.py		prepare_calibration.py
prepare_protocol.py		prepare_protocol.py
preprocessing.py		preprocessing.py
requirements.txt		requirements.txt
scoring.py		scoring.py
scoring_impl.py		scoring_impl.py
utils.py		utils.py
utils_io.py		utils_io.py

License

underdogliu/household-speaker-recognition

Folders and files

Latest commit

History

Repository files navigation

household_speaker_recognition

Usage

Pre-requisites

Runner

Features

Backend Algorithms

Scoring

Dataset

Extension

Citation

About

Resources

License

Stars

Watchers

Forks

Languages