Source Code for MMM 2021 Paper "Crossed-Time Delay Neural Network for Speaker Recognition"

The main model architecrue CTDNN is defined in Models/model.py . CTDNN can replace original TDNN easily in your own model. Note that the default parameters may not be equal to the papers' model, you can change the number and dimention of the filters in each layer following our code.

To train the model, you should probably first download the VoxCeleb1 Dataset or your own speaker identification datasets. Then, follow the codes under Preprocessing directory or your own methods to extract MFCC features and generate training/dev examples. After that, you can train the model with the help of script in Train dataset.

The dataset preprocess script and training scripts are listed in corresponding directory, they should be easy to use with minor custom change in path.

We recommend you just apply the model architecture, and use module like pytorch-lightening to do training since the prepocessing and training code are built on older version of Pytorch, and not neat enough.

If you use this work or code, please kindly cite the following paper:

@inproceedings{chen_crossed-time_2021,
	location = {Cham},
	title = {Crossed-Time Delay Neural Network for Speaker Recognition},
	isbn = {978-3-030-67832-6},
	pages = {1--10},
	booktitle = {{MultiMedia} Modeling},
	publisher = {Springer International Publishing},
	author = {Chen, Liang and Liang, Yanchun and Shi, Xiaohu and Zhou, You and Wu, Chunguo},
	editor = {Lokoč, Jakub and Skopal, Tomáš and Schoeffmann, Klaus and Mezaris, Vasileios and Li, Xirong and Vrochidis, Stefanos and Patras, Ioannis},
	date = {2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
Models		Models
Preprocessing		Preprocessing
Train		Train
LICENSE		LICENSE
README.md		README.md
model.png		model.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Models

Models

Preprocessing

Preprocessing

Train

Train

LICENSE

LICENSE

README.md

README.md

model.png

model.png

requirements.txt

requirements.txt

Repository files navigation

Source Code for MMM 2021 Paper "Crossed-Time Delay Neural Network for Speaker Recognition"

About

Contributors 2

Languages

License

chenllliang/CTDNN

Folders and files

Latest commit

History

Repository files navigation

Source Code for MMM 2021 Paper "Crossed-Time Delay Neural Network for Speaker Recognition"

About

Resources

License

Stars

Watchers

Forks

Languages