GitHub - ND15/SiameseSpeech: Tensorflow-2 Implementation of the paper Siamese Capsule Network for End-to-End Speaker Recognition in the Wild

Siamese Capsule Network for End-to-End Speaker Recognition in the Wild

This repository is an implementation of the paper Siamese Capsule Network for End-to-End Speaker Recognition in the Wild (Paper link). This repository contains the implementation for the front-end part of the model and the back-end part needs to be implemented. I have changed some of the parameters from the paper which includes window length, hop size, etc.

Comments

One of the drawbacks of the siamese network is that for a dataset with N samples, the dataset preprocessor will make the dataset size N x N and hence requires more computational power and also more training time. So with a bigger window length, the dimensions of the spectrograms would also increase and will take a huge amount of space on disk.

In my implementation I have used a customised version of Vox Celeb Dataset. This dataset contains only the recordings of the Indian celebrities, further for the ease of implementation for each speaker I took only 25-30 recordings.

Updates

Speaker-Verification a repo for DNN based speaker recognition and verification.

Links

Dataset

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
models		models
testing		testing
utils		utils
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Siamese Capsule Network for End-to-End Speaker Recognition in the Wild

Comments

Updates

Links

About

Releases

Packages

Languages

ND15/SiameseSpeech

Folders and files

Latest commit

History

Repository files navigation

Siamese Capsule Network for End-to-End Speaker Recognition in the Wild

Comments

Updates

Links

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages