A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
-
Updated
May 31, 2024 - Python
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
A PyTorch-based Speech Toolkit
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
🔈 Deep Learning & 3D Convolutional Neural Networks for Speaker Verification
In defence of metric learning for speaker recognition
SincNet is a neural architecture for efficiently processing raw audio samples.
speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the same time, this project also supports MelSpectrogram, Spectrogram data preprocessing methods
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
Angular penalty loss functions in Pytorch (ArcFace, SphereFace, Additive Margin, CosFace)
Base on MFCC and GMM(基于MFCC和高斯混合模型的语音识别)
Keras implementation of ‘’Deep Speaker: an End-to-End Neural Speaker Embedding System‘’ (speaker recognition)
Speaker Identification System (upto 100% accuracy); built using Python 2.7 and python_speech_features library
Identifying people from small audio fragments
使用Tensorflow实现声纹识别
Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196
Simple d-vector based Speaker Recognition (verification and identification) using Pytorch
本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法
[InterSpeech 2020] "AutoSpeech: Neural Architecture Search for Speaker Recognition" by Shaojin Ding*, Tianlong Chen*, Xinyu Gong, Weiwei Zha, Zhangyang Wang
Add a description, image, and links to the speaker-recognition topic page so that developers can more easily learn about it.
To associate your repository with the speaker-recognition topic, visit your repo's landing page and select "manage topics."