Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
-
Updated
Jul 24, 2025 - Jupyter Notebook
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
SincNet is a neural architecture for efficiently processing raw audio samples.
PyTorch implementation of "Generalized End-to-End Loss for Speaker Verification" by Wan, Li et al.
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
A Fully Native Solution with Swift and CoreML Models Offering Speaker Diarization, VAD, and Speech-to-Text.
The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
Simple d-vector based Speaker Recognition (verification and identification) using Pytorch
Speaker Identification System (upto 100% accuracy); built using Python 2.7 and python_speech_features library
Identifying people from small audio fragments
Deep Learning - one shot learning for speaker recognition using Filter Banks
Official Implementation of the work "Audio Mamba: Bidirectional State Space Model for Audio Representation Learning"
A light weight neural speaker embeddings extraction based on Kaldi and PyTorch.
[SLT'24] The official implementation of SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model
打造最简单的TTS前端集合,最简单的有声小说制作工作流。基于正则规则对小说进行分句,基于RoBERTa对小说中的对话进行说话人识别,从而实现一键式生成多人有声小说。多说话人的语音合成,高质量的有声小说制作。
This repo contains my attempt to create a Speaker Recognition and Verification system using SideKit-1.3.1
Source code for paper "Who is real Bob? Adversarial Attacks on Speaker Recognition Systems" (IEEE S&P 2021)
Pytorch implementation of "Generalized End-to-End Loss for Speaker Verification"
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.
Pytorch implementation of Generalized End-to-End Loss for speaker verification
A tool for summarizing dialogues from videos or audio
Add a description, image, and links to the speaker-identification topic page so that developers can more easily learn about it.
To associate your repository with the speaker-identification topic, visit your repo's landing page and select "manage topics."