A PyTorch-based Speech Toolkit
-
Updated
May 14, 2024 - Python
A PyTorch-based Speech Toolkit
Foundation Architecture for (M)LLMs
WaveNet vocoder
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
SincNet is a neural architecture for efficiently processing raw audio samples.
AI powered speech denoising and enhancement
General Speech Restoration
A neural network for end-to-end speech denoising
Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.
Problem Agnostic Speech Encoder
A python wrapper for Speech Signal Processing Toolkit (SPTK).
🔉 spafe: Simplified Python Audio Features Extraction
Novoic's audio feature extraction library
This repository has implementation for "Neural Voice Cloning With Few Samples"
UniSpeech - Large Scale Self-Supervised Learning for Speech
Library to build speech synthesis systems designed for easy and fast prototyping.
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network
Add a description, image, and links to the speech-processing topic page so that developers can more easily learn about it.
To associate your repository with the speech-processing topic, visit your repo's landing page and select "manage topics."