Skip to content
View fifilyu's full-sized avatar

Block or report fifilyu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

End-to-End Speech Processing Toolkit

Python 8,932 2,238 Updated Mar 27, 2025

Instant voice cloning by MIT and MyShell. Audio foundation model.

Python 31,537 3,195 Updated Jan 7, 2025

chinese speech pretrained models

Shell 1,092 90 Updated Aug 23, 2024

Faster Whisper transcription with CTranslate2

Python 15,090 1,269 Updated Mar 20, 2025

🔊 Text-Prompted Generative Audio Model

Jupyter Notebook 37,329 4,418 Updated Aug 19, 2024

Easily train a good VC model with voice data <= 10 mins!

Python 28,272 3,997 Updated Nov 24, 2024

🌐 The Internet OS! Free, Open-Source, and Self-Hostable.

JavaScript 29,686 2,197 Updated Mar 29, 2025

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Jupyter Notebook 7,136 854 Updated Mar 28, 2025

kaldi-asr/kaldi is the official location of the Kaldi project.

Shell 14,706 5,345 Updated Jan 28, 2025

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 9,215 938 Updated Mar 28, 2025

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

Python 3,806 318 Updated Jan 8, 2025

A PyTorch-based Speech Toolkit

Python 9,597 1,458 Updated Mar 29, 2025

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 13,487 2,760 Updated Mar 30, 2025

[WIP] Layer Diffusion for WebUI (via Forge)

Python 4,005 344 Updated Aug 30, 2024

WebUI extension for ControlNet

Python 17,478 2,003 Updated Aug 12, 2024

a machine learning image inpainting task that instinctively removes watermarks from image indistinguishable from the ground truth image

Python 3,220 392 Updated Aug 16, 2024

GeneFace++: Generalized and Stable Real-Time 3D Talking Face Generation; Official Code

Python 1,673 241 Updated Oct 18, 2024

MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting

Python 3,814 479 Updated Mar 28, 2025

Code for the paper Hybrid Spectrogram and Waveform Source Separation

Python 8,748 1,149 Updated Apr 24, 2024

Easy to use stem (e.g. instrumental/vocals) separation from CLI or as a python package, using a variety of amazing pre-trained models (primarily from UVR)

Python 693 113 Updated Mar 28, 2025

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs

Python 11,635 2,448 Updated Feb 10, 2025

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 20,992 2,612 Updated Mar 4, 2025

We provide a PyTorch implementation of the paper Voice Separation with an Unknown Number of Multiple Speakers In which, we present a new method for separating a mixed audio sequence, in which multi…

Python 1,279 186 Updated Nov 16, 2023

GUI for a Vocal Remover that uses Deep Neural Networks.

Python 20,021 1,469 Updated Mar 13, 2025

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 38,922 4,897 Updated Aug 16, 2024

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 43,347 4,821 Updated Mar 26, 2025

so-vits-svc fork with realtime support, improved interface and more features.

Python 8,948 1,190 Updated Mar 16, 2025

Robust Speech Recognition via Large-Scale Weak Supervision

Python 79,114 9,499 Updated Jan 4, 2025

High-performance GPGPU inference of OpenAI's Whisper automatic speech recognition (ASR) model

C++ 9,103 772 Updated Aug 3, 2024

Stable Diffusion web UI

Python 150,205 27,988 Updated Mar 4, 2025
Next
Showing results