Lists (1)
Sort Name ascending (A-Z)
Stars
workflow orchestration UI and nodes editor for your own python codebase
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
C++ library for converting text to phonemes for Piper
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
I've been trying quite hard to use the IVONA Amy voice on Linux natively, from trying to reverse engineer the APK's and dll files, to hacking Waydroid to be compatible, and port forwarding/ssh via …
All Algorithms implemented in Python
A Jupyter widgets-based interactive notebook for Google Colab to generate images using Stable Diffusion.
fMRI-to-image reconstruction on the NSD dataset.
a colab notebook repo for using Diffusers library (not a webui)
Versatile audio super resolution (any -> 48kHz) with AudioSR.
Godot Engine – Multi-platform 2D and 3D game engine
VoiceSplit: Targeted Voice Separation by Speaker-Conditioned Spectrogram
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
OpenMMLab Semantic Segmentation Toolbox and Benchmark.
A curated list of open source projects used in nuclear science and engineering
Demo Programs for the "Talking Head(?) Anime from a Single Image 3: Now the Body Too" Project
A MIT-licensed, deployable starter kit for building and customizing your own version of AI town - a virtual town where AI characters live, chat and socialize.
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
A multi document reader and chatbot using LangChain and ChatGPT
template for duplicating and executing Hugging Face Spaces either on SM Studio Lab, Google Colab, or locally.
📚 A collection of sketch based application papers.
Panel: The powerful data exploration & web app framework for Python
A list of awesome beginners-friendly projects.
TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS, Stable Audio, Mars5, F5-TTS, ParlerTTS)
A timeline of the latest AI models for audio generation, starting in 2023!
Finetuning VITS Efficiently