Skip to content
View wangtianrui's full-sized avatar

Organizations

@android-nuc

Block or report wangtianrui

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 250 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

Python 1,183 129 Updated Sep 23, 2025
Python 627 51 Updated Sep 20, 2025

Long-form streaming TTS system for multi-speaker dialogue generation

Python 660 72 Updated Sep 17, 2025

A ComfyUI custom node integration for multi-engine multi-language Text-to-Speech and Voice Conversion. Supports: RVC, IndexTTS-2, Chatterbox (classic and multilingual 23-lang), F5-TTS, Higgs Audio …

Python 263 20 Updated Sep 22, 2025

Frontier Open-Source Text-to-Speech

9,161 1,095 Updated Sep 5, 2025

A single-layer, streaming codec model providing SOTA audio quality and discrete tokens designed for superior downstream modelability.

Python 98 6 Updated Jun 4, 2025
Python 434 31 Updated Aug 28, 2025

Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.

TypeScript 139,963 44,619 Updated Sep 23, 2025

A Unified Framework for Expressive Speech Synthesis with Voice Cloning

Python 372 31 Updated Aug 18, 2025

Official Repository of Paper: "Emilia-NV: A Non-Verbal Speech Dataset with Word-Level Annotation for Human-Like Speech Modeling"

42 1 Updated Sep 18, 2025

MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows

Python 92 6 Updated Sep 2, 2025

Ultra-low bitrate speech codec (0.27-1 kbps) with cross-modal alignment and real-time capabilities

Python 65 6 Updated Aug 27, 2025

Text-audio foundation model from Boson AI

Python 104 21 Updated Sep 4, 2025

This is the official implement of ASDA: Audio Spectrogram Differential Attention Mechanism for Self-Supervised Representation Learning

Python 3 Updated Aug 7, 2025

Inworld TTS

Python 494 40 Updated Sep 19, 2025

Text-audio foundation model from Boson AI

Python 7,338 526 Updated Sep 15, 2025

Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation.

Python 1,099 76 Updated Sep 22, 2025

A Foundation Model for Industrial Signal Comprehensive Representation

Python 40 5 Updated Aug 8, 2025

A toolkit for processing speech data and creating speech datasets

Python 168 34 Updated Aug 22, 2025

Pytorch implementation of MeanFlow on ImageNet and CIFAR10

Python 251 13 Updated Aug 23, 2025

SoftVC VITS Singing Voice Conversion

Python 27,629 5,050 Updated Nov 11, 2023

PyTorch implementation of Audio Flamingo: Series of Advanced Audio Understanding Language Models

741 56 Updated Sep 16, 2025

Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.

Python 2,382 230 Updated Sep 22, 2025

STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation

Python 52 8 Updated Jul 23, 2025

[NeurIPS 2025] PyTorch implementation of [ThinkSound], a unified framework for generating audio from any modality, guided by Chain-of-Thought (CoT) reasoning.

Python 1,027 57 Updated Sep 19, 2025

This is the code for paper: XY-Tokenizer: Mitigating the Semantic-Acoustic Conflict in Low-Bitrate Speech Codecs. Demos, technical insights and experimental results are presented on

Python 72 4 Updated Sep 19, 2025

A cross-platform bilibili toolbox. 跨平台哔哩哔哩工具箱,支持下载视频、番剧等等各类资源

Rust 3,658 246 Updated Sep 23, 2025

Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching

Python 636 80 Updated Sep 17, 2025

Open-source unified multimodal model

Python 5,060 447 Updated Aug 22, 2025

Unofficial PyTorch implementation of "Autoregressive Speech Synthesis without Vector Quantization (MELLE)"

Python 37 5 Updated Jun 28, 2025
Next
Showing results