wangtianrui

Tianrui Wang (王天锐) wangtianrui

242 followers · 299 following

https://scholar.google.com/citations?user=eGqo7CYAAAAJ&hl=en&oi=ao

Achievements

Organizations

Stars

OpenBMB / VoxCPM

VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

Python 1,183 129 Updated Sep 23, 2025

XiaomiMiMo / MiMo-Audio

Python 627 51 Updated Sep 20, 2025

FireRedTeam / FireRedTTS2

Long-form streaming TTS system for multi-speaker dialogue generation

Python 660 72 Updated Sep 17, 2025

diodiogod / TTS-Audio-Suite

A ComfyUI custom node integration for multi-engine multi-language Text-to-Speech and Voice Conversion. Supports: RVC, IndexTTS-2, Chatterbox (classic and multilingual 23-lang), F5-TTS, Higgs Audio …

Python 263 20 Updated Sep 22, 2025

microsoft / VibeVoice

Frontier Open-Source Text-to-Speech

9,161 1,095 Updated Sep 5, 2025

Ereboas / MagiCodec

A single-layer, streaming codec model providing SOTA audio quality and discrete tokens designed for superior downstream modelability.

Python 98 6 Updated Jun 4, 2025

mingyin0312 / RLFromScratch

Python 434 31 Updated Aug 28, 2025

n8n-io / n8n

Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.

TypeScript 139,963 44,619 Updated Sep 23, 2025

AIDC-AI / Marco-Voice

A Unified Framework for Expressive Speech Synthesis with Voice Cloning

Python 372 31 Updated Aug 18, 2025

Hannieliao / Emilia-NV

Official Repository of Paper: "Emilia-NV: A Non-Verbal Speech Dataset with Word-Level Annotation for Human-Like Speech Modeling"

42 1 Updated Sep 18, 2025

xiquan-li / MeanAudio

MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows

Python 92 6 Updated Sep 2, 2025

QiangChunyu / SecoustiCodec

Ultra-low bitrate speech codec (0.27-1 kbps) with cross-modal alignment and real-time capabilities

Python 65 6 Updated Aug 27, 2025

JimmyMa99 / train-higgs-audio

Forked from boson-ai/higgs-audio

Text-audio foundation model from Boson AI

Python 104 21 Updated Sep 4, 2025

MyParadise21 / ASDA

This is the official implement of ASDA: Audio Spectrogram Differential Attention Mechanism for Self-Supervised Representation Learning

Python 3 Updated Aug 7, 2025

inworld-ai / tts

Inworld TTS

Python 494 40 Updated Sep 19, 2025

boson-ai / higgs-audio

Text-audio foundation model from Boson AI

Python 7,338 526 Updated Sep 15, 2025

stepfun-ai / Step-Audio2

Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation.

Python 1,099 76 Updated Sep 22, 2025

jianganbai / FISHER

A Foundation Model for Industrial Signal Comprehensive Representation

Python 40 5 Updated Aug 8, 2025

NVIDIA / NeMo-speech-data-processor

A toolkit for processing speech data and creating speech datasets

Python 168 34 Updated Aug 22, 2025

zhuyu-cs / MeanFlow

Pytorch implementation of MeanFlow on ImageNet and CIFAR10

Python 251 13 Updated Aug 23, 2025

svc-develop-team / so-vits-svc

SoftVC VITS Singing Voice Conversion

Python 27,629 5,050 Updated Nov 11, 2023

NVIDIA / audio-flamingo

PyTorch implementation of Audio Flamingo: Series of Advanced Audio Understanding Language Models

741 56 Updated Sep 16, 2025

kyutai-labs / delayed-streams-modeling

Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.

Python 2,382 230 Updated Sep 22, 2025

gwx314 / STARS

STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation

Python 52 8 Updated Jul 23, 2025

FunAudioLLM / ThinkSound

[NeurIPS 2025] PyTorch implementation of [ThinkSound], a unified framework for generating audio from any modality, guided by Chain-of-Thought (CoT) reasoning.

Python 1,027 57 Updated Sep 19, 2025

gyt1145028706 / XY-Tokenizer

This is the code for paper: XY-Tokenizer: Mitigating the Semantic-Acoustic Conflict in Low-Bitrate Speech Codecs. Demos, technical insights and experimental results are presented on

Python 72 4 Updated Sep 19, 2025

btjawa / BiliTools

A cross-platform bilibili toolbox. 跨平台哔哩哔哩工具箱，支持下载视频、番剧等等各类资源

Rust 3,658 246 Updated Sep 23, 2025

k2-fsa / ZipVoice

Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching

Python 636 80 Updated Sep 17, 2025

ByteDance-Seed / Bagel

Open-source unified multimodal model

Python 5,060 447 Updated Aug 22, 2025

Shy-98 / MELLE

Unofficial PyTorch implementation of "Autoregressive Speech Synthesis without Vector Quantization (MELLE)"

Python 37 5 Updated Jun 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tianrui Wang (王天锐) wangtianrui

Achievements

Achievements

Organizations

Block or report wangtianrui

Stars

OpenBMB / VoxCPM

XiaomiMiMo / MiMo-Audio

FireRedTeam / FireRedTTS2

diodiogod / TTS-Audio-Suite

microsoft / VibeVoice

Ereboas / MagiCodec

mingyin0312 / RLFromScratch

n8n-io / n8n

AIDC-AI / Marco-Voice

Hannieliao / Emilia-NV

xiquan-li / MeanAudio

QiangChunyu / SecoustiCodec

JimmyMa99 / train-higgs-audio

MyParadise21 / ASDA

inworld-ai / tts

boson-ai / higgs-audio

stepfun-ai / Step-Audio2

jianganbai / FISHER

NVIDIA / NeMo-speech-data-processor

zhuyu-cs / MeanFlow

svc-develop-team / so-vits-svc

NVIDIA / audio-flamingo

kyutai-labs / delayed-streams-modeling

gwx314 / STARS

FunAudioLLM / ThinkSound

gyt1145028706 / XY-Tokenizer

btjawa / BiliTools

k2-fsa / ZipVoice

ByteDance-Seed / Bagel

Shy-98 / MELLE