Skip to content
View ABC0408's full-sized avatar
🐢
Focusing
🐢
Focusing

Block or report ABC0408

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

A Conversational Speech Generation Model

Python 11,853 990 Updated Mar 27, 2025

Web EPUB and PDF text to speech document reader. Read documents in realtime with high-quality TTS; or extract audiobooks. Use your own Kokoro TTS API or Open AI API endpoint.

TypeScript 106 15 Updated Mar 24, 2025

LP-MusicCaps: LLM-Based Pseudo Music Captioning [ISMIR23]

Python 313 38 Updated Apr 8, 2024

Run Orpheus 3B Locally With LM Studio

Python 299 53 Updated Mar 20, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 42,931 6,515 Updated Mar 28, 2025

FlashInfer: Kernel Library for LLM Serving

Cuda 2,507 260 Updated Mar 27, 2025

TTS Towards Human-Sounding Speech

Python 3,091 221 Updated Mar 27, 2025

The official Soundwave repository

Python 181 19 Updated Mar 16, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 45,521 5,564 Updated Mar 27, 2025

LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement

Python 65 14 Updated Mar 9, 2025

Sylber: Syllabic Embedding Representation of Speech from Raw Audio

Jupyter Notebook 48 2 Updated Mar 17, 2025

Transcribe audio and video files with speaker diarization and logically grouped timestamps

Svelte 10 1 Updated Mar 12, 2025

OSUM: Open Speech Understanding Model, open-sourced by ASLP@NPU.

Python 347 21 Updated Mar 18, 2025

Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

Python 9,298 1,157 Updated Mar 24, 2025

An invisible desktop application to help you pass your technical interviews.

2,456 509 Updated Mar 20, 2025

💬 MaskLID: Code-Switching Language Identification through Iterative Masking -- ACL 2024

Python 8 1 Updated Jun 11, 2024

Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)

Python 361 48 Updated Nov 7, 2023

The first Large Audio Language Model that enables native in-depth thinking, which is trained on large-scale audio Chain-of-Thought data.

Python 202 19 Updated Mar 17, 2025

Di♪♪Rhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion

Python 1,331 126 Updated Mar 23, 2025

F5-TTS 推理加速,速度提升约4倍!

Python 64 10 Updated Jan 6, 2025

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

Python 384 22 Updated Mar 27, 2025

The python library for real-time communication

JavaScript 3,325 286 Updated Mar 27, 2025

Spark-TTS Inference Code

Python 6,852 710 Updated Mar 21, 2025

Official implementation of the paper "Acoustic Music Understanding Model with Large-Scale Self-supervised Training".

Python 358 23 Updated Apr 20, 2024

[CVPR 2025] MatAnyone: Stable Video Matting with Consistent Memory Propagation

Python 888 44 Updated Mar 27, 2025
Python 209 17 Updated Mar 18, 2025

Using whisper and spleeter to sync lyrics with song timeline

Python 4 Updated Jul 25, 2024
Python 4,081 327 Updated Mar 12, 2025

A fast inference library for running LLMs locally on modern consumer-class GPUs

Python 4,077 305 Updated Mar 15, 2025

🧠+🎧 Build your music algorithms and AI models with the next-gen DAW 🔥

Python 824 95 Updated Jun 6, 2023
Next
Showing results