Stars
SGLang is a fast serving framework for large language models and vision language models.
Integrate the DeepSeek API into popular softwares
Fully open reproduction of DeepSeek-R1
A debugging and profiling tool that can trace and visualize python code execution
LLMPerf is a library for validating and benchmarking LLMs
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.
Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯
Speech-to-text, text-to-speech, speaker diarization, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC…
🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library
Fast and memory-efficient exact attention
A modern GUI client based on Tauri, designed to run in Windows, macOS and Linux for tailored proxy experience
A collection of (mostly) technical things every software developer should know about
Code for explaining and evaluating late chunking (chunked pooling)
"LightRAG: Simple and Fast Retrieval-Augmented Generation"
A simple, easy-to-hack GraphRAG implementation
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.pdf and PV-Tuning: Beyond Straight-Through Estimation for Ext…
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
Speech To Speech: an effort for an open-sourced and modular GPT4-o