AI
🚀Clone a voice in 5 seconds to generate arbitrary speech in real-time
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
Multilingual Voice Understanding Model
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Production-ready platform for agentic workflow development.
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
ChatDev 2.0: Dev All through LLM-powered Multi-Agent Collaboration
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
A modular graph-based Retrieval-Augmented Generation (RAG) system
🚀 Truly open-source AI avatar(digital human) toolkit for offline video generation and digital human cloning.
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Lets make video diffusion practical!
The ultimate LLM/AI application development framework in Go.
A next.js web application that integrates AI capabilities with draw.io diagrams. This app allows you to create, modify, and enhance diagrams through natural language commands and AI-assisted visual…
Free, local, open-source 24/7 Cowork app and OpenClaw for Gemini CLI, Claude Code, Codex, OpenCode, Qwen Code, Goose CLI, Auggie, and more | 🌟 Star if you like it!
