English | 简体中文
✨ Key Features • 📖 Usage Examples • 🚀 Quick Start • 📊 Benchmarks
Youtu-RAG is a next-generation agentic retrieval-augmented generation system built on the "Local Deployment · Autonomous Decision · Memory-Driven" paradigm. With autonomous decision-making and memory learning capabilities, it represents best practices for personal local knowledge base management and Q&A systems.
Core Concepts:
- Local Deployment: All components support local deployment with data staying within domain. Integrated with MinIO object storage for large-scale file local management.
- Autonomous Decision: Agents autonomously determine whether to retrieve, how to retrieve, and when to call memory, selecting optimal strategies based on question types and historical experience.
- Memory-Driven: Dual-layer memory mechanism (short-term conversational memory + long-term knowledge accumulation) enables continuous learning and self-evolution of Q&A experience.
Traditional RAG systems follow a fixed pipeline of "offline chunking - vector retrieval - concatenation generation," facing core bottlenecks such as privacy risks, memory loss, and rigid retrieval. Youtu-RAG aims to upgrade the system from a passive retrieval tool to an intelligent retrieval-augmented generation system with autonomous decision and memory learning capabilities.
|
File-based knowledge organization, supporting multi-source heterogeneous data including PDF, Excel, Images, and Databases Supported Formats: |
Autonomous decision-making on the optimal retrieval strategy, supporting a variety of tool calls such as web search, vector retrieval, metadata filtering, database queries, and code execution Retrieval Modes: |
|
Short-term conversational memory + long-term cross-session knowledge accumulation, achieving Q&A experience learning Memory Types: |
From simple conversations to complex orchestrations, covering a wide range of application-level scenarios. Supports over 8 AI agents including Web Search, KB Search, Meta Retrieval, Excel Agent, and Text2SQL Application Scenarios: |
|
Pure native HTML + CSS + JavaScript implementation, framework-free. Supports file upload, knowledge base management, AI dialogue, document preview, and complete functionality Technical Features: |
All related components support local deployment, data stays within domain. Integrated with MinIO object storage for large-scale file local management Security Guarantees: |
- Access frontend interface
http://localhost:8000 - Click "File Management" in the left sidebar
- Click "Upload File"
- Based on file type and file management configuration, files will be processed through different paths and generate previewable content
|
File Upload Example Automatic Metadata extraction and summary generation |
PDF File Post-Processing Preview Requires OCR configuration support |
01_upload_file.mp4 |
02_pdf_file_example.mp4 |
|
PNG File Post-Processing Preview Requires OCR configuration support |
HiChunk Parsing Preview Requires HiChunk configuration support |
03_png_file_example.mp4 |
04_hichunk_example.mp4 |
When OCR and HiChunk configurations are enabled, the parsing phase of document uploading will incur additional time consumption. It is recommended to use single-file import for such files (batch import will result in longer waiting times).
|
File Batch Deletion and Upload It is recommended to batch import files of the same type at once |
File Metadata Batch Editing Supports batch export, editing, and import |
File Search Supports filename, Metadata, summary, etc. |
06_batch_delete_upload.mp4 |
05_metadata_export_import.mp4 |
07_file_search.mp4 |
- Access frontend interface
http://localhost:8000 - Click "Knowledge Base" in the left sidebar
- Click the "Create Knowledge Base" button
- Fill in the knowledge base name (e.g.,
Technical Documentation) - Click confirm to create
|
Knowledge Base Creation and Deletion Only supports single knowledge base operation |
Knowledge Base Search Supports knowledge base name and Description search |
08_kb_create_delete.mp4 |
09_kb_search.mp4 |
- File Association: Associate uploaded files to knowledge base
- Database Association: Associate local database to knowledge base
- Example Association: Associate example Q&A pairs to knowledge base (as experience information)
💡 Tips: After completing each association configuration, you need to click the Save Association button to save the association configuration and avoid losing previous selections
|
File Association Multiple files can be selected for association at once |
Database Association Supports Sqlite and MySQL |
Example Association Supports association of example Q&A pairs |
10_kb_file_association.mp4 |
11_kb_db_association.mp4 |
12_kb_qa_example.mp4 |
|
Knowledge Base Configuration View View association configuration and construction configuration |
Knowledge Base Vectorization Construction Unified construction of different types of associated content |
Knowledge Base Association Editing Supports editing and updating of associated content |
13_kb_config_show.mp4 |
14_kb_build.mp4 |
15_kb_modify.mp4 |
-
You can select configured Agents for different tasks to conduct dialogues or Q&A:
- Some agents can only be used after selecting a knowledge base or file
- Provides temporary file upload button, supporting temporary file upload for Q&A, but the file will only be automatically associated with the current knowledge base and will not undergo vector construction
-
In the frontend dialogue interface, turn on the "Memory" switch in the lower right corner to enable the dual-layer memory mechanism. After enabling memory, the Agent will have:
- Short-Term Memory: Remember conversation context to avoid repeated questioning
- Long-Term Memory: Accumulate successful experiences, prioritizing reuse when encountering similar questions next time
💬 Chat Agent
|
🔍 Web Search Agent
|
16_chat_agent.mp4 |
17_websearch_agent.mp4 |
📚 KB Search Agent
|
📚 Meta Retrieval Agent
|
18_kbsearch_agent.mp4 |
19_meta_retrieval_agent.mp4 |
📄 File QA Agent
|
📊 Excel Agent
|
20_file_qa_agent.mp4 |
22_excel_agent.mp4 |
💻 Text2SQL Agent
|
🧠 Short and Long-Term Memory
|
21_text2sql_agent.mp4 |
23_memory_chat_2.mp4 |
🧐 Text2SQL Agent with Memory
|
🎯 QA Learning
|
25_text2sql_memory.mp4 |
26_qa_learning.mp4 |
- Python: 3.12+
- Package Manager: Recommended to use uv
- Operating System: Linux Desktop / macOS / Windows
MinIO is a high-performance object storage service used to store uploaded document files (still locally managed).
For installation instructions, please refer to the official MinIO repository. Two installation methods are supported:
- Install from Source: Build and install MinIO from source code
- Build Docker Image: Deploy MinIO using Docker containers
| Model | HuggingFace | Deployment Method | Required |
|---|---|---|---|
| Youtu-Embedding | HuggingFace | Deployment Docs | ✅ Required, or other Embedding API services |
| Youtu-Parsing | HuggingFace | Deployment Docs | ⭕ Optional |
| Youtu-HiChunk | HuggingFace | Deployment Docs | ⭕ Optional |
git clone https://github.com/TencentCloudADP/youtu-rag.git
cd youtu-rag
uv sync
source .venv/bin/activate
cp .env.example .envEdit the .env file and fill in the following core configurations:
# =============================================
# LLM Configuration (Required)
# =============================================
UTU_LLM_TYPE=chat.completions
UTU_LLM_MODEL=deepseek-chat
UTU_LLM_BASE_URL=https://api.deepseek.com/v1
UTU_LLM_API_KEY=your_deepseek_api_key # Replace with your API Key
# =============================================
# Embedding Configuration (Required)
# =============================================
# Option 1: Local Service (Youtu-Embedding-2B)
UTU_EMBEDDING_URL=http://localhost:8081
UTU_EMBEDDING_MODEL=youtu-embedding-2B
# Option 2: Other Embedding API Services
# UTU_EMBEDDING_URL=https://api.your-embedding-service.com
# UTU_EMBEDDING_API_KEY=your_api_key
# UTU_EMBEDDING_MODEL=model_name
# =============================================
# Reranker Configuration (Optional, improves retrieval accuracy)
# =============================================
UTU_RERANKER_MODEL=jina-reranker-v3
UTU_RERANKER_URL=https://api.jina.ai/v1/rerank
UTU_RERANKER_API_KEY=your_jina_api_key
# =============================================
# OCR Configuration (Optional, locally deployable Youtu-Parsing)
# =============================================
UTU_OCR_BASE_URL=https://api.ocr.com/ocr
UTU_OCR_MODEL=youtu-ocr
# =============================================
# Chunk Configuration (Optional, locally deployable Youtu-HiChunk)
# =============================================
UTU_CHUNK_BASE_URL=https://api.hichunk.com/chunk
UTU_CHUNK_MODEL=hichunk
# =============================================
# Memory Function (Optional)
# =============================================
memoryEnabled=false # Set to true to enable dual-layer memory mechanismNote: If you don't need OCR and Chunk features, you can disable them by setting
ocr enabled: falseandchunk enabled: falsein configs/rag/file_management.yaml.
# Method 1: Using startup script (Recommended)
bash start.sh
# Method 2: Directly using uvicorn
uv run uvicorn utu.rag.api.main:app --reload --host 0.0.0.0 --port 8000After successful startup, access the following addresses:
- 📱 Frontend Interface: http://localhost:8000
- 📊 Monitoring Dashboard: http://localhost:8000/monitor
Youtu-RAG provides a complete evaluation system, supporting multi-dimensional capability verification.
- Capability: Natural language to SQL, Schema understanding, SQL execution
- Dataset: Self-built Text2SQL dataset (Multi-table, Complex excel, Domain table)
- Metric: Accuracy (LLM Judge)
| Dataset Overview | Dataset | Multi-table-mini | Complex Excel | Multi-table | Domain Table |
|---|---|---|---|---|---|
| Data Volume | 245 | 931 | 1,390 | 100 | |
| Type | Multi-table | Complex Questions | Multi-table Full | Domain Knowledge | |
| Baseline | Vanna | 45.71% | 38.64% | 35.11% | 9.00% |
| 🎯 Youtu-RAG | Text2SQL Agent | 69.39% ↑ | 57.36% ↑ | 67.27% ↑ | 27.00% ↑ |
- Capability: Table understanding, data analysis, non-standard table parsing
- Dataset: Self-built Excel Q&A dataset (500 test questions)
- Metrics: LLM Judge
- Accuracy: Factual correctness of answers
- Analysis Depth: Analysis quality and insight of answers
- Feasibility: Whether generated code/solutions are executable
- Aesthetics: Visual quality of visualization charts
| Category | Methods | Accuracy | Analysis Depth | Feasibility | Aesthetics |
|---|---|---|---|---|---|
| Baselines | TableGPT2-7B | 8.4 | 5.1 | 4.3 | 6.2 |
| StructGPT | 6.22 | 3.84 | 3.12 | 4.5 | |
| TableLLM-7B | 4.1 | 2.1 | 1.8 | 2.3 | |
| ST-Raptor | 22.4 | 6.0 | 7.4 | 12.4 | |
| TreeThinker | 31.0 | 22.8 | 21.4 | 36.8 | |
| Code Loop | 27.5 | 9.5 | 14.9 | 20.4 | |
| 🎯 Youtu-RAG | Excel Agent | 37.5 ↑ | 30.2 ↑ | 27.6 ↑ | 42.6 ↑ |
- FactGuard: Long document single-point fact checking, information extraction, reasoning verification
- Sequential-NIAH: Long document multi-point information extraction, sequential information extraction
| Dataset Overview | Dataset | FactGuard | Sequential-NIAH |
|---|---|---|---|
| Data Volume | 700 | 2,000 | |
| Type | Long-text Q&A (Single-point) | Long-text Q&A (Multi-point) | |
| Baselines | Naive Retrieval Top3 | 79.86% | 14.20% |
| Naive Retrieval Top5 | 80.71% | 29.75% | |
| Naive Retrieval Top10 | 82.71% | 57.25% | |
| Naive Retrieval Top15 | 83.00% | 70.15% | |
| 🎯 Youtu-RAG | KB Search Agent | 88.27% ↑ | 85.05% ↑ |
| File QA Agent | 88.29% ↑ | 60.80% * |
Note: *Reading full documents in long context is a known weakness of LLMs, which aligns with the experimental findings in Sequential-NIAH.
- Capability: Question preference understanding, metadata filtering and reranking, vector retrieval
- Dataset: Self-built metadata retrieval dataset
- Metrics:
- Weighted NDCG@5: Metric for recalling truly relevant documents in accurate order within the top 5 retrieval results
- Recall@all: How many of all truly relevant documents are accurately recalled
| Dataset | Data Volume | Metric | Baseline (Naive Retrieval) |
Youtu-RAG (Meta Retrieval Agent) |
Improvement |
|---|---|---|---|---|---|
| Timeliness Preference | 183 | Recall@all | 34.52% | 41.92% | +7.40% ↑ |
| NDCG_w@5 | 29.91% | 43.57% | +13.66% ↑ | ||
| Popularity Preference | 301 | Recall@all | 26.19% | 47.20% | +21.01% ↑ |
| NDCG_w@5 | 29.86% | 54.31% | +24.45% ↑ | ||
| Average | 483 | Recall@all | 29.34% | 45.21% | +15.87% ↑ |
| NDCG_w@5 | 29.88% | 50.25% | +20.37% ↑ |
Memoria-Bench is the industry's first agent memory evaluation benchmark that distinguishes between semantic memory, episodic memory, and procedural memory, and is adapted to high information density scenarios such as in-depth research, table Q&A, and complex code analysis and completion.
Core Features:
- 📚 Semantic Memory Evaluation: Knowledge understanding and application
- 📖 Episodic Memory Evaluation: Historical dialogue retrospection
- 🔧 Procedural Memory Evaluation: learning and reuse
- 🎯 Scenario Coverage: Research report generation, data analysis, code completion
💡 Tips: The Memoria-Bench evaluation benchmark is under review, stay tuned!
We welcome any form of contribution! Including but not limited to:
- 🐛 Report Bugs and Issues
- 💡 Propose New Feature Suggestions
- 📝 Improve Documentation
- 🔧 Submit Code Improvements
For detailed development process and specifications, please refer to CONTRIBUTING.md.
This project is licensed under the MIT License.
Youtu-RAG builds upon the excellent work of several open-source projects:
- Youtu-Agent: Agent framework
- Youtu-LLM:LLM model
- Youtu-Embedding: Chinese vector encoder model
- Youtu-Parsing: Document parsing model
- Youtu-HiChunk: Hierarchical document chunking model
- FactGuard: Benchmark of long document single-point fact checking, information extraction, reasoning verification
- Sequential-NIAH: Benchmark of long document multi-point information extraction, sequential information extraction
Special thanks to all developers who contributed code, suggestions, and reported issues to this project!
If this project is helpful to your research or work, please cite:
@software{Youtu-RAG,
author = {Tencent Youtu Lab},
title = {Youtu-RAG: Next-Generation Agentic Intelligent Retrieval-Augmented Generation System},
year = {2026},
url = {https://github.com/TencentCloudADP/youtu-rag}
}⭐ If this project is helpful to you, please give us a Star!
