Skip to content

Youtu-RAG: Next-Generation Agentic Intelligent Retrieval-Augmented Generation System

License

Notifications You must be signed in to change notification settings

TencentCloudADP/youtu-rag

Repository files navigation

Python Version License Documentation Version

English | 简体中文

✨ Key Features📖 Usage Examples🚀 Quick Start📊 Benchmarks


Youtu-RAG is a next-generation agentic retrieval-augmented generation system built on the "Local Deployment · Autonomous Decision · Memory-Driven" paradigm. With autonomous decision-making and memory learning capabilities, it represents best practices for personal local knowledge base management and Q&A systems.

Core Concepts:

  • Local Deployment: All components support local deployment with data staying within domain. Integrated with MinIO object storage for large-scale file local management.
  • Autonomous Decision: Agents autonomously determine whether to retrieve, how to retrieve, and when to call memory, selecting optimal strategies based on question types and historical experience.
  • Memory-Driven: Dual-layer memory mechanism (short-term conversational memory + long-term knowledge accumulation) enables continuous learning and self-evolution of Q&A experience.

Traditional RAG systems follow a fixed pipeline of "offline chunking - vector retrieval - concatenation generation," facing core bottlenecks such as privacy risks, memory loss, and rigid retrieval. Youtu-RAG aims to upgrade the system from a passive retrieval tool to an intelligent retrieval-augmented generation system with autonomous decision and memory learning capabilities.

✨ Key Features

📁 File-Centric Architecture

File-based knowledge organization, supporting multi-source heterogeneous data including PDF, Excel, Images, and Databases

Supported Formats: PDF/Word/MD Excel IMAGE Database +12 formats

🎯 Adaptive Retrieval Engine

Autonomous decision-making on the optimal retrieval strategy, supporting a variety of tool calls such as web search, vector retrieval, metadata filtering, database queries, and code execution

Retrieval Modes: Autonomous Decision Tool Call Diversified Data Sources

🧠 Dual-Layer Memory

Short-term conversational memory + long-term cross-session knowledge accumulation, achieving Q&A experience learning

Memory Types: Short-Term Memory Long-Term Memory Q&A Learning

🤖 Ready-to-Use Agents

From simple conversations to complex orchestrations, covering a wide range of application-level scenarios. Supports over 8 AI agents including Web Search, KB Search, Meta Retrieval, Excel Agent, and Text2SQL

Application Scenarios: Ready-to-Use Diverse Scenarios Complex Task Collaboration

🎨 Lightweight WebUI

Pure native HTML + CSS + JavaScript implementation, framework-free. Supports file upload, knowledge base management, AI dialogue, document preview, and complete functionality

Technical Features: Zero Dependencies Streaming Response Easy Operation

🔐 Security & Control

All related components support local deployment, data stays within domain. Integrated with MinIO object storage for large-scale file local management

Security Guarantees: Local Deployment Data Isolation MinIO Storage

Youtu-RAG Architecture

📖 Usage Examples

1️⃣ File Management

File Upload and Preview

  1. Access frontend interface http://localhost:8000
  2. Click "File Management" in the left sidebar
  3. Click "Upload File"
  4. Based on file type and file management configuration, files will be processed through different paths and generate previewable content
File Upload Example
Automatic Metadata extraction and summary generation
PDF File Post-Processing Preview
Requires OCR configuration support
01_upload_file.mp4
02_pdf_file_example.mp4
PNG File Post-Processing Preview
Requires OCR configuration support
HiChunk Parsing Preview
Requires HiChunk configuration support
03_png_file_example.mp4
04_hichunk_example.mp4

File Batch Management

When OCR and HiChunk configurations are enabled, the parsing phase of document uploading will incur additional time consumption. It is recommended to use single-file import for such files (batch import will result in longer waiting times).

File Batch Deletion and Upload
It is recommended to batch import files of the same type at once
File Metadata Batch Editing
Supports batch export, editing, and import
File Search
Supports filename, Metadata, summary, etc.
06_batch_delete_upload.mp4
05_metadata_export_import.mp4
07_file_search.mp4

2️⃣ Knowledge Base Management

Knowledge Base Creation and Deletion

  1. Access frontend interface http://localhost:8000
  2. Click "Knowledge Base" in the left sidebar
  3. Click the "Create Knowledge Base" button
  4. Fill in the knowledge base name (e.g., Technical Documentation)
  5. Click confirm to create
Knowledge Base Creation and Deletion
Only supports single knowledge base operation
Knowledge Base Search
Supports knowledge base name and Description search
08_kb_create_delete.mp4
09_kb_search.mp4

Knowledge Base Content Association and Vectorization Construction

  1. File Association: Associate uploaded files to knowledge base
  2. Database Association: Associate local database to knowledge base
  3. Example Association: Associate example Q&A pairs to knowledge base (as experience information)

💡 Tips: After completing each association configuration, you need to click the Save Association button to save the association configuration and avoid losing previous selections

File Association
Multiple files can be selected for association at once
Database Association
Supports Sqlite and MySQL
Example Association
Supports association of example Q&A pairs
10_kb_file_association.mp4
11_kb_db_association.mp4
12_kb_qa_example.mp4
Knowledge Base Configuration View
View association configuration and construction configuration
Knowledge Base Vectorization Construction
Unified construction of different types of associated content
Knowledge Base Association Editing
Supports editing and updating of associated content
13_kb_config_show.mp4
14_kb_build.mp4
15_kb_modify.mp4

3️⃣ Intelligent Dialogue

  1. You can select configured Agents for different tasks to conduct dialogues or Q&A:

    • Some agents can only be used after selecting a knowledge base or file
    • Provides temporary file upload button, supporting temporary file upload for Q&A, but the file will only be automatically associated with the current knowledge base and will not undergo vector construction
  2. In the frontend dialogue interface, turn on the "Memory" switch in the lower right corner to enable the dual-layer memory mechanism. After enabling memory, the Agent will have:

    • Short-Term Memory: Remember conversation context to avoid repeated questioning
    • Long-Term Memory: Accumulate successful experiences, prioritizing reuse when encountering similar questions next time
💬 Chat Agent
  • Chat Agent
  • It is recommended to enable "Memory" to support multi-turn conversations
🔍 Web Search Agent
  • Supports web search
  • Can access links to explore detailed content and answer
16_chat_agent.mp4
17_websearch_agent.mp4
📚 KB Search Agent
  • Must select knowledge base
  • Supports vector retrieval and reranking
📚 Meta Retrieval Agent
  • Must select knowledge base
  • Supports vector retrieval and reranking
  • Supports question intent parsing and metadata filtering
18_kbsearch_agent.mp4
19_meta_retrieval_agent.mp4
📄 File QA Agent
  • Must select knowledge base and file
  • Supports Python reading and processing file content
  • Supports vector retrieval and reranking
📊 Excel Agent
  • Must select knowledge base and file
  • Question decomposition and data processing step breakdown
  • Python code execution and reflection
20_file_qa_agent.mp4
22_excel_agent.mp4
💻 Text2SQL Agent
  • Must select knowledge base with associated database
  • Question decomposition and SQL code generation and execution
  • SQL query result display and reflection
🧠 Short and Long-Term Memory
  • Short-term memory: Takes effect within Session, used to support multi-turn conversations
  • Long-term memory: Long-term effectiveness, used to accumulate successful experiences
21_text2sql_agent.mp4
23_memory_chat_2.mp4
🧐 Text2SQL Agent with Memory
  • Short-term memory takes effect within Session
  • Long-term memory can avoid additional token consumption for similar questions
🎯 QA Learning
  • Record QA examples
  • Automatically learn Agent routing strategies
25_text2sql_memory.mp4
26_qa_learning.mp4

🚀 Quick Start

Environment Requirements

  • Python: 3.12+
  • Package Manager: Recommended to use uv
  • Operating System: Linux Desktop / macOS / Windows

📦 Object Storage (MinIO) Configuration

MinIO is a high-performance object storage service used to store uploaded document files (still locally managed).

For installation instructions, please refer to the official MinIO repository. Two installation methods are supported:

  • Install from Source: Build and install MinIO from source code
  • Build Docker Image: Deploy MinIO using Docker containers

⚙️ Model Deployment

Model HuggingFace Deployment Method Required
Youtu-Embedding HuggingFace Deployment Docs ✅ Required, or other Embedding API services
Youtu-Parsing HuggingFace Deployment Docs ⭕ Optional
Youtu-HiChunk HuggingFace Deployment Docs ⭕ Optional

One-Click Installation of Youtu-RAG System

git clone https://github.com/TencentCloudADP/youtu-rag.git
cd youtu-rag
uv sync
source .venv/bin/activate
cp .env.example .env

Configure Necessary Environment Variables

Edit the .env file and fill in the following core configurations:

# =============================================
# LLM Configuration (Required)
# =============================================
UTU_LLM_TYPE=chat.completions
UTU_LLM_MODEL=deepseek-chat
UTU_LLM_BASE_URL=https://api.deepseek.com/v1
UTU_LLM_API_KEY=your_deepseek_api_key  # Replace with your API Key

# =============================================
# Embedding Configuration (Required)
# =============================================
# Option 1: Local Service (Youtu-Embedding-2B)
UTU_EMBEDDING_URL=http://localhost:8081
UTU_EMBEDDING_MODEL=youtu-embedding-2B

# Option 2: Other Embedding API Services
# UTU_EMBEDDING_URL=https://api.your-embedding-service.com
# UTU_EMBEDDING_API_KEY=your_api_key
# UTU_EMBEDDING_MODEL=model_name

# =============================================
# Reranker Configuration (Optional, improves retrieval accuracy)
# =============================================
UTU_RERANKER_MODEL=jina-reranker-v3
UTU_RERANKER_URL=https://api.jina.ai/v1/rerank
UTU_RERANKER_API_KEY=your_jina_api_key 

# =============================================
# OCR Configuration (Optional, locally deployable Youtu-Parsing)
# =============================================
UTU_OCR_BASE_URL=https://api.ocr.com/ocr
UTU_OCR_MODEL=youtu-ocr

# =============================================
# Chunk Configuration (Optional, locally deployable Youtu-HiChunk)
# =============================================
UTU_CHUNK_BASE_URL=https://api.hichunk.com/chunk
UTU_CHUNK_MODEL=hichunk

# =============================================
# Memory Function (Optional)
# =============================================
memoryEnabled=false  # Set to true to enable dual-layer memory mechanism

Note: If you don't need OCR and Chunk features, you can disable them by setting ocr enabled: false and chunk enabled: false in configs/rag/file_management.yaml.

Start Service

# Method 1: Using startup script (Recommended)
bash start.sh

# Method 2: Directly using uvicorn
uv run uvicorn utu.rag.api.main:app --reload --host 0.0.0.0 --port 8000

After successful startup, access the following addresses:


📊 Benchmarks

Youtu-RAG provides a complete evaluation system, supporting multi-dimensional capability verification.

🗄️ Structured Retrieval (Text2SQL)

  • Capability: Natural language to SQL, Schema understanding, SQL execution
  • Dataset: Self-built Text2SQL dataset (Multi-table, Complex excel, Domain table)
  • Metric: Accuracy (LLM Judge)
Dataset Overview Dataset Multi-table-mini Complex Excel Multi-table Domain Table
Data Volume 245 931 1,390 100
Type Multi-table Complex Questions Multi-table Full Domain Knowledge
Baseline Vanna 45.71% 38.64% 35.11% 9.00%
🎯 Youtu-RAG Text2SQL Agent 69.39% 57.36% 67.27% 27.00%

📊 Semi-Structured Retrieval (Excel)

  • Capability: Table understanding, data analysis, non-standard table parsing
  • Dataset: Self-built Excel Q&A dataset (500 test questions)
  • Metrics: LLM Judge
    • Accuracy: Factual correctness of answers
    • Analysis Depth: Analysis quality and insight of answers
    • Feasibility: Whether generated code/solutions are executable
    • Aesthetics: Visual quality of visualization charts
Category Methods Accuracy Analysis Depth Feasibility Aesthetics
Baselines TableGPT2-7B 8.4 5.1 4.3 6.2
StructGPT 6.22 3.84 3.12 4.5
TableLLM-7B 4.1 2.1 1.8 2.3
ST-Raptor 22.4 6.0 7.4 12.4
TreeThinker 31.0 22.8 21.4 36.8
Code Loop 27.5 9.5 14.9 20.4
🎯 Youtu-RAG Excel Agent 37.5 30.2 27.6 42.6

📖 Reading Comprehension (Long Documents)

  • FactGuard: Long document single-point fact checking, information extraction, reasoning verification
  • Sequential-NIAH: Long document multi-point information extraction, sequential information extraction
Dataset Overview Dataset FactGuard Sequential-NIAH
Data Volume 700 2,000
Type Long-text Q&A (Single-point) Long-text Q&A (Multi-point)
Baselines Naive Retrieval Top3 79.86% 14.20%
Naive Retrieval Top5 80.71% 29.75%
Naive Retrieval Top10 82.71% 57.25%
Naive Retrieval Top15 83.00% 70.15%
🎯 Youtu-RAG KB Search Agent 88.27% 85.05%
File QA Agent 88.29% 60.80% *

Note: *Reading full documents in long context is a known weakness of LLMs, which aligns with the experimental findings in Sequential-NIAH.


🏷️ Metadata Retrieval

  • Capability: Question preference understanding, metadata filtering and reranking, vector retrieval
  • Dataset: Self-built metadata retrieval dataset
  • Metrics:
    • Weighted NDCG@5: Metric for recalling truly relevant documents in accurate order within the top 5 retrieval results
    • Recall@all: How many of all truly relevant documents are accurately recalled
Dataset Data Volume Metric Baseline
(Naive Retrieval)
Youtu-RAG
(Meta Retrieval Agent)
Improvement
Timeliness Preference 183 Recall@all 34.52% 41.92% +7.40% ↑
NDCG_w@5 29.91% 43.57% +13.66% ↑
Popularity Preference 301 Recall@all 26.19% 47.20% +21.01% ↑
NDCG_w@5 29.86% 54.31% +24.45% ↑
Average 483 Recall@all 29.34% 45.21% +15.87% ↑
NDCG_w@5 29.88% 50.25% +20.37% ↑

Memoria-Bench (Under Review, To Be Released)

Memoria-Bench is the industry's first agent memory evaluation benchmark that distinguishes between semantic memory, episodic memory, and procedural memory, and is adapted to high information density scenarios such as in-depth research, table Q&A, and complex code analysis and completion.

Core Features:

  • 📚 Semantic Memory Evaluation: Knowledge understanding and application
  • 📖 Episodic Memory Evaluation: Historical dialogue retrospection
  • 🔧 Procedural Memory Evaluation: learning and reuse
  • 🎯 Scenario Coverage: Research report generation, data analysis, code completion

💡 Tips: The Memoria-Bench evaluation benchmark is under review, stay tuned!

🤝 Contributing Guidelines

We welcome any form of contribution! Including but not limited to:

  • 🐛 Report Bugs and Issues
  • 💡 Propose New Feature Suggestions
  • 📝 Improve Documentation
  • 🔧 Submit Code Improvements

For detailed development process and specifications, please refer to CONTRIBUTING.md.

📄 License

This project is licensed under the MIT License.

🙏 Acknowledgments

Youtu-RAG builds upon the excellent work of several open-source projects:

Special thanks to all developers who contributed code, suggestions, and reported issues to this project!

📚 Citation

If this project is helpful to your research or work, please cite:

@software{Youtu-RAG,
  author = {Tencent Youtu Lab},
  title = {Youtu-RAG: Next-Generation Agentic Intelligent Retrieval-Augmented Generation System},
  year = {2026},
  url = {https://github.com/TencentCloudADP/youtu-rag}
}

⭐ If this project is helpful to you, please give us a Star!

About

Youtu-RAG: Next-Generation Agentic Intelligent Retrieval-Augmented Generation System

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •