This project is a comprehensive Rust-based platform for building a self-improving knowledge base and interacting with your data—from data warehouses to live Google Sheets—using natural language.
-
Natural Language to Data:
- Text-to-SQL: Translates prompts into executable SQL queries for providers like Google BigQuery.
- Dynamic Google Sheet Querying: Automatically ingests a Google Sheet from a URL within a prompt and answers questions about its content on the fly.
- Context-Aware: Automatically injects the current date and time into the context, enabling time-sensitive questions.
-
Comprehensive Knowledge Base Pipeline:
- Multi-Source Ingestion: Builds a knowledge base from various sources:
- Web URLs, PDFs, Google Sheets, RSS Feeds, and Raw Text.
- AI-Powered Distillation: Uses an LLM to automatically extract explicit Q&A pairs and generate new ones from unstructured text.
- Vector Embeddings: Generates embeddings for semantic search.
- Multi-Source Ingestion: Builds a knowledge base from various sources:
-
Advanced Retrieval-Augmented Generation (RAG):
- Provides an API to ask questions against the knowledge base.
- Employs a sophisticated, multi-stage pipeline for highly relevant and context-aware answers.
- Temporal Reasoning: Understands and correctly answers time-sensitive queries like "what is the newest..." by filtering results based on date properties.
-
Self-Improvement Cycle:
- Fine-Tuning Export: Generates a dataset from the knowledge base in the correct format for fine-tuning your base LLM, which in turn improves future data extraction.
-
Flexible Identity & Ownership:
- JWT & Guest Access: Supports standard JWT-based authentication and seamlessly falls back to a deterministic "Guest User," ensuring all ingested data has a clear owner without requiring a login.
- Ownership-Aware Search: Search results are automatically and securely filtered based on the user's identity.
The system uses a sophisticated, multi-stage process to deliver precise answers from your knowledge base:
-
Query Analysis (LLM Call #1): The user's query is analyzed to extract key entities (e.g., "Tesla") and keyphrases (e.g., "campaign prize").
-
Hybrid Candidate Retrieval:
- Metadata Search: A fast database query retrieves an initial set of parent documents based on the extracted entities and keyphrases.
- Vector Search: In parallel, the user's query is converted into a vector (Embedding Model Call) to find semantically similar parent documents.
- Keyword Search: A traditional keyword search provides a baseline set of results.
-
Reciprocal Rank Fusion (RRF): The results from all retrieval methods are intelligently combined and re-ranked using the RRF algorithm to produce a single, relevance-scored list of the most relevant parent documents.
-
Contextual Chunking: The system retrieves the full YAML content of the top-ranked parent documents. Instead of using the whole document, it parses the YAML and treats each
section
as a single, context-rich "chunk." This ensures that the context provided to the final LLM is highly relevant and structured. -
Answer Synthesis (LLM Call #2): The final, semantically chunked context is passed to a powerful LLM, which generates a coherent, accurate answer based only on the provided information.
All JSON API responses follow a consistent result
object structure. Appending ?debug=true
to any request URL will add a debug
object to the response with contextual information.
- Standard Response (
/ingest/rss
){ "result": { "message": "Ingestion successful", "ingested_articles": 2 } }
- Debug Response (
/ingest/rss?debug=true
){ "debug": { "url": "http://example.com/rss" }, "result": { "message": "Ingestion successful", "ingested_articles": 2 } }
The workspace is divided into two main crates. For detailed information, please refer to the README.md
file within each crate's directory.
anyrag
: The core library containing all business logic.anyrag-server
: A lightweightaxum
web server that exposes the library's functionality via a REST API.
For detailed curl
examples for every API endpoint, please see the API Usage Examples (EXAMPLES.md) document.
anyrag/
├── Cargo.toml # Workspace configuration
├── EXAMPLES.md # Detailed API usage examples
├── crates/
│ ├── lib/ # The core logic library
│ │ ├── README.md <-- Library details
│ │ └── src/
│ └── server/ # The axum web server
│ ├── README.md <-- Server details
│ ├── Dockerfile
│ └── src/
└── README.md # This file
This project includes a comprehensive script (deploy.sh
) to automate deployment to Google Cloud Run.
- The Google Cloud SDK is installed and initialized.
- You have a Google Cloud project with billing enabled.
- Your
crates/server/.env
file contains yourAI_API_KEY
andBIGQUERY_PROJECT_ID
.
- Make the script executable:
chmod +x deploy.sh
- Run the deployment script:
./deploy.sh your-gcp-project-id
The script will guide you through the process and output the URL for your deployed service.
You can run all tests for the entire workspace from the root directory:
cargo test --workspace
This project is licensed under the MIT License.