Arke

A fast, efficient, locally-run Retrieval-Augmented Generation (RAG) system for document querying and knowledge base management

demo.mp4

Arke is a small personal project focused on building a local high-performance RAG system by combining some of the most modern and efficient tools and libraries available.

Note: As a design choice, chat threads lack persistence across backend resets. Only document storage and cached embeddings, along with document and query caching, are retained. This accommodates users who often open chats and forget about them, automatically cleaning up excess information.

Features

Agent-Driven Architecture: Built around a LangChain agent enhanced with custom tools for intelligent document ingestion and querying
High-Performance Storage & Retrieval: Qdrant-backed vector store optimized for fast, scalable semantic search
Intelligent Caching Strategy: Dual-layer caching with local embedding persistence to minimize latency and cost (Redis + LangChain native)
Ultra-Fast Multiformat Ingestion: Native support for 50+ document formats powered by the Rust-based Kreuzberg OCR engine
Modern Web Interface: Next.js frontend with real-time streaming responses

Prerequisites

Installation

Clone the repository:
```
git clone <repository-url>
cd arke
```
Set up environment variables: Copy .env.example to .env and fill in your values:
```
cp .env.example .env
```
Required variables:
- OPENAI_API_KEY: Your OpenAI API key
Start Docker: A docker-compose.yml file is provided to spin up both the necessary instances from your terminal:
```
docker compose up -d
```

Usage

Note: The application may take up to ~30 seconds to connect on startup. Check the status bar on the bottom left.

The RAG will be available at http://localhost:3000 on your browser.

You can now interact with the agent:

Store documents: Specify local folder paths with the documents to add (providing a sample in 'data/greece_dataset' on some Wikipedia pages about Ancient Greek history, art and architecture)
Query knowledge base: Ask questions about stored documents
Manage documents: View, delete stored documents or flush the database through natural language queries

Configuration

You can eventually customize the system settings through src/core/config.py:

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.github/workflows		.github/workflows
data/greece_dataset		data/greece_dataset
frontend		frontend
src		src
tests		tests
.env.example		.env.example
.python-version		.python-version
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
package-lock.json		package-lock.json
package.json		package.json
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Arke

Table of Contents

Features

Prerequisites

Installation

Usage

Configuration

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Dassoo/Arke

Folders and files

Latest commit

History

Repository files navigation

Arke

Table of Contents

Features

Prerequisites

Installation

Usage

Configuration

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages