Skip to content
/ Arke Public

Arke is a small personal project focused on building a local high-performance RAG system by combining some of the most modern and efficient tools and libraries available.

Notifications You must be signed in to change notification settings

Dassoo/Arke

Repository files navigation

Arke

A fast, efficient, locally-run Retrieval-Augmented Generation (RAG) system for document querying and knowledge base management

Python License "Buy Me A Coffee"

demo.mp4

Arke is a small personal project focused on building a local high-performance RAG system by combining some of the most modern and efficient tools and libraries available.

Note: As a design choice, chat threads lack persistence across backend resets. Only document storage and cached embeddings, along with document and query caching, are retained. This accommodates users who often open chats and forget about them, automatically cleaning up excess information.

Table of Contents

Features

  • Agent-Driven Architecture: Built around a LangChain agent enhanced with custom tools for intelligent document ingestion and querying
  • High-Performance Storage & Retrieval: Qdrant-backed vector store optimized for fast, scalable semantic search
  • Intelligent Caching Strategy: Dual-layer caching with local embedding persistence to minimize latency and cost (Redis + LangChain native)
  • Ultra-Fast Multiformat Ingestion: Native support for 50+ document formats powered by the Rust-based Kreuzberg OCR engine
  • Modern Web Interface: Next.js frontend with real-time streaming responses

Prerequisites

Installation

  1. Clone the repository:

    git clone <repository-url>
    cd arke
  2. Set up environment variables: Copy .env.example to .env and fill in your values:

    cp .env.example .env

    Required variables:

    • OPENAI_API_KEY: Your OpenAI API key
  3. Start Docker: A docker-compose.yml file is provided to spin up both the necessary instances from your terminal:

    docker compose up -d

Usage

Note: The application may take up to ~30 seconds to connect on startup. Check the status bar on the bottom left.

The RAG will be available at http://localhost:3000 on your browser.

You can now interact with the agent:

  • Store documents: Specify local folder paths with the documents to add (providing a sample in 'data/greece_dataset' on some Wikipedia pages about Ancient Greek history, art and architecture)
  • Query knowledge base: Ask questions about stored documents
  • Manage documents: View, delete stored documents or flush the database through natural language queries

Configuration

You can eventually customize the system settings through src/core/config.py:

License

This project is licensed under the MIT License - see the LICENSE file for details.


About

Arke is a small personal project focused on building a local high-performance RAG system by combining some of the most modern and efficient tools and libraries available.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors