Skip to content

StDensity/local-first-rag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Local RAG System

A complete, local-first Retrieval-Augmented Generation (RAG) system. This project allows you to chat with your documents privately using local LLMs (via Ollama) or optional cloud-based models (via Groq).

Demo

Features

  • Local & Private: Runs entirely on your machine using Ollama for inference and local embeddings.
  • Modern UI: Responsive React frontend with streaming responses and citation support.
  • Easy Document Management: Upload, view, and delete knowledge sources (PDFs) via the UI.
  • Vector Search: Powered by LlamaIndex for efficient retrieval.

Architecture

Architecture Diagram

The project is divided into two main components:

  • Backend: FastAPI service using LlamaIndex to handle ingestion, indexing, and query processing.
  • Frontend: React + Vite application providing the user interface.

Quick Start

Prerequisites

  • Python 3.13+ (Managed effectively with uv)
  • Node.js & npm/bun/yarn
  • Ollama running locally

1. Setup Backend

Navigate to the backend folder and start the server:

cd backend

# Install dependencies
uv sync

# Make sure you have the ollama model pulled
ollama pull qwen2.5:7b

# Start the API server
uv run fastapi dev main.py

The backend API will be available at http://localhost:8000.

2. Setup Frontend

Open a new terminal, navigate to the frontend folder, and start the UI:

cd frontend

# Install dependencies
bun install  # or npm install

# Start the development server
bun run dev  # or npm run dev

The frontend will typically run at http://localhost:5173.

3. Usage

  1. Open your browser to the frontend URL (e.g., http://localhost:5173).
  2. Use the Knowledge Base section to upload a PDF document.
  3. Click "Rebuild Index" to process the documents.
  4. Start chatting! The system will use your documents to answer questions.

📁 Repository Structure

.
├── backend/            # Python FastAPI application
│   ├── main.py         # Entry point
│   ├── src/            # Application source code
│   └── pyproject.toml  # Python dependencies
├── frontend/           # React application
│   ├── src/            # Frontend source code
│   ├── components/     # UI Components
│   └── package.json    # Node dependencies
└── README.md           # This file

🛠 Tech Stack

  • Backend: Python, FastAPI, LlamaIndex, uv
  • Frontend: React 19, TypeScript, Vite, Tailwind CSS, Bun
  • AI/ML: Ollama (Local LLM), HuggingFace (Embeddings)

About

A basic local first RAG application using Ollama

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors