rag-chatbot

A self-hosted, offline-first RAG (Retrieval-Augmented Generation) chatbot stack for Raspberry Pi. Clone, install, and query your own documents with local AI — no cloud required.

Guide: Building an Offline-First AI Stack on Raspberry Pi

Stack

Ollama — Local LLM inference (qwen3, llama3, phi3)
ChromaDB — Vector database for semantic search
Flask — API server with health, ingest, and query endpoints
nomic-embed-text — Local embedding model

Quick Start

# Prerequisites: Ollama installed with nomic-embed-text pulled
ollama pull nomic-embed-text
ollama pull qwen3:1.7b

# Clone and install
git clone https://github.com/bitsandbots/rag-chatbot.git
cd rag-chatbot
make install

# Run
make run

API

# Health check
curl http://localhost:5000/health

# Ingest documents
curl -X POST http://localhost:5000/ingest \
  -H "Content-Type: application/json" \
  -d '{"texts": ["Your document text here"], "ids": ["doc_0"]}'

# Query
curl -X POST http://localhost:5000/query \
  -H "Content-Type: application/json" \
  -d '{"question": "What does the document say?"}'

Development

make test     # Run tests
make lint     # Lint with ruff
make format   # Format with ruff

Upgrade to Production

An optional upgrade script adds FastAPI backend, streaming responses, PDF ingestion, chat history, and systemd services:

bash scripts/upgrade.sh

Hardware

Raspberry Pi 5 (8GB recommended) or Pi 4 (4GB minimum)
64GB+ microSD or USB SSD
Active cooling

Ready for Production?

This starter stack is designed for learning and demos. When you're ready for a production deployment, check out CoreChat — the full-featured evolution of this project.

CoreChat adds:

Polished Flask UI with dashboard, document management, and model selector
Hybrid retrieval (BM25 + vector fusion) for better search quality
Multi-language prompts with configurable system prompt templates
Docker & systemd deployment with resource limits for Pi 5
Pluggable backend — choose between full LlamaIndex engine or lightweight direct-Ollama mode

Your documents are compatible — same nomic-embed-text embeddings and ChromaDB storage. No re-ingestion needed when you migrate.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
docs		docs
documents		documents
scripts		scripts
services		services
src/rag_chatbot		src/rag_chatbot
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rag-chatbot

Stack

Quick Start

API

Development

Upgrade to Production

Hardware

Ready for Production?

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

rag-chatbot

Stack

Quick Start

API

Development

Upgrade to Production

Hardware

Ready for Production?

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages