A local Retrieval-Augmented Generation (RAG) app with:
- CLI query (
rag.py) - Browser UI via FastAPI (
web_app.py) - Streamlit UI (
streamlit_app.py)
It uses:
- ChromaDB for vector search
- SentenceTransformers embeddings
- Local Llama GGUF (
llama-cpp-python) or OpenAI for generation
resources/documents/- your source files (.txt,.md,.pdf)resources/public_demo/- safe sample docs used for Streamlit public demo bootstrapresources/document_catalog.json- auto-generated catalog (ignored by git)resources/DOCUMENT_INDEX.md- auto-generated markdown index (ignored by git)chroma_db/- vector index (ignored by git)
cd RAG
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements_local.txtAdd documents to resources/documents/, then index:
python ingest.pyAsk from CLI:
python rag.py "What is time series forecasting?"FastAPI web UI:
python web_app.pyOpen http://127.0.0.1:8765.
Streamlit UI:
streamlit run streamlit_app.pyUse streamlit_app.py as the app entry point.
This repo includes:
requirements.txt(cloud-safe, faster deploy)requirements_local.txt(full local stack: FastAPI + local Llama)runtime.txtpinned to Python 3.11 for dependency compatibility.
Recommended public-demo environment/secrets:
PUBLIC_DEMO_MODE=1AUTO_BOOTSTRAP_DEMO=1(auto-ingestresources/public_demo/when catalog is empty)SHOW_DOC_METADATA=0ALLOW_USER_UPLOAD=0ALLOW_USER_REINDEX=0
These defaults prevent exposing document names, chunks, and write operations in public demos.
If using OpenAI:
- set
OPENAI_API_KEYin Streamlit secrets.
.gitignoreexcludes private docs and generated indexes:resources/documents/*(except.gitkeep)resources/document_catalog.jsonresources/DOCUMENT_INDEX.md.env.streamlit/secrets.toml
- Keep private files only on your local machine or private storage.
Download default GGUF:
python download_llm.pyThen query with local backend (default) or set LLM_BACKEND=openai for cloud model.