RAG Knowledge System

Production-oriented RAG stack: FastAPI, FAISS + SQLite, OpenAI or local embeddings (sentence-transformers), OpenAI or Ollama for the LLM, and a minimal Streamlit UI.

Requirements

Python 3.13+ (recommended baseline)

Setup

Create and activate a virtual environment first so dependencies stay isolated from the system Python.

Windows (PowerShell)

cd c:\Repos\rag
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -e .
# Optional: pip install -e ".[local]"   # sentence-transformers for EMBEDDING_PROVIDER=local
# Optional: pip install -e ".[ui]"      # Streamlit UI
# Contributors: pip install -e ".[dev]"

Runtime dependencies are in pyproject.toml. Extras: local (local embeddings), ui (Streamlit). dev is for tests and tooling (ruff, mypy, pytest, pytest-cov).

If script execution is blocked by policy, for the current user once:

Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser

Linux / macOS

cd /path/to/rag
python3 -m venv .venv
source .venv/bin/activate
pip install -e .
# Optional: pip install -e ".[local]" ".[ui]"  # as needed
# Contributors: pip install -e ".[dev]"

Your prompt should show (.venv).

Configuration

copy .env.example .env   # Windows
# cp .env.example .env   # Unix

Edit .env: OpenAI key (if using EMBEDDING_PROVIDER=openai and/or LLM_PROVIDER=openai), models, relevance threshold, and DATA_DIR / STORAGE_DIR.

Run the API

With .venv activated:

uvicorn app.main:app --reload --host 127.0.0.1 --port 8000

Docs: http://127.0.0.1:8000/docs

Run the Streamlit UI

Install the UI extra (pip install -e ".[ui]" or pip install -e ".[dev,ui]" for contributors). In a second terminal, activate .venv, then:

streamlit run ui/streamlit_app.py

Set RAG_API_URL (default http://127.0.0.1:8000) to match the API. If the API enforces RAG_API_KEY, set the same key in the Streamlit process environment.

Tests

Install dev tools: pip install -e ".[dev]".

pytest
ruff format --check app tests ui
ruff check app tests ui
mypy app

Local pytest does not enable coverage by default. CI runs pytest with --cov=app --cov-report=term-missing --cov-fail-under=65 (see .github/workflows/ci.yml). To match CI locally:

pytest --cov=app --cov-report=term-missing --cov-fail-under=65

Layout

pyproject.toml — dependencies, extras (local, ui, dev), ruff/mypy/pytest config
app/ — FastAPI app, ingestion / retrieval / generation services, config
data/ — uploaded files
storage/ — FAISS index and SQLite chunk metadata
embeddings/ — optional model cache (HF_HOME=embeddings in .env)
ui/streamlit_app.py — chat, upload, and document removal

Endpoints

Method	Path	Description
POST	`/ingest`	Upload `.txt` / `.pdf` (same filename replaces existing chunks)
POST	`/query`	Question → answer with sources and metrics
POST	`/query/stream`	SSE stream with sources, token events, and terminal done/error events
GET	`/documents`	List indexed documents
DELETE	`/documents/{filename}`	Remove document from index and `data/` (safe basename only)
GET	`/health`	Service health, vector count, and optional LLM probe

The same methods are available under the /v1 prefix (for example POST /v1/query). Legacy paths without /v1 remain available and include deprecation headers.

Operations notes

Logs: JSON via structlog; each request logs method, path, status_code, duration_ms; response includes X-Request-ID. Clients (including Streamlit) may send X-Request-ID to correlate with logs.
Optional auth: set RAG_API_KEY to require X-API-Key or Authorization: Bearer on API routes (/health stays open). Control docs/openapi exemption with API_KEY_EXEMPT_DOCS.
Ingest size cap: MAX_INGEST_BYTES (default 20MB); oversize uploads return HTTP 413.
Health LLM probe: set HEALTH_CHECK_LLM=true to include llm_ok / llm_error in GET /health (short outbound check).
CORS: CORS_ALLOW_ORIGINS (* or comma-separated origins) and CORS_ALLOW_CREDENTIALS (must be false when origins contain *).
LLM / OpenAI HTTP timeout: OPENAI_TIMEOUT_SECONDS (OpenAI embeddings and chat, and Ollama chat requests).
Rate limits: default 60/minute global; endpoint-specific limits include ingest 10/minute and query/query_stream 30/minute.
Legacy route policy: non-versioned routes remain available for compatibility and include Deprecation + Link headers pointing to /v1.
Streamlit exposure: do not expose Streamlit directly on the public internet; place it behind reverse proxy + SSO/VPN/IP allowlist.
Index files: written with faiss.serialize_index for Unicode paths on Windows; older write_index files are loaded via a temporary ASCII path.

See .env.example for all environment variables.

Docker Compose

api image installs the base package only (OpenAI embeddings by default). To run EMBEDDING_PROVIDER=local in containers, rebuild with build arg INSTALL_EXTRAS=local (or local,ui if needed).
ui image is built with INSTALL_EXTRAS=ui so Streamlit is included.

Agent / AI assistants

See AGENTS.md for architecture notes and editing rules for this repository.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Knowledge System

Requirements

Setup

Configuration

Run the API

Run the Streamlit UI

Tests

Layout

Endpoints

Operations notes

Docker Compose

Agent / AI assistants

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github/workflows		.github/workflows
app		app
data		data
embeddings		embeddings
storage		storage
tests		tests
ui		ui
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

RAG Knowledge System

Requirements

Setup

Configuration

Run the API

Run the Streamlit UI

Tests

Layout

Endpoints

Operations notes

Docker Compose

Agent / AI assistants

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages