Log ingestion and LLM-based investigation engine. Ingests log files into PostgreSQL (pgvector), retrieves relevant log clusters via hybrid search (BM25 + dense vectors with RRF), and runs a ReAct loop where an LLM autonomously investigates root causes.
repi/
├── cli.py # Typer CLI — lifecycle commands: init, serve, ui, stop
├── worker.py # Background file watcher — polls watcher_configs, ingests new log bytes
├── api/ # FastAPI — ingest, investigate, watchers, config endpoints
├── core/ # Settings (pydantic-settings), DI container, Redis cache
├── ingestion/ # Log parsing → signature clustering → embedding → upsert
├── retrieval/ # pgvector HNSW + PostgreSQL FTS, RRF fusion, query expansion
├── investigation/ # ReAct loop, tools (search_logs, get_timeline, scan_window, get_service_summary), evidence store
├── llm/ # Provider protocol + adapters (OpenAI, Anthropic, Mistral, Gemini, Ollama)
├── intent/ # Natural-language query → service / time-window / log-level extraction
└── models/ # SQLModel tables: log_chunks, investigations, investigation_steps, investigation_chunks, watcher_configs, watcher_offsets
Multi-arch images (linux/amd64 + linux/arm64) are published to GitHub Container Registry on every push to main and every tagged release.
# Pull the latest image
docker pull ghcr.io/varungitgood/repi:latest
# One-shot help check
docker run --rm ghcr.io/varungitgood/repi:latest --help
# Full stack — Postgres + pgvector + Redis + repi API, all in one
OPENAI_API_KEY=sk-... docker compose --profile all-in-one upThe all-in-one profile lives in docker-compose.yml. The API binds to http://localhost:8000 and reads its provider key from your shell env (OPENAI_API_KEY, ANTHROPIC_API_KEY, MISTRAL_API_KEY, or GEMINI_API_KEY). Without --profile all-in-one, the compose file only brings up db + redis — the original behavior used by the local-dev flow below.
Override the image tag (e.g. to pin a release) via REPI_IMAGE=ghcr.io/varungitgood/repi:v0.1.0.
Flow for hacking on repi from a fresh clone. End users who pipx install repi (once D1 lands) call the same commands without the uv run prefix — see /repi/docs for the install-and-run path.
Prerequisites: Docker, Python 3.11+, uv, Node.js (for the web UI).
uv sync # resolve lockfile into .venv
uv run repi init --with-docker # db + redis up, prompts provider/key, writes .repi/config.json, applies schema
uv run repi serve # terminal 1: API on :8000
uv run repi ui # terminal 2: web UI on :3000
uv run repi stop # when done: tear down docker stackrepi has one source of truth: .repi/config.json. It's created by repi init and mutated by the web UI's Config page (PUT /config). The file is gitignored.
Three ways to edit it, in order of convenience:
- Web UI — visit the Config page; changes save immediately and the API hot-reloads them.
- Re-run
repi init --force— re-prompts for provider + key and rewrites the file. - Edit the JSON directly —
.repi/config.jsonis plain JSON; restart the API after editing.
Shell env vars (e.g. REPI_ENV=development uv run repi serve) override values from config.json at runtime — handy for one-off commands or CI.
REPI_ENV defaults to production. Flip to development in .repi/config.json (or via the UI) for verbose logs + --reload.
Coming from an older checkout? If you have a legacy
config.jsonat the repo root, move it:mkdir -p .repi && mv config.json .repi/config.json. The rootconfig.jsonis no longer read.
curl -X POST \
-F "service=my-service" \
-F "file=@/path/to/app.log" \
http://localhost:8000/ingestcurl -X POST http://localhost:8000/investigate \
-H "Content-Type: application/json" \
-d '{"query": "why did checkout fail last friday night"}'Stream the ReAct steps live:
curl -N http://localhost:8000/investigate/{id}/streamThe worker watches directories for new log bytes and ingests them automatically.
- Register a watcher via the API:
curl -X POST http://localhost:8000/watchers \
-H "Content-Type: application/json" \
-d '{"service_name": "auth-svc", "watch_path": "/var/log/auth"}'- Start the worker:
uv run python -m repi.workerThe worker polls watcher_configs every 30s (WATCHER_CONFIG_REFRESH_SECS) and uses watchfiles to detect new bytes, seeking forward from the last stored offset.
| Variable | Default | Description |
|---|---|---|
REPI_ENV |
production |
production | development. Production = quiet logs, no auto-reload. |
DATABASE_URL |
postgresql+asyncpg://lograg_user:password_here@localhost:5432/lograg |
PostgreSQL asyncpg URL |
LLM_PROVIDER |
openai |
openai | anthropic | mistral | gemini | ollama |
LLM_MODEL |
provider default | Override model name |
OPENAI_API_KEY / ANTHROPIC_API_KEY / … |
— | Provider API key |
REDIS_URL |
redis://localhost:6379 |
Redis for caching |
ENABLE_REDIS_CACHE |
true |
Set false to disable Redis |
TIME_WINDOW_INITIAL_MINUTES |
10 |
First search window for investigation |
TIME_WINDOW_EXPANSIONS |
60,360,1440 |
Progressive window expansion (minutes) |
UI_PORT |
3000 |
Port the web UI binds to (read by repi ui) |
WATCHER_CONFIG_REFRESH_SECS |
30 |
How often the worker polls for config changes |
OLLAMA_BASE_URL |
http://localhost:11434 |
Ollama endpoint |
# Run all tests
uv run pytest tests/ -v
# Run worker tests only
uv run pytest tests/worker/ -v
# Run investigation tests
uv run pytest tests/investigation/ -vMIT