Natural language search across tldr-pages CLI documentation.
- No hallucination on commands — every answer is grounded in tldr-pages documentation, so the commands are real and tested
- Structured, consistent output — always a command, an explanation, and a source; no conversational filler
- Ranked alternatives — 3–5 results ordered by commonality, from the standard answer to the edge cases
- No context needed — one input, one output
Example queries:
- "recursively find files modified in the last 24 hours"
- "watch a log file and filter for errors"
- "list open ports on this machine"
- "copy files over SSH"
- "show disk usage sorted by size"
Live at https://container-service-1.gqceswqwzkchr.us-west-2.cs.amazonlightsail.com
- Corpus: tldr-pages (~7,000 pages, MIT licensed)
- Embeddings: sentence-transformers
all-MiniLM-L6-v2(local, no API key required) - Vector store: Chroma
- RAG: LangChain + Gemini 2.5 Flash
- Backend: FastAPI with per-IP rate limiting
- Frontend: Vanilla HTML/JS
1. Ingest the corpus (one-time, re-run to update):
cd ingest
pip install -r requirements.txt
python ingest.py2. Start the backend:
cd backend
pip install -r requirements.txt
GOOGLE_API_KEY=your-key uvicorn main:app --reloadOpen http://localhost:8000.
MIT
