Chat with your local markdown, text, and PDF files using Claude + semantic search.
pip install ask-my-docs# Index a directory of docs
ask init ./docs
# Ask a question (default command)
ask "what is the authentication strategy?"
# Explicit query subcommand
ask query "how do I configure the database?"
# List all indexed files
ask list
# Clear the index
ask clearask-my-docs uses a simple RAG (Retrieval-Augmented Generation) pipeline:
- Index — Scans your docs directory, extracts text from
.md,.txt, and.pdffiles, splits into overlapping chunks, and generates semantic embeddings using a local sentence-transformer model. - Embed — Each chunk is embedded with
all-MiniLM-L6-v2and stored in a FAISS index on disk (~/.ask-my-docs/). - Retrieve — When you ask a question, it is embedded and compared against the FAISS index using inner-product similarity to find the most relevant chunks.
- Answer — The retrieved chunks are sent to Claude (
claude-opus-4-7) as context, which generates a grounded answer with source citations.
| Format | Extension | Notes |
|---|---|---|
| Markdown | .md |
Full text extraction |
| Plain text | .txt |
Full text extraction |
.pdf |
Text extracted via pdfplumber |
┌─────────────────────────────────────────────────────────────────┐
│ ask-my-docs RAG Pipeline │
└─────────────────────────────────────────────────────────────────┘
ask init ./docs
│
▼
┌─────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ File scan │────▶│ Text extraction │────▶│ Chunking │
│ .md .txt │ │ (pdfplumber for │ │ 512-token chunks│
│ .pdf │ │ PDFs) │ │ w/ 64-tok overlap│
└─────────────┘ └──────────────────┘ └────────┬─────────┘
│
▼
┌──────────────────┐
│ SentenceTransf. │
│ all-MiniLM-L6-v2│
│ embeddings │
└────────┬─────────┘
│
▼
┌──────────────────┐
│ FAISS IndexFlatIP│
│ ~/.ask-my-docs/ │
│ index.faiss │
└──────────────────┘
ask "your question"
│
▼
┌─────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ Query │────▶│ Embed query │────▶│ FAISS search │
│ (user text)│ │ (same model) │ │ top-K chunks │
└─────────────┘ └──────────────────┘ └────────┬─────────┘
│
▼
┌──────────────────┐
│ Claude API │
│ claude-opus-4-7 │
│ + context chunks│
└────────┬─────────┘
│
▼
┌──────────────────┐
│ Answer + Sources│
└──────────────────┘
Set your Anthropic API key before querying:
export ANTHROPIC_API_KEY="your-key-here"