A CLI tool and library for indexing and semantically searching document repositories. Converts documents of various formats into searchable vector embeddings using ChromaDB and Docling, enabling fast semantic search across personal knowledge bases, research collections, and document archives.
uv sync# Add a repository
uv run researcher repo add my-docs ~/Documents --file-types md,txt,pdf
# Index the repository
uv run researcher index my-docs
# Search
uv run researcher search "machine learning concepts" --repo my-docs
# Check status
uv run researcher status
# Show configuration
uv run researcher config showresearcher repo add <name> <path> [--file-types md,txt,pdf] [--embedding-provider chromadb]
researcher repo remove <name>
researcher repo listresearcher index [<repo-name>] # Index one or all repositories
researcher remove <repo> <doc-path> # Remove a document from the index
researcher status [<repo-name>] # Show index statisticsresearcher search <query> [--repo <name>] [--fragments 10] [--documents 5] [--mode documents]researcher config show
researcher config set <key> <value>
researcher config pathresearcher init # Install Claude Code skills into .claude/skills/
researcher init --force # Overwrite existing skill filesThis copies the bundled researcher-admin and researcher-find skills into your project's .claude/skills/ directory so Claude Code can discover them automatically. After running init, configure the MCP server in .claude/settings.json:
{
"mcpServers": {
"researcher": {
"command": "researcher",
"args": ["serve"]
}
}
}researcher serve # Start in STDIO mode (for Claude Code)
researcher serve --port 8392 # Start in HTTP mode~/.researcher/
config.yaml
repositories/
<repo-name>/
chroma/ # ChromaDB vector store
checksums.json # Incremental indexing cache
| Provider | Description | Requirements |
|---|---|---|
chromadb (default) |
Built-in embeddings, zero config | None |
ollama |
Local Ollama instance | Ollama running locally |
openai |
OpenAI API | OPENAI_API_KEY env var |
researcher-cli exposes an MCP server with the following tools:
search_documents— semantic document searchsearch_fragments— semantic fragment searchadd_to_index— index a specific fileremove_from_index— remove a documentlist_repositories— list configured reposget_index_status— index statistics