An on-device search engine for everything you need to remember. Index your Markdown notes, meeting transcripts, documentation, and knowledge bases. Search with keywords or natural language. Ideal for agentic workflows.
SNIP combines BM25 full-text search, deterministic offline vector search, and a hybrid reranking pipeline. Single binary, fully offline by default.
- β‘ Fast local search with BM25 + vectors
- π Fully offline by default
- π§ Hybrid fusion with reranking
- π¦ Single binary (macOS/Linux/Windows)
Homebrew:
brew install yindia/homebrew-yindia/snipInstall script:
curl -fsSL https://raw.githubusercontent.com/yindia/snip/refs/heads/main/install.sh | shManual build:
go build ./cmd/snip# create collections for notes, meetings, and docs
snip collection add ~/notes --name notes
snip collection add ~/Documents/meetings --name meetings
snip collection add ~/work/docs --name docs
# add context to improve results
snip context add snip://notes "Personal notes and ideas"
snip context add snip://meetings "Meeting transcripts and notes"
snip context add snip://docs "Work documentation"
# index content
snip update
# generate embeddings for semantic search (offline)
snip embed
# search across everything
snip search "project timeline" # fast keyword search
snip vsearch "how to deploy" # semantic search
snip query "quarterly planning process" # hybrid + reranking (best quality)
# get a specific document
snip get "meetings/2024-01-15.md"
# get a document by docid (shown in search results)
snip get "#abc123def4567890"
# get multiple documents by glob pattern
snip get "journals/2025-05*.md"
# search within a specific collection
snip search "API" -c notes
# export all matches for an agent
snip search "API" --all --files --min-score 0.3SNIP's --json and --files output formats are designed for agentic workflows:
# structured results
snip search "authentication" --json -n 10
# list all relevant files above a threshold
snip query "error handling" --all --files --min-score 0.4
# retrieve full document content
snip get "docs/api-reference.md"Example pipeline with a local LLM (Ollama):
# search, then summarize with a local model
snip query "payment retries" --json -n 8 | \
ollama run mistral "Summarize key findings and cite docids."SNIP can run an MCP server over STDIO for agent integrations:
snip mcpBy default, snip mcp uses the same index directory as your last CLI run.
Override with --index or SNIP_INDEX_DIR if you keep multiple indexes.
Tools exposed:
snip_search- π Fast BM25 keyword searchsnip_vsearch- π§ Semantic vector search (requires embeddings)snip_query- 𧬠Hybrid search with fusion + rerankingsnip_get- π Retrieve one or many documents (path/docid, glob, or list)snip_status- π Index stats and collection info
Resources:
snip://{collection}/{relative/path}- returns document text with line numbers
Claude Desktop configuration (~/Library/Application Support/Claude/claude_desktop_config.json):
{
"mcpServers": {
"snip": {
"command": "snip",
"args": ["mcp"]
}
}
}Claude Code configuration (~/.claude/settings.json):
{
"mcpServers": {
"snip": {
"command": "snip",
"args": ["mcp"]
}
}
}βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SNIP Hybrid Search Pipeline β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββ
β User Query β
ββββββββββββββ¬βββββββββββββ
β
ββββββββββββββββββββ΄βββββββββββββββββββ
βΌ βΌ
ββββββββββββββββββββββββββ ββββββββββββββββββββββββββ
β Query Expansion β β Original Query (Γ2) β
β deterministic variants β ββββββββββββββ¬ββββββββββββ
ββββββββββββββ¬ββββββββββββ β
β 1-2 alternatives β
ββββββββββββββ¬βββββββββββββββββββββββββ
βΌ
ββββββββββββββββββββββ
β Query Set (N<=3) β
βββββββββββββ¬βββββββββ
β
ββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββ
βΌ βΌ βΌ
βββββββββββββββββ βββββββββββββββββ βββββββββββββββββ
β BM25 (FTS5) β β Vector Search β β BM25 (FTS5) β
βββββββββ¬ββββββββ βββββββββ¬ββββββββ βββββββββ¬ββββββββ
β β β
ββββββββββββββββ¬βββββββββββ΄βββββββββββ¬ββββββββββββββββ
βΌ βΌ
ββββββββββββββββββββββββββββββββββββββ
β RRF Fusion + Top-Rank Bonus β
β original query weighted Γ2 β
β keep top 30 candidates β
ββββββββββββββββββββ¬ββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββ
β Local Re-ranking β
β (yzma or overlap) β
βββββββββββββ¬ββββββββββββ
β
βΌ
βββββββββββββββββββββββββ
β Position-Aware Blend β
β Top 1-3: 75% RRF β
β Top 4-10: 60% RRF β
β Top 11+: 40% RRF β
βββββββββββββββββββββββββ
| Backend | Raw Score | Normalization | Range |
|---|---|---|---|
| FTS (BM25) | SQLite FTS5 BM25 | Min-max within result set | 0.0 to 1.0 |
| Vector | Cosine similarity | Min-max within result set | 0.0 to 1.0 |
| Reranker | Yzma-based LLM or lexical overlap | Min-max within result set | 0.0 to 1.0 |
The query command uses Reciprocal Rank Fusion (RRF) with position-aware blending:
- Query Expansion: Original query (Γ2 weighting) + 1β2 deterministic variants
- Parallel Retrieval: Each query searches both FTS and vector indexes
- RRF Fusion: Combine all result lists using
score = Ξ£(1/(k+rank+1))where k=60 - Top-Rank Bonus: Documents ranking #1 in any list get +0.05, #2-3 get +0.02
- Top-K Selection: Take top 30 candidates for reranking
- Re-ranking: Yzma-based LLM reranker, fallback to lexical overlap
- Position-Aware Blending:
- RRF rank 1-3: 75% retrieval, 25% reranker
- RRF rank 4-10: 60% retrieval, 40% reranker
- RRF rank 11+: 40% retrieval, 60% reranker
- Go 1.24+ toolchain for building from source
- SQLite is embedded via Go (no external dependency)
- Optional: llama.cpp shared libraries for yzma-based models
# create a collection from current directory
snip collection add . --name myproject
# create a collection with explicit path and custom file extensions
# extensions match files recursively in any subdirectory
snip collection add ~/repo --name repo --extension go --extension py
# match all markdown files recursively
snip collection add ~/repo --name repo --extension md
# indexing respects .gitignore files under the collection root
# list all collections
snip collection list
# remove a collection
snip collection remove myproject
# rename a collection
snip collection rename myproject my-project
# list files in a collection
snip ls notes
snip ls notes/subfolder# embed all indexed documents (800 tokens/chunk, 15% overlap)
snip embed
# force re-embed everything
snip embed -fContext adds descriptive metadata to collections and paths, helping search understand your content.
# add context to a collection (using snip:// virtual paths)
snip context add snip://notes "Personal notes and ideas"
snip context add snip://docs/api "API documentation"
# add context using an absolute path
snip context add ~/notes/work "Work-related notes"
# list all contexts
snip context list
# remove context
snip context rm snip://notes/oldββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Search Modes β
ββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β search β BM25 full-text search only β
β vsearch β Vector semantic search only β
β query β Hybrid: FTS + Vector + Expansion + Reranking β
ββββββββββββ΄ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
# full-text search (fast, keyword-based)
snip search "authentication flow"
# vector search (semantic similarity)
snip vsearch "how to login"
# hybrid search with reranking (best quality)
snip query "user authentication"# search options
-n <num> # number of results (default: 5)
-c, --collection # restrict search to a specific collection
--all # search all collections
--min-score <num> # minimum score threshold (0.0β1.0)
--full # include full document content
--line-numbers # add line numbers to output
# output formats
--files # output file paths only
--json # JSON output
--csv # CSV output
--md # Markdown output
--xml # XML outputSoftware Craftsmanship (score: 0.9300)
docs/guide.md [a1b2c3]
context: Work documentation
This section covers the craftsmanship of building
quality software with attention to detail.
Index stored at: ~/.cache/snip/index.sqlite (respects XDG_CACHE_HOME).
Schema:
collections -- Indexed directories with name and extensions
path_contexts -- Context descriptions by virtual path (snip://...)
documents -- Markdown content with metadata and docid (16-char hash)
documents_fts -- FTS5 full-text index
content_vectors -- Embedding chunks (hash, seq, pos, 800 tokens each)Collection βββΊ Extensions βββΊ Markdown Files βββΊ Parse Title βββΊ Hash Content
β β β
β β βΌ
β β Generate docid
β β (16-char hash of collection+path)
β β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββΊββββΊ Store in SQLite
β
βΌ
FTS5 Index
Documents are chunked into 800-token pieces with 15% overlap:
Document βββΊ Chunk (800 tokens) βββΊ Hash Embedder βββΊ Store Vectors
β
βββΊ Chunks stored with:
- hash: document hash
- seq: chunk sequence (0, 1, 2...)
- pos: token position in original
Query βββΊ Expansion βββΊ [Original, Variant 1, Variant 2]
β
βββββββββββ΄ββββββββββ
βΌ βΌ
For each query: FTS (BM25)
β β
βΌ βΌ
Vector Search Ranked List
β
βΌ
Ranked List
β
βββββββββββ¬ββββββββββ
βΌ
RRF Fusion (k=60)
Original query Γ2 weight
Top-rank bonus: +0.05/#1, +0.02/#2-3
β
βΌ
Top 30 candidates
β
βΌ
Local Re-ranking (yzma or overlap)
β
βΌ
Position-Aware Blend
Rank 1-3: 75% RRF / 25% reranker
Rank 4-10: 60% RRF / 40% reranker
Rank 11+: 40% RRF / 60% reranker
β
βΌ
Final Results
SNIP reads YAML or JSON config (if present) and environment variables. Flags always override config.
Default config path (if present):
~/.config/snip/config.yaml
Config keys:
index_dirembed_modelrerank_modelexpand_modelmodel_cache_dirllama_lib_pathmodel(legacy alias forembed_model)debugno_color
Env vars:
SNIP_INDEX_DIRSNIP_EMBED_MODELSNIP_RERANK_MODELSNIP_EXPAND_MODELSNIP_MODEL_CACHE_DIRSNIP_LLAMA_LIBSNIP_MODEL(legacy alias forSNIP_EMBED_MODEL)SNIP_DEBUG
SNIP supports local GGUF models via yzma (llama.cpp without CGO). The flow is:
- Build SNIP with yzma support.
- Install llama.cpp shared libraries.
- Download GGUF models into a cache directorysnip.
- Point SNIP at the libraries + models using config or env vars.
Build with the yzma tag:
go build -tags yzma ./main.goInstall llama.cpp libs and download models:
yzma install --lib ~/.cache/snip/llama
yzma model get -u https://huggingface.co/ggml-org/embeddinggemma-300M-GGUF/resolve/main/embeddinggemma-300M-Q8_0.gguf -o ~/.cache/snip/models
yzma model get -u https://huggingface.co/ggml-org/Qwen3-Reranker-0.6B-Q8_0-GGUF/resolve/main/qwen3-reranker-0.6b-q8_0.gguf -o ~/.cache/snip/models
yzma model get -u https://huggingface.co/tobil/qmd-query-expansion-1.7B-gguf/resolve/main/qmd-query-expansion-1.7B-q4_k_m.gguf -o ~/.cache/snip/modelsConfigure SNIP (config file or env vars). Example ~/.config/snip/config.yaml:
llama_lib_path: ~/.cache/snip/llama
model_cache_dir: ~/.cache/snip/models
embed_model: hf:ggml-org/embeddinggemma-300M-GGUF/embeddinggemma-300M-Q8_0.gguf
rerank_model: hf:ggml-org/Qwen3-Reranker-0.6B-Q8_0-GGUF/qwen3-reranker-0.6b-q8_0.gguf
expand_model: hf:tobil/qmd-query-expansion-1.7B-gguf/qmd-query-expansion-1.7B-q4_k_m.ggufOr set env vars instead:
export SNIP_LLAMA_LIB=~/.cache/snip/llama
export SNIP_MODEL_CACHE_DIR=~/.cache/snip/models
export SNIP_EMBED_MODEL=hf:ggml-org/embeddinggemma-300M-GGUF/embeddinggemma-300M-Q8_0.gguf
export SNIP_RERANK_MODEL=hf:ggml-org/Qwen3-Reranker-0.6B-Q8_0-GGUF/qwen3-reranker-0.6b-q8_0.gguf
export SNIP_EXPAND_MODEL=hf:tobil/qmd-query-expansion-1.7B-gguf/qmd-query-expansion-1.7B-q4_k_m.ggufIf models or the llama.cpp libraries are missing, SNIP falls back to hash embeddings and lexical reranking. To force pure-Go mode, set embed_model: hash (or SNIP_MODEL=hash).
SNIP is heavily inspired by https://github.com/tobi/qmd/tree/main.