Local semantic search for the command line. Store, search, and delete text using vector embeddings — no backend, no API keys, everything stays on your machine.
Semantic search tools today assume you have infrastructure: a vector database to run, an embedding API to call, credentials to manage. That's fine for production systems, but overkill when all you need is a lightweight way to index and recall text — especially for AI agents that benefit from long-term memory.
semlocal is a single binary you install with npm and run immediately. Embeddings are generated locally via ONNX Runtime, stored in a SQLite file, and searched with brute-force cosine similarity. No servers, no API keys, no Docker containers. Just a CLI that reads and writes to disk.
Use it to give agents persistent, searchable memory without any compute or infrastructure overhead.
npm install -g semlocalsemlocal write "Rust is a systems programming language focused on safety"
# prints: a1b2c3d4-e5f6-7890-abcd-ef1234567890You can also pipe text from another command or a file:
cat README.md | semlocal write
echo "hello world" | semlocal write -semlocal search "safe low-level language"
# [0.87] a1b2c3d4-... Rust is a systems programming language focused on safetyReturn results as JSON:
semlocal search "safe low-level language" --json --top 3semlocal delete a1b2c3d4-e5f6-7890-abcd-ef1234567890Entries are organized into collections. If not specified, all operations use the default collection. Use --collection to partition your data:
semlocal write "Rust is fast" --collection languages
semlocal search "performance" --collection languages
semlocal delete a1b2c3d4-... --collection languagesCollections are created implicitly on first write and removed automatically when their last entry is deleted.
By default the index is stored in .semlocal/ in the current working directory. Use --src to change this:
semlocal write "hello world" --src ~/my-index
semlocal search "greeting" --src ~/my-indexsemlocal uses FastEmbed (ONNX Runtime) to generate 384-dimensional vector embeddings with the all-MiniLM-L6-v2 model. Embeddings are stored in a local SQLite database. Search is performed via brute-force cosine similarity over all stored entries.
The embedding model (~25 MB) is downloaded automatically on first use and cached in ~/.semlocal/models/. Subsequent runs start instantly.
Pre-built binaries are provided for:
| OS | x64 | arm64 |
|---|---|---|
| Linux | ✓ | ✓ |
| macOS | ✓ | |
| Windows | ✓ |
MIT