Local-first wiki CLI for AI agents. Persistent, searchable knowledge bases built from Markdown files with hybrid semantic + keyword search.
Inspired by Karpathy's LLM-wiki pattern — the LLM incrementally builds and maintains a structured wiki that compounds over time.
- Stores knowledge as interlinked Markdown files (source of truth)
- SQLite for metadata indexing, FTS5 for keyword search, sqlite-vec for semantic search
- Local embeddings via bge-base-en-v1.5 (768 dims, no cloud, no API keys, runs on Apple Silicon); switchable to gte-base-en-v1.5 for 8K-token long context
- Hybrid search combines BM25 + cosine similarity via Reciprocal Rank Fusion
- CLI for agents + HTTP transport for remote KB access
- Multiple isolated wikis per user (work, personal, research, etc.)
- Agent skill system with workflows for ingestion, search, updates, and maintenance
npm install -g kb-wikiFirst run downloads the embedding model (~200 MB for the default bge-base-en-v1.5, cached in ~/.kb/.models/). See Embedding models below if you want to swap to the long-context variant.
Prebuilt binaries exist for all three native dependencies (better-sqlite3, sqlite-vec, onnxruntime-node) on five OS/arch combinations:
| Platform | Status |
|---|---|
| macOS arm64 (Apple Silicon) | confirmed |
| macOS x64 (Intel) | should work, untested |
| Linux x64 | should work, untested |
| Linux arm64 | should work, untested |
| Windows x64 | should work, untested |
| Windows arm64 | not supported (no prebuilds) |
Untested = the architecture is portable and all three native deps publish working prebuilt binaries for the platform, but the project has only been exercised end-to-end on macOS arm64 so far. If you run into a platform-specific issue, please file it on the issue tracker.
pnpm refuses to run install scripts of transitive native dependencies (better-sqlite3, onnxruntime-node) unless you've approved them. Without that step, kb will fail at startup with Could not locate the bindings file. Two ways to handle it:
# Recommended: approve native builds once, then install.
pnpm approve-builds -g # interactive — say yes to better-sqlite3 + onnxruntime-node
pnpm add -g kb-wiki
# If you already installed and hit the bindings error, force a rebuild:
pnpm rebuild -gkb wiki create my-wiki # create a wiki
kb wiki use my-wiki # set as default
# --dry-run lints the content first without writing — recommended for agents.
kb add --title "Docker Basics" --category concepts --tags "docker,containers" \
--content "Docker packages applications into containers..." --dry-run
kb add --title "Docker Basics" --category concepts --tags "docker,containers" \
--content "Docker packages applications into containers..."
kb search "container orchestration"
kb read docker-basics.md
kb related docker-basics
kb lintTo bind a wiki to a specific project directory:
kb setup --agents claude # Claude Code: skill + slash commands + CLAUDE.md memo
kb setup --agents cursor # Cursor: .mdc rules + AGENTS.md memo
kb setup --all # all agents (project-local)
kb setup --agents claude --global # user-scope: ~/.claude/CLAUDE.md
kb setup --agents codex --global # user-scope: ~/.codex/AGENTS.mdSetup writes a short "kb exists, run kb skill" memo into CLAUDE.md (for Claude) or AGENTS.md (for other agents), wrapped in <!-- kb-cli:start --> ... <!-- kb-cli:end --> markers. Re-running kb setup rewrites the block in place; delete the block manually to uninstall.
Project-local also creates kb.config.json:
{
"wiki": "my-project-wiki"
}Any kb command run within this directory (or subdirectories) will use that wiki by default.
| Agent | Project install | Global install |
|---|---|---|
| Claude Code | .claude/skills/ + .claude/commands/ + CLAUDE.md memo |
~/.claude/skills/ + ~/.claude/commands/ + ~/.claude/CLAUDE.md memo |
| Cursor | .cursor/rules/kb.mdc + AGENTS.md memo |
n/a (workspace-only) |
| Codex CLI | AGENTS.md memo |
~/.codex/AGENTS.md memo |
| Cline | AGENTS.md memo |
n/a (workspace-only) |
| Windsurf | AGENTS.md memo |
n/a (workspace-only) |
| Continue.dev | .continue/rules/kb.md + AGENTS.md memo |
n/a (workspace-only) |
kb search <query> Hybrid search (--mode hybrid|fts|vec, --limit, --format json)
kb read <file> Read document (--lines, --meta, --links, --follow); alias: kb get
kb resolve <arg> Resolve any handle (id, .md, ./path, full path) → canonical id + suggestions
kb add Add document (--content/--file/--stdin, --dry-run, --format json)
kb update <id> Update document (--content/--file/--stdin/--append, --dry-run, --format json)
kb delete <id> Delete document
kb rename <old> <new> Rename with automatic link updates
kb list List documents (--category, --tag, --format json)
kb categories List categories in use
kb related <id> Find semantically similar documents
kb lint [--fix] Check integrity + retrievability (--format json)
kb reindex [<id>] Rebuild index — whole wiki, or a single doc by id
kb toc Table of contents
kb schema [update] Show / regenerate wiki schema
kb log [add] Recent activity log; `add` records an agent session entry
kb migrate Upgrade local schema/embeddings (--dry-run, --yes, --wiki)
kb status Show local environment status (server, config, wikis)
kb wiki create/list/use/delete/info Manage wikis
kb config get/set/list Configuration
kb skill [workflow] Show agent instructions (ingest/search/update/lint)
kb setup Install agent integrations
kb serve [--port 4141 --secret <s> --detached --log <path> --stop] HTTP API server
kb remote add/remove/list/connect Manage remote KBs
kb remote attach/detach/wikis Manage remote wiki access
Run any command with --help to see full option details.
Every command that takes a doc handle (read, update, delete, rename, related, reindex, resolve) accepts any of: bare id (foo), filename form (foo.md), markdown-link form (./foo.md), full path (/Users/me/.kb/wiki/docs/foo.md), or any case (FOO.md). All forms normalize to the canonical lowercase id internally — same effect, no duplicate index rows. Use kb resolve <arg> when an id doesn't match: it returns the canonical form, file existence, and fuzzy suggestions.
Docs are chunked by heading at index time and each chunk is searched
independently — so structure matters. Two checks help you keep docs
retrievable; both are surfaced by kb lint and by kb add / kb update
(always, plus --dry-run to preview without writing or indexing):
| Warning | Threshold | Meaning |
|---|---|---|
chunk-merge |
section body <160 chars, or >50% link syntax | Section will auto-merge into the previous chunk |
long-paragraph |
paragraph >1500 chars | Can't be subdivided, risks embedding truncation |
doc-too-short |
<200 words | Centroid embedding is noisy |
doc-too-long |
>1500 words | Split into linked sub-docs |
Three frontmatter fields opt-out per doc, all matched case-insensitively:
important_sections: # prevent auto-merge of these short-but-critical sections
- TL;DR
- Status
suppress_merge_warn: # let the merge happen, just stop warning about it
- See Also
suppress_lint: # silence doc-level soft warnings
- doc-too-short # e.g. an intentionally short index pageStructural errors (broken links, missing frontmatter) are never suppressible.
Every kb invocation that touches the index loads the embedding model into memory (~200 MB for bge-base-en-v1.5, ~2-3 seconds on Apple Silicon). Across many commands per session, that adds up.
If you start kb serve in the background, every subsequent kb command on the same machine auto-detects the running server and routes through it — the model stays warm in the server process. Typical search latency drops from ~2-3s to ~50-150ms.
# Start the server in the background
kb serve --detached
# Optional: capture the server's stdout/stderr to a file
kb serve --detached --log /tmp/kb-server.log
# Inspect what's running and which model the server has loaded
kb status
# When you're done, stop it
kb serve --stopNo flags are needed on the routed commands themselves — kb search, kb read, kb add, kb update, kb reindex, etc. all detect the server transparently. kb status shows whether routing is active and surfaces any mismatch between the server's loaded model and the current config.json (a restart picks up config changes).
Notes:
- The server binds to
127.0.0.1; no auth is required by default for purely local use. If you pass--secret <token>, every routed call sends it as a Bearer token automatically. - A coordination file at
~/.kb/.serve.jsonrecords port / pid / model. It's removed cleanly onkb serve --stopand onSIGINT/SIGTERM. kb migraterefuses to run while the server is up (concurrent schema mutation would be unsafe) — stop the server first.- Only one local server at a time. A second
kb serveexits with a clear error pointing at the running one.
Connect to remote kb instances (servers running kb serve) to access shared team knowledge.
Access control uses a shared secret — both server and client must know the same string. The secret is sent as a Bearer token on every request. This is minimal access control; granular permissions and proper token management are planned for a future release.
# On the server machine
kb serve --port 4141 --secret my-shared-secret
# On your machine
kb remote add team --url http://server:4141 --secret my-shared-secret
kb remote connect team # verify connection
kb remote wikis team # list available wikis
kb remote attach team docs # attach "docs" wiki locally
kb remote attach team notes --alias tnotes # attach with alias (avoids name conflicts)
# Now use it like any local wiki
kb search "query" --wiki docs
kb add --title "..." --wiki docs --content "..."
kb wiki list # shows local + remote wikis
# Disconnect
kb remote detach docs
kb remote remove team # unregisters (remote data preserved)Remote wikis are transparent — all commands work the same whether the target wiki is local or remote. Use --wiki <name> to target a specific one, or set it as default with kb wiki use <name>.
kb remote create-wiki team new-wiki # create wiki on remote
kb remote delete-wiki team old-wiki --force # delete on remote (destructive!)~/.kb/
├── config.json global config
├── remotes.json remote KB registrations
├── .models/ cached embedding model
├── my-wiki/
│ ├── docs/ markdown files (flat, no subdirs)
│ │ ├── docker-basics.md
│ │ └── kubernetes.md
│ ├── index.db SQLite (metadata + FTS + vectors + links)
│ └── schema.md wiki structure & conventions
└── another-wiki/
├── docs/
├── index.db
└── schema.md
- Markdown files are the source of truth — human-readable, git-friendly, Obsidian-compatible
- SQLite is a derived index, rebuildable via
kb reindex - Links use standard Markdown format:
[text](./filename.md)— works in Obsidian graph view, VS Code, GitHub - Categories are free-form strings in frontmatter (not directories)
- Embeddings computed locally via ONNX (see Embedding models)
kb-wiki ships with two supported local embedding models. Both produce 768-dim vectors stored in the per-wiki vec0 index.
| Model | Default? | Dims | Context | Strength |
|---|---|---|---|---|
Xenova/bge-base-en-v1.5 |
yes | 768 | 512 tokens | Best general-purpose quality for short-to-medium docs (MTEB ~64) |
Alibaba-NLP/gte-base-en-v1.5 |
no | 768 | 8192 tokens | Same quality tier, much longer context — pick this if your docs are long (1000+ words) |
Both run fully local via ONNX through @huggingface/transformers. No API keys, no network after first download.
kb config list # see current setting
kb config get embeddingModel # current model name
kb config set embeddingModel Alibaba-NLP/gte-base-en-v1.5The setting is global (lives in ~/.kb/config.json), not per-wiki. Only the two model names above are accepted — any other value is rejected on the next embed call.
The vector index for each wiki was built with the previous model's embeddings. Vectors from different models live in different vector spaces and can't be compared — semantic search will degrade or return nonsense until you re-embed:
kb reindex --wiki my-wiki # re-embed one wiki
# repeat for each wiki you want to updatekb reindex drops the FTS index, the vec0 shadow, and re-walks all markdown files. The first invocation after a switch will trigger a fresh download of the new model (~200 MB → ~/.kb/.models/).
Both supported models are 768-dim, so the vec0 schema doesn't change and you can switch back and forth at will (just remember to reindex each time). Wikis on different versions before a reindex won't error — they'll just return poor results.
bge-base-en-v1.5 is the best small-medium general-purpose embedder available as an ONNX port in the Hugging Face ecosystem — strong on MTEB benchmarks and well-suited to the kind of mixed prose/identifier content most personal wikis contain. gte-base-en-v1.5 matches it on quality but extends the context window to 8K tokens, which matters if you write long-form docs that exceed the 512-token cap (otherwise content past the cap gets truncated before embedding).
If you want to add another model to the allowlist, see src/services/embedding.service.ts — ALLOWED_EMBEDDING_MODELS is a single tuple. Adding a model with a different dim count would also require updating the @db.search.vector annotation in src/models/document.as and migrating existing wikis.
- Node.js 22+, TypeScript, ESM
- moostjs CLI + HTTP framework
- atscript-db with SQLite adapter
- @huggingface/transformers for local embeddings
- sqlite-vec for vector search
- rolldown bundler
MIT — Artem Maltsev