v0.9.2 — Local-only embeddings; BGE-large default (paired with agentic-harness v2.4.1)
Patch — embedding-mode collapse + default model upgrade. Drops the Voyage/Anthropic API embedding mode entirely; local sentence-transformers is now the only production mode. Default model upgraded from all-MiniLM-L6-v2 (384-d, MTEB English 56.3) to BAAI/bge-large-en-v1.5 (1024-d, MTEB English 64.2). EMBEDDING_DIM bumped 384 → 1024. Triggered by ROADMAP item #18 (inserted mid-flight of plan #7a part 5 / seed-pass on 2026-05-20). Implemented as plan #18 (7 tasks across 8 toolkit commits). Paired with agentic-harness v2.4.1 (doc-only on the harness side).
Why this shape: the primary operator is a Claude Ultra subscriber without a separate Anthropic / Voyage API key — the API path was unreachable for the toolkit's actual user. Dual-mode added surface area (mode resolution, env-var contract, dim-truncation, two test paths) without value for the personal-dev-env use case. Modern small-to-mid local models (BGE-large family, mxbai, nomic-embed) deliver near-SOTA MTEB results on desktop-class hardware (M-series + 64GB RAM) — the quality gap that motivated dual-mode is no longer load-bearing. Plan #18 was inserted mid-flight of plan #7a part 5 (seed-pass) because task 6 (validate via sample recalls) needs a worthwhile embedding model for validation signal to be meaningful; seed-pass resumes at task 6 with the new model after this release pair ships.
Decision rationale + 4 load-bearing assumptions with re-audit triggers in ADR 0001's 2026-05-20 amendment (operator decision: amend rather than write new ADR 0007). The parent MemoryVault design doc body was rewritten in-place across 12 substantive references to match the v0.9.2 state; Document History row 10 captures the rewrite scope.
Added
AGENT_TOOLKIT_EMBEDDING_MODELenv var escape hatch inskills/memory/scripts/embed.py— operators on low-spec hosts swap the BGE-large default for a smaller local model (e.g.all-MiniLM-L6-v2) without code changes. Still local-only — no API option ever.rebuildsubcommand inskills/memory/scripts/vec_index.py— dropsentriesvirtual table +entry_metatable + recreates at currentEMBEDDING_DIM. Preserves the embedding queue file. Returns stats dict (old_dim,new_dim,entries_dropped,queue_preserved) or{skipped: true, ...}for graceful-skip. Exit 0 on success / exit 2 on graceful-skip (matchessizepattern).- Dim-mismatch detection in
vec_index.py's_open_index()— introspects existing virtual-table schema viasqlite_master+ the new_DIM_REGEX; on mismatch prints[vec_index] dim mismatch ... rebuild required: python3 vec_index.py rebuild --vault-path <path>to stderr + closes conn + returns None (graceful-skip; never blocks the prompt). Same path fires fromdrain_queueso dim-mismatch surfaces there too. requirements.txtat repo root with the canonical Python dep list:pyyaml>=6.0,sqlite-vec>=0.1.0,sentence-transformers>=2.0. Comments document manual install + PEP 668 escape (--break-system-packages) + virtualenv pattern.--no-python-deps/-NoPythonDepsflag ininstall.sh+install.ps1— operator escape hatch for operators who manage Python deps via virtualenv / conda / system packages, or for CI to avoid the ~1.3GB sentence-transformers download per workflow run.install_python_deps()function ininstall.sh(+Install-PythonDepsininstall.ps1) — best-effort pip-install ofrequirements.txtafter the customization install loop. Idempotent quick-path checks importability before attempting install. Non-fatal failure with operator-facing hint for PEP 668 systems.- Local-mode integration test in
scripts/smoke-install-{bash.sh,pwsh.ps1}guarded bySKIP_LOCAL_MODE_INTEGRATIONenv var (set by all 3 OS CI workflows to skip the BGE-large download). Operators with sentence-transformers installed run the test locally — invokesembed.py --mode local, asserts 1024-d JSON list, asserts all numeric values. - ADR 0001's 2026-05-20 amendment (43 new lines) — the v0.9.2 amendment block following the existing 2026-05-17 amendment shape. WHY narrowing + WHY NOT 4 alternatives + 5 operational changes + 4 load-bearing assumptions with re-audit triggers.
Changed
skills/memory/scripts/embed.pyrewrite (109 ins / 115 del; 216 lines net). Removed:_embed_api()function +_VOYAGE_ENDPOINT/_VOYAGE_MODELconstants + dim-truncation logic +_resolve_mode()'s API branch +MEMORY_USE_API_EMBEDDINGS/VOYAGE_API_KEY/ANTHROPIC_API_KEYenv var reads. Added:_DEFAULT_LOCAL_MODEL = "BAAI/bge-large-en-v1.5",EMBEDDING_DIM = 1024,_resolve_model()withAGENT_TOOLKIT_EMBEDDING_MODELenv var override, informativeValueErrorfor"api"invocations pointing at v0.9.2 + ADR amendment. CLI--modechoices reduced from["api","local","stub"]to["local","stub"].skills/memory/scripts/vec_index.py(209 ins / 18 del). Schema dim 384 → 1024._open_index()extended with dim-mismatch detection.drain_queue()switched to use_open_index()as gating probe so dim-mismatch surfaces from drain. Newrebuild_index()function +rebuildCLI subcommand.skills/memory/scripts/recall.py+vec_index.pyCLI--modechoices reduced from["api","local","stub"]to["local","stub"](alignment with embed.py).install.sh+install.ps1install Python deps fromrequirements.txtby default (was: not installed at all; operators followed wiki docs to install manually). Same default-on-with-opt-out pattern as--no-pre-push-hook.- MemoryVault design doc body rewritten in-place across 12 substantive references (overview, infrastructure, recall engine, dependencies, tech debt #2 + #9, security network surface, reliability, privacy opt-out, latency budgets, project management § DD #7, operations monitoring). Each rewritten section cross-links to ADR 0001's amendment. Document History row 10 captures the rewrite scope. Old dual-mode narrative preserved only in the pre-existing 2026-05-15 / 2026-05-16 / 2026-05-17 Document History rows as historical record.
wiki/how-to/Use-The-Memory-Skill.mdupdates — Prereqs callout updated; § Embedding mode (was "Embedding modes" plural) rewritten with BGE-large + model swap escape hatch; troubleshooting entries for "embedding skipped" + "embedding unavailable" updated for local-only state; offline-capable recall paragraph updated.- Design doc parts files (
write-primitives.md,recall-loop.md) updated to match v0.9.2 state — references tomemory.use_api_embeddingsflag + Anthropic API replaced with single-mode local sentence-transformers narrative + ADR amendment cross-refs.
Internal
- Smoke install bash + pwsh tests updated: all 5 install.sh + 5 install.ps1 invocations pass
--no-python-deps/-NoPythonDepsso CI doesn't pay the ~1.3GB sentence-transformers download × 5 install scenarios × 3 OS per workflow run. Default-mode-resolution test changed from'api'expectation to'local'; new tests for v0.9.2--mode apiValueError,EMBEDDING_DIM=1024,AGENT_TOOLKIT_EMBEDDING_MODELescape hatch, stub-mode 1024-d output,rebuildsubcommand happy + graceful-skip paths,_DIM_REGEXparse correctness. - 8 commits across plan #18:
222fea6(embed.py refactor) +6f0383b+ce5b110(task-1 CI fixups) +18941ae+fb83437(task-2 vec_index.py + CI fixup) +4a9c74a(task-3 local-mode integration test) +6633943(task-4 install scripts) +1b956f2(task-5 ADR amendment) + this v0.9.2 release commit.