Skip to content

v1.10.0

Choose a tag to compare

@anton-core-plugin-publisher anton-core-plugin-publisher released this 18 Jun 22:41
· 3 commits to main since this release

Added

  • CLAUDE.md routing-fragment auto-apply + sole applier. A new internal/fragment package and core fragment apply / core fragment status verbs are now the single writer/reader of the user's global ${CLAUDE_CONFIG_DIR:-$HOME/.claude}/CLAUDE.md routing fragment — sentinel-bounded splice (<!-- anton-core:start/end -->), atomic write, fragment.version read-back verify, --dry-run preview, and a FRAGMENT_APPLIED audit-events row. The SessionStart hook auto-applies a newer shipped fragment, bumps the pin, and announces via the envelope's top-level systemMessage ("Routing fragment updated X → Y."); on failure it degrades to a visible warning and never blocks (exit 0). The setup skill (Stage-2/Update/Repair) now calls the verbs instead of doing its own Read/Edit, eliminating the dual-applier drift risk and the hardcoded ~/.claude path. ADR-0041 records the trust-boundary decision.
  • Session-start context now reaches the model. The SessionStart hook delivers its surfaced tasks / improvements / recall reminder through hookSpecificOutput.additionalContext (the documented model-context channel) — previously the JSON envelope had no recognized hook field and was inert.

Changed

  • The local embedder is now ON by default (embedder.enabled seeds true; migration 0018_core_embedder_enabled_default_on.sql flips pre-existing installs from the old false). Recall gains its vector-similarity arm out of the box instead of staying FTS-only until manually enabled. The model is loaded lazily — the adapter is bound at startup but the model is fetched/loaded only on the first command that actually embeds (recall, item save, maintenance reindex), so commands that never embed (tasks add, report health) pay nothing and a fresh install does not download the ~127 MB model until a vector is genuinely needed. A load/acquire failure degrades that embed to FTS (the stub sentinel) rather than failing the command, and is logged once per process as embedder.load.failed (Warn) so the silent degrade stays diagnosable (a deliberately-disabled embedder logs nothing); core health reports the vec-arm state ("staged" / "enabled — run core system warm --target embedder" / "stub") from a cheap on-disk stat without loading the model. To light up an existing corpus after upgrading, run maintenance reindex --target knowledge (and --target code); new writes embed automatically. Set embedder.enabled=false to opt out. A new CORE_EMBEDDER_OFFLINE=1 env forbids the network model download (degrade to FTS when the model is absent) — set by the test Make targets so the suite never pulls the model, and usable on air-gapped hosts.
  • Fragment-version drift no longer emits a stderr banner (confirmed dead at SessionStart exit 0). internal/hooks.CheckFragmentDrift (banner-returning) is replaced by structured EvaluateFragmentDrift; the dotted-numeric version compare consolidates into internal/fragment.CompareVersions; "older than pinned" warns without writing (no downgrade). Spec §04-paths-and-config, acceptance A-plugin-8/9/10 + A-setup-1/2/3, and the 08-hooks side-effects updated accordingly.

Fixed

  • Embedder batch reindex no longer exhausts the inference graph cache. internal/embed forward now pads input_ids/attention_mask/token_type_ids up to bucketed sequence lengths ({64,128,256,512}) before inference, so every forward pass shares one of ≤4 JIT-compiled SimpleGo graphs instead of recompiling per unique token length. A long-lived maintenance reindex --target {knowledge,code} previously accumulated >32 distinct sequence lengths and died at the 33rd with maximum cache size of 32 reached; per-save (fresh process, ~1 shape) was unaffected. Pad positions carry the tokenizer's [PAD] id and attention-mask 0 so the softmax ignores them, and CLS pooling reads position 0 — whose representation is invariant to masked trailing pads — so embeddings are unchanged from the prior unpadded output (cosine ≈ 1.0, gated by a new embedmodel-tagged padding-identity test). On the batch path, maintenance reindex now pre-compiles the embedder's bucket graphs before its item loop, so no early item pays a mid-loop JIT stall (interactive commands are unaffected — WarmUp stays load-only).