You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Replaced misleading memory recall labeling with LanceDB-backed vector recall
evidence. Butler now separates semantic_similarity, lexical_match, contextual_match, graph, task, and explicit-memory evidence in recall
diagnostics.
Added production hot-cache vector backfill through butler cognition memory maintain --hot-cache-backfill-only --json, so existing hot-cache memory can
be indexed through the same CLI path used by the real runtime.
Reworked recall ranking away from anonymous fixed weighted sums and added a
real-data A/B gate against the legacy keyword-weighted baseline. The latest
gate showed 4 legacy-baseline lifts, 0 legacy regressions, 0 top decoys, 0
positive misses, and 0 negative false positives on the Butler data snapshot.
Hardened work-state persistence with locked atomic file-state writes for
planned task state, with concurrent TaskStore and PlannedTaskStore E2E
contention evidence.
Improved the agent loop so repeated identical failed tool calls can stop
early without discarding a successful alternate tool result from the same
safe parallel batch.
Fixed packaged Butler App startup so the desktop client attaches to an
already healthy local app gateway, avoids Electron-owned gateway pid-file
contention, and resolves the managed Bun runtime under Finder-like macOS
launch environments.
Added dedicated Butler App release artifacts. Version-tag releases now build
a signed butler-app-0.0.3-darwin-arm64.zip, a butler-app-0.0.3-linux-x64.tar.gz, SHA256 files, app-release-manifest.json,
and app-update-manifest.json alongside the service artifact.
Strengthened real-data E2E reporting. The evidence gate now records prompt
input, internal recall/tool-loop diagnostics, target-vector-backed evidence,
file-state contention evidence, safe-tool parallel execution, unsafe-tool
serialization, repeated-failure early stop, and alternate-path recovery.
Validation
bun run e2e:memory-quality-ab passed on run 2026-06-02T06-24-01-303Z-05c78913.
bun run e2e:evidence-gate passed 17/17 on run 2026-06-02T05-54-51-706Z-fa411ca4.
bun run release:app:package --out dist/release/app --artifact-base-url https://github.com/Hexpy-Games/butler/releases/download/v0.0.3 generated
signed app release artifacts and passed macOS codesign --verify --deep --strict plus Finder-like launch smoke.
bun run release:service:package --out dist/release/service --artifact-base-url https://github.com/Hexpy-Games/butler/releases/download/v0.0.3 regenerated
the service tarball, SHA256 file, service release manifest, and update
manifest.
bun run check passed after release preparation.
project-ledger check --project $BUTLER_HOME --silent passed with
no stale views or next actions.