feat(plugin): plugin-suite improvements — grep, hints, skills, steering, extractor, markdown, auto-sync#280
Conversation
Root-cause fix for plugin under-adoption: the installer never wrote permission allowlist entries for the plugin MCP namespace (mcp__plugin_tracedecay_tracedecay__*), so every plugin tool call prompted interactively and hard-failed headless/subagent contexts. update_plugin now writes and migrates the allowlist (idempotent: twins derive from the union of existing and caller-supplied legacy entries) and refreshes the managed CLAUDE.md steering block so updated steering propagates to existing installs and their subagents. Rescued from an ended Codex session's working tree, then fixed: idempotency fixed-point bug and stale coercion-test expectations. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
Codex session was stopped mid-rename; only 2 dirty test files salvaged. The triple-plugin workflow task G owns the rename + allowlist migration. Keep until #280's rename lands for salvage comparison, then delete. Recovery: orchestrating Claude session 2c51d204-3565-4a10-833d-d8fbd51620c3 (project /home/zack/projects/tracedecay); replay: tracedecay tool lcm_load_session --provider claude --session-id 2c51d204-3565-4a10-833d-d8fbd51620c3 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Intent (markdown render backlog) fully covered by PR #280's T7 task (render fixes from the 74-tool audit, fact 24). Branch is base-only. Recovery: orchestrating Claude session 2c51d204-3565-4a10-833d-d8fbd51620c3 (project /home/zack/projects/tracedecay); replay: tracedecay tool lcm_load_session --provider claude --session-id 2c51d204-3565-4a10-833d-d8fbd51620c3 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Intent covered by the auto-sync workflow (design fact 22) implementing D1-D7 in feat/plugin-suite-improvements (PR #280). Branch is base-only. Recovery: orchestrating Claude session 2c51d204-3565-4a10-833d-d8fbd51620c3 (project /home/zack/projects/tracedecay); replay: tracedecay tool lcm_load_session --provider claude --session-id 2c51d204-3565-4a10-833d-d8fbd51620c3 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Eval scorecard doc (ab49fad). Canonical data lives in facts 28 (main corpus: codex 90% vs sonnet 50%) and 33 (obscure evals). Land as docs or prune once #280's eval re-run supersedes the baseline. Recovery: orchestrating Claude session 2c51d204-3565-4a10-833d-d8fbd51620c3 (project /home/zack/projects/tracedecay); replay: tracedecay tool lcm_load_session --provider claude --session-id 2c51d204-3565-4a10-833d-d8fbd51620c3 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Full recovery manifest (all session + workflow ids)Main orchestrating session: Workflow runs (transcripts: session dir →
Research agents (task transcripts in the session's Durable decisions: facts 17–39 in this project's fact store ( |
Status update (15:28 UTC)Done since the description was written:
Known flakes (pre-existing / test-only, not blocking):
Remaining before ready-for-review: Fable integration review of the assembled diff → rebase onto master + #277 + #278 (dedupe) → sccache/mold enablement during the rebuild window → hermetic clean-build re-smoke → eval corpus re-run (before/after vs sonnet-50%/codex-90% baseline) → reorganize wip checkpoints into logical commits. |
Squash-assembly of feat/plugin-suite-improvements onto the update-plugin permissions branch + master: tracedecay_grep, hint telemetry + budget + Claude hint surfaces + SubagentStart, skills consolidation (30->13) with pattern-researched gateway, installer permissions + CLAUDE.md steering, markdown render pass, TS extractor test-attribution fixes, auto-sync D1-D7 (watcher, serve-stale, branch lifecycle, session-start sync). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
217d31a to
5556dff
Compare
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
5556dff to
0c8239c
Compare
CI correctionCI surfaced 7 failures I missed locally — my local gate was
These are in |
Before/after proof — adoption fix works on the assembled binaryRan the 6 baseline-failure scenarios (the ones where Sonnet showed zero tracedecay usage) against this branch's binary in a hermetic isolated env (sonnet, tracedecay repo). 6/6 pass, and every silent-bypass scenario now fires the right tool:
The five symbol-metadata tools that never triggered on tailor-made prompts in the baseline (fact 33) all trigger now. This validates the full stack end-to-end on the real binary: installer plugin-namespace permissions (calls flow instead of being denied) + (The 7 CI failures noted above are all in the bundled auto-sync tail — being fixed separately; they don't touch this adoption path.) |
Consolidated plugin-suite improvements for Claude/Codex/Cursor, from tonight's audits, evals, and pattern research. Contains PR #278's permissions work (built on that branch) — merge this alone for everything, or merge #277/#278 first as smaller chunks and this rebases clean.
What's in it (911/911 tests green under nextest, the CI harness)
Adoption root-cause fixes
mcp__plugin_tracedecay_tracedecay__*) + migrates legacy entries — the fix that turns denied/stalled tool calls into working ones (evals: sonnet went 2/4-stalled → opus 4/4-substantive once permissions flowed)update_pluginrefreshes the managed CLAUDE.md steering block (reaches subagents); rewritten from anti-Explore polemic to moment-triggersNew capability closing the #1 native-tool leak
tracedecay_grep: gitignore-aware literal/regex content search, graph-enriched (enclosing symbol + node id per hit) — rg was the top shell command (683 calls); nothing in the graph answered content search beforeSteering that actually triggers
Correctness / quality
diagnostics_prewarmconfig knob (env-wins precedence) for the cold-start blockerIndex freshness (auto-sync D1–D7)
[sync]config table, warning UX ("refresh in progress" vs "run tracedecay sync"), the!Sendfix that unblocked the whole libraryKnown test-harness note
14 tests fail under in-process
cargo test(shared env-lock poison cascade under load) but all pass in isolation and 911/911 pass under nextest (process-per-test = what CI runs). Two pre-existing flakes worth standalone fixes: install-family tempdir rename race; git_watch debounce timing sensitivity.Recovery context
Session
2c51d204-3565-4a10-833d-d8fbd51620c3· workflowswf_785c5851-aee/wf_7f23bed7-803(plugin),wf_d002080a-6e2(auto-sync) · facts 17–39 (tracedecay tool fact_store --action get --fact-id 39= full manifest)🤖 Generated with Claude Code