Skip to content

feat(v0.21.x): cookbook v1 — BM25 retrieval over 12 curated kCAD patterns + lookup_cookbook MCP tool + eval --cookbook flag (#22)#59

Merged
w1ne merged 26 commits intodevelopfrom
feat/cookbook-v1
May 3, 2026
Merged

feat(v0.21.x): cookbook v1 — BM25 retrieval over 12 curated kCAD patterns + lookup_cookbook MCP tool + eval --cookbook flag (#22)#59
w1ne merged 26 commits intodevelopfrom
feat/cookbook-v1

Conversation

@w1ne
Copy link
Copy Markdown
Owner

@w1ne w1ne commented May 3, 2026

Summary

Workstream #22 from the v0.2-to-v1.0 gap-closure roadmap (§I4). Adds a self-growing pattern library that the agent can search at runtime.

  • 12 curated .kcad.ts snippets under cookbook/snippets/ (edge features, booleans, holes, sketches, symmetry, parameters). Markdown + YAML frontmatter + fenced TS body. Tag whitelist at cookbook/tags.json.
  • Pure BM25 retrieval in src/cookbook/search(query, snippets, k=3) over title + tags + keywords + when_to_use. Score floor 0.5, k clamped to [1, 5]. ~60 LoC, no external deps. Snapshot test locks ranking on 5 queries.
  • MCP tool lookup_cookbook(query, k?) registered alongside the existing 14 tools. Empty-hits is a valid success.
  • SKILL.md cookbook index — build-generated between <!-- COOKBOOK:START/END --> markers. CI gate via diff-check.
  • Eval --cookbook flag — pre-injects top-3 hits into a separate cache_control block on the system prompt. A/B golden test (eval/cookbook.test.ts) locks deterministic ranking.
  • npm run eval:ab — runs the suite twice (off/on) and prints score / token delta.
  • CI gates wired into npm run qc: cookbook:validate + cookbook:evaluate + cookbook:build + SKILL.md diff-check.

Test Plan

w1ne and others added 26 commits May 3, 2026 16:55
Curated library of canonical .kcad.ts pattern snippets, indexed for
in-prompt retrieval. Distinct from corpus expansion (cookbook is for
agents to reference, corpus is for evaluation). Continuous.

v1 scope:
- 12 starter pattern snippets seeded from eval expert solutions and the
  documented kernel surface
- Markdown frontmatter file format (id, title, tags, keywords,
  when_to_use, fenced TS body)
- BM25 retrieval over title/tags/keywords/when_to_use; score floor 0.5;
  ~40 LoC pure TS, no external deps
- Hybrid agent surface: build-generated SKILL.md cookbook index +
  MCP tool lookup_cookbook(query, k)
- Eval --cookbook flag with pre-injection (separate cache_control
  block); A/B golden test on bracket-holes
- CI gates: validate frontmatter, evaluate every body clean, diff-check
  generated SKILL.md section
- Continuous growth contract: same-PR additions, eval-driven additions,
  snapshot-test gate on ranking shifts

Native-framed per the no-competitor-refs rule. Lineage captured in
~/.claude/projects/-home-andrii/memory/kernelcad_design_lineage.md
under '#22 cookbook with retrieval (2026-05-03) — design-time lineage'.
19 tasks (102 bite-sized steps) covering:
- yaml dep + tags whitelist + 12 starter snippets
- Pure BM25 retrieval module (~60 LoC, no deps)
- Snippet loader with frontmatter + tag-whitelist validation
- search() composition with score floor + k clamping
- 3 CI gates (validate, evaluate, build) wired into qc
- SKILL.md cookbook index generator + diff-check
- MCP tool lookup_cookbook
- TranscriptEvent kind cookbook_inject + renderer
- AgentClient systemAddendum (separate cache_control block)
- Eval --cookbook flag + per-task pre-injection
- A/B golden test on bracket-holes
- npm run eval:ab convenience script
- CHANGELOG entry

TDD throughout (test before implementation). Frequent commits (one per
task). Ready for subagent-driven-development execution.

Spec lineage: docs/superpowers/specs/2026-05-03-cookbook-with-retrieval-design.md (2ab8190).
…nto qc

- Add cookbook:validate, cookbook:evaluate, and cookbook:build to qc script
- Add git diff --exit-code check on src/skill/SKILL.md to catch drift
- Fix lint error: remove unused import of loadSnippets (re-exported via export statement)
@w1ne w1ne merged commit bbc4a9c into develop May 3, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant