bkit v2.1.25 — Claude 5 Model Alignment + Issue Response (GitHub Release notes DRAFT)

Status: FINAL DRAFT — version confirmed v2.1.25 (2026-07-02).
Basis: docs/02-design/features/claude-model-alignment.design.en.md (Option C,
user-approved) + empirical reproductions R1–R4 (.bkit/research/v2125-reproduction-log.md)

issue responses #128 / #129 / #130.

Highlights

4-tier role-based model matrix: 9 fable (verification
& orchestration core) / 7 opus (deep reasoning & security) / 16 sonnet
(implementers) / 2 haiku (monitors) — final tree 34 agents. 16 model pins
changed; every assignment argued per-agent (No Guessing).
~4.5–5.3K tokens saved on every session wakeup (#129, @NEXCODE-MK): agent
descriptions compacted 30,065B → 16,919B (−44%) with a compact 8-language
trigger encoding — full EN + KO keyword lists stay, other languages keep one
anchor keyword each, and "Do NOT use for" guidance moved into agent bodies
(loaded only on invocation). bkit's own 8-language routing is untouched.
Deprecated agents off your prompt surface (#128, @NEXCODE-MK): the 6
pdca-eval-* tombstone stubs are gone from agents/ (−1,387B more, and no
accidentally-spawnable entries). Deprecation governance now lives in a
machine-readable registry (test/contract/deprecation-registry.json) under
ADR 0014 — contract L4/L5 gates fully preserved.
bkit's verification core now runs on Claude Fable 5: gap-detector,
design-validator, pdca-iterator, and all long-horizon leads (cto-lead,
sprint-orchestrator, sprint-master-planner, pm-lead, qa-lead, sprint-qa-flow)
are pinned to fable — the model class built for long-horizon orchestration and
honest self-verification.
Dual-floor compatibility, zero hard breakage: the install floor stays at
Claude Code v2.1.143. Below the new model floor (v2.1.170), bkit shows an
actionable SessionStart advisory (ENH-368) with a one-line workaround instead
of mystery spawn errors.
Cost accuracy: token-cost dashboards previously overstated opus spend 3x
(stale $15/$75 pricing). Pricing is now synced to published Claude API list
prices: fable $10/$50, opus $5/$25, sonnet $3/$15, haiku $1/$5 per MTok.

What changes for you

Higher verification quality: gap analysis (match-rate SSoT), design
validation, and the Evaluator-Optimizer iteration loop now run on Fable 5 —
the checks that decide whether your implementation matches your design get
bkit's strongest model.
Better orchestration: /pdca team, /sprint, PM and QA team workflows are
led by Fable-pinned leads; synthesis quality compounds across every downstream
phase.
Accurate cost reporting: /pdca-watch and token reports reflect real
Claude 5 list prices (opus costs were overstated 3x before).
Leaner sessions: every Claude Code session with bkit carries ~44% less
always-resident agent-description text (#129) and zero deprecated tombstone
entries (#128) — reflected in cache-read billing on every API call.
Security-sensitive work stays on Opus 4.8: security-architect,
code-analyzer, and self-healing intentionally remain opus — Opus 4.8 is
strongest on cybersecurity, and Fable's safety classifier can reroute/refuse
security-adjacent or headless work.

Model matrix

Tier	Count	Agents
fable — verification & orchestration core	9	cto-lead, sprint-orchestrator, sprint-master-planner, pm-lead, qa-lead, gap-detector, design-validator, pdca-iterator, sprint-qa-flow
opus — deep reasoning & security	7	security-architect, code-analyzer, self-healing, infra-architect, enterprise-expert, bkit-impact-analyst, cc-version-researcher
sonnet — implementers	16	bkend-expert, frontend-architect, pipeline-guide, pm-discovery, pm-lead-skill-patch, pm-prd, pm-research, pm-strategy, product-manager, qa-debug-analyst, qa-strategist, qa-test-generator, qa-test-planner, skill-needs-extractor, sprint-report-writer, starter-guide
haiku — monitors	2	qa-monitor, report-generator

Reassignments: 9 opus→fable, 1 opus→sonnet (sprint-report-writer). 7 opus agents
preserved deliberately. The 6 deprecated pdca-eval-* stubs were removed from
agents/ per #128 / ADR 0014 (governance in the deprecation registry).

Compatibility & floors

Recommended Claude Code: v2.1.198 — the sonnet alias resolves to
Sonnet 5 only on CC ≥ v2.1.197.
Model floor: v2.1.170 — the fable alias exists only from this version.
On CC 2.1.143–2.1.169 the 9 fable-pinned agents fail to spawn (empirically
reproduced, R2); bkit detects this at SessionStart and shows the advisory below.
Install minimum: v2.1.143 (unchanged — plugin-manifest displayName).
Runtime minimum: v2.1.78 (unchanged).
Below the model floor — workaround: upgrade with
npm install -g @anthropic-ai/claude-code@latest, or temporarily
export CLAUDE_CODE_SUBAGENT_MODEL=sonnet (forces ALL subagents to sonnet
until you unset it).

Provider alias table (R1)

Alias resolution depends on your provider path — bkit makes no universal
"Sonnet 5" promise:

Provider path	`fable`	`opus`	`sonnet`
Anthropic API (CC ≥ 2.1.197)	Fable 5 (CC ≥ 2.1.170)	Opus 4.8	Sonnet 5
Claude Platform on AWS	provider-specific full ID required	Opus 4.7	Sonnet 4.6
Bedrock / Vertex / Foundry	provider-specific full ID required	Opus 4.6	Sonnet 4.5

Footguns & caveats

CLAUDE_CODE_SUBAGENT_MODEL overrides ALL frontmatter model pins — while
set, every subagent (including the 9 fable pins) runs on that model. It is the
documented below-floor workaround; remember to unset it after upgrading.
Enterprise availableModels exclusions fall back silently: an excluded
model does not error — the agent inherits the main conversation model instead.
Fable safety-classifier headless refusals: Fable may reroute or refuse
security-adjacent or non-interactive (claude -p) requests. This is why the
security/headless agents stay on Opus 4.8.
Provider aliases resolve to older models on AWS/Bedrock/Vertex (see table
above); fable there needs a provider-specific full model ID.
evals/*/eval.yaml model_baseline values are historical capture records and
are intentionally unchanged.

Issue response (community-driven)

#128 (@NEXCODE-MK) — deprecated pdca-eval-* stubs removed from the
prompt surface; deprecation registry + ADR 0014 supersede the v2.1.22
permanent-retention decision without touching contract baselines. Bonus: a
pre-existing exit-2 crash in the L4 missing-stub path and 6 pre-existing
agents-effort test failures were fixed along the way.
#129 (@NEXCODE-MK) — token diet via compact 8-language trigger encoding
(−44% agent-description surface; regression-locked at ≤700B per agent).
Locale-scoped generation (issue proposal 1) is deferred with rationale: CC
plugins are read-only marketplace checkouts — no install-time generation hook
exists.
#130 (@s99606931) — learning-stop.js piped-stdin isTTY === false dead
gate fixed with the shared readStdinSync() helper (same precedent as
#125/#126); 9-TC regression test added; repo-wide sweep confirms zero
remaining isTTY === false code gates.

Fixes

Opus pricing 3x overstatement in lib/pdca/token-report.js (15/75 → 5/25);
haiku synced to 1/5; fable 10/50 added with _modelClass() fable branch.
3 pre-existing doc-drift bugs: commands/bkit.md claimed "36 total /
13 opus / 21 sonnet / 2 haiku" (actual: 40 files); pm-lead was listed as
sonnet (it was opus, now fable); test-checklist PM-T10 claimed all 5 PM
agents use sonnet.
Lockstep updates: VALID_MODELS + runtime whitelist gained fable,
28 contract baseline JSONs regenerated (model field only), security assertions
SEC-AF-030/037/038 updated, team default ctoAgent opus → fable,
RECOMMENDED_VERSION 2.1.150 → 2.1.198, FABLE_MODEL_FLOOR 2.1.170 added.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bkit v2.1.25 — Claude 5 Model Alignment + Issue Response

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

bkit v2.1.25 — Claude 5 Model Alignment + Issue Response (GitHub Release notes DRAFT)

Highlights

What changes for you

Model matrix

Compatibility & floors

Provider alias table (R1)

Footguns & caveats

Issue response (community-driven)

Fixes

Contributors

Uh oh!