chore(pipelines): raise judgment-step model from cheapest to balanced#1609
Merged
Conversation
Cheapest (Haiku) tier produced shallow / lazy outputs on judgment-shaped work — most visibly the test deletion in #1582 and forbidigo-panic patterns in #1585. Real-world signal from Epic #1565 Phase 1 dispatch: Haiku is fine for summary/distill/format steps but not for any persona that scans, judges, plans, implements, or reviews. Promoted to balanced on these steps: - impl-issue: fetch-assess + agent_review (create-pr commenter stays cheap) - impl-issue-core: fetch-assess - audit-tests: scan + agent_review (summarizer report stays cheap) - audit-architecture: scan + agent_review (summarizer report stays cheap) - audit-security: agent_review (summarizer report stays cheap) - ops-bootstrap: commit (craftsman writes scaffolding) - ops-pr-review: diff-analysis + security llm_judge + quality agent_review - plan-research: analyze-topics + research-topics (fetch/post-comment stay cheap) Kept cheapest where the persona is summary-shaped: - summarizer - forge.type-commenter (PR/comment formatting) - forge.type-analyst on fetch-only steps Also adds golangci-lint to flake.nix devShell so the same lint that gates CI runs locally without manual install (CI uses v2.10, nixpkgs ships v2.8 — minor skew, acceptable for now).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Real-world signal from Epic #1565 Phase 1 dispatch: cheapest tier (Haiku) produced shallow outputs on judgment-shaped work. Most visible cases:
panic()calls in init paths, failing forbidigoBoth rooted in cheapest-tier model lacking the nuance to refactor / scope / reason. Pipeline yamls had
model: cheapestbaked into many judgment steps as a cost optimization that no longer holds when output quality breaks downstream.Change
Promoted to balanced on judgment steps; kept cheapest on summary/distill/format steps.
Other change
Added
golangci-linttoflake.nixdevShell so the same forbidigo gate that broke CI on PR #1585 runs locally without manual install. CI ships v2.10; nixpkgs-unstable ships v2.8.0 — minor skew, acceptable.Test plan
go build ./...cleango test ./internal/defaults/...passes (embedded-fs parity tests)nix develop -c which golangci-lintresolvesRelated