Vegan dog food second pass: fully logged, non-homepage, hard claim-check + repo fixes it surfaced#2
Closed
hnshah wants to merge 2 commits into
Closed
Vegan dog food second pass: fully logged, non-homepage, hard claim-check + repo fixes it surfaced#2hnshah wants to merge 2 commits into
hnshah wants to merge 2 commits into
Conversation
First fully logged run on a non-homepage first page. The run exercises the first-page-decision, page-argument-shape and claim-check steps on a trust-heavy object (fictional plant-based dog food brand, Kind Bowl) where the correct first page turns out to be a nutritional-adequacy page, not a homepage. Run contents: - goal, models, working-log - 6 source briefs (product, wedge, mechanism, proof, comparison, plus source-capture note) - per-step prompts and raw outputs for all 7 steps - final artifacts: signal doc, message spine, first-page decision, page argument shape, proof map, first-page draft, corrected draft after hard-severity claim check, claim-check report - evaluation and adversarial evaluator-pass The claim check at hard severity flagged 9 lines on a draft that read clean. Two recurring patterns surfaced (editorial voice and unsourced quantitative modifiers) that the current claim-check framework does not explicitly name. Those gaps, plus a few others surfaced by the evaluator pass, will be addressed in a follow-up commit on this branch.
The fully logged second-pass run exposed specific, tractable gaps in the method scaffolding. This commit applies those fixes so the run actually strengthens the repo, rather than just sitting inside it. Changes: - frameworks/claim-checking.md — add three flag types the run surfaced: editorial voice (brand narrating its own restraint), unsourced quantitative modifier, clinical/regulatory drift. Add an explicit severity calibration section (light / normal / hard) and a note on when patterns should be absorbed upstream into the argument shape instead of being re-caught by claim check on every run. - templates/claim-check-template.md — reflect the new flag types. Add a severity field and a recurring-patterns summary section. - frameworks/page-argument-shape.md — add length-and-density as an explicit shape-level concern. A page that is honest but unread does not earn trust. Name the drafting-constraints layer as the place where recurring claim-check patterns should be absorbed. - templates/page-argument-shape-template.md — strengthen the standing drafting-constraints list with the patterns the run found the default shape did not prevent (unsourced quantitative modifiers, editorial voice, external-credential description bloat, clinical drift). Add a length/density check. - frameworks/first-page-decision.md — name the role of mechanism, proof and comparison briefs in sharpening the decision. Add a candidates-considered section as a requirement for strong decisions on trust-heavy objects. - frameworks/run-logging.md — evaluator pass is now explicitly required for the fully-logged tier. Sources listed explicitly as a required component. - README.md — update logged-runs section to reflect the new second-pass run and mark the first-pass run as superseded summary. - examples/README.md + examples/vegan-dog-food-second-pass-summary.md — add a summary for the second-pass run and note what it proves and what it deliberately does not prove yet.
Owner
Author
|
Closing this PR per user direction — the improvements here (claim-check severity, editorial-voice flag type, evaluator-pass required at fully-logged tier, first-page-decision framework expansions) are being redone on a new branch ( The second-pass run itself ( Generated by Claude Code |
5 tasks
hnshah
pushed a commit
that referenced
this pull request
Apr 14, 2026
…skill+agent, new-source-brief.sh, PUBLISHABLE tier, pre-commit Ships 11 of the 13 Tier 1+2 items from the roadmap. Two items (#2 and #9, real-object runs) need objects from the user and are flagged for a follow-up. Tier 1 (high leverage, small effort): - .github/workflows/check.yml: runs doctor.sh and slop-check.sh on every push and PR. - .github/PULL_REQUEST_TEMPLATE.md: matches the repo's actual PR rhythm. - .github/ISSUE_TEMPLATE/{method-feedback,bug,method-proposal}.md - CONTRIBUTING.md: four shapes of contribution, required checks, what bad contribution looks like. - README.md: sharper public hero. Points at runs/vegan-dog-food-verdel/ as the canonical worked example. Leads with the core principle instead of a feature list. - (v0.1.0 tag will be pushed after this PR merges.) Tier 2 (high leverage, medium effort): - .claude/agents/pagekit-evaluator-pass.md: read-only subagent for the adversarial evaluator pass. Produces runs/<name>/evaluator-pass.md. - .claude/skills/pagekit-evaluator-pass/SKILL.md: skill wrapper that delegates to the subagent. - scripts/new-source-brief.sh: scaffold a single source brief (wedge / mechanism / proof / comparison). Auto-numbered; reads the canonical brief templates; matches the Verdel sources/ pattern. - scripts/run-check.sh: new PUBLISHABLE tier above FULLY LOGGED. Requires claim-check.md filled in (not placeholder), first-page-draft.md filled in (not placeholder), and slop-check clean on the draft(s). Verdel now classifies as PUBLISHABLE; empty scaffolds classify as FULLY LOGGED with a clear punch list. - .pre-commit-config.yaml: local enforcement of slop-check + doctor on every commit. Optional (contributors opt in with pre-commit install). - CHANGELOG.md: seeded with v0.1.0 (the current merged state) and [Unreleased] for this PR's additions. Related: - scripts/doctor.sh updated to include the new subagent, the new skill, and scripts/new-source-brief.sh in its manifest checks. Deferred (need user input): - Tier 1 #2 and Tier 2 #9: real-object runs (non-fictional). Requires an object from the user. Flagged in the handoff. Verified: - scripts/doctor.sh PASS - scripts/slop-check.sh exit 0 clean - scripts/run-check.sh runs/vegan-dog-food-verdel -> PUBLISHABLE - scripts/run-check.sh runs/<fresh-scaffold> -> FULLY LOGGED (not PUBLISHABLE) - scripts/new-source-brief.sh smoke test created mechanism brief - scripts/new-run.sh still scaffolds correctly with doctor.sh post-change
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two commits on this branch:
Fully logged second-pass run on the vegan-dog-food object (fictional Kind Bowl). The first-page-decision step produced a non-homepage first page (a nutritional-adequacy page). The claim-check step ran at hard severity and flagged 9 lines on a draft that initially read clean. All seven steps have per-step prompts, raw outputs and final artifacts. Evaluation + adversarial evaluator-pass included.
Repo improvements the run exposed, applied in a separate commit so the run-to-repo loop is auditable:
frameworks/claim-checking.md— three new flag types (editorial voice, unsourced quantitative modifier, clinical/regulatory drift), explicit severity calibration (light/normal/hard), note on absorbing recurring patterns upstream.templates/claim-check-template.md— reflects new flag types, adds severity field and recurring-patterns section.frameworks/page-argument-shape.md+ template — add length/density as a shape-level concern, strengthen default drafting constraints.frameworks/first-page-decision.md— role of mechanism/proof/comparison briefs in sharpening the decision, candidates-considered requirement for trust-heavy objects.frameworks/run-logging.md— evaluator-pass now explicitly required at the fully-logged tier.README.md+examples/README.md+ newexamples/vegan-dog-food-second-pass-summary.md.What this run does not prove
Both gaps are named honestly in
evaluation.mdandevaluator-pass.mdand surfaced inexamples/README.md.Test plan
runs/vegan-dog-food-second-pass/goal.md, thenevaluation.mdandevaluator-pass.md.runs/vegan-dog-food-second-pass/prompts/andoutputs/in order 01-07 to confirm every step has both a prompt file and a raw output file.first-page-draft.mdandfirst-page-draft-corrected.mddiffer in the 9 flagged lines.README.mdlogged-runs section now lists three distinct tiers (fully logged, fully logged, summary logged) and names the second-pass run as the first non-homepage validation.