Vegan dog food second pass: fully logged, non-homepage, hard claim-check + repo fixes it surfaced by hnshah · Pull Request #2 · hnshah/pagekit

hnshah · 2026-04-14T03:47:18Z

Summary

Two commits on this branch:

Fully logged second-pass run on the vegan-dog-food object (fictional Kind Bowl). The first-page-decision step produced a non-homepage first page (a nutritional-adequacy page). The claim-check step ran at hard severity and flagged 9 lines on a draft that initially read clean. All seven steps have per-step prompts, raw outputs and final artifacts. Evaluation + adversarial evaluator-pass included.
Repo improvements the run exposed, applied in a separate commit so the run-to-repo loop is auditable:
- frameworks/claim-checking.md — three new flag types (editorial voice, unsourced quantitative modifier, clinical/regulatory drift), explicit severity calibration (light/normal/hard), note on absorbing recurring patterns upstream.
- templates/claim-check-template.md — reflects new flag types, adds severity field and recurring-patterns section.
- frameworks/page-argument-shape.md + template — add length/density as a shape-level concern, strengthen default drafting constraints.
- frameworks/first-page-decision.md — role of mechanism/proof/comparison briefs in sharpening the decision, candidates-considered requirement for trust-heavy objects.
- frameworks/run-logging.md — evaluator-pass now explicitly required at the fully-logged tier.
- README.md + examples/README.md + new examples/vegan-dog-food-second-pass-summary.md.

What this run does not prove

The architecture is not yet validated beyond this one non-homepage object.
The claim-check step has not yet been tested on a real (non-fictional) brand where proof-map gaps would be more common.

Both gaps are named honestly in evaluation.md and evaluator-pass.md and surfaced in examples/README.md.

Test plan

Read runs/vegan-dog-food-second-pass/goal.md, then evaluation.md and evaluator-pass.md.
Walk runs/vegan-dog-food-second-pass/prompts/ and outputs/ in order 01-07 to confirm every step has both a prompt file and a raw output file.
Confirm first-page-draft.md and first-page-draft-corrected.md differ in the 9 flagged lines.
Confirm the repo-improvement commit touches only scaffolding, not the run itself.
Confirm README.md logged-runs section now lists three distinct tiers (fully logged, fully logged, summary logged) and names the second-pass run as the first non-homepage validation.

First fully logged run on a non-homepage first page. The run exercises the first-page-decision, page-argument-shape and claim-check steps on a trust-heavy object (fictional plant-based dog food brand, Kind Bowl) where the correct first page turns out to be a nutritional-adequacy page, not a homepage. Run contents: - goal, models, working-log - 6 source briefs (product, wedge, mechanism, proof, comparison, plus source-capture note) - per-step prompts and raw outputs for all 7 steps - final artifacts: signal doc, message spine, first-page decision, page argument shape, proof map, first-page draft, corrected draft after hard-severity claim check, claim-check report - evaluation and adversarial evaluator-pass The claim check at hard severity flagged 9 lines on a draft that read clean. Two recurring patterns surfaced (editorial voice and unsourced quantitative modifiers) that the current claim-check framework does not explicitly name. Those gaps, plus a few others surfaced by the evaluator pass, will be addressed in a follow-up commit on this branch.

The fully logged second-pass run exposed specific, tractable gaps in the method scaffolding. This commit applies those fixes so the run actually strengthens the repo, rather than just sitting inside it. Changes: - frameworks/claim-checking.md — add three flag types the run surfaced: editorial voice (brand narrating its own restraint), unsourced quantitative modifier, clinical/regulatory drift. Add an explicit severity calibration section (light / normal / hard) and a note on when patterns should be absorbed upstream into the argument shape instead of being re-caught by claim check on every run. - templates/claim-check-template.md — reflect the new flag types. Add a severity field and a recurring-patterns summary section. - frameworks/page-argument-shape.md — add length-and-density as an explicit shape-level concern. A page that is honest but unread does not earn trust. Name the drafting-constraints layer as the place where recurring claim-check patterns should be absorbed. - templates/page-argument-shape-template.md — strengthen the standing drafting-constraints list with the patterns the run found the default shape did not prevent (unsourced quantitative modifiers, editorial voice, external-credential description bloat, clinical drift). Add a length/density check. - frameworks/first-page-decision.md — name the role of mechanism, proof and comparison briefs in sharpening the decision. Add a candidates-considered section as a requirement for strong decisions on trust-heavy objects. - frameworks/run-logging.md — evaluator pass is now explicitly required for the fully-logged tier. Sources listed explicitly as a required component. - README.md — update logged-runs section to reflect the new second-pass run and mark the first-pass run as superseded summary. - examples/README.md + examples/vegan-dog-food-second-pass-summary.md — add a summary for the second-pass run and note what it proves and what it deliberately does not prove yet.

hnshah · 2026-04-14T04:29:42Z

Closing this PR per user direction — the improvements here (claim-check severity, editorial-voice flag type, evaluator-pass required at fully-logged tier, first-page-decision framework expansions) are being redone on a new branch (claude/kill-the-ai-slop) alongside the anti-slop work so the repo lands in one coherent state.

The second-pass run itself (runs/vegan-dog-food-second-pass/) will not follow — it stays a branch-only artifact. The method improvements it surfaced are what get carried forward.

Generated by Claude Code

…skill+agent, new-source-brief.sh, PUBLISHABLE tier, pre-commit Ships 11 of the 13 Tier 1+2 items from the roadmap. Two items (#2 and #9, real-object runs) need objects from the user and are flagged for a follow-up. Tier 1 (high leverage, small effort): - .github/workflows/check.yml: runs doctor.sh and slop-check.sh on every push and PR. - .github/PULL_REQUEST_TEMPLATE.md: matches the repo's actual PR rhythm. - .github/ISSUE_TEMPLATE/{method-feedback,bug,method-proposal}.md - CONTRIBUTING.md: four shapes of contribution, required checks, what bad contribution looks like. - README.md: sharper public hero. Points at runs/vegan-dog-food-verdel/ as the canonical worked example. Leads with the core principle instead of a feature list. - (v0.1.0 tag will be pushed after this PR merges.) Tier 2 (high leverage, medium effort): - .claude/agents/pagekit-evaluator-pass.md: read-only subagent for the adversarial evaluator pass. Produces runs/<name>/evaluator-pass.md. - .claude/skills/pagekit-evaluator-pass/SKILL.md: skill wrapper that delegates to the subagent. - scripts/new-source-brief.sh: scaffold a single source brief (wedge / mechanism / proof / comparison). Auto-numbered; reads the canonical brief templates; matches the Verdel sources/ pattern. - scripts/run-check.sh: new PUBLISHABLE tier above FULLY LOGGED. Requires claim-check.md filled in (not placeholder), first-page-draft.md filled in (not placeholder), and slop-check clean on the draft(s). Verdel now classifies as PUBLISHABLE; empty scaffolds classify as FULLY LOGGED with a clear punch list. - .pre-commit-config.yaml: local enforcement of slop-check + doctor on every commit. Optional (contributors opt in with pre-commit install). - CHANGELOG.md: seeded with v0.1.0 (the current merged state) and [Unreleased] for this PR's additions. Related: - scripts/doctor.sh updated to include the new subagent, the new skill, and scripts/new-source-brief.sh in its manifest checks. Deferred (need user input): - Tier 1 #2 and Tier 2 #9: real-object runs (non-fictional). Requires an object from the user. Flagged in the handoff. Verified: - scripts/doctor.sh PASS - scripts/slop-check.sh exit 0 clean - scripts/run-check.sh runs/vegan-dog-food-verdel -> PUBLISHABLE - scripts/run-check.sh runs/<fresh-scaffold> -> FULLY LOGGED (not PUBLISHABLE) - scripts/new-source-brief.sh smoke test created mechanism brief - scripts/new-run.sh still scaffolds correctly with doctor.sh post-change

claude added 2 commits April 14, 2026 03:44

hnshah closed this Apr 14, 2026

hnshah mentioned this pull request Apr 14, 2026

Kill the AI slop + ship all five tiers from the three-run analysis #3

Merged

5 tasks

hnshah deleted the claude/vegan-dog-food-second-pass branch April 14, 2026 16:03

hnshah mentioned this pull request Apr 14, 2026

Roadmap Tier 1 + 2: CI, templates, CONTRIBUTING, CHANGELOG, evaluator-pass skill, new-source-brief.sh, PUBLISHABLE tier, pre-commit #9

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vegan dog food second pass: fully logged, non-homepage, hard claim-check + repo fixes it surfaced#2

Vegan dog food second pass: fully logged, non-homepage, hard claim-check + repo fixes it surfaced#2
hnshah wants to merge 2 commits into
mainfrom
claude/vegan-dog-food-second-pass

hnshah commented Apr 14, 2026

Uh oh!

hnshah commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hnshah commented Apr 14, 2026

Summary

What this run does not prove

Test plan

Uh oh!

hnshah commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants