Skip to content

Vegan dog food second pass: fully logged, non-homepage, hard claim-check + repo fixes it surfaced#2

Closed
hnshah wants to merge 2 commits into
mainfrom
claude/vegan-dog-food-second-pass
Closed

Vegan dog food second pass: fully logged, non-homepage, hard claim-check + repo fixes it surfaced#2
hnshah wants to merge 2 commits into
mainfrom
claude/vegan-dog-food-second-pass

Conversation

@hnshah
Copy link
Copy Markdown
Owner

@hnshah hnshah commented Apr 14, 2026

Summary

Two commits on this branch:

  1. Fully logged second-pass run on the vegan-dog-food object (fictional Kind Bowl). The first-page-decision step produced a non-homepage first page (a nutritional-adequacy page). The claim-check step ran at hard severity and flagged 9 lines on a draft that initially read clean. All seven steps have per-step prompts, raw outputs and final artifacts. Evaluation + adversarial evaluator-pass included.

  2. Repo improvements the run exposed, applied in a separate commit so the run-to-repo loop is auditable:

    • frameworks/claim-checking.md — three new flag types (editorial voice, unsourced quantitative modifier, clinical/regulatory drift), explicit severity calibration (light/normal/hard), note on absorbing recurring patterns upstream.
    • templates/claim-check-template.md — reflects new flag types, adds severity field and recurring-patterns section.
    • frameworks/page-argument-shape.md + template — add length/density as a shape-level concern, strengthen default drafting constraints.
    • frameworks/first-page-decision.md — role of mechanism/proof/comparison briefs in sharpening the decision, candidates-considered requirement for trust-heavy objects.
    • frameworks/run-logging.md — evaluator-pass now explicitly required at the fully-logged tier.
    • README.md + examples/README.md + new examples/vegan-dog-food-second-pass-summary.md.

What this run does not prove

  • The architecture is not yet validated beyond this one non-homepage object.
  • The claim-check step has not yet been tested on a real (non-fictional) brand where proof-map gaps would be more common.

Both gaps are named honestly in evaluation.md and evaluator-pass.md and surfaced in examples/README.md.

Test plan

  • Read runs/vegan-dog-food-second-pass/goal.md, then evaluation.md and evaluator-pass.md.
  • Walk runs/vegan-dog-food-second-pass/prompts/ and outputs/ in order 01-07 to confirm every step has both a prompt file and a raw output file.
  • Confirm first-page-draft.md and first-page-draft-corrected.md differ in the 9 flagged lines.
  • Confirm the repo-improvement commit touches only scaffolding, not the run itself.
  • Confirm README.md logged-runs section now lists three distinct tiers (fully logged, fully logged, summary logged) and names the second-pass run as the first non-homepage validation.

claude added 2 commits April 14, 2026 03:44
First fully logged run on a non-homepage first page. The run exercises
the first-page-decision, page-argument-shape and claim-check steps on a
trust-heavy object (fictional plant-based dog food brand, Kind Bowl)
where the correct first page turns out to be a nutritional-adequacy
page, not a homepage.

Run contents:
- goal, models, working-log
- 6 source briefs (product, wedge, mechanism, proof, comparison, plus
  source-capture note)
- per-step prompts and raw outputs for all 7 steps
- final artifacts: signal doc, message spine, first-page decision,
  page argument shape, proof map, first-page draft, corrected draft
  after hard-severity claim check, claim-check report
- evaluation and adversarial evaluator-pass

The claim check at hard severity flagged 9 lines on a draft that
read clean. Two recurring patterns surfaced (editorial voice and
unsourced quantitative modifiers) that the current claim-check
framework does not explicitly name. Those gaps, plus a few others
surfaced by the evaluator pass, will be addressed in a follow-up
commit on this branch.
The fully logged second-pass run exposed specific, tractable gaps in
the method scaffolding. This commit applies those fixes so the run
actually strengthens the repo, rather than just sitting inside it.

Changes:

- frameworks/claim-checking.md — add three flag types the run surfaced:
  editorial voice (brand narrating its own restraint), unsourced
  quantitative modifier, clinical/regulatory drift. Add an explicit
  severity calibration section (light / normal / hard) and a note on
  when patterns should be absorbed upstream into the argument shape
  instead of being re-caught by claim check on every run.

- templates/claim-check-template.md — reflect the new flag types.
  Add a severity field and a recurring-patterns summary section.

- frameworks/page-argument-shape.md — add length-and-density as an
  explicit shape-level concern. A page that is honest but unread does
  not earn trust. Name the drafting-constraints layer as the place
  where recurring claim-check patterns should be absorbed.

- templates/page-argument-shape-template.md — strengthen the standing
  drafting-constraints list with the patterns the run found the
  default shape did not prevent (unsourced quantitative modifiers,
  editorial voice, external-credential description bloat, clinical
  drift). Add a length/density check.

- frameworks/first-page-decision.md — name the role of mechanism,
  proof and comparison briefs in sharpening the decision. Add a
  candidates-considered section as a requirement for strong decisions
  on trust-heavy objects.

- frameworks/run-logging.md — evaluator pass is now explicitly
  required for the fully-logged tier. Sources listed explicitly as a
  required component.

- README.md — update logged-runs section to reflect the new
  second-pass run and mark the first-pass run as superseded summary.

- examples/README.md + examples/vegan-dog-food-second-pass-summary.md
  — add a summary for the second-pass run and note what it proves
  and what it deliberately does not prove yet.
Copy link
Copy Markdown
Owner Author

hnshah commented Apr 14, 2026

Closing this PR per user direction — the improvements here (claim-check severity, editorial-voice flag type, evaluator-pass required at fully-logged tier, first-page-decision framework expansions) are being redone on a new branch (claude/kill-the-ai-slop) alongside the anti-slop work so the repo lands in one coherent state.

The second-pass run itself (runs/vegan-dog-food-second-pass/) will not follow — it stays a branch-only artifact. The method improvements it surfaced are what get carried forward.


Generated by Claude Code

@hnshah hnshah closed this Apr 14, 2026
@hnshah hnshah deleted the claude/vegan-dog-food-second-pass branch April 14, 2026 16:03
hnshah pushed a commit that referenced this pull request Apr 14, 2026
…skill+agent, new-source-brief.sh, PUBLISHABLE tier, pre-commit

Ships 11 of the 13 Tier 1+2 items from the roadmap. Two items
(#2 and #9, real-object runs) need objects from the user and are
flagged for a follow-up.

Tier 1 (high leverage, small effort):
- .github/workflows/check.yml: runs doctor.sh and slop-check.sh on
  every push and PR.
- .github/PULL_REQUEST_TEMPLATE.md: matches the repo's actual PR rhythm.
- .github/ISSUE_TEMPLATE/{method-feedback,bug,method-proposal}.md
- CONTRIBUTING.md: four shapes of contribution, required checks,
  what bad contribution looks like.
- README.md: sharper public hero. Points at runs/vegan-dog-food-verdel/
  as the canonical worked example. Leads with the core principle
  instead of a feature list.
- (v0.1.0 tag will be pushed after this PR merges.)

Tier 2 (high leverage, medium effort):
- .claude/agents/pagekit-evaluator-pass.md: read-only subagent for
  the adversarial evaluator pass. Produces runs/<name>/evaluator-pass.md.
- .claude/skills/pagekit-evaluator-pass/SKILL.md: skill wrapper that
  delegates to the subagent.
- scripts/new-source-brief.sh: scaffold a single source brief (wedge /
  mechanism / proof / comparison). Auto-numbered; reads the canonical
  brief templates; matches the Verdel sources/ pattern.
- scripts/run-check.sh: new PUBLISHABLE tier above FULLY LOGGED.
  Requires claim-check.md filled in (not placeholder), first-page-draft.md
  filled in (not placeholder), and slop-check clean on the draft(s).
  Verdel now classifies as PUBLISHABLE; empty scaffolds classify as
  FULLY LOGGED with a clear punch list.
- .pre-commit-config.yaml: local enforcement of slop-check + doctor
  on every commit. Optional (contributors opt in with pre-commit install).
- CHANGELOG.md: seeded with v0.1.0 (the current merged state) and
  [Unreleased] for this PR's additions.

Related:
- scripts/doctor.sh updated to include the new subagent, the new
  skill, and scripts/new-source-brief.sh in its manifest checks.

Deferred (need user input):
- Tier 1 #2 and Tier 2 #9: real-object runs (non-fictional). Requires
  an object from the user. Flagged in the handoff.

Verified:
- scripts/doctor.sh PASS
- scripts/slop-check.sh exit 0 clean
- scripts/run-check.sh runs/vegan-dog-food-verdel -> PUBLISHABLE
- scripts/run-check.sh runs/<fresh-scaffold> -> FULLY LOGGED (not PUBLISHABLE)
- scripts/new-source-brief.sh smoke test created mechanism brief
- scripts/new-run.sh still scaffolds correctly with doctor.sh post-change
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants