Skip to content

docs(workflow): STACKED_PRS.md + session cost log + 2026-04-18 post-mortem#112

Merged
github-actions[bot] merged 1 commit intomainfrom
docs/stacked-prs-graphite
Apr 18, 2026
Merged

docs(workflow): STACKED_PRS.md + session cost log + 2026-04-18 post-mortem#112
github-actions[bot] merged 1 commit intomainfrom
docs/stacked-prs-graphite

Conversation

@mojwang
Copy link
Copy Markdown
Owner

@mojwang mojwang commented Apr 18, 2026

Tier 3.1 + 3.3 from the next-session plan. Flows today's workflow learnings back into the canonical agentic infra.

Summary

`docs/STACKED_PRS.md` (new)

`docs/CLAUDE_AGENTS.md` — adds session-cost-log format

  • `scripts/.session-cost.log` append-only: `date | topic | dispatches | models | outcome`
  • Documents the weekly-review pattern-detection use case
  • Cross-references STACKED_PRS.md for the "manual stacks indicate tooling gap" pattern

Test plan

  • `docs/STACKED_PRS.md` renders cleanly on GitHub
  • Cross-reference links resolve in both directions
  • No conflicts with existing Graphite docs — the new file is a strategic-view companion, not a replacement

Pre-existing findings (no changes this PR)

  • Graphite is already in `homebrew/Brewfile` (`withgraphite/tap/graphite`)
  • GIT-WORKFLOW.md already has a thorough Graphite tutorial section
  • So Tier 3.1's "install gt" action item is already satisfied — just the strategic doc was missing

Tier 3.1 + 3.3 alignment from the 2026-04-18 autonomous session.

docs/STACKED_PRS.md (new)
- Strategic-view companion to the existing GIT-WORKFLOW.md Graphite
  tutorial. Covers WHEN to stack, manual retarget fallback when gt
  isn't available, and the concrete 2026-04-18 incident post-mortem
  (PRs #64/#65 on mojwang.tech merged into feature branches instead
  of main because auto-delete-head-branches was off)
- Documents repo hygiene defaults:
  - GitHub "Automatically delete head branches" ON
  - Vercel-Neon "Delete Neon database branch when Vercel preview is
    deleted" ON
- Cross-referenced from GIT-WORKFLOW.md's Graphite section

docs/CLAUDE_AGENTS.md — Session cost log format
- Adds the `scripts/.session-cost.log` append-only format to the
  Cost Awareness section. Fields: date | topic | dispatches |
  models | outcome
- Captures directional cost awareness for weekly-review pattern
  detection (research-heavy vs implementation, stacks-needing-
  Graphite-threshold, etc.)
- Cross-references STACKED_PRS.md for the "manual stacking
  indicates missing tooling" pattern
@github-actions github-actions Bot merged commit 1c007c0 into main Apr 18, 2026
8 checks passed
@github-actions github-actions Bot deleted the docs/stacked-prs-graphite branch April 18, 2026 22:44
@mojwang mojwang requested a review from Copilot April 19, 2026 00:41
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds strategic documentation to complement the existing Graphite tutorial, capturing stacked-PR workflow guidance and recent incident learnings, and introduces a lightweight session cost log format for agentic workflow review.

Changes:

  • Add docs/STACKED_PRS.md covering when/why to stack PRs, manual fallback retargeting, repo hygiene defaults, and a 2026-04-18 incident post‑mortem.
  • Cross-reference the new stacked-PR doc from docs/GIT-WORKFLOW.md.
  • Document an append-only session cost log line format in docs/CLAUDE_AGENTS.md.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
docs/STACKED_PRS.md New strategic doc for stacked PRs, manual fallback guidance, and incident post‑mortem.
docs/GIT-WORKFLOW.md Adds a “See also” link pointing readers to the new strategic stacked-PR doc.
docs/CLAUDE_AGENTS.md Defines a session cost log line format and links it to stacked-PR/tooling gap patterns.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs/STACKED_PRS.md
gt auth
```

Bootstrap script (`setup-macos.sh`) adds this alongside `gh`, `delta`, `fzf`, etc.
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setup-macos.sh doesn’t appear to be responsible for installing Homebrew packages (including Graphite); Graphite is installed from homebrew/Brewfile via scripts/install-packages.sh. Consider updating this line to point at the Brewfile/install-packages step (or setup.sh) so the install source is accurate.

Suggested change
Bootstrap script (`setup-macos.sh`) adds this alongside `gh`, `delta`, `fzf`, etc.
In this repo, Graphite is installed alongside `gh`, `delta`, `fzf`, and other Homebrew packages from `homebrew/Brewfile` via `scripts/install-packages.sh`.

Copilot uses AI. Check for mistakes.
Comment thread docs/STACKED_PRS.md
Comment on lines +63 to +67
If you're mid-stack without `gt`, **always retarget each downstream PR's base to `main` before the parent merges**. GitHub doesn't do this automatically unless "Auto-delete head branches" is on AND the parent's head gets deleted via the merge.

```bash
gh pr edit <child-pr-number> --base main
```
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The manual retarget guidance here is likely backwards: retargeting the child PR to main before the parent merges removes the stacked-diff benefit (the child diff will include the parent). The usual mitigation for the failure mode described is to retarget (or delete the parent branch to trigger GitHub retargeting) immediately after the parent merges, and before merging the child PR.

Copilot uses AI. Check for mistakes.
Comment thread docs/STACKED_PRS.md
Comment on lines +36 to +37
# Submit the whole stack as coordinated PRs
gt stack submit
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doc uses gt branch create / gt stack submit, but docs/GIT-WORKFLOW.md’s Graphite section uses gt create and gt submit --stack (and calls out gt ss). To avoid confusing readers, it’d help to align on one command style or explicitly mention the equivalent commands/aliases.

Copilot uses AI. Check for mistakes.
Comment thread docs/CLAUDE_AGENTS.md
Comment on lines +179 to +183
At session end, append a single line to `scripts/.session-cost.log` (append-only, gitignored, reviewed at weekly review):

```
YYYY-MM-DD | <session_topic> | <dispatches> | <models_used> | <outcome>
```
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This says scripts/.session-cost.log is gitignored, but there’s no ignore entry for it in the repo’s .gitignore right now. Either add scripts/.session-cost.log to .gitignore (preferred if you want it append-only/local), or adjust the wording so it doesn’t imply it’s already ignored.

Copilot uses AI. Check for mistakes.
mojwang added a commit that referenced this pull request Apr 19, 2026
Roadmap P0.2 from the 2026-04-19 overnight research. Unblocks P0.3
(/grade-session retroactive grading) and P5.1 (meta-agent) — both need
a real log with real rows.

PR #112 already documented a 5-field cost-log format but never shipped
a writer. This PR ships the writer and extends the format to 6 fields:
adds `agent_sha` so later analysis can correlate outcomes with
specific agent-prompt versions (P5.1 requirement).

Format (pipe-separated):
  YYYY-MM-DDTHH:MM:SS | topic | dispatches | models | outcome | agent_sha

Outcome enum: shipped | partial | reverted | blocked | plan-only | (empty)
Empty = grade later via /grade-session (P0.3).

Design decisions:
- Slash command + shell script, NOT a Stop hook. Claude Code's Stop
  fires every turn, not session end, so a hook-driven autolog would
  misfire on partial turns. One explicit /log-session per session is
  cleaner and gives the user control over the topic/dispatches/models
  summary (fields no hook could auto-capture).
- Timestamp instead of just date — multiple sessions per day are
  common.
- Auto-captured fields: timestamp (`date`), agent_sha (`git log` on
  .claude/agents/).
- --outcome sentinel: `--outcome ""` explicitly marks ungraded,
  skipping the prompt; missing --outcome prompts interactively.
- Pipe chars in user fields get sanitized to `;` so a malformed
  topic doesn't break column parsing.
- Log file is gitignored (personal state, not shared artifact).

Files:
- scripts/log-session.sh — new, executable, shellcheck clean
- .claude/commands/log-session.md — new slash command wrapper
- docs/CLAUDE_AGENTS.md — updated format doc (6 fields, broader enum)
- .gitignore — adds scripts/.session-cost.log

Verified:
- shellcheck clean
- Happy path: `--topic … --outcome shipped` writes a row
- Empty outcome: `--outcome ""` writes a row with blank 5th field
- Invalid outcome: `--outcome bogus` exits 2 with clear error
- Pipe sanitization: topic containing `|` rewrites as `;`
- Gitignored: log file doesn't appear in git status
github-actions Bot pushed a commit that referenced this pull request Apr 19, 2026
## Context

Roadmap **P0.2** from the [2026-04-19 overnight research implementation
roadmap](https://github.com/mojwang/mojwang.tech/blob/main/../../../../../ai/workspace/claude/resources/agentic-research/2026-04-19-implementation-roadmap.md).
Unblocks **P0.3** (\`/grade-session\` retroactive grading) and **P5.1**
(meta-agent) — both need a real log with real rows.

PR #112 shipped documentation for a 5-field cost-log format but never
shipped a writer. This PR ships the writer and extends the format to 6
fields: adds \`agent_sha\` so later analysis can correlate outcomes with
specific agent-prompt versions (P5.1 requirement).

## Format (pipe-separated)

\`\`\`
YYYY-MM-DDTHH:MM:SS | topic | dispatches | models | outcome | agent_sha
\`\`\`

Outcome enum: \`shipped | partial | reverted | blocked | plan-only |
(empty)\`. Empty = "grade later," filled retroactively by
\`/grade-session\` (forthcoming in P0.3).

## Design decisions

- **Slash command + shell script, NOT a Stop hook.** Claude Code's
\`Stop\` hook fires every turn, not session end, so a hook-driven
autolog would misfire on partial turns. One explicit \`/log-session\`
per session is cleaner and gives the user control over
topic/dispatches/models (fields no hook could auto-capture).
- **Timestamp instead of just date** — multiple sessions per day are
common.
- **Auto-captured fields:** timestamp (\`date\`), agent_sha (\`git log\`
on \`.claude/agents/\`).
- **\`--outcome\` sentinel:** \`--outcome \"\"\` explicitly marks
ungraded, skipping the prompt; missing \`--outcome\` prompts
interactively.
- **Pipe-char sanitization:** user fields replace \`|\` with \`;\` so
malformed input doesn't break column parsing.
- **Log file gitignored** — personal state, not a shared artifact.

## Files

- \`scripts/log-session.sh\` — new, executable, \`shellcheck\` clean,
follows \`#!/usr/bin/env bash\` + \`set -euo pipefail\` convention
- \`.claude/commands/log-session.md\` — new slash-command wrapper
- \`docs/CLAUDE_AGENTS.md\` — updated format doc (6 fields, broader
outcome enum, usage example)
- \`.gitignore\` — adds \`scripts/.session-cost.log\`

## Usage

\`\`\`
/log-session --topic \"P0.2 session-logger infra\" --dispatches \"—\"
--models \"sonnet\" --outcome \"shipped\"
\`\`\`

Missing flags prompt interactively.

## Verification

- [x] \`shellcheck scripts/log-session.sh\` — clean
- [x] Happy path: \`--topic … --outcome shipped\` writes a row
- [x] Empty outcome: \`--outcome \"\"\` writes a row with blank 5th
field
- [x] Invalid outcome: \`--outcome bogus\` exits 2 with clear error
- [x] Pipe sanitization: topic containing \`|\` is rewritten as \`;\`
- [x] Gitignored: log file doesn't appear in \`git status\`

## Out of scope (deferred to P0.3)

- \`/grade-session\` retroactive grading (walks ungraded rows)
- \`weekly-review\` command update to surface ungraded sessions
- SessionStart hook nudge (\"N ungraded sessions\")
- Auto-capture of dispatches + models via transcript parsing (Claude
Code doesn't expose transcripts to hooks)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants