Skip to content

feat: compaction-recovery hooks + subagent watchdog & quality-rotation patterns#23

Merged
intel352 merged 2 commits into
GoCodeAlone:mainfrom
intel352:feat/compaction-recovery-and-watchdog
Apr 29, 2026
Merged

feat: compaction-recovery hooks + subagent watchdog & quality-rotation patterns#23
intel352 merged 2 commits into
GoCodeAlone:mainfrom
intel352:feat/compaction-recovery-and-watchdog

Conversation

@intel352
Copy link
Copy Markdown
Contributor

@intel352 intel352 commented Apr 29, 2026

Summary

Three resilience layers so long autonomous superpowers runs don't silently degrade after a compaction event or a stuck subagent.

What's new

1. Compaction-recovery hooks (Claude Code, Cursor)

Long pipeline runs are vulnerable to context compaction wiping pipeline state and subagent task assignments. This PR adds two hooks:

Event Matcher Action
PostToolUse Skill / Agent / Task Append entry to .claude/superpowers-state/in-progress.jsonl (capped at 200 lines)
SessionStart startup / clear Wipe state file (existing using-superpowers injection unchanged)
SessionStart compact / resume Inject using-superpowers PLUS a <superpowers-resumption-context> block with the first user message from the transcript (the original task) and the last 30 activity entries

SessionStart compact was chosen over PostCompact because the additionalContext injection mechanism is documented for SessionStart. Both lead and subagent sessions get this hook firing inside their own contexts on their own transcripts, so a compacted subagent recovers its original assignment automatically. The hook also fires for Cursor since .cursor-plugin/plugin.json already wires hooks at the same path. Both hook scripts no-op gracefully when jq is unavailable.

The state file lives in .claude/superpowers-state/ (project-local, JSONL). Cleanup is automatic at session boundaries.

2. Watchdog cadence (every 5–10 minutes)

skills/subagent-driven-development/SKILL.md gains a Resilience section prescribing a 5–10 minute health-check cadence on background subagents:

  • Still running? Producing output? API errors / rate limits / context-length blowups? Off-task?
  • If stuck: send corrective input. If unrecoverable: terminate and re-dispatch a fresh subagent with a one-line note about what the prior attempt got wrong.

Cross-host: tool-specific guidance is wrapped in <host: claude-code> (TaskList / TaskOutput / SendMessage / TaskStop / ScheduleWakeup), <host: codex> (scratch-context tracking + thread status pings), and <host: opencode> (@mention peer pings). The host-neutral body is the pattern itself.

3. Quality-based rotation

Same skill section also prescribes replacing — not re-prompting — a subagent_type that is consistently low-quality. Rotation triggers:

  • 2 consecutive code-review rejections on the same task → switch subagent_type or escalate model tier (see agents/model-tiers.md)
  • 3 cumulative quality issues across tasks in one session → stop dispatching that subagent_type for the session
  • 2 instances of ignored explicit guidance → stop using; the issue is the agent profile, not the prompt

Rotation must be stated visibly in user-facing text so the user can redirect.

Why

A subagent that compacts mid-flight, hits a transient API error, or quietly drifts off-task can burn 30+ minutes of autonomous run time before anyone notices. The hook automation handles compaction recovery deterministically on hosts that support hooks; the watchdog and rotation patterns rely on the orchestrator applying them consistently.

Files changed

  • hooks/session-start — extended to read PostToolUse JSON stdin, parse source, and on compact|resume build the <superpowers-resumption-context> block from the transcript + state file. Existing using-superpowers injection unchanged. JSON-escaping helper preserved.
  • hooks/record-activity (new) — PostToolUse hook script. Reads JSON stdin, writes {ts, tool, detail} JSONL, caps at 200 lines.
  • hooks/hooks.json — adds the PostToolUse entry. SessionStart entry unchanged.
  • skills/subagent-driven-development/SKILL.md — appended Resilience section with three subsections, host-conditional blocks for claude-code / codex / opencode.
  • RELEASE-NOTES.md — v5.3.0 entry.
  • .claude-plugin/plugin.json, .claude-plugin/marketplace.json — version 5.2.0 → 5.3.0.
  • .cursor-plugin/plugin.json — version 5.1.0 → 5.3.0 (was lagging the other two; synced).

Smoke test

End-to-end test of the hooks against a synthesized transcript:

cat > /tmp/transcript.jsonl <<EOF
{"type":"user","message":{"content":"Run subagent-driven-development on the auth refactor plan."}}
EOF
echo '{"tool_name":"Skill","tool_input":{"skill":"brainstorming"},"cwd":"/tmp"}' | hooks/record-activity
echo '{"tool_name":"Agent","tool_input":{"subagent_type":"superpowers:code-reviewer","description":"review task 3","run_in_background":true},"cwd":"/tmp"}' | hooks/record-activity
echo '{"source":"compact","cwd":"/tmp","transcript_path":"/tmp/transcript.jsonl"}' | hooks/session-start | jq -r '.hookSpecificOutput.additionalContext'

Renders both the original-task block and the recent-activity replay correctly.

Notes

  • I deliberately did not invent host-specific hook mechanisms for Codex/OpenCode — they don't have a documented hooks system, so on those hosts the recovery pattern is described as a manual discipline. PRs welcome to add equivalent automation if/when those hosts gain hook events.

Adds three resilience layers so long autonomous superpowers runs do not
silently degrade after a compaction or a stuck subagent.

1. Compaction-recovery hooks (Claude Code, Cursor)

   PostToolUse hook on Skill|Agent|Task records each invocation to
   .claude/superpowers-state/in-progress.jsonl (capped at 200 lines, wiped
   on startup|clear). SessionStart matcher compact|resume re-injects a
   <superpowers-resumption-context> block containing:

     - the original task (extracted from the first user message in the
       transcript) — re-anchors a compacted SUBAGENT to its assignment
     - the last 30 activity entries — re-anchors the LEAD orchestrator
       to its place in the pipeline

   Both hook scripts no-op gracefully when jq is unavailable. The hook
   fires inside each session against its own transcript, so subagents
   recover automatically.

2. Watchdog cadence (every 5–10 min)

   subagent-driven-development now prescribes a 5–10 min health-check
   cadence on background subagents (still active? output flowing? API
   errors? off-task?) with corrective playbooks for stuck agents.
   Tool-specific guidance is wrapped in <host: claude-code> blocks;
   <host: codex> and <host: opencode> get host-conditional equivalents
   (scratch-context tracking, status pings via thread / @mention).

3. Quality-based rotation

   subagent-driven-development now prescribes replacing — not re-prompting —
   a subagent_type that is consistently low-quality. Rotation triggers:
   2 consecutive review rejections on a task, 3 cumulative session quality
   issues, or 2 instances of ignored guidance. Rotation should be stated
   visibly in user-facing text so the user can redirect.

Why these patterns: A subagent that compacts mid-flight, hits a transient
API error, or quietly drifts off-task can burn 30+ minutes of autonomous
run time before anyone notices. The hook automation handles compaction
recovery deterministically on hosts that support hooks; the watchdog and
rotation patterns rely on the orchestrator applying them consistently.

Bumps plugin version 5.2.0 -> 5.3.0 across .claude-plugin/plugin.json,
.claude-plugin/marketplace.json, and .cursor-plugin/plugin.json (the
last had been at 5.1.0 — synced to match the others). RELEASE-NOTES.md
gets a v5.3.0 entry.

Cross-host conditionals follow the existing convention (<host: claude-code>,
<host: codex>, <host: opencode, cursor>); the host-neutral body is the
pattern itself, host blocks carry the specific tools.
Copilot AI review requested due to automatic review settings April 29, 2026 18:29
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds resilience mechanisms to the superpowers plugin so long-running orchestrations can recover after context compaction, detect stuck subagents, and rotate away from consistently low-quality subagent types.

Changes:

  • Add a PostToolUse hook to record recent Skill/Agent/(Task) activity to a capped project-local JSONL state file and replay that context on SessionStart for compact|resume.
  • Extend subagent-driven-development guidance with compaction recovery, watchdog cadence, and quality-rotation patterns (with host-conditional sections).
  • Bump plugin versions and document the new behavior in release notes.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
skills/subagent-driven-development/SKILL.md Adds “Resilience” section covering compaction recovery, watchdog checks, and quality-based rotation.
hooks/session-start Injects resumption context on `compact
hooks/record-activity New PostToolUse hook that appends activity events to .claude/superpowers-state/in-progress.jsonl and caps file size.
hooks/hooks.json Registers the new PostToolUse hook wiring to record-activity.
RELEASE-NOTES.md Documents v5.3.0 resilience features.
.cursor-plugin/plugin.json Version bump to 5.3.0.
.claude-plugin/plugin.json Version bump to 5.3.0.
.claude-plugin/marketplace.json Version bump to 5.3.0.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread hooks/record-activity Outdated
Comment thread skills/subagent-driven-development/SKILL.md Outdated
Comment thread RELEASE-NOTES.md Outdated
Comment thread hooks/session-start
Comment thread hooks/session-start Outdated
Comment thread hooks/record-activity
Comment thread hooks/record-activity
- record-activity: broaden Task* matcher to capture TaskCreate/TaskList/
  TaskGet/TaskUpdate (header comment + new case arm with generic
  best-effort detail), fix the docstring that previously implied only
  Skill/Agent were recorded.
- record-activity: serialize the cap/rotate step to prevent concurrent
  writers from corrupting the JSONL during truncation. Uses flock when
  available, falls back to an atomic-mkdir mutex on platforms without
  util-linux flock (notably macOS). PID-tagged tmp file prevents tmp
  collisions.
- hooks.json: matcher updated to Skill|Agent|Task.* to actually capture
  Task* family tools.
- session-start: fail closed when the session source can't be determined
  (e.g. jq unavailable) so stale activity from a prior session can't
  leak into the resumption context.
- session-start: fence-wrap the original_task content so tag-like text
  (including a literal </superpowers-resumption-context>) in the first
  user message can't terminate or nest inside the resumption block. Fence
  length escalates if backticks already appear in the content.
- SKILL.md: link to agents/model-tiers.md (the actual file) instead of
  a non-existent superpowers:model-tiers agent for the tier-escalation
  reference.
- SKILL.md + RELEASE-NOTES.md: update matcher docs to reflect Task.* and
  the fail-closed wipe.
@intel352 intel352 merged commit 8beba50 into GoCodeAlone:main Apr 29, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants