Skip to content

bigbabyjack/agent-harness

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

harness

Lean rails for agentic work. A trigger-based rewrite of the superpowers plugin, grounded in the agent-harness research at bigbabyjack/agentic-research.

Why this exists

superpowers is built around the implicit thesis that ceremony makes agents reliable — hard gates, "1% chance" rules, brainstorm-then-design-doc-then-plan-then-execute-then-review for every task.

The agent-harness research argues for a different thesis: verification loops + hook-enforced discipline make agents reliable. Skills exist to encode habits the agent should reach for, not to gate every action behind ritual. The plugin's job is to make the agent's work checkable, not ceremonious.

This plugin is the second thesis in concrete form.

What changed from superpowers

Kept (lightly edited):

  • verification-before-completion — the load-bearing rule
  • test-driven-development — red → green → refactor → commit
  • systematic-debugging — hypothesis-driven, not patch-and-pray

Rewritten:

  • brainstormingdesign-before-architecture-changes (gated on >2 files / public interface / new module, not "every task")
  • writing-plans + executing-plans → folded into the design skill; long-lived plan docs replaced with in-conversation sketches
  • requesting-code-review + receiving-code-reviewarchitecture-review (structural concerns, not line-level nits)
  • using-superpowersusing-harness (triggers, not gates)

Demoted (default-off):

  • using-git-worktreesworkspace-isolation (in-place is the default; isolation requires a named reason)
  • subagent-driven-development + dispatching-parallel-agentsparallelism (single-agent until single-agent is the bottleneck)

Added (gaps the harness research identified):

  • atomic-commits — explicit file lists, Conventional Commits, never bypass hooks, prefer the project's committer wrapper
  • no-suppression — no # noqa / # type: ignore / // @ts-ignore
  • hook-discipline — fix the cause, never --no-verify
  • continuous-refactor — standing-practice slop reduction
  • evolutionary-instructionsAGENTS.md grows by accretion
  • two-tier-stateWORKING_NOTES.md (replaceable) + DECISIONS.md (append-only ADR-lite)
  • trigger-phrases — vocabulary to steer agents on hard problems

Skill index

Skill Trigger
using-harness session start
verification-before-completion before any "done"/"fixed"/"passing" claim
atomic-commits before any git commit
no-suppression before adding a suppression marker
hook-discipline when a hook fails
test-driven-development before writing implementation code
systematic-debugging when behavior is unexpected
design-before-architecture-changes task touches >2 files, a public interface, or a new module
continuous-refactor after shipping a feature commit-set
evolutionary-instructions non-obvious pattern formalizes mid-session
two-tier-state context-risk ops; non-obvious decisions
architecture-review end of feature commit-set
trigger-phrases task is hard, agent is rushing
parallelism only when single-agent is the bottleneck
workspace-isolation only when in-place work is unsafe
writing-skills creating or editing a skill

Design rules

  • ~80 lines per skill, hard cap ~100. Long skills get skimmed once and ignored thereafter.
  • Trigger in the description field. Specific enough that the agent can decide invoke-or-not in one read.
  • Rule, then reason. Rules with reasons survive edge cases; rules without reasons get rationalized away.
  • No shouting. No <EXTREMELY-IMPORTANT> tags, no all-caps imperatives, no "you cannot rationalize your way out of this." The harness's discipline comes from hooks and verification loops, not from emphasis.
  • No mandatory announce-on-invoke. No mandatory TodoWrite per checklist item. Use those tools when they help, not because a skill demanded it.

Templates (the other half of the thesis)

The thesis is "discipline comes from hooks," so the plugin ships the hooks too. None of these are auto-installed — you wire them into the project once.

Entry point

A SessionStart hook in hooks/ injects the using-harness index as additionalContext so the agent discovers the skill set without the user having to invoke it first. The hook is deliberately light — no <EXTREMELY_IMPORTANT> wrapping, no mandatory announce-on-invoke, just the index as a reference.

If your platform doesn't run plugin hooks, paste templates/CLAUDE.md.snippet into the project's AGENTS.md instead.

TODO (v0.2)

  • Slash commands. /harness:verify-and-commit, /harness:design, /harness:review as friction reducers — invoke a skill without remembering its name.
  • Marketplace name. harness-dev is fine for development. Pick a publish name before pushing to a public marketplace.
  • More language templates. Rust pre-commit config, Go equivalent.
  • CI variants. GitHub Actions / GitLab CI snippets that mirror the pre-commit gates.

Status

Initial draft. Skills will evolve as patterns formalize — see evolutionary-instructions.

About

Lean rails for agentic work: trigger-based Claude Code plugin with verification loops, atomic commits, hook discipline, evolutionary docs.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors