Skip to content

addeelnayyer/forge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Forge

Evidence-first coding quality for Claude Code.

Forge classifies risk, captures a verification baseline, runs a multi-tier verification cascade, and spawns adversarial reviewer sub-agents before presenting code to you.


Installation

/install forge

Slash Commands

/forge <task>

Runs the full evidence-first Forge Loop for a given task.

Phases:

  1. Size — classifies the task as Small, Medium, or Large
  2. Classify — rates each file 🟢/🟡/🔴; any 🔴 file escalates to Large
  3. Baseline — runs the verification cascade before touching code; records results
  4. Implement — edits files; skips to Verify if git diff already shows changes
  5. Verify cascade — Tier 1 (diagnostics/syntax) → Tier 2 (build/types/lint/tests) → Tier 3 (smoke script)
  6. Adversarial review — 1 reviewer (Medium) or 3 parallel reviewers (Large / 🔴); fixes findings; max 2 rounds
  7. Evidence Bundle — structured markdown output with before/after results and confidence level
  8. Commit — auto-commit with structured message; captures pre-commit SHA for rollback
  9. Session file — saves implementations/<task-id>.md to repo root

/forge-verify

Standalone verification cascade on currently changed files. No implementation phase — useful for verifying changes made manually or by another tool.

Steps: detect changed files → classify risk → run Tiers 1–3 → output Evidence Bundle.


Auto-Activated Skills

These skills activate automatically without you invoking them:

Skill When it fires
forge:risk-classify Before Claude edits, creates, or deletes files — rates each file 🟢/🟡/🔴
forge:evidence-gate Before Claude writes a "build/tests passed" claim — requires real tool call evidence

Evidence Bundle

At the end of every /forge or /forge-verify run, you receive a structured report:

## 🔨 Forge Evidence Bundle

**Task**: add-payment-webhook | **Size**: L | **Risk**: 🔴

### Baseline (before changes)
| Check        | Result        | Command              |
|--------------|---------------|----------------------|
| diagnostics  | ✅ 0 errors   | ide-get_diagnostics  |
| build        | ✅ exit 0     | npm run build        |
| types        | ✅ exit 0     | npm run type-check   |
| tests        | ✅ 47 passed  | npm test             |

### After Changes
| Check        | Result        | Command              |
|--------------|---------------|----------------------|
| diagnostics  | ✅ 0 errors   | ide-get_diagnostics  |
| build        | ✅ exit 0     | npm run build        |
| types        | ✅ exit 0     | npm run type-check   |
| tests        | ✅ 48 passed  | npm test             |

### Adversarial Review
| Reviewer | Findings                                               |
|----------|--------------------------------------------------------|
| sonnet   | Missing HMAC validation line 34 (95) — **fixed**      |
| haiku    | Idempotency gap line 67 (78) — **fixed**               |
| opus     | No issues                                              |

**Confidence**: High — all tiers passed, all reviewer findings fixed
**Rollback**: `git revert HEAD` or `git checkout <pre-sha> -- <file>`

Confidence Levels

Level Meaning
High All tiers passed, no regressions, reviewers found zero issues or only issues you fixed
Medium Most checks passed but a reviewer concern was addressed without certainty, or test coverage for the changed path is missing
Low A check failed you could not fix, or a reviewer raised something you cannot disprove — states what would raise confidence

Worked Example

/forge add HMAC signature validation to webhook handler

→ Size: Large (webhook + signature keywords)
→ Classify: payments.ts 🔴, order.service.ts 🟡, payments.test.ts 🟢
→ Baseline: build ✅  types ✅  tests 47 ✅
→ Implement changes
→ Verify: build ✅  types ✅  tests 48 ✅
→ 3× forge-reviewer (parallel)
    sonnet: Missing timingSafeEqual (95) — fixed
    haiku:  Idempotency gap (78) — fixed
    opus:   No issues
→ Re-verify after fixes: all ✅
→ Second review round: no new findings
→ Evidence Bundle: Confidence High
→ Commit created
→ implementations/add-hmac-validation.md saved

Risk Classification

Color Meaning Examples
🔴 Critical — escalates task to Large, triggers 3 reviewers Auth, payments, crypto, schema migrations, data deletion, public API surface, webhooks
🟡 Significant — existing logic being modified Business logic, service files, DB queries, UI state, controllers
🟢 Additive — low blast radius New test files, documentation, config, new files from scratch, CSS

About

Evidence-first coding quality plugin for Claude Code

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors