@rezzedai/specfirst

Confidence-scored specifications for Claude Code. Plan before you build. Know what you don't know.

Why

Most plan-before-build tools enforce a binary gate: spec exists or doesn't. But not all specs are created equal. A 10-step plan where step 7 is "figure out the database schema" isn't much better than no plan at all.

specfirst adds a gradient. Every step in your spec carries a 4-dimensional confidence score. Steps below threshold get flagged before you write a line of code. You see exactly where the uncertainty lives — unclear requirements, unfamiliar implementation, unknown risks, or missing dependencies.

The result: fewer rewrites, better estimates, and honest uncertainty instead of false confidence.

Quick Start

Installation

npm install -g @rezzedai/specfirst

Or use without installing:

npx @rezzedai/specfirst init

Setup

Initialize in your project:

cd your-project/
specfirst init

This creates .specfirst/ with default configuration.

Add to Claude Code:

Copy the skill definition to your Claude Code skills directory:

cp node_modules/@rezzedai/specfirst/SKILL.md .claude/skills/spec.md

Or if installed via npx:

mkdir -p .claude/skills && curl -o .claude/skills/spec.md https://raw.githubusercontent.com/rezzedai/specfirst/main/SKILL.md

Create specs in Claude Code:

/spec Add a /health endpoint that returns JSON with status and uptime

Note: /spec is a Claude Code skill, not a CLI command. It runs inside Claude Code sessions. The CLI commands (init, list, review, status) manage specs after they're created.

Claude will analyze your codebase, decompose the task, score each step, and write a structured spec to .specfirst/specs/.

Review Your Specs

specfirst list
specfirst review .specfirst/specs/2026-02-15T14-30-00Z-add-health-endpoint.md

How It Works

specfirst follows a 7-step pipeline:

ANALYZE — Read your codebase to understand structure, patterns, and dependencies
CLARIFY — Ask questions if the task is ambiguous (or skip if clear)
DECOMPOSE — Break the task into ordered, testable implementation steps
SCORE — Rate each step on 4 dimensions (0.0 to 1.0 scale)
AUTO-SCOPE — Trivial tasks (≤2 steps, all scores ≥0.9) get lightweight specs
VERIFY — Identify failure modes and adjust the plan (full specs only)
OUTPUT — Write a structured spec file with scoring summary

Confidence Scoring

Each step is scored on 4 dimensions:

Dimension	What It Measures
Requirement Clarity (RC)	How well-defined is what needs to be built?
Implementation Certainty (IC)	How confident are you in the how?
Risk Awareness (RA)	How well do you understand what could go wrong?
Dependency Clarity (DC)	Do you know what this step depends on?

Composite score = weighted average of the 4 dimensions (default: equal weights, 0.25 each).

Overall plan confidence = geometric mean of all step scores. This penalizes catastrophic uncertainty — one step with 0.2 confidence drags the whole plan into review, as it should.

Calibrated Anchors

Scoring prompts include examples to prevent overconfidence:

RC 1.0: "Add /health endpoint returning {status: 'ok'} with HTTP 200." (specific, testable)
RC 0.5: "Make the API more observable." (general direction, details unclear)
IC 1.0: Standard pattern seen many times (e.g., adding a REST endpoint with Express)
IC 0.5: General approach known, implementation details need research (e.g., WebSocket subscriptions)

See SKILL.md for full calibration table.

Thresholds

Each score maps to an action:

Score	Status	Meaning
≥0.8	✅ Green	Proceed — safe to implement autonomously
0.5-0.79	⚠️ Yellow	Review — flagged for human review before implementation
<0.5	🛑 Red	Block — requires human decision before proceeding

Thresholds are configurable via .specfirst/config.yaml.

Output Format

A spec file looks like this:

# Spec: Add Health Endpoint

**Created:** 2026-02-15T14:30:00Z
**Task:** Add a /health endpoint that returns JSON with status and uptime
**Status:** draft
**Overall Confidence:** 0.92 ✅

---

## Context

The API currently has no health checking. This adds a lightweight endpoint
for monitoring tools to verify the service is running and track uptime.

## Steps

### Step 1: Add route handler

- **What:** Create GET /health endpoint in routes/health.js
- **Files:** src/routes/health.js, src/app.js
- **Dependencies:** Express (already installed)
- **Confidence:** 0.95 (✅)
  - Requirement Clarity: 1.0
  - Implementation Certainty: 0.9
  - Risk Awareness: 1.0
  - Dependency Clarity: 0.9
- **Risks:** None — standard Express route pattern
- **Verification:** curl localhost:3000/health returns 200 with JSON

### Step 2: Add uptime tracking

- **What:** Track process start time, calculate uptime in seconds
- **Files:** src/utils/uptime.js, src/routes/health.js
- **Dependencies:** Node.js process.uptime() (built-in)
- **Confidence:** 0.90 (✅)
  - Requirement Clarity: 0.9
  - Implementation Certainty: 0.9
  - Risk Awareness: 0.9
  - Dependency Clarity: 0.9
- **Risks:** Uptime resets on process restart (expected behavior)
- **Verification:** Verify uptime increments on subsequent requests

---

## Scoring Summary

| Step | Score | RC   | IC   | RA   | DC   | Status |
|------|-------|------|------|------|------|--------|
| 1    | 0.95  | 1.0  | 0.9  | 1.0  | 0.9  | ✅     |
| 2    | 0.90  | 0.9  | 0.9  | 0.9  | 0.9  | ✅     |

**Overall Confidence:** 0.92 ✅

**Steps requiring review:** 0 of 2
**Flagged dimensions:** None

For trivial tasks (≤2 steps, all scores ≥0.9), you get a lightweight version without the Chain of Verification section.

CLI Reference

`specfirst init`

Initialize specfirst in the current project.

specfirst init

Creates .specfirst/ directory with default config.yaml.

`specfirst list`

List all specs in the project.

specfirst list

# Output:
# .specfirst/specs/
#   2026-02-15T14-30-00Z-add-health-endpoint.md (0.92 ✅)
#   2026-02-14T09-15-00Z-database-migration.md (0.65 ⚠️)

`specfirst review <file>`

Display a formatted summary of a spec.

specfirst review .specfirst/specs/2026-02-15T14-30-00Z-add-health-endpoint.md

Outputs: task, overall confidence, step count, flagged steps, and next actions.

`specfirst status <file> <status>`

Update a spec's status.

specfirst status .specfirst/specs/2026-02-15T14-30-00Z-add-health-endpoint.md implementing

Valid statuses: draft, reviewed, approved, implementing, complete

Configuration

Edit .specfirst/config.yaml to customize:

# Confidence thresholds
thresholds:
  proceed: 0.8       # Green — implement autonomously
  review: 0.5        # Yellow — flagged for human review (below = red/blocked)

# Scoring weights (must sum to 1.0)
weights:
  requirement_clarity: 0.25
  implementation_certainty: 0.25
  risk_awareness: 0.25
  dependency_clarity: 0.25

# Auto-scope: lightweight spec for trivial tasks
auto_scope:
  enabled: true
  max_steps_for_lightweight: 2
  min_score_for_lightweight: 0.9

# Output
output:
  directory: ".specfirst/specs"
  format: "markdown"

Examples:

Raise the bar: thresholds.proceed: 0.9 (only super-confident plans go green)
Weight risk higher: weights.risk_awareness: 0.4 (reduce others to sum to 1.0)
Disable auto-scoping: auto_scope.enabled: false (always generate full specs)

Comparison

vs. superpowers

Feature	superpowers	specfirst
Focus	Methodology + templates	Measurement + confidence
Size	22K tokens (skill definition)	~2K tokens
Enforcement	Binary (spec exists or doesn't)	Graduated (green/yellow/red)
Scoring	None	4-dimensional per step
Auto-scope	No	Yes (trivial tasks get lightweight specs)
Dependencies	Multiple npm packages	Zero (Node built-ins only)
Philosophy	"Follow this process"	"Measure your certainty"

Both are valuable. superpowers is a comprehensive framework for AI-assisted development. specfirst is a focused tool for confidence-aware planning. Use them together if you want both process and measurement.

Why Geometric Mean?

The overall plan confidence uses geometric mean, not arithmetic mean. Here's why:

Arithmetic mean treats all steps equally:

Plan: [0.9, 0.9, 0.9, 0.2] → arithmetic mean = 0.73 (looks okay)

Geometric mean penalizes catastrophic uncertainty:

Plan: [0.9, 0.9, 0.9, 0.2] → geometric mean = 0.60 (correctly flagged)

One step with 0.2 confidence should drag the entire plan into review. That's the step where the plan will blow up. Geometric mean surfaces that signal.

Contributing

Contributions welcome! This is an open-source project by Rezzed AI.

To contribute:

Fork the repo
Create a feature branch: git checkout -b feature/your-feature
Make your changes (add tests if applicable)
Run tests: npm test (when test suite exists)
Commit with clear messages
Open a PR with a description of what changed and why

Ideas for contributions:

Additional output formats (JSON, YAML)
IDE integrations (VS Code extension)
Spec diff tool (compare two versions of a spec)
Calibration dataset (real specs with outcomes for scoring calibration)
Language-specific analyzers (better context extraction for Python, Go, Rust, etc.)

See GitHub Issues for open tasks.

What's Next?

More tools coming from the @rezzedai toolkit. See rezzed.ai for updates.

License

MIT License. See LICENSE for details.

Built by Rezzed — the AI product studio.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.githooks		.githooks
.github/workflows		.github/workflows
.specfirst		.specfirst
bin		bin
examples		examples
src		src
test		test
.gitignore		.gitignore
.npmignore		.npmignore
LICENSE		LICENSE
README.md		README.md
SKILL.md		SKILL.md
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

@rezzedai/specfirst

Why

Quick Start

Installation

Setup

Review Your Specs

How It Works

Confidence Scoring

Calibrated Anchors

Thresholds

Output Format

CLI Reference

`specfirst init`

`specfirst list`

`specfirst review <file>`

`specfirst status <file> <status>`

Configuration

Comparison

Why Geometric Mean?

Contributing

What's Next?

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

rezzedai/specfirst

Folders and files

Latest commit

History

Repository files navigation

@rezzedai/specfirst

Why

Quick Start

Installation

Setup

Review Your Specs

How It Works

Confidence Scoring

Calibrated Anchors

Thresholds

Output Format

CLI Reference

specfirst init

specfirst list

specfirst review <file>

specfirst status <file> <status>

Configuration

Comparison

Why Geometric Mean?

Contributing

What's Next?

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

`specfirst init`

`specfirst list`

`specfirst review <file>`

`specfirst status <file> <status>`

Packages