Warning
Work in Progress — Expect Daily Breaking Changes
This plugin is under active development and is not stable. The workflow, commands, skills, and configuration are subject to change at any time — often daily. If you are using this, expect things to break. There is no guarantee of backwards compatibility between any two versions. Proceed accordingly.
An intelligent, phased workflow plugin for Claude Code that takes software projects from initial idea through complete, validated implementation. JSF enforces test-driven development discipline, security review gates, and structured human validation checkpoints — so you ship code that works and is safe.
- Simple over clever: JSF builds only what you ask for, with no speculative features or premature abstractions
- Safety by default: hooks block dangerous shell, git, and SQL commands before they execute
- Validated progress: each phase must pass automated tests and (when required) manual review before proceeding
- Persistent state: work survives interruptions via a file-based memory system; resume where you left off
- Open Claude Code and navigate to Settings → Plugins → Marketplace
- Search for "jsf" or "John's Software Factory"
- Click Install
That's it. No additional setup is required for basic use.
JSF emits OpenTelemetry spans for every Claude session and tool call, giving you APM-style visibility into what Claude is doing and how long each step takes. Traces appear in Jaeger (or any OTLP-compatible backend) with no extra configuration beyond enabling Claude Code's standard telemetry.
Enable tracing by setting these env vars before running claude:
# Required — opt in to Claude Code telemetry (enables JSF tracing hooks too)
export CLAUDE_CODE_ENABLE_TELEMETRY=1
# Point at your OTLP collector (gRPC)
export OTEL_EXPORTER_OTLP_PROTOCOL=grpc
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317Once set, every Claude session automatically produces:
| Span | When emitted |
|---|---|
claude.session |
Root span — opened on session start, closed on stop |
claude.tool_call |
One child span per tool invocation (Bash, Read, Edit, etc.) |
claude.session.stop |
Session ended normally |
claude.session.subagent_stop |
Sub-agent completed |
claude.session.compact |
Context compaction triggered |
claude.session.notification |
Notification event fired |
factory.task |
Full JSF workflow run (linked to claude.session) |
factory.phase |
Each implementation phase (linked to claude.session) |
All spans are sent to your OTLP collector under service.name=jsf. View them in the Jaeger UI at http://localhost:16686 — select service jsf and you'll see the full trace tree for your session.
See the Claude Code monitoring documentation for the full list of supported environment variables, multi-team tagging, and backend configuration options.
The observability/ directory contains a ready-to-run Docker Compose stack that
receives, stores, and visualises all three OTel signal types — metrics, traces,
and logs — using entirely open-source components:
| Component | Image | Purpose |
|---|---|---|
| otel-collector | otel/opentelemetry-collector-contrib:0.92.0 |
Receives OTLP and fans out to backends |
| Prometheus | prom/prometheus:v2.48.0 |
Metric storage and query |
| Jaeger | jaegertracing/all-in-one:1.53 |
Trace storage and UI |
| Loki | grafana/loki:2.9.3 |
Log storage |
| Grafana | grafana/grafana:10.2.3 |
Unified dashboard (datasources pre-wired) |
Start the stack:
cd observability/
docker compose up -dService endpoints:
| Service | URL |
|---|---|
| Grafana | http://localhost:3000 (admin / admin) |
| Prometheus | http://localhost:9090 |
| Jaeger UI | http://localhost:16686 |
| Loki | http://localhost:3100/ready |
| OTLP gRPC | localhost:4317 |
| OTLP HTTP | localhost:4318 |
Stop and remove volumes:
docker compose down -vClaude Code can export metrics and events to the local collector via the
standard OpenTelemetry environment variables. Set these before running claude:
# Required — opt in to telemetry
export CLAUDE_CODE_ENABLE_TELEMETRY=1
# Send both metrics and logs/events over OTLP
export OTEL_METRICS_EXPORTER=otlp
export OTEL_LOGS_EXPORTER=otlp
# Point at the local collector's gRPC endpoint
export OTEL_EXPORTER_OTLP_PROTOCOL=grpc
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
# Optional: faster export intervals for local development
export OTEL_METRIC_EXPORT_INTERVAL=10000 # 10s (default: 60s)
export OTEL_LOGS_EXPORT_INTERVAL=5000 # 5s (default: 5s)
claudeWhat Claude Code exports:
Metrics (visible in Prometheus / Grafana):
| Metric | Description |
|---|---|
claude_code.session.count |
CLI sessions started |
claude_code.cost.usage |
Session cost in USD |
claude_code.token.usage |
Tokens used (input/output/cache) |
claude_code.lines_of_code.count |
Lines added/removed |
claude_code.active_time.total |
Time spent (user vs. CLI) |
claude_code.commit.count |
Git commits created |
claude_code.pull_request.count |
PRs created |
claude_code.code_edit_tool.decision |
Tool permission accept/reject decisions |
Events (visible in Loki / Grafana):
| Event | Description |
|---|---|
claude_code.user_prompt |
Prompt submitted (length only by default) |
claude_code.api_request |
Each API call — model, cost, tokens, latency |
claude_code.api_error |
Failed API requests |
claude_code.tool_result |
Tool execution outcome, duration, decision |
claude_code.tool_decision |
Permission decision for a tool call |
All signals include session.id, user.account_uuid, organization.id,
service.name=claude-code, and OS/arch attributes.
Optional: enable richer logging
# Log user prompt content (disabled by default for privacy)
export OTEL_LOG_USER_PROMPTS=1
# Log MCP server/tool names and skill names in tool_result events
export OTEL_LOG_TOOL_DETAILS=1For more detail on all variables and multi-team tagging, see the Claude Code monitoring documentation.
pip install -r path/to/jsf-plugin/scripts/requirements.txtOnce installed, start a new project session with:
/jsf:start I want to build a REST API for managing book reviews
JSF will guide you through a structured clarification dialogue, generate a technical spec and phased implementation plan, implement each phase with TDD discipline, and validate each phase before moving on. You can check progress, pause, and resume at any time.
Commands are invoked with the /jsf: prefix in Claude Code.
Starts a new software factory workflow. Provide an optional description of what you want to build; if omitted, JSF will ask.
What it does:
- Writes an initial memory checkpoint
- Launches the clarification dialogue (scope, success criteria, tech stack, constraints)
- Confirms your answers before proceeding to planning
Example:
/jsf:start Add webhook support to the existing notifications service
Displays the current state of the active factory workflow.
Shows:
- Project summary and clarification status
- All planned phases with completion indicators
- Any pending validations waiting on manual review
- Memory key count (for debugging multi-agent state)
Example:
/jsf:status
Sample output:
Project: Book Review REST API
Clarification: confirmed
Plan: 5 phases
[✓] Phase 1: Database schema + migrations
[✓] Phase 2: Core CRUD endpoints
[→] Phase 3: Authentication middleware ← current
[ ] Phase 4: Rate limiting
[ ] Phase 5: Integration tests + docs
Pending validations: none
Runs the validation gate for the current phase. Executes automated tests and, when required, coordinates manual review.
Manual review is triggered when a phase touches UI changes, API surface changes, external integrations, or any project-specific rules defined during clarification.
Example:
/jsf:validate
A phase only advances when both automated tests pass and all required manual confirmations are given.
Resumes a factory workflow from the last memory checkpoint. Use this when returning to a project after a break or after a Claude Code session ends.
What it does:
- Reads the persisted memory state
- Shows which phases are complete and which is next
- Picks up implementation from where it left off
Example:
/jsf:resume
Skills are the internal building blocks JSF uses. You don't invoke these directly in normal use — commands orchestrate them automatically. Understanding them helps you know what's happening under the hood and how to customize or extend the workflow.
The master orchestrator. Controls the full lifecycle: intake → clarification → spec+plan → phased TDD implementation → validation → completion. All other skills are invoked through this one.
Runs a structured Q&A dialogue covering:
- Scope and what is explicitly out of scope
- Success criteria (how you'll know it's done)
- Tech stack and any hard constraints
- CI/CD assumptions already in place
- Whether any UI, API, or external integration changes require manual validation
- Constraints around existing code to preserve
Produces a clarification_summary stored in memory that all downstream agents read.
Converts the confirmed clarification summary into:
- A technical specification (problem statement, architecture, data model, API surface, security considerations)
- An ordered list of implementation phases, each with: test files to write first, source files to create/modify, and whether phases can run in parallel
Implements a single phase with strict red-green-refactor discipline:
- Write failing tests first
- Write the minimum code to make them pass
- Refactor if needed
- Security review before committing: checks for hardcoded credentials, SQL/shell/XSS injection, insecure defaults, missing input validation
After implementation, validates phase completion:
- Runs the full automated test suite
- Checks which manual validation triggers fire (UI changes, API changes, external integrations)
- Presents a checklist for human confirmation when required
- Only marks a phase complete when all checks pass
Manages persistent JSONL memory with file-based locking for safe multi-agent coordination. Stores and retrieves:
clarification_summary— confirmed answers from the clarification dialogueimplementation_plan— the full phase listphase_complete:<name>— per-phase completion recordsreview_result,validation_confirmed— review and validation outcomescheckpoints— git SHAs at each phase boundary
Emits OpenTelemetry spans for factory activity monitoring. Spans cover the full workflow (factory.task), individual phases (factory.phase), automated validation (factory.validation.automated), manual validation (factory.validation.manual), and memory checkpoints (factory.checkpoint). These factory-level spans are linked to the claude.session root span emitted by the hook-based tracer, so factory activity appears inside the broader session trace in Jaeger. Useful for understanding where time is spent in long multi-phase projects.
JSF installs three pre-tool hooks that block dangerous operations before they execute:
| Hook | What it blocks |
|---|---|
block-dangerous-bash.sh |
rm -rf, force operations, process kills |
block-dangerous-git.sh |
Force push, reset --hard, checkout . |
block-dangerous-sql.sh |
DROP TABLE, DELETE without a WHERE clause |
These run automatically on every Bash tool call. No configuration needed.
You have an idea. Nothing exists yet.
/jsf:start Build a REST API for managing a book review collection. PostgreSQL backend, FastAPI, JWT auth.
JSF asks clarifying questions: What endpoints? Admin vs. public access? Any existing schema to preserve? Rate limiting? What counts as "done"?
After you confirm the answers, JSF produces a 5-phase plan:
- Schema + migrations
- CRUD endpoints (unauthenticated)
- JWT auth middleware
- Rate limiting
- Integration tests + OpenAPI docs
It implements phase 1 with TDD (writes schema tests first, then the migration), runs /jsf:validate, gets your confirmation that the schema looks correct, then moves to phase 2. Repeat through all 5 phases. At the end you have tested, reviewed, committed code at each phase boundary.
You have an existing notifications service and want to add webhook support.
/jsf:start Add webhook support to the notifications service. Webhooks should fire on new notification events. HMAC-SHA256 signing. Retry on failure with exponential backoff.
During clarification, JSF asks: Which notification event types? Where is the retry state stored? Any existing webhook tables? Is the signing key per-tenant or global?
The plan JSF produces respects your existing codebase. It reads relevant files before writing any code, avoids touching code outside the webhook feature, and flags the new API surface (webhook registration endpoints) as requiring manual validation.
After phase 2 (the new endpoints), /jsf:validate pauses for manual review: "New API endpoints added — please verify the registration flow behaves as expected." You test it, confirm, and JSF proceeds.
You started a project yesterday, got through 3 of 5 phases, and your session ended.
/jsf:resume
JSF reads its memory, shows you:
Completed: Phase 1 (schema), Phase 2 (CRUD), Phase 3 (auth)
Next: Phase 4 — Rate limiting
It picks up exactly where it left off. No re-explaining the project. No re-running completed phases. The git SHA from the phase 3 checkpoint is recorded so you can diff what's been done.
You're partway through a multi-phase build and want a quick status check.
/jsf:status
Output:
Project: Webhook support for notifications service
Clarification: confirmed (8 keys)
Plan: 4 phases
[✓] Phase 1: Webhook table + migration
[✓] Phase 2: Registration + delivery endpoints
[→] Phase 3: HMAC signing + retry logic ← implementing now
[ ] Phase 4: End-to-end integration tests
Pending validations: none
Everything is visible at a glance. When phase 3 finishes, you run /jsf:validate to advance to phase 4.
.
├── .claude-plugin/
│ ├── plugin.json # Claude Code plugin manifest
│ └── marketplace.json # Marketplace listing metadata
├── .cursor-plugin/
│ └── plugin.json # Cursor plugin manifest
├── commands/
│ ├── start.md # /jsf:start command
│ ├── resume.md # /jsf:resume command
│ ├── status.md # /jsf:status command
│ └── validate.md # /jsf:validate command
├── skills/
│ ├── workflow/ # Master orchestration skill
│ ├── clarification/ # Structured Q&A skill
│ ├── spec-planning/ # Spec + plan generation skill
│ ├── tdd-implementation/ # TDD implementation skill
│ ├── validation-gate/ # Phase validation skill
│ ├── memory-protocol/ # Persistent memory skill
│ └── otel-tracing/ # OpenTelemetry tracing skill
├── agents/ # Specialist agent definitions
├── hooks/
│ ├── hooks.json # Hook configuration
│ └── scripts/ # Safety hook shell scripts
├── scripts/
│ ├── memory.py # JSONL memory manager
│ ├── telemetry.py # OTel span emitter (factory-level spans)
│ └── hook_tracer.py # OTel hook tracer (session + tool-call spans)
├── rules/ # Cursor-compatible rule files
├── docs/
│ └── ProjectGoals.md # Design goals and requirements
└── tests/ # Plugin test suite
MIT