Protocol enforcer for AI coding agents. Gates agent work behind structured reasoning steps -- spec, hypothesis, verification, challenge -- so AI agents write better code.
AI coding agents are powerful but chaotic. They skip planning, don't consider edge cases, and rarely verify their own work. Code-crucible enforces a structured protocol:
- Spec the task before writing code (acceptance criteria, edge cases, interface contracts)
- Consider approaches and evaluate tradeoffs before committing to one
- Log hypotheses when debugging, track outcomes, build error memory
- Verify changes with real commands, not just "looks right"
- Challenge code adversarially before calling it done
Every step produces a structured JSON artifact stored in a local SQLite database. The protocol creates an audit trail of reasoning -- not just code diffs, but why decisions were made.
cargo install code-crucibleOr build from source:
git clone https://github.com/Ghost-Frame/code-crucible
cd code-crucible
cargo build --releaseCode-crucible communicates via JSON files. Every command takes --input (JSON) and --output (JSON):
code-crucible spec-task --input spec.json --output result.jsonPlanning & Design
spec-task-- Create a task specification with acceptance criteria, edge cases, and interface contractsconsider-approaches-- Evaluate multiple implementation approaches with pros/cons/scoresthink-- Structured deep reasoning with constraints and contextdeclare-unknowns-- Document blocking and non-blocking unknowns before proceeding
Debugging
log-hypothesis-- Record a bug hypothesis with confidence scorelog-outcome-- Document whether a hypothesis was correct, incorrect, or partialrecall-errors-- Search past hypotheses and outcomes for pattern matching
Verification
verify-- Run a command and validate exit code (no shell -- direct exec for security)challenge-code-- Adversarially review a file for security, performance, and correctness issuessession-diff-- Show git diff stats for the current session
Session Management
checkpoint-- Save session state to a named checkpoint (with git ref)rollback-- Restore a previous checkpointsession-learn-- Record a discovery for future sessionssession-recall-- Search past session learnings
Code Analysis
repo-map-- Generate AST-aware repository structure map (supports Rust, TypeScript, Python, Go, JS, C, JSON)search-code-- Full-text code search with tree-sitter AST awareness
Each command expects specific JSON fields. Example for spec-task:
{
"task_description": "Add rate limiting to the /api/auth endpoint",
"task_type": "feature",
"acceptance_criteria": [
"Rate limit of 10 requests per minute per IP",
"Returns 429 with Retry-After header when exceeded"
],
"interface_contract": "POST /api/auth returns 429 when rate limited",
"edge_cases": [
"Multiple IPs behind a proxy",
"Rate limit reset at minute boundary",
"Concurrent requests at the limit boundary"
],
"files_to_touch": ["src/middleware/rate_limit.rs", "src/routes/auth.rs"],
"dependencies": "Redis for distributed rate counting"
}Example for log-hypothesis:
{
"bug_description": "Auth tokens expire 5 minutes early",
"hypothesis": "Clock skew between auth server and API server causes premature expiration",
"confidence": 0.8
}All commands produce JSON with a consistent structure:
{
"success": true,
"id": "spec_a1b2c3d4",
"message": "Spec created",
"data": { ... }
}Code-crucible can optionally integrate with a skill evolution backend -- a REST API that tracks reusable patterns, debugging approaches, and verified solutions across sessions.
Enable with the skill-backend feature:
cargo install code-crucible --features skill-backendSet environment variables:
export CRUCIBLE_SKILL_URL=http://your-skill-server:4200
export CRUCIBLE_SKILL_API_KEY=your-api-keyThis unlocks 6 additional commands:
skill-search-- Find skills relevant to your current taskskill-capture-- Create a new skill from a workflow descriptionskill-record-exec-- Record execution success/failure (builds trust scores)skill-fix-- Trigger fix evolution on a failing skillskill-derive-- Combine parent skills into a new derived skillskill-lineage-- Show a skill's evolution chain
Existing commands also gain skill awareness:
spec-taskautomatically searches for relevant skills and includes them in outputverifycan record pass/fail against a skill whenskill_idis providedsession-learncan auto-capture discoveries as skills whencapture_as_skill: true
Any server implementing the expected REST endpoints (/skills/search, /skills/capture, /skills/{id}/execute, /skills/{id}/fix, /skills/derive, /skills/{id}/lineage) will work as a skill backend.
Code-crucible is designed to be called by AI coding agents (Claude Code, Cursor, Copilot, custom agents) as part of their workflow. The agent writes JSON input, calls the CLI, reads JSON output.
Example integration in an agent's system prompt:
Before writing any code, run code-crucible spec-task with:
- Task description
- At least 2 acceptance criteria
- At least 3 edge cases
- Interface contract
- Files you plan to touch
After implementation, run code-crucible verify with the test command.
Then run code-crucible challenge-code on each modified file.
All state is stored in ~/.code-crucible/crucible.db (SQLite). Override with --db path/to/db.
Tables: specs, hypotheses, checkpoints, session_learns, approaches.
Licensed under either of:
- Apache License, Version 2.0 (LICENSE-APACHE)
- MIT License (LICENSE-MIT)
at your option.
Contributions welcome. Please open an issue first for significant changes.