Lockstep Dev is the open-source CLI for contracted, verifiable AI software work. It runs a local Codex or Claude session, applies policy and validator checks, and produces receipts that can be inspected later.
The CLI, local contract runner, spec format, and receipt verifier are the open-source surface. Hosted orchestration and API-backed receipt storage are optional and remain API-key gated to prevent abuse and support serious users. The TypeScript SDK is a thin API client for integrations.
Lockstep Inference is a separate commercial system for route-aware sparse-model serving. The inference stack, sparse routing primitive, resident expert scheduler, calibration paths, and model-specific serving recipes are not part of this open-source package.
AI coding agents can make useful changes quickly, but teams still need evidence. A terminal transcript is not enough for production review, compliance, or post-incident analysis. Teams need to know what prompt was used, which files were touched, which checks passed, which checks failed, and whether the final receipt was tampered with.
Lockstep wraps local execution with a contract. The contract defines the work, the boundaries, the required validation signals, and the receipt chain that proves the run state.
Lockstep does not prove that a model is correct in an absolute sense. It proves that a specific run followed a declared workflow and produced a verifiable record.
It records:
- the spec file and spec hash
- each step prompt
- agent stdout and stderr hashes
- validator results
- step hash links
- final chain hash
- receipt metadata
The result is evidence that a reviewer, CI job, or auditor can inspect without trusting a rewritten summary.
Install the CLI globally:
npm install -g @lockstepai/lockstep
lockstep --version
lockstep doctorSave an API key only if you want hosted orchestration and API-backed receipt storage:
lockstep login ls_test_your_api_key
lockstep login ls_live_your_api_keyConfigure local defaults:
lockstep setupCreate a spec inside the target repo:
cd /path/to/repo
lockstep templates
lockstep init blank
lockstep validate
lockstep review
lockstep runlockstep setup detects local provider CLIs and saves machine defaults to ~/.locksteprc. It asks for the default runner, default judge, delivery rigor, autonomy mode, model overrides, and Claude auth mode when Claude is involved.
lockstep init creates .lockstep.yml in the current repository. lockstep run reads that spec and executes each step. With a saved API key, runs can use the Lockstep API. Without a key, or with --local, the CLI uses the local executor.
Config precedence is:
- command flags
.lockstep.yml~/.locksteprc- built-in defaults
Runner and judge are separate. The runner performs the work. The judge reviews evidence and AI-judge criteria.
lockstep run --runner codex --judge codex
lockstep run --runner claude --judge codex
lockstep run --runner codex --judge claude
lockstep run --runner claude --judge claude --claude-auth-mode interactiveA Lockstep spec is a YAML contract. It defines configuration, shared context, steps, commands, and validators.
version: "1"
config:
agent: "codex"
judge_mode: "codex"
execution_mode: "standard"
max_retries: 3
step_timeout: 300
working_directory: "."
context: |
Shared instructions for every step.
steps:
- name: "Implement and verify"
prompt: |
Describe exactly what the agent should change.
validate:
- type: "file_exists"
target: "package.json"
- type: "command_passes"
command: "npm test"
timeout: 120Run validation before executing:
lockstep validate
lockstep review --rawThe ai_judge validator asks the configured judge provider to evaluate output against explicit criteria. It is useful for qualitative review, but it must not be the only validator in a step. Add at least one structural or functional check such as command_passes, test_passes, file_exists, or json_valid.
Example:
validate:
- type: "command_passes"
command: "npm test"
- type: "ai_judge"
criteria: "The implementation is minimal, safe, and matches the requested behavior."
threshold: 0.8
evaluation_targets:
- "src"Receipts are JSON files with linked step hashes. Verify one with:
lockstep verify .lockstep/receipts/<receipt>.jsonVerification checks the chain, step hashes, chain hash, completeness, and receipt shape. A missing original spec is reported as a warning; a broken chain or tampered step fails verification.
Built-in validator types:
file_exists: requirestargetfile_not_exists: requirestargetfile_contains: requirespathandpattern; optionalis_regexfile_not_contains: requirespathandpattern; optionalis_regexcommand_passes: requirescommand; optionaltimeoutcommand_output: requirescommandandpattern; optionalis_regexandtimeoutapi_responds: requiresurlandstatus; optionalbody_containsandtimeoutjson_valid: requirespath; optionalschematype_check: optionalcommandandtimeoutlint_passes: optionalcommandandtimeouttest_passes: requirescommand; optionaltimeoutai_judge: requirescriteriaandthreshold; optionalevaluation_targets,rubric, andtimeout
List available templates:
lockstep templatesCreate a spec:
lockstep init blank
lockstep init nextjs-saas
lockstep init rest-api
lockstep init solana-programTemplates are starting points. Edit .lockstep.yml before production use so prompts, validators, and working directories match the real repo.
Public commands:
lockstep --version
lockstep doctor
lockstep templates
lockstep login <api-key>
lockstep setup
lockstep init [template]
lockstep validate [spec-file]
lockstep review [--raw] [spec-file]
lockstep policy init
lockstep contract init [--dry-run] [--output <path>]
lockstep run [spec-file]
lockstep run --runner <codex|claude> --judge <codex|claude>
lockstep run --runner-model <model> --judge-model <model>
lockstep run --execution-mode <standard|yolo>
lockstep run --claude-auth-mode <auto|interactive|api-key|auth-token|oauth-token|bedrock|vertex|foundry>
lockstep run --local --dry-run
lockstep run --local --phase <n>
lockstep run --local --from-phase <n>
lockstep run --local --output <path>
lockstep verify <receipt-file>doctor, templates, setup, and login can run from any directory. init, validate, review, policy init, contract init, and run should run from the target repository root.
Run checks from this package directory:
npm test
bash test/run-all.sh
npm run buildUse rg when searching the codebase, keep public command output stable, and update docs when command behavior changes.
MIT. See LICENSE.