Pipeline orchestration for AI-assisted software development.
Claudi coordinates multi-step Claude agent interactions with resumability, state persistence, progress visualization, and fine-grained testability. It uses the @anthropic-ai/claude-agent-sdk directly—no wrapper layer.
- Claudi
- Table of Contents
- Features
- Installation
- Quick Start
- CLI Reference
- Pipeline System
- Configuration
- MCP Integration
- Subagents
- Human Input
- Progress Display
- Logging
- Error Handling
- Programmatic API
- Architecture
- Development
- License
- Contributing
- Resumability — Persist state for crash recovery and resume capability
- State Persistence — Atomic file writes track step-by-step progress
- Progress Visualization — Docker BuildKit-inspired real-time progress display
- Fine-grained Testability — Pure functions, immutable state, dependency injection
- Direct SDK Integration — No wrapper layer; uses
@anthropic-ai/claude-agent-sdkdirectly - 4 Step Types — Agent, Script, Parallel, Sequence
- Recursive Composition — Nest sequences within parallel steps, and vice versa
- MCP Support — Stdio and HTTP MCP servers for custom tools
- Subagent Spawning — Define specialized agents that Claude can invoke
- Human Input — Request input from human operators during execution (approvals, clarifications)
- Structured Output — Zod schemas for validated, typed responses
- Typed Pipeline Inputs — Define input schemas with full TypeScript inference and CLI flag generation
- Pattern-Aware Builder — High-level primitives (
.map(),.gate(),.converge()) and nested builder callbacks - Cost Tracking — Per-step and aggregate cost tracking with budget enforcement
# Using Bun (recommended)
bun install claudi
# Using npm
npm install claudi
# Using yarn
yarn add claudi- Bun v1.0+ (recommended) or Node.js v18+
- TypeScript 5+
- Anthropic API key (set
ANTHROPIC_API_KEYenvironment variable)
bunx claudi initThis creates:
claudi.config.json— Project configuration.pipelines/definitions/— Directory for pipeline definitions
// .pipelines/definitions/analyze.ts
import { buildPipeline, z } from 'claudi/sdk';
const AnalysisResult = z.object({
summary: z.string(),
issues: z.array(
z.object({
severity: z.enum(['low', 'medium', 'high']),
description: z.string(),
file: z.string(),
})
),
score: z.number().min(0).max(100),
});
export const pipeline = buildPipeline({
id: 'analyze',
version: '1.0.0',
description: 'Analyze codebase for issues',
maxBudgetUsd: 5.0,
llmConfig: { model: 'sonnet' },
inputSchema: z.object({}),
})
.agent('scan', {
description: 'Scan codebase for issues',
prompt: 'Analyze this codebase for security vulnerabilities, code quality issues, and potential bugs.',
llmConfig: { tools: ['Read', 'Glob', 'Grep'], model: 'sonnet', maxTurns: 50 },
outputKey: 'analysis',
outputSchema: AnalysisResult,
})
.script('report', {
description: 'Generate report',
execute: async (ctx) => {
const analysis = ctx.outputs.analysis!;
console.log(`Found ${analysis.issues.length} issues. Score: ${analysis.score}/100`);
return { reportGenerated: true };
},
outputKey: 'report',
})
.build();bunx claudi run analyze# List all runs
bunx claudi run list
# Show specific run details
bunx claudi run show analyze-abc123
# View cost summary
bunx claudi run cost --days 7Initialize a new Claudi project.
claudi init [options]| Option | Description |
|---|---|
--force, -f |
Overwrite existing files |
--dry-run |
Show what would be created without writing |
Template Resolution:
Templates are resolved from the local cache (~/.claudi/templates-cache/) or built-in defaults.
Execute pipelines with real-time progress tracking.
claudi run <pipeline> [input...] [options]| Option | Description |
|---|---|
--stage N |
Run up to step N only |
--restart |
Ignore previous progress, start fresh |
--dry-run |
Test without execution |
--verbose |
Show detailed output |
--quiet, -q |
Suppress progress output |
--plan <runId> |
Parent plan run ID |
--todo-display <mode> |
Display mode: inline, rich, json, quiet |
Input Variables:
Pass input variables as key=value pairs:
claudi run implement feature="Add dark mode" priority=highAccess in pipelines via ctx.variables.feature.
List runs:
claudi run list [pipeline]
# Output:
# Runs:
# implement-abc123 │ completed │ 5/5 steps │ $0.24 │ 2m15s
# plan-def456 │ failed │ 2/3 steps │ $0.08 │ ResumableShow run details:
claudi run show <runId>Resume failed/paused run:
claudi run resume <runId>View cost aggregation:
claudi run cost [--days N] # Default: 30 daysQuery registered pipelines.
# List all pipelines
claudi pipeline list
# Show pipeline details
claudi pipeline show <name>Example output:
Pipeline: implement
Version: 1.0.0
Description: Implementation pipeline
Steps:
1. [agent] analyze - Analyze codebase
Model: sonnet | Max Turns: 50
Tools: Read, Glob, Grep
2. [parallel] features - Implement features
├─ [agent] auth - Add authentication
└─ [agent] dashboard - Add dashboard
View and query pipeline run logs.
claudi logs [runId] [options]| Option | Description |
|---|---|
[runId] |
Show logs for a specific run ID |
--last, -l [N] |
Show logs from Nth most recent run (default: 1) |
--level |
Minimum log level: error, warn, info, debug |
--step |
Filter by step ID |
--follow, -f |
Tail logs in real-time |
--json |
Output raw JSONL (for piping to jq) |
Examples:
# View logs from most recent run (default)
claudi logs
# View logs for a specific run
claudi logs my-run-id
# View only errors from last run
claudi logs --last --level error
# Filter by step
claudi logs --last --step analyze
# Tail logs in real-time
claudi logs --follow
# Pipe raw JSONL to jq
claudi logs --last --json | jq .A pipeline is a sequence of steps with optional defaults and hooks:
interface Pipeline<TInput = unknown> {
id: string; // Unique identifier
version: string; // Semantic version
description: string; // What this pipeline does
steps: PipelineStep[]; // Ordered steps to execute
// Typed inputs (see "Typed Pipeline Inputs" section)
inputSchema?: ZodType<TInput>; // Zod schema for validated inputs
// Optional configuration
maxBudgetUsd?: number; // Cost limit for entire run
enableTodos?: boolean; // Enable TodoWrite tool (default: true)
rateLimits?: RateLimitConfig; // API rate limiting
llmConfig?: LLMConfig; // LLM defaults for all steps (model, tools, etc.)
}Claudi supports 4 step types, using a discriminated union pattern:
Execute Claude interactions with full SDK support.
{
type: 'agent',
id: 'analyze',
description: 'Analyze codebase',
prompt: 'Analyze this codebase for security issues.',
// LLM configuration (all nested under llmConfig)
llmConfig: {
// Model & execution
model: 'opus', // 'sonnet' | 'opus' | 'haiku'
maxTurns: 100,
timeout: 600000, // 10 minutes
permissionMode: 'acceptEdits', // Auto-accept file edits
// Tools
tools: ['Read', 'Grep', 'Glob', 'Write', 'Bash'],
disallowedTools: ['Task'], // Blocklist specific tools
// Subagents (Claude can invoke these)
agents: {
'code-reviewer': {
description: 'Expert code reviewer',
prompt: 'You are a code review specialist...',
tools: ['Read', 'Grep'],
},
},
// MCP servers for custom tools
mcpServers: {
'filesystem': {
command: 'npx',
args: ['@modelcontextprotocol/server-filesystem'],
},
},
// System prompt
systemPrompt: { type: 'preset', preset: 'claude_code' },
settingSources: ['project'], // Load CLAUDE.md
// Session management
continueConversation: true, // Maintain history
forkSession: false,
// Output validation
outputSchema: AnalysisResultSchema, // Zod schema
// Human input (optional, see "Human Input" section)
humanInput: {
mode: 'auto',
defaultTimeout: 300,
},
},
outputKey: 'analysis',
}Execute arbitrary TypeScript code for data transformation and orchestration logic.
{
type: 'script',
id: 'transform',
description: 'Transform analysis results',
execute: async (ctx) => {
const analysis = ctx.outputs.analysis;
const filtered = analysis.issues.filter(i => i.severity === 'high');
return { highPriorityIssues: filtered };
},
outputKey: 'filteredResults',
}Script steps are designed for pure data transformations. For conditional branching or dynamic step generation, use Dynamic Steps instead.
Execute steps concurrently with failure handling.
{
type: 'parallel',
id: 'features',
description: 'Implement features in parallel',
maxConcurrency: 3, // Limit concurrent steps
onFailure: 'continue', // 'fail-fast' | 'continue' | 'ignore'
steps: [
{ type: 'agent', id: 'auth', outputKey: 'auth_result', ... },
{ type: 'agent', id: 'dashboard', outputKey: 'dashboard_result', ... },
{ type: 'agent', id: 'api', outputKey: 'api_result', ... },
],
}Failure Strategies:
| Strategy | Behavior |
|---|---|
fail-fast (default) |
Cancel remaining steps on first failure |
continue |
Complete all steps, aggregate failures |
ignore |
Log failures, continue as successful |
Context Isolation:
- Each branch gets a deep clone of
outputs(prevents race conditions) variablesis shallow cloned (treat as read-only)- After completion, outputs are merged (last-write-wins for conflicts)
Execute steps sequentially with optional checkpoints.
{
type: 'sequence',
id: 'main-flow',
description: 'Main implementation flow',
isCheckpoint: true, // Pause for verification after completion
steps: [
{ type: 'agent', id: 'plan', ... },
{ type: 'parallel', id: 'implement', ... },
{ type: 'agent', id: 'test', ... },
],
onStepStart: (step, ctx) => console.log(`Starting: ${step.id}`),
onStepComplete: (step, result) => console.log(`Completed: ${step.id}`),
}Checkpoint Support:
isCheckpoint: truepauses execution for user verification- Enables manual intervention in long-running pipelines
- Resume with
claudi run resume <runId>
Sequence and Parallel steps support dynamic step resolution via functions. This enables conditional logic and fan-out patterns without a separate step type.
Use dynamic steps to branch execution based on runtime context:
{
type: 'sequence',
id: 'adaptive-flow',
description: 'Choose approach based on complexity',
steps: (ctx) => {
const complexity = ctx.outputs.analysis?.complexity;
if (complexity === 'high') {
return [
{ type: 'agent', id: 'detailed-plan', description: 'Detailed planning', prompt: '...' },
{ type: 'agent', id: 'careful-impl', description: 'Careful implementation', prompt: '...' },
];
}
return [
{ type: 'agent', id: 'quick-impl', description: 'Quick implementation', prompt: '...' },
];
},
}Use dynamic parallel steps to process collections:
{
type: 'parallel',
id: 'process-items',
description: 'Process all items in parallel',
steps: (ctx) => {
const items = ctx.outputs.items ?? [];
return items.map((item) => ({
type: 'agent' as const,
id: `process-${item.id}`,
description: `Process ${item.name}`,
prompt: `Process this item: ${JSON.stringify(item)}`,
}));
},
}Dynamic steps can return an empty array to skip execution entirely:
steps: (ctx) => ctx.outputs.skip ? [] : [{ type: 'agent', ... }]Steps can be arbitrarily nested:
{
type: 'sequence',
id: 'main',
steps: [
{
type: 'parallel',
id: 'phase-1',
steps: [
{
type: 'sequence',
id: 'feature-a',
steps: [
{ type: 'agent', id: 'plan-a', ... },
{ type: 'agent', id: 'impl-a', ... },
],
},
{
type: 'sequence',
id: 'feature-b',
// Dynamic steps for conditional inclusion
steps: (ctx) => ctx.variables.includeFeatureB
? [{ type: 'agent', id: 'impl-b', ... }]
: [],
},
],
},
],
}All steps receive a StepContext providing:
interface StepContext<TInput = unknown> {
runId: string; // Unique run identifier
sessionId?: string; // SDK session for continuity
inputs: TInput; // Validated inputs from inputSchema (read-only)
variables: Record<string, unknown>; // Input variables (legacy)
outputs: Record<string, unknown>; // Previous step outputs
config: PipelineConfig; // Resolved configuration
stepPath: string[]; // Path to current step
emit: (event: EmittableEvent) => void; // Event emission
updateProgress?: (path, status) => void; // Progress updates
completedTasks?: Array<{...}>; // Resume support
}Accessing Previous Outputs:
prompt: (ctx) => {
const analysis = ctx.outputs.analysis;
return `Based on this analysis: ${JSON.stringify(analysis)}, implement fixes.`;
},Any configuration value can be static or computed at runtime:
{
type: 'agent',
id: 'dynamic-step',
description: 'Process data',
// Dynamic prompt (computed from context)
prompt: (ctx) => `Process: ${ctx.outputs.preprocessed}`,
// Dynamic llmConfig (computed from context)
llmConfig: (ctx) => ({
model: ctx.variables.useOpus ? 'opus' : 'sonnet',
maxTurns: ctx.outputs.complexity === 'high' ? 100 : 50,
tools: ctx.variables.allowBash ? ['Read', 'Bash'] : ['Read'],
}),
}Define type-safe, schema-validated inputs for pipelines using Zod schemas. The CLI automatically generates flags from your schema.
The buildPipeline builder provides full type inference for both inputs and outputs via method chaining. Each .agent() or .script() call with outputKey and outputSchema automatically extends the output type, so subsequent steps see properly typed ctx.outputs:
import { buildPipeline, z } from 'claudi/sdk';
const AnalysisSchema = z.object({
summary: z.string(),
issues: z.array(z.object({ file: z.string(), severity: z.string() })),
});
export const pipeline = buildPipeline({
id: 'deploy',
version: '1.0.0',
description: 'Deploy to environment',
inputSchema: z.object({
environment: z.enum(['dev', 'staging', 'prod']).describe('Target environment'),
dryRun: z.boolean().default(false).describe('Test without deploying'),
}),
})
.agent('analyze', {
description: 'Analyze deployment target',
prompt: (ctx) => `Analyze deployment to ${ctx.inputs.environment}`,
llmConfig: { tools: ['Read', 'Grep'], model: 'sonnet', maxTurns: 20 },
outputKey: 'analysis',
outputSchema: AnalysisSchema,
})
.agent('deploy', {
description: 'Execute deployment',
// ctx.outputs.analysis is fully typed from the previous step
prompt: (ctx) => `Deploy. Issues found: ${ctx.outputs.analysis!.issues.length}`,
llmConfig: { tools: ['Bash'], model: 'opus', maxTurns: 30 },
})
.build();Builder methods:
| Method | Description |
|---|---|
.agent(id, config) |
Add an agent step. With outputKey/outputSchema, extends output types |
.script(id, config) |
Add a script step. With outputKey, extends output types |
.sequence(id, config) |
Add a sequence step (nested steps run sequentially) |
.sequence(id, callback, opts) |
Add a sequence via nested builder callback for type-safe composition |
.parallel(id, config) |
Add a parallel step (nested steps run concurrently) |
.parallel(id, callback, opts) |
Add a parallel via nested builder callback for type-safe composition |
.map(id, config) |
Fan-out over a dynamic list (compiles to parallel with dynamic steps) |
.gate(id, config) |
Draft/critique/approve loop (compiles to loop with 3 agent children) |
.converge(id, config) |
Generate/fix convergence loop (compiles to loop with iteration-aware selection) |
.outputs<T>() |
Declare additional output types from dynamic nested steps |
.build() |
Produce the final Pipeline value object |
Escape hatch for dynamic steps:
When a sequence or parallel step creates sub-steps with outputKey at runtime, use .outputs<>() to declare those types:
buildPipeline({ ... })
.parallel('process', {
description: 'Process items',
steps: (ctx) => items.map((item) => ({
type: 'agent' as const,
id: `task-${item.id}`,
outputKey: `task-${item.id}`,
// ...
})),
})
.outputs<{ [x: `task-${string}`]: TaskOutput }>()
.build();Nested builder callbacks:
The .sequence() and .parallel() methods accept a callback overload for type-safe composition. The callback receives a NestedBuilder and the output types accumulate automatically:
buildPipeline({ id: 'example', version: '1.0.0', description: 'Example', inputSchema: z.object({}) })
.sequence(
'setup',
(seq) =>
seq
.agent('scan', {
description: 'Scan project',
prompt: 'Scan the project structure',
llmConfig: { model: 'sonnet', maxTurns: 10, tools: ['Read', 'Glob'] },
outputKey: 'scan',
outputSchema: z.object({ files: z.array(z.string()) }),
})
.script('filter', {
description: 'Filter results',
execute: async (ctx) => ({ relevant: ctx.outputs.scan!.files.filter((f) => f.endsWith('.ts')) }),
outputKey: 'filtered',
}),
{ description: 'Setup phase' }
)
.build();Pattern-aware primitives:
High-level primitives that compile to existing step types (no runner changes needed):
// .map() — fan-out over a dynamic list
.map('process-files', {
description: 'Process each file',
over: (ctx) => ctx.outputs.scan!.files,
step: (file) => ({ type: 'agent', id: `process-${file}`, description: `Process ${file}`, prompt: `Process: ${file}` }),
maxConcurrency: 3,
outputKey: 'results',
outputSchema: z.object({ success: z.boolean() }),
})
// .gate() — draft/critique/approve loop
.gate('review', {
description: 'Review until approved',
maxIterations: 3,
draft: { description: 'Write draft', prompt: 'Write a draft...', outputKey: 'draft', outputSchema: DraftSchema, llmConfig: { model: 'opus', maxTurns: 20 } },
critique: { description: 'Critique draft', prompt: 'Critique...', outputKey: 'critique', outputSchema: CritiqueSchema, llmConfig: { model: 'sonnet', maxTurns: 10 } },
approve: { description: 'Approve or reject', prompt: 'Approve?', outputKey: 'approval', outputSchema: ApprovalSchema, llmConfig: { model: 'sonnet', maxTurns: 5 } },
})
// .converge() — generate/fix convergence loop
.converge('implement', {
description: 'Implement until tests pass',
maxIterations: 5,
generate: { description: 'Generate code', prompt: 'Generate...', outputKey: 'code', outputSchema: CodeSchema, llmConfig: { model: 'opus', maxTurns: 30 } },
fix: { description: 'Fix test failures', prompt: 'Fix...', outputKey: 'code', outputSchema: CodeSchema, llmConfig: { model: 'sonnet', maxTurns: 40 } },
check: async (ctx) => { /* run tests, return result */ },
checkOutputKey: 'testResult',
until: (result) => result.allPassed,
})The definePipeline helper provides full TypeScript inference from your Zod schema:
import { definePipeline, z } from 'claudi/sdk';
export const pipeline = definePipeline({
id: 'deploy',
version: '1.0.0',
description: 'Deploy to environment',
inputSchema: z.object({
environment: z.enum(['dev', 'staging', 'prod']).describe('Target environment'),
dryRun: z.boolean().default(false).describe('Test without deploying'),
tags: z.array(z.string()).optional().describe('Tags to apply'),
maxRetries: z.number().int().min(0).max(10).default(3).describe('Max retry attempts'),
}),
steps: [
{
type: 'agent',
id: 'deploy',
description: 'Run deployment',
// ctx.inputs is fully typed: { environment: 'dev'|'staging'|'prod', dryRun: boolean, ... }
prompt: (ctx) => `Deploy to ${ctx.inputs.environment}. Dry run: ${ctx.inputs.dryRun}`,
},
] as const,
});To type ctx.outputs with definePipeline, use the curried form. This lets you specify TOutputs explicitly while TInput is still inferred from inputSchema:
type DeployOutputs = {
analysis: { summary: string; risk: 'low' | 'medium' | 'high' };
result: { success: boolean; url: string };
};
export const pipeline = definePipeline<DeployOutputs>()({
id: 'deploy',
version: '1.0.0',
description: 'Deploy with typed outputs',
inputSchema: z.object({ environment: z.enum(['dev', 'prod']) }),
steps: [
{
type: 'agent',
id: 'analyze',
description: 'Analyze',
prompt: (ctx) => `Analyze ${ctx.inputs.environment}`, // ctx.inputs typed
outputKey: 'analysis',
},
{
type: 'agent',
id: 'deploy',
description: 'Deploy',
prompt: (ctx) => `Risk: ${ctx.outputs.analysis?.risk}`, // ctx.outputs typed
outputKey: 'result',
},
],
});Tip: Prefer
buildPipelinewhen possible — it infers output types automatically from the chain, so you don't need to maintain a separate type definition.
When you define an inputSchema, the CLI automatically generates flags:
# Flags are auto-generated from schema
claudi run deploy --environment prod --dry-run --max-retries 5
# Boolean flags work as expected
claudi run deploy --environment staging --dry-run
# Array values can be repeated
claudi run deploy --environment dev --tags feature --tags releaseSchema to Flag Mapping:
| Zod Type | CLI Flag Type | Example |
|---|---|---|
z.string() |
--flag value |
--name "my-app" |
z.number() |
--flag 123 |
--retries 3 |
z.boolean() |
--flag or --no-flag |
--dry-run |
z.enum([...]) |
--flag choice |
--env prod |
z.array(z.string()) |
--flag a --flag b |
--tags v1 --tags v2 |
.optional() |
Flag is optional | (not required) |
.default(val) |
Uses default if omitted | (has default) |
.describe('...') |
Shown in help text | (documentation) |
Naming Convention:
- Schema fields use
camelCase:maxRetries - CLI flags use
kebab-case:--max-retries
Inputs are validated against the schema before execution:
# Valid - runs pipeline
claudi run deploy --environment prod
# Invalid - shows error
claudi run deploy --environment invalid
# Error: --environment: Invalid enum value. Expected 'dev' | 'staging' | 'prod'
# Missing required field
claudi run deploy
# Error: --environment: RequiredUse --dry-run to validate inputs without executing:
claudi run deploy --environment prod --dry-run
# ✓ Inputs validated successfully:
# {
# "environment": "prod",
# "dryRun": false,
# "maxRetries": 3
# }
#
# Inputs:
# --environment Target environment (required) choices: "dev", "staging", "prod"
# --dry-run Test without deploying (default: false)
# --tags Tags to apply
# --max-retries Max retry attempts (default: 3)Use ctx.inputs (typed) instead of ctx.variables (untyped):
{
type: 'script',
id: 'check-env',
description: 'Validate environment',
execute: async (ctx) => {
// ctx.inputs is typed from inputSchema
const { environment, dryRun, maxRetries } = ctx.inputs;
if (environment === 'prod' && !dryRun) {
console.log('Production deployment with', maxRetries, 'retries');
}
return { validated: true };
},
}You can also define pipelines as plain objects that match the Pipeline shape. The pipeline loader accepts any object with the correct structure:
import { z } from 'claudi/sdk';
const inputSchema = z.object({
target: z.string().describe('Target to process'),
});
export const pipeline = {
id: 'process',
version: '1.0.0',
description: 'Process target',
inputSchema,
steps: [
{
type: 'agent' as const,
id: 'process',
description: 'Process',
prompt: (ctx: { inputs: { target: string } }) => `Process: ${ctx.inputs.target}`,
},
],
};Create claudi.config.json in your project root:
{
"claudiDir": ".pipelines",
"llmConfig": {
"model": "sonnet",
"maxTurns": 50,
"budgets": {
"daily": 5.0,
"monthly": 100.0
},
"tools": ["Read", "Write", "Glob", "Grep"],
"plugins": [{ "type": "local", "path": "./my-plugin" }]
}
}The claudiDir field (default: .pipelines) consolidates all Claudi data under a single directory. Sub-paths are conventions derived by adapters:
definitions/— Pipeline definition files.reports/— Run state persistence.logs/— Structured log files
| Field | Type | Default | Description |
|---|---|---|---|
claudiDir |
string | .pipelines |
Root directory for all Claudi data |
llmConfig.model |
string | sonnet |
Default model |
llmConfig.maxTurns |
number | 50 |
Default max turns |
llmConfig.tools |
string[] | — | Default tools for agent steps |
llmConfig.plugins |
object[] | — | SDK plugins to load |
llmConfig.budgets.daily |
number | — | Daily cost limit (USD) |
llmConfig.budgets.monthly |
number | — | Monthly cost limit (USD) |
Set LLM configuration at the pipeline level:
const pipeline: Pipeline = {
id: 'my-pipeline',
llmConfig: {
model: 'sonnet',
maxTurns: 50,
timeout: 300000, // 5 minutes
tools: ['Read', 'Write', 'Bash'],
plugins: [{ type: 'local', path: './custom-plugin' }],
hooks: [
{ event: 'progress', handler: (event) => console.log(event) },
],
retryOnFailure: true,
maxRetries: 3,
retryDelayMs: 1000,
retryBackoffMultiplier: 2,
},
steps: [...],
};Configuration flows through three levels: Project → Pipeline → Step. Values accumulate rather than replace:
| Type | Behavior |
|---|---|
Scalars (model, maxTurns) |
Child overrides parent |
Arrays (tools, disallowedTools) |
Union with deduplication |
| Plugins | Union by path (child wins on conflict) |
Objects (agents, mcpServers, env) |
Shallow merge (child wins on conflict) |
Example:
// claudi.config.json
{ "llmConfig": { "tools": ["Read", "Write"] } }
// .pipelines/definitions/my-pipeline.ts
const pipeline: Pipeline = {
llmConfig: { tools: ["Bash"] }, // Inherits + adds
steps: [
{
type: 'agent',
tools: ["Glob"], // Inherits + adds
// Final tools: ["Read", "Write", "Bash", "Glob"]
},
],
};Priority (for scalars):
- Step-level explicit config
- Pipeline
llmConfig - Project config (
claudi.config.json) - Built-in defaults
Claudi supports Model Context Protocol servers for custom tools.
{
type: 'agent',
id: 'with-mcp',
prompt: 'Query the database and save results to a file.',
llmConfig: {
mcpServers: {
'filesystem': {
command: 'node',
args: ['./mcp-servers/filesystem.js'],
env: { ROOT_DIR: '/project' },
},
'database': {
command: 'npx',
args: ['@modelcontextprotocol/server-postgres'],
env: { DATABASE_URL: process.env.DATABASE_URL },
},
},
tools: ['Read', 'MCP_filesystem_readFile', 'MCP_database_query'],
},
}{
llmConfig: {
mcpServers: {
'remote-api': {
type: 'http',
url: 'https://api.example.com/mcp',
headers: { 'Authorization': `Bearer ${process.env.API_TOKEN}` },
},
},
},
}Define specialized agents that Claude can invoke during execution:
{
type: 'agent',
id: 'orchestrator',
prompt: 'Implement this feature. Use the code-reviewer to verify your changes and test-runner to ensure tests pass.',
llmConfig: {
tools: ['Read', 'Write', 'Task'], // Include 'Task' to enable subagent invocation
agents: {
'code-reviewer': {
description: 'Expert code reviewer for security and quality',
prompt: 'You are a code review specialist. Focus on security vulnerabilities, performance issues, and code quality.',
tools: ['Read', 'Grep', 'Glob'],
model: 'sonnet',
},
'test-runner': {
description: 'Runs and analyzes test suites',
prompt: 'You are a test execution specialist. Run tests and analyze failures.',
tools: ['Bash', 'Read', 'Grep'],
},
'documentation-writer': {
description: 'Generates technical documentation',
prompt: 'You are a technical writer. Create clear, comprehensive documentation.',
tools: ['Read', 'Write'],
},
},
},
}Claude will autonomously decide when to invoke subagents based on the task.
Enable agents to request input from human operators during execution. This is useful for approval workflows, clarification requests, and interactive decision points.
Add RequestInput to the step's tools array:
{
type: 'agent',
id: 'interactive-deploy',
description: 'Deploy with human approval',
prompt: 'Deploy this application. Ask for confirmation before deploying to production.',
llmConfig: { tools: ['Read', 'Write', 'RequestInput'] },
}The RequestInput tool supports four input types:
| Type | Description | Example Use Case |
|---|---|---|
text |
Free-form text input | "Enter the deployment message" |
confirm |
Yes/no confirmation | "Deploy to production?" |
choice |
Single/multiple selection from list | "Select target environment" |
content |
Multi-line content (logs, code, etc) | "Paste the error log" |
Example agent prompt:
When ready to deploy, use the RequestInput tool to ask:
- type: "confirm"
- prompt: "Ready to deploy to production?"
If the user confirms, proceed with deployment.
Configure timeout and fallback behavior with the humanInput field:
{
type: 'agent',
id: 'interactive-step',
prompt: 'Analyze and request clarification if needed.',
llmConfig: {
tools: ['Read', 'RequestInput'],
humanInput: {
mode: 'auto', // 'interactive' | 'file' | 'auto'
defaultTimeout: 300, // 5 minutes (in seconds)
defaultOnTimeout: 'fail', // 'fail' | 'default' | 'skip'
},
},
}Configuration Options:
| Field | Type | Default | Description |
|---|---|---|---|
mode |
'interactive'│'file'│'auto' |
'auto' |
Input collection mode |
defaultTimeout |
number |
— | Default timeout in seconds (undefined = wait) |
defaultOnTimeout |
'fail'│'default'│'skip' |
'fail' |
Behavior when timeout occurs |
Mode Descriptions:
interactive— Prompt via terminal readline (requires TTY)file— Write request to file, poll for response file (for background/CI)auto— Use interactive if TTY available, otherwise file-based
Timeout Behaviors:
fail— Throw error and fail the stepdefault— Use the default value from the request (if provided)skip— Return empty response and continue
Human input configuration supports dynamic values:
{
type: 'agent',
id: 'conditional-approval',
prompt: 'Process changes with appropriate approvals.',
llmConfig: (ctx) => ({
tools: ['Read', 'RequestInput'],
humanInput: {
mode: ctx.variables.ci ? 'file' : 'auto',
defaultTimeout: ctx.variables.environment === 'prod' ? 600 : 120,
defaultOnTimeout: ctx.variables.allowSkip ? 'skip' : 'fail',
},
}),
}Subscribe to human input events for logging, UI updates, or custom handling:
eventBus.on('step:human-input-requested', (event) => {
console.log(`Input requested: ${event.request.prompt}`);
console.log(`Type: ${event.request.type}`);
});
eventBus.on('step:human-input-received', (event) => {
console.log(`Input received from: ${event.response.source}`);
});
eventBus.on('step:human-input-timeout', (event) => {
console.log(`Input timed out, behavior: ${event.behavior}`);
});Event Types:
| Event | Description |
|---|---|
step:human-input-requested |
Agent requested input from human |
step:human-input-received |
Human provided input response |
step:human-input-timeout |
Input request timed out |
For non-interactive environments, the file collector writes requests and polls for responses:
.pipelines/.human-input/
├── request-{uuid}.json # Written by Claudi
└── response-{uuid}.json # Written by external process/human
Request file format:
{
"type": "confirm",
"prompt": "Deploy to production?",
"default": false,
"timeout": 300
}Response file format:
{
"value": true
}This enables integration with external approval systems, Slack bots, or web dashboards.
Claudi provides Docker BuildKit-style progress visualization:
Pipeline: implement
Run ID: impl-abc123
Steps:
✓ analyze 00:15 ↑ 2.1k ↓ 1.8k $0.02
● implement 01:23 ↑ 5.4k ↓ 12.3k $0.08 Writing src/auth.ts
├─ auth 00:45 ✓
└─ dashboard 00:38 ● Reading components...
○ test —
○ deploy —
Total: $0.10 | 2/4 steps | 01:38 elapsed
Status Indicators:
✓Completed✗Failed●Running○Pending⊘Skipped
Metrics Displayed:
- Duration per step
- Token counts (↑ input, ↓ output)
- Cost per step
- Current tool activity
For CI/CD environments:
claudi run implement --quiet # Minimal outputOr set CI=true environment variable for automatic detection.
Claudi logs to JSONL files with automatic run ID correlation. Every log entry emitted during a pipeline run is tagged with the runId via AsyncLocalStorage-based ambient context — no manual threading required.
Location: .pipelines/.logs/claudi-{YYYY-MM-DD}.log
Log Entry Format:
{
"ts": "2026-01-15T10:30:45.123Z",
"level": "info",
"msg": "Pipeline started",
"runId": "impl-abc123",
"stepId": "analyze",
"sessionId": "sess-xyz",
"durationMs": 1523,
"costUsd": 0.02
}Log Levels: error, warn, info, debug
Use the claudi logs command for filtered, human-readable output:
# View logs from most recent run
claudi logs
# Filter by level
claudi logs --last --level error
# Filter by step
claudi logs --last --step analyze
# Tail in real-time
claudi logs --follow
# Raw JSONL for piping
claudi logs --last --json | jq .For advanced queries, pipe raw JSONL through jq:
# Steps taking > 10 seconds
claudi logs --last --json | jq 'select(.durationMs > 10000)'
# Cost tracking
claudi logs --last --json | jq 'select(.costUsd != null) | {stepId, costUsd}'| Type | Description |
|---|---|
step_failed |
Generic step failure |
timeout |
Step exceeded timeout |
budget_exceeded |
Cost limit reached |
aborted |
User/system abort |
rate_limited |
API rate limit hit |
validation_failed |
Output schema validation failed |
unknown |
Unexpected error |
const pipeline: Pipeline = {
llmConfig: {
retryOnFailure: true,
maxRetries: 3,
retryDelayMs: 1000,
retryBackoffMultiplier: 2, // Exponential backoff
},
steps: [...],
};Claudi provides context-specific suggestions:
Error: Step 'analyze' timed out after 300000ms
Suggestions:
• Resume with longer timeout:
claudi run resume impl-abc123 --timeout 600000
• Check step complexity - consider breaking into smaller steps
• Review recent logs: claudi logs impl-abc123
# Check what failed
claudi run show impl-abc123
# Resume from failure point
claudi run resume impl-abc123Claudi exposes a SDK entry point (claudi/sdk) for defining pipelines programmatically. Two approaches are available:
import { buildPipeline, definePipeline, z } from 'claudi/sdk';
import type { Pipeline, PipelineStep, StepContext } from 'claudi/sdk';
// Builder pattern (recommended) — output types accumulate automatically
const pipeline = buildPipeline({
id: 'my-pipeline',
version: '1.0.0',
description: 'My custom pipeline',
inputSchema: z.object({ target: z.string() }),
})
.agent('analyze', {
description: 'Analyze code',
prompt: (ctx) => `Analyze: ${ctx.inputs.target}`,
llmConfig: { tools: ['Read', 'Grep'], model: 'sonnet', maxTurns: 20 },
outputKey: 'analysis',
outputSchema: z.object({ summary: z.string() }),
})
.build();
// definePipeline — direct definition with optional typed outputs
const pipeline2 = definePipeline({
id: 'simple',
version: '1.0.0',
description: 'Simple pipeline',
inputSchema: z.object({ target: z.string() }),
steps: [
{
type: 'agent',
id: 'analyze',
description: 'Analyze code',
prompt: (ctx) => `Analyze: ${ctx.inputs.target}`,
},
],
});
export { pipeline };The CLI entry point (claudi/cli) bootstraps the DI container and resolves the full application:
import { runCLI } from 'claudi/cli';
await runCLI();The hexagonal architecture separates concerns through ports (interfaces) and adapters (implementations). To provide a custom adapter for any driven port, implement the port interface and register it in the composition layer:
// Implement a driven port (e.g., custom state persistence)
import type { PipelineStateRepository } from '@ports/driven';
class CustomStateRepository implements PipelineStateRepository {
async save(state) {
/* Save to database, S3, etc. (.pipelines/.reports/ by default) */
}
async load(runId) {
/* Load from storage */
}
async list() {
/* List all runs */
}
}Subscribe to pipeline events via the EventBus driven port:
// Events are defined in the domain layer
// Adapters subscribe through the EventBus port
eventBus.on('step:start', (event) => {
console.log(`Starting step: ${event.stepId}`);
});
eventBus.on('step:complete', (event) => {
console.log(`Completed: ${event.stepId} ($${event.costUsd})`);
});
eventBus.on('step:fail', (event) => {
console.error(`Failed: ${event.stepId} - ${event.error.message}`);
});Claudi uses a hexagonal (ports & adapters) architecture with five layers. Dependencies flow strictly inward -- outer layers depend on inner layers, never the reverse. Boundary rules are enforced at lint time by eslint-plugin-boundaries (configured in boundary-spec.js).
For full architectural details -- layer responsibilities, data flow, DI patterns, and design decisions -- see ARCHITECTURE.md.
┌────────────────────────────────────────┐
│ composition/ │
│ DI wiring (tsyringe bootstrap) │
│ Tokens, modules, container setup │
└──────────┬──────────────────┬──────────┘
│ │
┌────────────────────▼───┐ ┌────────▼────────────────┐
│ adapters/ DRIVING │ │ adapters/ DRIVEN │
│ ┌───────────────────┐ │ │ ┌────────────────────┐ │
│ │ CLI (yargs/ink) │ │ │ │ Claude LLM Gateway │ │
│ │ SDK (buildPipe- │ │ │ │ File State Repo │ │
│ │ line) │ │ │ │ Config Manager │ │
│ └───────────────────┘ │ │ │ Logger, EventBus │ │
│ │ │ │ Input Collectors │ │
└───────────┬────────────┘ │ └────────────────────┘ │
│ └──────────┬──────────────┘
│ │
┌───────────▼────────────────────────────▼──┐
│ application/ │
│ ┌──────────────────────────────────────┐ │
│ │ Use Cases: init, run, resume, list, │ │
│ │ show, cost-summary, view-logs │ │
│ ├──────────────────────────────────────┤ │
│ │ Services: pipeline-runner, │ │
│ │ step-runner, step-runner strategies│ │
│ ├──────────────────────────────────────┤ │
│ │ Errors: abort, budget, timeout, │ │
│ │ validation, checkpoint-pause, ... │ │
│ └──────────────────────────────────────┘ │
└───────────────────┬───────────────────────┘
│
┌───────────────────▼───────────────────────┐
│ ports/ │
│ Driving: *.usecase.ts (init, run, ...) │
│ Driven: *.port.ts (LLM, state, config) │
│ Internal: runner, step-runner, strategy │
└───────────────────┬───────────────────────┘
│
┌───────────────────▼───────────────────────┐
│ domain/ │
│ Entities: PipelineState, StepState │
│ Values: Pipeline, PipelineStep, │
│ StepContext, StepResult, │
│ LLMConfig, LLMMessage, │
│ ClaudiConfig │
│ Errors: StepError │
│ Events: 16 event classes │
│ (Pure business logic, zero dependencies) │
└───────────────────────────────────────────┘
Dependency rules:
| Layer | Can depend on |
|---|---|
domain |
nothing |
ports |
domain |
application |
domain, ports |
adapters |
domain, ports (NOT application) |
composition |
everything |
Key patterns:
- DI via tsyringe -- Injection tokens in
tokens.ts, modules register bindings,bootstrap()wires the container - Strategy pattern -- Step runner strategies (agent, script, parallel, sequence) registered via multi-registration
- Child container per request -- Isolation between pipeline runs
- Path aliases --
@domain/*,@ports/*,@application/*,@adapters/*,@composition/*(never relative cross-layer imports)
# Clone repository
git clone https://github.com/your-org/claudi.git
cd claudi
# Install dependencies
bun installbun run dev # Hot-reload CLI development
bun run build # Build JS bundles + type declarations (dist/)
bun run build:bin # Compile to standalone binary (dist/claudi)
bun run build:js # Build CLI + SDK bundles (dist/)
bun run build:types # Bundle SDK type declarations (dist/typings/)
bun run clean # Remove dist/ and node_modules/
bun run test # Run tests with coverage
bun run typecheck # TypeScript checking (tsc --noEmit)
bun run lint # ESLint with architectural boundary enforcement
bun run lint:fix # Auto-fix lint issues
bun run format # Format code with Prettier
bun run format:check # Check formatting without writingTests live in tests/ and mirror the src/ structure. Mock driven ports for application-layer tests.
bun test # All tests
bun test tests/specific.ts # Single fileArchitectural layer dependencies are enforced by eslint-plugin-boundaries at lint time. The rules are defined in boundary-spec.js. Violations will not show as TypeScript errors -- run bun run lint to catch them.
# Check for boundary violations
bun run lintMIT
Contributions are welcome! Please read the contributing guidelines and submit pull requests to the main repository.