Create specialized agent workflows that coordinate multiple AI agents to tackle complex engineering tasks. Instead of a single agent trying to handle everything, you can orchestrate teams of focused specialists that work together.
- Edit an existing agent: Start with
my-custom-agent.tsand modify it for your needs - Test your agent: Run
codebuff --agent your-agent-name - Publish your agent: Run
codebuff publish your-agent-name
- For examples, check the
examples/directory. - Join our Discord community and ask your questions!
- Check our documentation for more details
Codebuff is an open-source AI coding assistant that edits your codebase through natural language instructions. Instead of using one model for everything, it coordinates specialized agents that work together to understand your project and make precise changes.
Codebuff beats Claude Code at 61% vs 53% on our evals across 175+ coding tasks over multiple open-source repos that simulate real-world tasks.
When you ask Codebuff to "add authentication to my API," it might invoke:
- A File Explorer Agent to scan your codebase to understand the architecture and find relevant files
- A Planner Agent to plan which files need changes and in what order
- An Editor Agent to make precise edits
- A Reviewer Agent to validate changes
This multi-agent approach gives you better context understanding, more accurate edits, and fewer errors compared to single-model tools.
Modern software projects are complex ecosystems with thousands of files, multiple frameworks, intricate dependencies, and domain-specific requirements. A single AI agent trying to understand and modify such systems faces fundamental limitations—not just in knowledge, but in the sheer volume of information it can process at once.
Agent workflows elegantly solve this by breaking large tasks into focused sub-problems. When working with large codebases (100k+ lines), each specialist agent receives only the narrow context it needs—a security agent sees only auth code, not UI components—keeping the context for each agent manageable while ensuring comprehensive coverage.
This is about efficient AI context management, not recreating a human department. Simply creating a "frontend-developer" agent misses the point. AI agents don't have human constraints like context-switching or meetings. Their power comes from hyper-specialization, allowing them to process a narrow domain more deeply than a human could, then coordinating seamlessly with other specialists.
Here's an example of a git-committer agent that creates good commit messages:
export default {
id: 'git-committer',
displayName: 'Git Committer',
model: 'openai/gpt-5-nano',
toolNames: ['read_files', 'run_terminal_command', 'end_turn'],
instructionsPrompt:
'You create meaningful git commits by analyzing changes, reading relevant files for context, and crafting clear commit messages that explain the "why" behind changes.',
async *handleSteps() {
// Analyze what changed
yield { tool: 'run_terminal_command', command: 'git diff' }
yield { tool: 'run_terminal_command', command: 'git log --oneline -5' }
// Stage files and create commit with good message
yield 'STEP_ALL'
},
}This agent systematically analyzes changes, reads relevant files for context, then creates commits with clear, meaningful messages that explain the "why" behind changes.
This guide covers everything you need to know about building custom Codebuff agents.
Each agent is a TypeScript file that exports an AgentDefinition object:
export default {
id: 'my-agent', // Unique identifier (lowercase, hyphens only)
displayName: 'My Agent', // Human-readable name
model: 'claude-3-5-sonnet', // AI model to use
toolNames: ['read_files', 'write_file'], // Available tools
instructionsPrompt: 'You are...', // Agent behavior instructions
spawnerPrompt: 'Use this agent when...', // When others should spawn this
spawnableAgents: ['helper-agent'], // Agents this can spawn
// Optional: Programmatic control
async *handleSteps() {
yield { tool: 'read_files', paths: ['src/config.ts'] }
yield 'STEP' // Let AI process and respond
},
}id: Unique identifier using lowercase letters and hyphens onlydisplayName: Human-readable name shown in UImodel: AI model from OpenRouter (see available models)instructionsPrompt: Detailed instructions defining the agent's role and behavior
toolNames: Array of tools the agent can use (defaults to common tools)spawnerPrompt: Instructions for when other agents should spawn this onespawnableAgents: Array of agent names this agent can spawnhandleSteps: Generator function for programmatic control
read_files: Read file contentswrite_file: Create or modify entire filesstr_replace: Make targeted string replacementscode_search: Search for patterns across the codebase
run_terminal_command: Execute shell commandsspawn_agents: Delegate tasks to other agentsend_turn: Finish the agent's response
web_search: Search the internet for informationread_docs: Read technical documentationbrowser_logs: Navigate and inspect web pages
See types/tools.ts for detailed parameter information.
Use the handleSteps generator function to mix AI reasoning with programmatic logic:
async *handleSteps() {
// Execute a tool
yield { tool: 'read_files', paths: ['package.json'] }
// Let AI process results and respond
yield 'STEP'
// Conditional logic
if (needsMoreAnalysis) {
yield { tool: 'spawn_agents', agents: ['deep-analyzer'] }
yield 'STEP_ALL' // Wait for spawned agents to complete
}
// Final AI response
yield 'STEP'
}'STEP': Let AI process and respond once'STEP_ALL': Let AI continue until completion- Tool calls:
{ tool: 'tool_name', ...params }
Choose models based on your agent's needs:
anthropic/claude-sonnet-4: Best for complex reasoning and code generationopenai/gpt-5: Strong general-purpose capabilitiesx-ai/grok-4-fast: Fast and cost-effective for simple or medium-complexity tasks
Any model on OpenRouter: Unlike Claude Code which locks you into Anthropic's models, Codebuff supports any model available on OpenRouter - from Claude and GPT to specialized models like Qwen, DeepSeek, and others. Switch models for different tasks or use the latest releases without waiting for platform updates.
See OpenRouter for all available models and pricing.
Agents can spawn other agents to create sophisticated workflows:
// Parent agent spawns specialists
async *handleSteps() {
yield { tool: 'spawn_agents', agents: [
'security-scanner',
'performance-analyzer',
'code-reviewer'
]}
yield 'STEP_ALL' // Wait for all to complete
// Synthesize results
yield 'STEP'
}Reuse any published agent: Compose existing published agents to get a leg up. Codebuff agents are the new MCP!
- Be specific about the agent's role and expertise
- Include examples of good outputs
- Specify when the agent should ask for clarification
- Define the agent's limitations
- Start with file exploration tools (
read_files,code_search) - Use
str_replacefor targeted edits,write_filefor major changes - Always use
end_turnto finish responses cleanly
- Include error checking in programmatic flows
- Provide fallback strategies for failed operations
- Log important decisions for debugging
- Choose appropriate models for the task complexity
- Minimize unnecessary tool calls
- Use spawnable agents for parallel processing
- Local Testing:
codebuff --agent your-agent-name - Debug Mode: Add logging to your
handleStepsfunction - Unit Testing: Test individual functions in isolation
- Integration Testing: Test agent coordination workflows
- Validate: Ensure your agent works across different codebases
- Document: Include clear usage instructions
- Publish:
codebuff publish your-agent-name - Maintain: Update as models and tools evolve
async *handleSteps() {
const config = yield { tool: 'read_files', paths: ['config.json'] }
yield 'STEP'
if (config.includes('typescript')) {
yield { tool: 'spawn_agents', agents: ['typescript-expert'] }
} else {
yield { tool: 'spawn_agents', agents: ['javascript-expert'] }
}
yield 'STEP_ALL'
}async *handleSteps() {
for (let attempt = 0; attempt < 3; attempt++) {
yield { tool: 'run_terminal_command', command: 'npm test' }
yield 'STEP'
if (allTestsPass) break
yield { tool: 'spawn_agents', agents: ['test-fixer'] }
yield 'STEP_ALL'
}
}Deep customizability: Create sophisticated agent workflows with TypeScript generators that mix AI generation with programmatic control. Define custom agents that spawn subagents, implement conditional logic, and orchestrate complex multi-step processes that adapt to your specific use cases.
Fully customizable SDK: Build Codebuff's capabilities directly into your applications with a complete TypeScript SDK. Create custom tools, integrate with your CI/CD pipeline, build AI-powered development environments, or embed intelligent coding assistance into your products.
Learn more about the SDK here.
- Discord: Join our community for help and inspiration
- Examples: Study the
examples/directory for patterns - Documentation: codebuff.com/docs and check
types/for detailed type information - Issues: Report bugs and request features on GitHub
- Support: support@codebuff.com
Happy agent building! 🤖