An experimental AI coding agent built to explore multi-model orchestration, tool systems, and agentic workflows.
Orpheus CLI is a learning experiment, not a production tool. I built it to deeply understand how coding agents work - from tool orchestration to memory systems to multi-model coordination.
This project represents my journey exploring:
- How to build effective AI coding assistants
- Multi-model orchestration patterns
- Tool systems and agent architectures
- The real complexity behind "simple" agentic workflows
If you're looking for a production-ready coding agent, check out Claude Code, Cursor, or Aider.
- Multi-model orchestration - A "planner" model coordinating specialist agents (Claude for coding, Codex for review, Gemini for documentation)
- Unified tool registry - AGNO-inspired tool system with structured responses
- Pattern memory - SQLite + FTS5 for storing and recalling successful refactoring patterns
- Git-aware context - Automatic workspace snapshots with branch, commits, and dirty files
- Session persistence - Save/resume sessions with full context restoration
- Streaming TUI - Bubble Tea interface showing real-time tool execution
A deep dive into the philosophy, design decisions, and lessons learned from building an experimental AI coding agent.
When I started building Orpheus, I wanted to explore a fundamental question: Can we build an AI coding assistant that truly understands the codebase it's working with?
The vision was ambitious:
- Multi-model orchestration - Use the right model for each task (Claude for coding, Codex for review, Gemini for documentation)
- Persistent memory - Learn from successful refactoring patterns and reuse them
- Workspace awareness - Understand git state, modified files, and project context before making decisions
- Parallel execution - Speed up multi-file operations by running tasks concurrently
I chose Go for several reasons:
- Concurrency primitives - Goroutines and channels seemed perfect for parallel agent execution
- Single binary distribution - Easy to ship as a standalone CLI
- Performance - Compiled language for fast execution
- Type safety - Catch errors at compile time
cmd/orpheus/ # CLI entry point
internal/
├── agent/ # Core orchestrator (OrpheusAgent)
│ ├── agent.go # Main agent with multi-model setup
│ ├── prompts.go # System prompts
│ └── capsule.go # Context capsule for state injection
├── cortex/ # Model provider abstraction
│ ├── unified_client.go
│ ├── anthropic_client.go # Claude integration
│ ├── openai_client.go # OpenAI/Codex integration
│ └── registry.go # Tool registration
├── memory/
│ └── memory.go # SQLite + FTS5 storage
├── tools/ # Tool implementations
│ ├── file_tools.go # Read, write, edit
│ ├── git_tools.go # Git status, diff, log
│ └── progressive_reader.go # Large file handling
├── workflow/ # Task orchestration
│ ├── runner.go # Sequential execution
│ └── parallel_runner.go # Concurrent execution
├── tui/ # Bubble Tea terminal UI
├── stream/ # Real-time event broadcasting
└── hooks/ # Session lifecycle hooks
The architecture follows a hub-and-spoke model:
┌─────────────────┐
│ OrpheusAgent │ ← Executive planner
│ (Claude 4.5) │
└────────┬────────┘
│
┌────────────────────┼────────────────────┐
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ CodingAgent │ │ DebugAgent │ │ DocAgent │
│ (Claude) │ │ (Codex) │ │ (Gemini) │
└──────────────┘ └──────────────┘ └──────────────┘
│ │ │
└────────────────────┼────────────────────┘
│
┌────────▼────────┐
│ Tool Registry │
│ (18+ tools) │
└─────────────────┘
The provider abstraction layer - 2,400+ lines handling:
- Multiple model providers (Anthropic, OpenAI, Google)
- Tool registration and execution
- SSE streaming for real-time feedback
- Request/response normalization
Design decision: I chose to create a unified interface rather than using separate clients because I wanted the ability to swap models at runtime based on task requirements.
SQLite-based pattern storage with FTS5 full-text search:
- Store successful refactoring patterns
- Jaccard similarity matching for pattern recall
- Session persistence for resume functionality
// Pattern matching uses Jaccard similarity
func jaccardSimilarity(a, b []string) float64 {
setA := make(map[string]bool)
for _, s := range a { setA[s] = true }
intersection, union := 0, len(setA)
for _, s := range b {
if setA[s] { intersection++ }
else { union++ }
}
return float64(intersection) / float64(union)
}Tool system with:
- Automatic parameter validation
- Execution timing and logging
- Event broadcasting for UI updates
The concurrency infrastructure using:
- WaitGroups for synchronization
- Channels for result collection
- Per-task timing and error handling
// Execute tasks in parallel using goroutines
for _, task := range tasks {
wg.Add(1)
go func(task Task) {
defer wg.Done()
result, err := t.registry.ExecuteTool(ctx, task.Tool, task.Parameters)
resultChan <- taskResult{...}
}(task)
}
wg.Wait()State injection system that provides:
- Git status (branch, HEAD, modified files)
- Recent conversation history
- Active tasks and discoveries
The tool registration worked well:
registry.RegisterTool(&tools.ReadFileTool{})
registry.RegisterTool(&tools.WriteFileTool{})
registry.RegisterTool(&tools.GitStatusTool{})This made adding new capabilities straightforward and kept tool implementations isolated.
Being able to save/resume sessions with git snapshots was genuinely useful:
orpheus save fix-auth-bug
# Later...
orpheus resume fix-auth-bug
# Shows: "Changes since session: 3 files modified, 2 new commits"Automatically capturing git state before each request helped the model make better decisions:
📂 branch: feature/new-auth; ahead: 2; behind: 0
• Modified: internal/auth/handler.go
• Untracked: test_auth.go
The Bubble Tea interface with real-time tool execution feedback was satisfying to use and helped with debugging.
The vision of "Claude for coding, Codex for review, Gemini for docs" was elegant in theory but added massive complexity:
- Different API formats to normalize
- Different tool calling conventions
- Coordination overhead between models
- Debugging became much harder
The SQLite + FTS5 pattern learning system was technically interesting but:
- Patterns rarely matched closely enough to be useful
- Storing/recalling added latency
- Most coding tasks are unique enough that pattern reuse doesn't help
Building Orpheus was a valuable learning experience. The codebase reflects exploration and testing hypotheses about what makes coding agents effective.
Key takeaway: The simplest agent that works is usually better than the most sophisticated agent you can build. Start simple, add complexity only when you have evidence it helps.
The most valuable output of this project isn't the code - it's understanding why certain patterns work and others don't.
git clone https://github.com/arpitnath/orpheus-cli.git
cd orpheus-cli
go build -o bin/orpheus ./cmd/orpheusCopy .env.example to ~/.orpheus/.env:
ANTHROPIC_API_KEY=...
GEMINI_API_KEY=... # optional
OPENAI_API_KEY=... # optional
# Interactive TUI
orpheus ui
# Simple REPL
orpheus interactive
# Session management
orpheus save <name>
orpheus resume <name>
orpheus sessions- Go 1.20+
- macOS or Linux
- API keys for providers you want to use
MIT - Do whatever you want with this code. Learn from it, fork it, improve it.
This is a learning project. The code reflects experimentation, not best practices.