Built on GSD (Get Shit Done) workflow patterns by glittercowboy. Tiller is a successor system designed for agent-first interaction with explicit state machines, CLI discoverability, and deterministic transitions. See GSD Analysis and Feature Parity Mapping for the evolution story.
Multi-session workflow automation for Claude Code. This repository contains two complementary tools:
- ahoy - Demand-side spec creation and proposal management (
specs/) - tiller - Supply-side execution tracking and state management (
plans/)
Prerequisites:
- Bun runtime (not npm/node)
- Claude Code CLI
Install from source:
# Clone the repository
git clone https://github.com/gulp/tiller
cd tiller
# Install dependencies and build
bun install
bun run build
# Link globally
bun link
# Verify installation
tiller --version
ahoy --versionTiller is a CLI tool that works with Claude Code. The plugin provides hooks to help Claude discover and use tiller commands quickly.
In Claude Code:
- Run
/plugincommand - Navigate to Marketplaces tab
- Click + Add Marketplace
- Enter marketplace source:
https://github.com/gulp/tiller(orgulp/tiller) - Navigate to Discover tab and enable "tiller" and/or "ahoy" plugins
Note: Marketplace plugins support limited hooks (SessionStart, PreCompact only).
For complete hook coverage (PreToolUse, PostToolUse, Stop, etc.):
# Run setup to install hooks
tiller setup claude
# This configures:
# - All tiller hooks (verification, sync, status)
# - Custom startup behaviors
# - Pre-commit validationThis monorepo implements a demand vs supply workflow for AI-assisted development:
Purpose: Explore ideas, draft proposals, answer "Is this worth building?"
- Create unnumbered drafts (
specs/feature-name/) - Write
scope.mdto define problem and desired state - Assign sequential IDs when ready (
ahoy number→specs/0001-feature-name/) - Create
PROPOSAL.mdwhen mature
Purpose: Execute approved work, track state, verify completion
- Multi-initiative support - Organize work into initiatives (products, components, workstreams) with session-scoped focus (
tiller focus) - Plan-based workflow - Organize work into phases and plans with clear objectives
- State machine tracking - Progress through: proposed → approved → ready → active → verifying → complete
- Cross-session persistence - Survive context compaction and session resets
- Verification gates - UAT workflows with pass/fail status and auto-completion
- Git-native sync - JSONL-based state sharing across machines (beads pattern)
Initiatives are flexible organizational units. They can be:
- Separate products (
tiller-cli,auth-service) - Components in a monorepo (
web/ui,web/auth,web/api) - Major workstreams or subsystems (
app/mobile,app/desktop)
The term "initiative" avoids overloading "feature" (which gets confusing in monorepos).
The Boundary: tiller accept is the only crossing point from demand → supply.
# Demand side (ahoy) - Exploration
ahoy draft dark-mode # Create specs/dark-mode/
ahoy discuss dark-mode # Interactive requirements gathering
ahoy research dark-mode # Explore codebase for implementation approach
# ... refine scope.md ...
ahoy number dark-mode # Ready → assign ID: specs/0003-dark-mode/
# Cross the boundary
tiller accept 0003-dark-mode --as-initiative myapp --phases 3
# Supply side (tiller) - Execution
tiller activate 01-01 # Execute first plan (phase 01, plan 01)
tiller verify 01-01 --pass # Verify completion
tiller complete 01-01 # Generate SUMMARY.mdSee ADR-0005: Demand vs Supply for the complete mental model.
Sample conversation with Claude:
You: "I want to add dark mode to my app"
Claude: "Let me create a plan for dark mode..."
[Claude creates plans/myapp/01/01-01-PLAN.md]
You: "tiller start 01-01"
Claude: "Run created in active/executing. Working on dark mode..."
[Claude implements the feature]
"Work complete. Ready for verification."
You: "tiller verify 01-01 --pass"
You: "tiller complete 01-01"
Claude: "Generated SUMMARY.md. Feature complete!"
How it works:
- Planning - Claude creates PLAN.md files with objectives and tasks
- Start -
tiller startcollapses init + approve + activate into one step (creates run inactive/executing) - Execution - Agent works on the plan, state tracked in
.tiller/runs/ - Verification -
tiller verify --passor--failcaptures UAT results - Completion -
tiller completefinalizes and generates SUMMARY.md - Persistence - State survives context resets via
.tiller/runs.jsonl
Granular workflow: For more control, use individual commands:
tiller init(→ ready) →tiller activate(→ active/executing) →tiller verify→tiller complete. See TILLER-DESIGN.md.
For detailed workflow patterns and state machine behavior, see TILLER-DESIGN.md.
Detailed design docs and architecture:
- Design Overview - Architecture and patterns
- CLI Reference - Complete command documentation
- Installation Guide - Setup and configuration
- GSD Analysis - What Tiller took from GSD and why
- GSD Feature Mapping - Command-by-command comparison
- ADRs - Architectural decision records
Note: These commands are designed for Claude to run autonomously. You typically won't run them directly - Claude discovers them via
--helpand executes them based on workflow state.If you do run commands in your terminal, add
--prettyfor human-readable output:tiller status --pretty tiller show 01-01 --prettyBy default, tiller outputs TOON format (structured data for agents).
# Create and explore
ahoy draft <name> # Create unnumbered draft (specs/name/)
ahoy show <draft> # Display draft contents
ahoy discuss <draft> # Gather requirements via interactive questions
ahoy research <draft> # Research how to implement (codebase exploration)
ahoy scope <draft> # Read/write scope.md
# Commit when ready
ahoy number <draft> # Assign sequential ID (specs/0001-name/)
ahoy lock <draft> # Mark as committed (adds .lock suffix)
ahoy status [draft] # Show draft lifecycle state# Accept spec into execution (demand → supply boundary)
tiller accept <spec-ref> --as-initiative <name> --phases <n>
# Multi-initiative management
tiller focus <initiative> # Set working initiative for session
tiller initiative list # Show all initiatives
tiller status # Show current state and next actions (respects focus)
# Run lifecycle (HSM state machine)
tiller start <ref> # Collapsed: init + approve + activate (→ active/executing)
tiller init <plan> # Create run from PLAN.md (→ ready)
tiller activate <ref> # Start execution (ready → active/executing)
tiller pause <ref> # Pause active work (→ active/paused)
tiller resume <ref> # Resume paused work (→ active/executing)
# Verification gates
tiller verify <ref> --pass # Mark verification passed (→ verifying/passed)
tiller verify <ref> --fail # Mark verification failed (→ verifying/failed)
tiller complete <ref> # Finalize run and generate SUMMARY.md (→ complete)
# Cross-machine sync (beads pattern)
tiller sync # Bidirectional: import JSONL → local, export local → JSONL
tiller sync --export # Export only: local runs → runs.jsonlTiller uses:
- HSM state machine - Hierarchical states (active/executing, verifying/testing, etc.) - see State Machine Design
- Two-layer sync - JSON cache + JSONL source of truth (beads pattern)
- TOON output format - Structured data + agent guidance - see TOON Specification
- Agent-first contracts - CLI discoverability, semantic clarity,
--helpas documentation
For complete architecture documentation, see TILLER-DESIGN.md.
stateDiagram-v2
[*] --> proposed: tiller init
proposed --> approved: tiller approve
approved --> ready
ready --> active: tiller start / activate
state active {
[*] --> executing
executing --> paused: tiller pause
paused --> executing: tiller resume
executing --> checkpoint: agent checkpoint
checkpoint --> executing: resolved
}
active --> verifying: tiller verify
state verifying {
[*] --> testing
testing --> passed: --pass
testing --> failed: --fail
failed --> fixing: create FIX-PLAN
fixing --> retesting: fixes complete
retesting --> passed: passes
retesting --> failed: still failing
}
verifying --> active: tiller rework
verifying --> complete: tiller complete
complete --> active: tiller rework
complete --> [*]
active --> abandoned: tiller abandon
verifying --> abandoned: tiller abandon
abandoned --> [*]
Shortcut:
tiller start 01-01collapsesinit → approve → activateinto one command.
See docs/TILLER-DESIGN.md for details.
Built on workflow patterns from GSD (Get Shit Done) by glittercowboy.
What Tiller took from GSD:
- Spec-driven development philosophy (Phase 1 human-led, Phase 2 AI-driven)
- PLAN.md as source of truth
- Multi-phase project structure
- Verification checkpoints and UAT workflow
What Tiller changed:
- Explicit state machine with hierarchical states (not inferred)
- TOON output format for deterministic agent guidance
- Two-layer sync pattern (JSON + JSONL)
- CLI-first design (not slash commands)
See GSD Analysis for detailed breakdown and Feature Parity Mapping for command-by-command comparison.
Additional inspirations:
- beads - CLI fluency and
bd readypattern - agent-browser - Agent-first CLI design
Models are exceptionally well-trained on CLI commands. They're fluent with discoverable patterns: --help text, man pages, file-based state, markdown docs they can grep.
- Execution order is hard to remember - Which command comes first?
/gsd:plan-phaseor/gsd:discuss-phase? - No discoverability - Models must memorize system prompts. No
--helpto reference. - Context-dependent - Commands work differently depending on what phase you're in
- Inference-heavy - Models guess what state you're in and what comes next
- Self-documenting -
tiller --help,tiller status --help, discoverable without prompts - Explicit state -
tiller statusshows exactly where you are and what's next - Deterministic - Commands do the same thing regardless of context
- File-based state - Everything in
.tiller/andplans/that agents can read/grep - TOON output - Every command returns structured data +
agent_hintfor next step
The pattern: When you give agents CLI tools with clear contracts, they become autonomous. No ceremony, just good interfaces.
Note: I'm a UX designer who built this through "vibecoding" - explaining ideas to Claude and iterating on generated code. This entire codebase was created through AI-assisted development, which shaped its design philosophy.
For the last 10 months, I've been exploring how to build tools that agents can use fluently. I tried every framework I could find:
- GSD (Get Shit Done) - Used it extensively, loved the spec-driven philosophy, but needed more explicit state tracking and deterministic transitions. Tiller directly builds on GSD's workflow patterns.
- SpecKit - Taught me about machine-readable contracts, but hard to track what's built vs. what's planned.
- beads - Brilliant for issue tracking (
bd readypattern), but I needed lifecycle gates and approval checkpoints.
See docs/GSD-ANALYSIS.md for detailed analysis of what I took from GSD and why I built Tiller.
The breakthrough came from watching agents interact with tools like beads and agent-browser. When you give models a well-designed CLI with --help and clear contracts, they become fluent without system prompts. No ceremony, just good interfaces.
This led to three core principles:
1. TOON Pattern (Token-Oriented Object Notation)
Instead of making agents guess what to do next, tell them explicitly:
accept:
spec: specs/0001-feature
phases_created: 3
plans[3]{ref,run_id,state}:
06.1,run-abc,ready
06.2,run-def,ready
06.3,run-ghi,ready
agent_hint: "Plans are ready. Next: tiller activate 06.1"
Every command returns structured data + meta-instruction. Agents know exactly what happened and what to do next.
2. State Machine as Interface
Tiller uses hierarchical states (active/executing, verifying/passed) that are semantic for both agents and humans. Every command returns the current state and available transitions. Agents query tiller status, understand where they are, proceed autonomously.
The bd ready pattern from beads: agents ask "what's unblocked?" and get a deterministic answer.
3. File-Based State Visibility
Everything lives in files that agents and humans both see:
plans/directory structure is explorable.tiller/runs.jsonlis git-tracked state (beads pattern)- SUMMARY.md files show what was accomplished
- State changes are visible in your file tree, not buried in logs
Traditional UX reduces friction for humans through clicks, forms, navigation.
Agent UX reduces friction for models - and models are fluent with:
- CLI commands with clear contracts
- File-based state they can read/write
- JSON output they can parse
--helptext and man pages
This isn't about "better tools for humans using AI." It's about designing interfaces that models natively understand.
When you give agents discoverable patterns (CLIs, filesystem structure, markdown docs they can grep), they become autonomous. Tiller is my attempt at this - an experiment in what happens when you design for agent ergonomics, not against them.
This is MIT licensed - do whatever you want with it. Fork it, improve it, break it, make it yours.
I built this through "vibecoding" with Claude and I'm not experienced at maintaining open source projects. If you see something broken or have ideas, feel free to open issues or PRs. No promises on response time, but I'm interested in seeing what people build with it.
The codebase follows some patterns documented in CLAUDE.md if you want to understand the philosophy, but there are no strict rules.
MIT - see LICENSE file