-
-
Notifications
You must be signed in to change notification settings - Fork 0
Six Phase Workflow
The workflow engine is the heart of M31 Autonomous (M31A). Every coding task is routed through six phases, providing structured execution with verification, self-healing, and git integration.
Source: internal/workflow/engine.go
flowchart LR
I[Initialize] --> D[Discuss]
D --> P[Plan]
P --> E[Execute]
E --> V[Verify]
V --> S[Ship]
I -. "Project detection,<br/>context building,<br/>code intelligence" .-> I
D -. "LLM-based intent<br/>classification +<br/>clarifying questions" .-> D
P -. "Optional research step +<br/>chunked planning for<br/>complex goals" .-> P
E -. "Task execution with<br/>tool calls + self-healing" .-> E
V -. "Build + test +<br/>coverage gates" .-> V
S -. "Commit + ledger<br/>entry + preflight" .-> S
The engine supports four modes that control which phases run:
| Mode | Phases | Use Case |
|---|---|---|
auto (default) |
Classifies prompt complexity and adapts | Most tasks |
full |
All 6 phases: Init, Discuss, Plan, Execute, Verify, Ship | Complex features |
fast |
Skips Plan: Init, Discuss, Execute, Verify, Ship | Simple fixes |
direct |
Skips Discuss, Plan, Verify: Init, Execute, Ship | Trivial changes |
Mode is set via config (features.workflow_mode) or the /agent-mode command.
Source: internal/workflow/initialize.go
- Detects project type (Go, Node.js, Rust, Python, Ruby) via marker files
- Builds project context: framework, structure, dependencies
- Creates a git commit hash snapshot for rollback baseline
- Runs code intelligence indexing (
internal/codeintel/) - Stores project state in the session
Source: internal/workflow/discuss.go
- LLM generates clarifying questions about the task
- Questions are presented to the user in the TUI
- User answers are collected and saved to the project state
- Supports Q&A timeout (configurable, default 300s)
- Can be skipped with empty answers
The discuss phase uses streaming to present questions in real-time. The TUI renders each question as it arrives and collects answers via interactive prompts.
Source: internal/workflow/intent.go
Before entering Discuss, the engine classifies the user's input intent using an LLM:
- Uses a 5-second timeout to avoid blocking
- Falls back to keyword-based
ClassifyPrompt()on failure - Returns an
IntentResultwith classified complexity and routing hints - Determines whether to enter full, fast, or direct workflow mode
flowchart TD
Input[User Input] --> IC{Intent Classification}
IC -->|Complex| Full[Full Workflow<br/>Init→Discuss→Plan→Execute→Verify→Ship]
IC -->|Moderate| Fast[Fast Workflow<br/>Init→Discuss→Execute→Verify→Ship]
IC -->|Simple| Direct[Direct Workflow<br/>Init→Execute→Ship]
IC -->|Timeout/Error| KB[Keyword Fallback]
KB --> Full
KB --> Fast
KB --> Direct
Source: internal/workflow/research.go
For complex goals, the engine runs a pre-plan research step:
- Investigates the codebase structure and existing patterns
- Identifies risks and potential challenges
- Surfaces relevant files and dependencies
- Output is injected into the plan context for better planning
Source: internal/workflow/plan.go
- LLM generates a structured implementation plan in markdown
- Plan includes: task breakdown, file predictions, acceptance criteria
- Plans are parsed into
Taskstructs viainternal/workflow/plan_parser.go - Supports refinement: user feedback triggers plan regeneration (up to
MaxPlanRefinements = 5) - Plan content is saved to
STATE.mdin the session directory
Source: internal/workflow/plan_chunk.go
For complex goals with many files, the engine uses chunked planning:
-
Outline generation — Creates a high-level
PlanOutlinewith waves and task stubs - Wave expansion — Each wave is fleshed out into full tasks with acceptance criteria
- Incremental execution — Waves execute sequentially, allowing later waves to benefit from earlier results
flowchart LR
Goal[Complex Goal] --> Outline[Generate Outline]
Outline --> W1[Wave 1]
Outline --> W2[Wave 2]
Outline --> W3[Wave 3]
W1 --> E1[Execute Wave 1]
E1 --> W2E[Expand Wave 2]
W2E --> E2[Execute Wave 2]
E2 --> W3E[Expand Wave 3]
W3E --> E3[Execute Wave 3]
Chunking is triggered when:
- Goal is classified as
complex - Project has 30+ source files
- Threshold: 10+ tasks (configurable via
features.plan_chunk_threshold)
Plan format (from prompts/plan-format.md):
- Task ID, description, action, category
- File predictions with create/modify/delete actions
- Dependencies between tasks
- Acceptance criteria per task
Source: internal/workflow/execute.go
- Tasks are executed in dependency order
- Each task sends tool calls to the LLM (Bash, FileRead, FileWrite, Edit, etc.)
- Tool results are fed back to the LLM for iterative refinement
- Self-healing: failed tasks are retried up to
MaxHealAttempts(2) times - Task runner supports parallel execution within independent groups (up to
DefaultMaxParallelTasks = 4) - Each completed task creates a git commit
The execute phase uses streamLLMWithTools() which sends tool definitions to the LLM and parses native tool_call chunks from the SSE stream. Tool calls are accumulated by index and finalized into structured ToolCall objects.
Source: internal/workflow/verify.go
- Runs build commands (auto-detected or configured)
- Runs test suites
- Checks file existence for predicted files
- Validates syntax of modified files
- Reports per-task pass/fail status
- Failed tasks can trigger self-healing loops
Source: internal/workflow/coverage_gates.go
Verification includes coverage gate checks:
- Validates that acceptance criteria are met (grep-verifiable)
- Checks file creation/modification predictions
- Runs quality checks via
internal/workflow/execute_quality.go - Generates a structured verification report via
internal/workflow/verify_report.go
Verification commands are auto-detected based on project type:
| Project Type | Build Command | Test Command |
|---|---|---|
| Go | go build ./... |
go test ./... |
| Node.js | npm run build |
npm test |
| Rust | cargo build |
cargo test |
| Python | python -m build |
pytest |
Custom commands can be set in [verify] config section.
Source: internal/workflow/ship.go
- Runs preflight checks via
internal/workflow/ship_preflight.go - Creates final git commit with all verified changes
- Generates structured commit message from the plan
- Appends a ledger entry with session metrics
- Uses platform-specific file locking (Unix:
flock, Windows: direct write) - Computes diff statistics
- Reports commit count and changed files
Transitions are validated via an explicit state machine:
stateDiagram-v2
[*] --> Idle
Idle --> Initialize
Initialize --> Discuss
Initialize --> Execute
Initialize --> Idle
Discuss --> Plan
Discuss --> Execute
Discuss --> Idle
Plan --> Execute
Plan --> Plan
Plan --> Discuss
Plan --> Idle
Execute --> Verify
Execute --> Ship
Execute --> Idle
Verify --> Ship
Verify --> Execute
Verify --> Idle
Ship --> Idle
Idle --> [*]
Plan-to-Discuss cycles are capped at 3 (maxDiscussPlanCycles) to prevent infinite oscillation. The counter resets when the workflow moves to Execute, Ship, or Idle.
All prompts are embedded markdown files loaded at engine creation:
| File | Purpose |
|---|---|
prompts/base.md |
Base system prompt (cached for session lifetime) |
prompts/tool-use.md |
Tool usage instructions for the LLM |
prompts/plan-format.md |
Plan output format specification |
prompts/execute-task.md |
Task execution instructions |
prompts/discuss-questions.md |
Discuss phase question format |
prompts/self-heal.md |
Self-healing instructions for failed tasks |
prompts/demonstration-format.md |
Demonstration output format |
prompts/autonomous.md |
Autonomous mode instructions |
prompts/context-awareness.md |
Context awareness instructions |
prompts/code-quality.md |
Code quality standards |
prompts/code-intelligence.md |
Code intelligence integration |
Source: internal/workflow/prompts/
Each workflow phase can use a different model, checked in priority order:
-
Per-phase overrides set interactively via the TUI model picker (
SetPhaseModel) -
AgentsConfig from
config.toml([agents]section) -
Global default (
cfg.Agents.Default) - Engine's active model (fallback)
This allows, for example, using a cheap model for Discuss and Plan, and a powerful model for Execute and Verify.
When features.budget_limit_usd is set, the engine tracks cumulative cost across all phases using atomic operations. Each phase checks the budget before executing:
if cost >= cfg.Features.BudgetLimitUSD {
return error("budget limit exceeded")
}Cost is accumulated from the Cost field of each PhaseResult, stored as bits in a uint64 for lock-free atomic updates.
When a task fails verification, the engine can attempt self-healing:
- The failed task's error output is sent back to the LLM
- The LLM generates a fix using tool calls
- The fix is verified again
- Up to
MaxHealAttempts(2) retries before marking asStatusUnrecoverable
Self-healing is triggered automatically during Execute phase and can also be triggered manually via HealTask(taskID).