Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Added

- **F089**: GitHub Copilot agent provider — new `github_copilot` provider implementing `AgentProvider` interface via the `copilot` CLI binary; single-turn execution (`-p <prompt> --output-format=json --silent`), multi-turn conversation support (`--resume=<session-id>`), CLI flag mapping for model (`--model`), mode (`--mode`: interactive/plan/autopilot), effort (`--effort`: low/medium/high), tool permissions (`--allow-tool`, `--deny-tool`, `--allow-all`); JSONL streaming display event parsing; option validation for `mode` and `effort` enums at `awf validate` time; graceful fallback to stateless mode when session ID extraction fails; registered in `RegisterDefaults()` as 6th provider; authenticate via `copilot login` or environment variables (`COPILOT_GITHUB_TOKEN`, `GH_TOKEN`)
- **F088**: Terminal User Interface (TUI) — `awf tui` launches a full-screen Bubble Tea dashboard with five tabs: Workflows (filterable list with launch/validate actions), Monitoring (real-time execution tree with status icons and live log viewport via 200ms tick polling of `ExecutionContext`), History (browse past executions from SQLite with filtering by name/status/date), Agent Conversations (Glamour-rendered Markdown chat view), and External Logs (fsnotify-based live tailing of Claude Code JSONL session files); TUI lives in `internal/interfaces/tui/` as a new interface adapter bridging to existing `WorkflowService`, `ExecutionService`, and `HistoryService` via async `tea.Cmd` factories; secret masking applied to all views; terminal state restored on exit/panic/signal; requires 256-color terminal support (graceful fallback to basic ANSI)
- **F085**: Unified display-event abstraction across all agent providers — replaces per-provider `LineExtractor` function-field with a `DisplayEventParser` returning structured `DisplayEvent` values (discriminated by `EventText` and `EventToolUse` kinds); all 5 providers (Claude, Codex, Gemini, OpenCode, OpenAI-Compatible) now emit events through the same parser contract; single interfaces-layer `RenderEvents` renderer with two display modes: default (text only, byte-equivalent to F082 behaviour) and verbose (text + tool-use markers in `[tool: Name(Arg)]` format); well-known tools (`Read`, `Write`, `Edit`, `Bash`, `Grep`, `Glob`, `Task`) display concise markers with argument truncation (≤ 40 chars); unknown tool names degrade gracefully; parser implementations return plain strings with no ANSI escapes (rendering concerns confined to interfaces layer); `output_format: json` bypasses event parsing entirely for raw passthrough; `DisplayOutput` aggregation on `AgentResult`/`ConversationResult` preserved via text-event concatenation
- **F085**: Unified display-event abstraction across all agent providers — replaces per-provider `LineExtractor` function-field with a `DisplayEventParser` returning structured `DisplayEvent` values (discriminated by `EventText` and `EventToolUse` kinds); all 6 providers (Claude, Codex, Gemini, GitHub Copilot, OpenCode, OpenAI-Compatible) now emit events through the same parser contract; single interfaces-layer `RenderEvents` renderer with two display modes: default (text only, byte-equivalent to F082 behaviour) and verbose (text + tool-use markers in `[tool: Name(Arg)]` format); well-known tools (`Read`, `Write`, `Edit`, `Bash`, `Grep`, `Glob`, `Task`) display concise markers with argument truncation (≤ 40 chars); unknown tool names degrade gracefully; parser implementations return plain strings with no ANSI escapes (rendering concerns confined to interfaces layer); `output_format: json` bypasses event parsing entirely for raw passthrough; `DisplayOutput` aggregation on `AgentResult`/`ConversationResult` preserved via text-event concatenation

### Changed

Expand Down
5 changes: 3 additions & 2 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ Use `mcp__plugin_common_serena__write_memory` or `mcp__plugin_common_serena__edi

## Project Overview

**ai-workflow-cli** (`awf`) - A Go CLI tool for orchestrating AI agents (Claude, Gemini, Codex) through YAML-configured workflows with state machine execution.
**ai-workflow-cli** (`awf`) - A Go CLI tool for orchestrating AI agents (Claude, Gemini, Codex, GitHub Copilot) through YAML-configured workflows with state machine execution.

## Architecture

Expand Down Expand Up @@ -217,7 +217,6 @@ func TestWorkflowValidation(t *testing.T) {

## Architecture Rules

- Pass optional turn-specific configuration (e.g., system_prompt) through options map in application layer; keeps infrastructure providers independent of turn logic
- Validate agent provider options only against what each CLI actually accepts; do not validate against API documentation if the underlying CLI rejects the option
- Plugin binaries must be discoverable at <plugins_dir>/<plugin_name>/awf-plugin-<plugin_name>; host validates binary existence and version compatibility via gRPC handshake after process start
- Commit generated protobuf files (.pb.go, _grpc.pb.go) to git; treat as source artifacts for build reproducibility, not ephemeral build outputs
Expand All @@ -240,6 +239,8 @@ func TestWorkflowValidation(t *testing.T) {
- Wire optional render callbacks alongside event parsers in stream processors; decouples rendering from parsing and enables multiple render modes (DefaultMode, VerboseMode) without modifying parser implementations
- When integrating external UI frameworks, create Bridge adapters in the interface layer that wrap application services; maintain zero infrastructure imports in bridge implementation

- Use provider name prefixes for all infrastructure provider helper methods (buildCopilot, extractCopilot, parseCopilot, validateCopilot) to prevent naming collisions across implementations

## Common Pitfalls

- Use 0o755 for executable scripts, 0o644 for data files, 0o700 for private temp files; match permissions to file purpose and access expectations
Expand Down
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,17 +3,17 @@
[![Go Version](https://img.shields.io/badge/Go-1.24+-00ADD8?style=flat&logo=go)](https://go.dev/)
[![License: EUPL-1.2](https://img.shields.io/badge/License-EUPL--1.2-blue.svg)](LICENSE)

A Go CLI tool for orchestrating AI agents (Claude, Gemini, Codex, OpenAI-Compatible APIs) through YAML-configured workflows with state machine execution.
A Go CLI tool for orchestrating AI agents (Claude, Gemini, Codex, GitHub Copilot, OpenAI-Compatible APIs) through YAML-configured workflows with state machine execution.

## Features

- **State Machine Execution** - Define workflows as state machines with conditional transitions based on exit codes, command output, or custom expressions
- **Inline Error Handling** - Specify error messages and exit codes directly on steps without creating separate terminal states
- **Agent Steps** - Invoke AI agents via CLI tools (Claude, Codex, Gemini) or direct HTTP (OpenAI, Ollama, vLLM, Groq) with prompt templates, response parsing, and accurate token tracking
- **Output Formatting for Agent Steps** - Automatically strip markdown code fences and validate JSON output; human-readable streaming display controlled by `output_format` field (text vs raw NDJSON); unified display-event abstraction across all 5 providers with optional verbose mode showing tool-use markers (`[tool: Name(Arg)]`)
- **Agent Steps** - Invoke AI agents via CLI tools (Claude, Codex, Gemini, GitHub Copilot) or direct HTTP (OpenAI, Ollama, vLLM, Groq) with prompt templates, response parsing, and accurate token tracking
- **Output Formatting for Agent Steps** - Automatically strip markdown code fences and validate JSON output; human-readable streaming display controlled by `output_format` field (text vs raw NDJSON); unified display-event abstraction across all 6 providers with optional verbose mode showing tool-use markers (`[tool: Name(Arg)]`)
- **External Prompt Files** - Load agent prompts from `.md` files with full template interpolation, helper functions, and local override support
- **External Script Files** - Load commands from external script files with shebang-based interpreter dispatch, template interpolation, path resolution, and local override support
- **Conversation Mode** - Multi-turn conversations with native session resume for CLI providers (`claude`, `codex`, `gemini`, `opencode`), automatic context window management for HTTP providers, mid-conversation context injection via `inject_context` field, and token tracking across all turns
- **Conversation Mode** - Multi-turn conversations with native session resume for CLI providers (`claude`, `codex`, `gemini`, `opencode`, `github_copilot`), automatic context window management for HTTP providers, mid-conversation context injection via `inject_context` field, and token tracking across all turns
- **OpenAI-Compatible Provider** - Use any Chat Completions API (OpenAI, Ollama, vLLM, Groq) with native HTTP integration, accurate token reporting, and no CLI tool required
- **Parallel Execution** - Run multiple steps concurrently with configurable strategies
- **Loop Constructs** - For-each and while loops with full context access
Expand Down
2 changes: 1 addition & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ Learn how to use AWF effectively:

- [Commands](user-guide/commands.md) - All CLI commands and flags
- [Interactive Input Collection](user-guide/interactive-inputs.md) - Automatic prompting for missing workflow inputs
- [Agent Steps](user-guide/agent-steps.md) - Invoke AI agents via CLI (Claude, Codex, Gemini) or HTTP APIs (OpenAI, Ollama, vLLM, Groq)
- [Agent Steps](user-guide/agent-steps.md) - Invoke AI agents via CLI (Claude, Codex, Gemini, GitHub Copilot) or HTTP APIs (OpenAI, Ollama, vLLM, Groq)
- [Output Formatting](user-guide/agent-steps.md#output-formatting) - Automatic code fence stripping and JSON validation (`output_format: json|text`)
- [Streaming Output Display & Tool Markers](user-guide/agent-steps.md#streaming-output-display--tool-markers) - Human-readable filtered output and tool-use markers for `--output streaming` and `--output buffered` modes
- [External Prompt Files](user-guide/agent-steps.md#external-prompt-files) - Load prompts from Markdown files with template interpolation
Expand Down
40 changes: 36 additions & 4 deletions docs/user-guide/agent-steps.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: "Agent Steps Guide"
---

Invoke AI agents (Claude, Codex, Gemini, OpenCode, OpenAI-Compatible) in your workflows with structured prompts and response parsing.
Invoke AI agents (Claude, Codex, Gemini, GitHub Copilot, OpenCode, OpenAI-Compatible) in your workflows with structured prompts and response parsing.

## Overview

Expand Down Expand Up @@ -126,6 +126,38 @@ summarize:
- `model`: Gemini model identifier — validated to start with `gemini-` prefix (see [Model Validation](#model-validation) below)
- `dangerously_skip_permissions`: Skip permission prompts (boolean, maps to `--approval-mode=yolo`). **Security warning**: bypasses all safety prompts — use only in trusted, automated environments.

### GitHub Copilot

Requires the `copilot` CLI tool installed and authentication via `copilot login` or environment variables.

```yaml
code_generate:
type: agent
provider: github_copilot
prompt: "Generate a function to: {{.inputs.requirement}}"
options:
model: gpt-4o
mode: interactive
timeout: 60
on_success: next
```

**Provider-Specific Options:**
- `model`: GitHub Copilot model identifier (e.g., `gpt-4o`, `gpt-4`, `gpt-3.5-turbo`)
- `mode`: Agent mode — one of `interactive` (default), `plan`, or `autopilot`
- `effort`: Reasoning effort level — one of `low`, `medium`, or `high`
- `allowed_tools`: Comma-separated list of tools to allow (e.g., `"bash,github_api"` → `--allow-tool bash --allow-tool github_api`)
- `denied_tools`: Comma-separated list of tools to deny (maps to `--deny-tool`)
- `allow_all`: Allow all available tools (boolean, maps to `--allow-all`)
- `system_prompt`: Custom system message (passed via prompt prepending)

**Authentication:**
GitHub Copilot CLI supports authentication via:
- `copilot login` (interactive authentication)
- `COPILOT_GITHUB_TOKEN` environment variable
- `GH_TOKEN` environment variable
- `GITHUB_TOKEN` environment variable (classic PATs not supported)

### OpenCode

Requires the `opencode` CLI tool installed.
Expand Down Expand Up @@ -242,9 +274,9 @@ options:
step validation error: model must start with "gpt-", "codex-", or match o-series pattern (e.g., o1, o3-mini)
```

### OpenCode & OpenAI-Compatible
### GitHub Copilot, OpenCode & OpenAI-Compatible

No model validation for `opencode` or `openai_compatible` providers — these use arbitrary backend models.
No model validation for `github_copilot`, `opencode`, or `openai_compatible` providers — these support arbitrary backend models.

### When Validation Occurs

Expand Down Expand Up @@ -699,7 +731,7 @@ Tool markers show:
- **Interleaved order** — markers appear in the same source order as agent output
- **Graceful degradation** — unknown tool names display as-is with no crash or error

This works consistently across all 5 supported providers (Claude, Codex, Gemini, OpenCode, OpenAI-Compatible). Verbose mode has no effect on `output_format: json` — raw NDJSON is always passed through unchanged.
This works consistently across all 6 supported providers (Claude, Codex, Gemini, GitHub Copilot, OpenCode, OpenAI-Compatible). Verbose mode has no effect on `output_format: json` — raw NDJSON is always passed through unchanged.

#### Buffered Mode (`--output buffered`)

Expand Down
4 changes: 2 additions & 2 deletions docs/user-guide/conversation-steps.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ You'll see the agent's first reply, then a `> ` prompt where you can type. When

| Field | Required | Description |
|---|---|---|
| `provider` | Yes | Agent provider (`claude`, `gemini`, `codex`, `opencode`, `openai_compatible`) |
| `provider` | Yes | Agent provider (`claude`, `gemini`, `codex`, `opencode`, `github_copilot`, `openai_compatible`) |
| `mode` | Yes | Must be `conversation` |
| `prompt` | Yes | First user message — sent automatically as turn 1 |
| `system_prompt` | No | System message preserved for the whole session |
Expand Down Expand Up @@ -137,7 +137,7 @@ states:
Both steps run as `mode: single` (the default — no `mode:` line needed). There is no interactive loop. Each step runs exactly one agent turn.

- `seed` has `conversation: {}`. This marks the step as session-tracked: AWF calls `provider.ExecuteConversation` (instead of `provider.Execute`), the provider runs one turn, and the session ID returned by the CLI is captured into `state.conversation.session_id`.
- `recall` has `conversation: {continue_from: seed}`. AWF clones the conversation state from `seed` (session ID + turn history) and passes it to the provider, which resumes the session via its native flag (`claude -r <id>`, `gemini --resume <id>`, `codex resume <id>`, `opencode -s <id>`).
- `recall` has `conversation: {continue_from: seed}`. AWF clones the conversation state from `seed` (session ID + turn history) and passes it to the provider, which resumes the session via its native flag (`claude -r <id>`, `gemini --resume <id>`, `codex resume <id>`, `opencode -s <id>`, `copilot --resume=<id>`).

### Why the Empty `conversation: {}`?

Expand Down
Loading
Loading