feat: add harness executor for coding agent CLI orchestration#1978
feat: add harness executor for coding agent CLI orchestration#1978
Conversation
Add a new `type: harness` step executor that spawns coding agent CLIs (Claude Code, Codex, OpenCode, Pi) as subprocess workflow steps. Each provider maps a unified config (model, effort, max_turns, output_format) to its CLI-specific flags, with provider-specific options and an extra_flags escape hatch for forward compatibility.
Rewrite codingagent.md to use the harness executor instead of bare shell scripts. Add harness section to executors.md reference.
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughAdds a new "harness" executor to run external coding-agent CLI binaries as subprocesses, a provider registry with built-in providers (claude, codex, copilot, opencode, pi), schema/spec support and DAG-level harness config, and comprehensive tests and docs for harness behavior and merging semantics. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant DAG as DAG Step
participant Executor as Harness Executor
participant Registry as Provider Registry
participant Provider as Provider Impl
participant CLI as CLI Binary
User->>DAG: trigger step (type: harness)
DAG->>Executor: Run(ctx, step)
Executor->>Executor: decode/merge harness config
Executor->>Registry: getProvider(providerName)
Registry-->>Executor: Provider instance
Executor->>Executor: resolve binary path (PATH / workdir)
Executor->>Provider: BaseArgs(prompt)
Provider-->>Executor: CLI args
Executor->>CLI: start subprocess (args, env, cwd, stdin)
CLI-->>Executor: stdout / stderr streams
Executor->>Executor: spool stdout, tail stderr
Executor->>CLI: wait for completion or ctx cancel
alt ctx cancelled
Executor->>CLI: kill process group
CLI-->>Executor: terminated (exit 124)
end
Executor-->>DAG: result (stdout file/stream, exit code, stderr tail)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
skills/dagu/references/codingagent.md (1)
55-68: Clarify the dual use ofcommand:andscript:in harness steps.The example shows both
command:andscript:fields. While line 88 explains the behavior, the example at lines 58-66 could be confusing since thescript:content references${RESEARCH}which would be interpolated before being piped to stdin. Consider adding an inline comment in the YAML example clarifying this interaction.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@skills/dagu/references/codingagent.md` around lines 55 - 68, The YAML example for the harness step named "review" is confusing because it shows both the script: block (containing ${RESEARCH}) and a separate command: field without clarifying which is used; update the example by adding a short inline comment next to the script or command explaining that when both are present the script is interpolated (variables like ${RESEARCH} are expanded) and its output is piped to the command's stdin, so the command ("Review the research...") reads the interpolated script from stdin. Reference the "review" step, the script: block, the command: field, and the ${RESEARCH} variable in the comment so readers understand the interaction.internal/runtime/builtin/harness/provider.go (1)
27-33: Consider generating the supported provider list dynamically.The error message hardcodes
"claude, codex, opencode, pi"which could become stale if providers are added or removed. Consider generating this list from the registry keys.♻️ Proposed refactor to generate provider list dynamically
func getProvider(name string) (Provider, error) { p, ok := providers[name] if !ok { - return nil, fmt.Errorf("harness: unknown provider %q; supported: claude, codex, opencode, pi", name) + supported := make([]string, 0, len(providers)) + for k := range providers { + supported = append(supported, k) + } + return nil, fmt.Errorf("harness: unknown provider %q; supported: %v", name, supported) } return p, nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/runtime/builtin/harness/provider.go` around lines 27 - 33, The error in getProvider hardcodes the supported provider names; instead, build the supported list dynamically from the providers registry: iterate over the map keys of providers, collect them into a slice, sort if deterministic output is desired, join with commas (e.g., strings.Join) and use that generated string in the fmt.Errorf so the message always reflects the actual entries in providers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/runtime/builtin/harness/harness.go`:
- Around line 215-217: The current validation only checks that step.Commands is
non-empty but allows commands with empty prompts to slip through; update the
validation in the harness creation path (the block that currently checks if
len(step.Commands) == 0) to also iterate over step.Commands and reject any
command whose prompt/text field is empty (the same field that newHarness later
expects), returning a clear error (e.g., "harness: command prompt is required")
referencing the step.Commands entries; ensure this check runs before calling
newHarness so invalid steps are caught early.
In `@internal/runtime/builtin/harness/pi.go`:
- Around line 35-40: The current argument assembly can emit both cfg.Tools and
cfg.NoTools, producing conflicting flags; update the logic in the pi.go
argument-building code so the flags are mutually exclusive: check cfg.NoTools
first and if true only append "--no-tools", otherwise if cfg.Tools is non-empty
append "--tools" with its value (i.e., ensure you do not append "--tools" when
cfg.NoTools is set), making the decision around the same args slice where
cfg.Tools and cfg.NoTools are currently handled.
In `@skills/dagu/references/codingagent.md`:
- Around line 228-235: The Codex model names in the "Model tiers" table are
incorrect; update the Codex column entries (currently 'gpt-5.1-codex-mini',
'gpt-5.4', 'gpt-5.3-codex') to real identifiers (e.g., cheap ->
'codex-mini-latest' or 'GPT-5.1-Codex-Max'; balanced -> 'gpt-5-codex' or
'GPT-5.2-Codex'; most capable -> 'GPT-5.3-Codex') or explicitly mark them as
placeholders, ensuring the table in skills/dagu/references/codingagent.md and
the "Model tiers" section uses accurate, correctly-cased model names.
---
Nitpick comments:
In `@internal/runtime/builtin/harness/provider.go`:
- Around line 27-33: The error in getProvider hardcodes the supported provider
names; instead, build the supported list dynamically from the providers
registry: iterate over the map keys of providers, collect them into a slice,
sort if deterministic output is desired, join with commas (e.g., strings.Join)
and use that generated string in the fmt.Errorf so the message always reflects
the actual entries in providers.
In `@skills/dagu/references/codingagent.md`:
- Around line 55-68: The YAML example for the harness step named "review" is
confusing because it shows both the script: block (containing ${RESEARCH}) and a
separate command: field without clarifying which is used; update the example by
adding a short inline comment next to the script or command explaining that when
both are present the script is interpolated (variables like ${RESEARCH} are
expanded) and its output is piped to the command's stdin, so the command
("Review the research...") reads the interpolated script from stdin. Reference
the "review" step, the script: block, the command: field, and the ${RESEARCH}
variable in the comment so readers understand the interaction.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0db391f9-5ce8-4972-b13e-276dc0146671
📒 Files selected for processing (11)
internal/runtime/builtin/builtin.gointernal/runtime/builtin/harness/claude.gointernal/runtime/builtin/harness/codex.gointernal/runtime/builtin/harness/harness.gointernal/runtime/builtin/harness/harness_test.gointernal/runtime/builtin/harness/opencode.gointernal/runtime/builtin/harness/pi.gointernal/runtime/builtin/harness/provider.gointernal/runtime/builtin/harness/schema.goskills/dagu/references/codingagent.mdskills/dagu/references/executors.md
- Strengthen validation to reject empty prompts, not just empty command lists (harness.go) - Make --tools and --no-tools mutually exclusive in Pi provider; --no-tools takes precedence (pi.go) - Fix Codex model names to match actual OpenAI offerings (codingagent.md, harness.md) - Fix go fmt alignment of mapstructure struct tags (harness.go)
…pilot Redesign the harness executor to pass config map entries directly as CLI flags instead of hardcoding provider-specific struct fields. Config keys become --key, values are type-dependent (string→value, bool→flag, number→N, array→repeated). This makes the executor future-proof — any new CLI flag works immediately without Go code changes. Add GitHub Copilot CLI as a new provider (binary: copilot, -p prompt). Remove hardcoded model name constants from docs — the agent should look up current model names from provider documentation. Net -463 lines: simpler code, more providers, no stale config.
Allow users to define their own harness providers via config.binary and config.prompt_args instead of using a built-in provider name. This lets any CLI tool be used as a harness step without Go code changes. Built-in: config.provider = "claude" Custom: config.binary = "gemini", config.prompt_args = ["-p"] prompt_args defaults to ["-p"] if omitted. The prompt is appended after prompt_args. All other config keys are passed as --key value.
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (4)
internal/core/harness.go (2)
60-65: Return provider names in stable order.At Line 60, map iteration makes
BuiltinHarnessProviderNames()non-deterministic. Sorting before return will stabilize error messages and test output.Suggested change
-import "maps" +import ( + "maps" + "sort" +) @@ func BuiltinHarnessProviderNames() []string { names := make([]string, 0, len(builtinHarnessProviders)) for name := range builtinHarnessProviders { names = append(names, name) } + sort.Strings(names) return names }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/harness.go` around lines 60 - 65, BuiltinHarnessProviderNames iterates a map (builtinHarnessProviders) so the returned slice order is non-deterministic; collect the keys into names as currently done, then call sort.Strings(names) before returning to stabilize output and tests, and add a sort import if missing. Ensure you reference the BuiltinHarnessProviderNames function and the builtinHarnessProviders map when making the change.
45-51: Avoid split-brain provider sources between core and runtime.At Line 45, the built-in provider list is hard-coded in
internal/core/harness.gowhile runtime registration is defined separately ininternal/runtime/builtin/harness/*.go. This creates a drift risk where spec validation and runtime execution disagree.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/harness.go` around lines 45 - 51, The hard-coded builtinHarnessProviders map in internal/core/harness.go can diverge from the runtime registrations in internal/runtime/builtin/harness/*.go; replace the static map with a single source of truth by loading the provider list from the runtime registration mechanism (e.g., expose the runtime's registration registry or init function) instead of hard-coding names: remove or deprecate builtinHarnessProviders and reference the runtime registration API used by the harness registration files so spec validation and runtime use the same provider set (look for the map named builtinHarnessProviders and the registration code under internal/runtime/builtin/harness to wire them together).internal/runtime/builtin/harness/provider.go (1)
27-29: Prevent silent provider collisions during registration.At Line 27,
registerProvideroverwrites existing entries without any signal. A duplicate provider name will mask the earlier registration and make behavior harder to debug.Suggested change
func registerProvider(p Provider) { - providers[p.Name()] = p + name := p.Name() + if _, exists := providers[name]; exists { + panic(fmt.Sprintf("harness: duplicate provider registration %q", name)) + } + providers[name] = p }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/runtime/builtin/harness/provider.go` around lines 27 - 29, registerProvider currently overwrites existing entries silently; update the function (registerProvider) to detect collisions by checking providers[p.Name()] before assignment and fail fast with a clear error (e.g., panic or processLogger.Fatalf) that includes the duplicate provider name and ideally both providers' identities; keep the signature to avoid ripple changes or, if you prefer to return an error, update all callers accordingly so duplicates are not silently masked.internal/cmn/schema/dag.schema.json (1)
3301-3344: Add schema-level requirement forprompt_flagwhenprompt_modeisflag.Runtime validation already enforces this; adding it here keeps schema validation behavior consistent and fails earlier with clearer feedback.
🧩 Proposed schema parity update
"harnessNamedDefinition": { "type": "object", "additionalProperties": false, "required": ["binary"], + "allOf": [ + { + "if": { + "properties": { "prompt_mode": { "const": "flag" } }, + "required": ["prompt_mode"] + }, + "then": { + "required": ["prompt_flag"] + } + } + ], "properties": { "binary": {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/cmn/schema/dag.schema.json` around lines 3301 - 3344, Add a JSON Schema conditional to harnessNamedDefinition that requires prompt_flag when prompt_mode is "flag": inside the harnessNamedDefinition object, add an allOf (or if/then) clause referencing properties.prompt_mode and properties.prompt_flag so that if { "properties": { "prompt_mode": { "const": "flag" } }, "required": ["prompt_mode"] } then enforce { "required": ["prompt_flag"] } — this ensures prompt_flag is required at schema validation time for harnessNamedDefinition when prompt_mode is "flag".
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/cmn/schema/dag.schema.json`:
- Around line 3355-3373: The harnessProviderConfig schema currently forbids
reserved keys "binary" and "prompt_args" via propertyNames.not.enum but still
allows "fallback" as an arbitrary flag; update the propertyNames.not.enum array
to also include "fallback" so nested entries cannot define a reserved "fallback"
key (refer to harnessProviderConfig -> propertyNames.not.enum and
additionalProperties -> $ref to harnessConfigValue), ensuring the schema treats
"fallback" as reserved for the top-level only.
In `@internal/core/dag.go`:
- Around line 235-237: Missing copy of Harnesses when rebuilding DAG from YAML:
in rebuildDAGRunSnapshotFromYAML assign the fresh parsed harness definitions
back to the original dag (i.e., add dag.Harnesses = fresh.Harnesses alongside
the existing copies of Env, Params, SMTP, SSH, RegistryAuths) so that
ctx.dag.Harnesses used in step validation (step.go referencing
ctx.dag.Harnesses) retains custom harness providers after queue/retry
deserialization.
In `@internal/runtime/builtin/harness/harness.go`:
- Around line 318-320: The harness currently silently truncates multi-command
steps because extractPrompt reads only step.Commands[0]; change the logic to
detect and reject steps with more than one command: update extractPrompt (or the
callers that assign prompt := extractPrompt(step)) to check len(step.Commands)
and return an error if >1, and apply the same validation in the other similar
use-site referenced (the block around the other prompt extraction at the 527-547
area) so any harness step with multiple commands fails fast with a clear error
that the harness type only accepts a single prompt/command.
- Around line 243-304: The problem is that built-in providers call
configToFlags(cfg.flags, nil) so unified keys like "max_turns" are forwarded
verbatim and not mapped to provider-specific tokens; update the caller
buildInvocation to pass the appropriate *core.HarnessDefinition (or the
provider's definition) into configToFlags instead of nil, and ensure
flagTokenForKey(key, definition) uses the HarnessDefinition mapping (or falls
back to a small normalization: convert common unified keys/underscores to
provider-expected tokens, e.g. underscore->hyphen) so BaseArgs(prompt) /
Provider invocations receive provider-specific flags rather than raw unified
keys.
In `@skills/dagu/references/codingagent.md`:
- Around line 207-220: The example YAML mixes camelCase and kebab-case for
config keys which will produce inconsistent CLI flags; update the example under
the steps -> name: task block so all non-reserved config keys use a single
convention (prefer kebab-case): change allowedTools to allowed-tools (and any
other camelCase keys) while keeping existing kebab-case keys like max-turns,
max-budget-usd, and permission-mode unchanged so forwarded CLI flags are
consistent.
In `@skills/dagu/references/executors.md`:
- Around line 431-447: The documentation for harnesses is missing the new
escape-hatch field extra_flags; add a short paragraph under the Custom
`harnesses:` section describing `extra_flags` as an array (or string) of raw CLI
switches that are forwarded verbatim to the invoked binary to support repeated
or ordering-sensitive flags, explain it bypasses the normal key=>`--key value`
mapping and is appended according to `prompt_position`/`flag_style` and any
`option_flags` overrides, and include a brief example showing repeated flags
(e.g., ["-I", "-I/usr/include"] or ["--foo","--bar"]) and note it is intended
for advanced use when precise ordering or duplicates are required.
---
Nitpick comments:
In `@internal/cmn/schema/dag.schema.json`:
- Around line 3301-3344: Add a JSON Schema conditional to harnessNamedDefinition
that requires prompt_flag when prompt_mode is "flag": inside the
harnessNamedDefinition object, add an allOf (or if/then) clause referencing
properties.prompt_mode and properties.prompt_flag so that if { "properties": {
"prompt_mode": { "const": "flag" } }, "required": ["prompt_mode"] } then enforce
{ "required": ["prompt_flag"] } — this ensures prompt_flag is required at schema
validation time for harnessNamedDefinition when prompt_mode is "flag".
In `@internal/core/harness.go`:
- Around line 60-65: BuiltinHarnessProviderNames iterates a map
(builtinHarnessProviders) so the returned slice order is non-deterministic;
collect the keys into names as currently done, then call sort.Strings(names)
before returning to stabilize output and tests, and add a sort import if
missing. Ensure you reference the BuiltinHarnessProviderNames function and the
builtinHarnessProviders map when making the change.
- Around line 45-51: The hard-coded builtinHarnessProviders map in
internal/core/harness.go can diverge from the runtime registrations in
internal/runtime/builtin/harness/*.go; replace the static map with a single
source of truth by loading the provider list from the runtime registration
mechanism (e.g., expose the runtime's registration registry or init function)
instead of hard-coding names: remove or deprecate builtinHarnessProviders and
reference the runtime registration API used by the harness registration files so
spec validation and runtime use the same provider set (look for the map named
builtinHarnessProviders and the registration code under
internal/runtime/builtin/harness to wire them together).
In `@internal/runtime/builtin/harness/provider.go`:
- Around line 27-29: registerProvider currently overwrites existing entries
silently; update the function (registerProvider) to detect collisions by
checking providers[p.Name()] before assignment and fail fast with a clear error
(e.g., panic or processLogger.Fatalf) that includes the duplicate provider name
and ideally both providers' identities; keep the signature to avoid ripple
changes or, if you prefer to return an error, update all callers accordingly so
duplicates are not silently masked.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 184b08a6-c9c6-47e7-9ba8-ef690c127a84
📒 Files selected for processing (21)
internal/cmn/schema/dag.schema.jsoninternal/cmn/schema/dag_schema_test.gointernal/core/dag.gointernal/core/harness.gointernal/core/spec/builder_test.gointernal/core/spec/dag.gointernal/core/spec/loader.gointernal/core/spec/loader_test.gointernal/core/spec/step.gointernal/runtime/builtin/builtin.gointernal/runtime/builtin/harness/claude.gointernal/runtime/builtin/harness/codex.gointernal/runtime/builtin/harness/copilot.gointernal/runtime/builtin/harness/harness.gointernal/runtime/builtin/harness/harness_test.gointernal/runtime/builtin/harness/opencode.gointernal/runtime/builtin/harness/pi.gointernal/runtime/builtin/harness/provider.gointernal/runtime/builtin/harness/schema.goskills/dagu/references/codingagent.mdskills/dagu/references/executors.md
| "harnessProviderConfig": { | ||
| "type": "object", | ||
| "propertyNames": { | ||
| "not": { | ||
| "enum": ["binary", "prompt_args"] | ||
| } | ||
| }, | ||
| "properties": { | ||
| "provider": { | ||
| "type": "string", | ||
| "description": "Harness provider name. May reference a built-in provider such as claude, codex, copilot, opencode, or pi, or a custom top-level harnesses entry. Supports ${VAR} interpolation." | ||
| } | ||
| }, | ||
| "additionalProperties": { | ||
| "$ref": "#/definitions/harnessConfigValue" | ||
| }, | ||
| "required": ["provider"], | ||
| "description": "Primary or fallback harness provider configuration. Reserved keys are provider and fallback; all other keys are passed through to the CLI as flags." | ||
| }, |
There was a problem hiding this comment.
harnessProviderConfig currently allows reserved fallback inside fallback entries.
fallback is documented as reserved, but in this definition it is treated as an arbitrary flag value. That can validate unsupported nested fallback shapes and produce ambiguous runtime behavior.
🔧 Proposed schema fix
"harnessProviderConfig": {
"type": "object",
"propertyNames": {
"not": {
- "enum": ["binary", "prompt_args"]
+ "enum": ["binary", "prompt_args", "fallback"]
}
},📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "harnessProviderConfig": { | |
| "type": "object", | |
| "propertyNames": { | |
| "not": { | |
| "enum": ["binary", "prompt_args"] | |
| } | |
| }, | |
| "properties": { | |
| "provider": { | |
| "type": "string", | |
| "description": "Harness provider name. May reference a built-in provider such as claude, codex, copilot, opencode, or pi, or a custom top-level harnesses entry. Supports ${VAR} interpolation." | |
| } | |
| }, | |
| "additionalProperties": { | |
| "$ref": "#/definitions/harnessConfigValue" | |
| }, | |
| "required": ["provider"], | |
| "description": "Primary or fallback harness provider configuration. Reserved keys are provider and fallback; all other keys are passed through to the CLI as flags." | |
| }, | |
| "harnessProviderConfig": { | |
| "type": "object", | |
| "propertyNames": { | |
| "not": { | |
| "enum": ["binary", "prompt_args", "fallback"] | |
| } | |
| }, | |
| "properties": { | |
| "provider": { | |
| "type": "string", | |
| "description": "Harness provider name. May reference a built-in provider such as claude, codex, copilot, opencode, or pi, or a custom top-level harnesses entry. Supports ${VAR} interpolation." | |
| } | |
| }, | |
| "additionalProperties": { | |
| "$ref": "#/definitions/harnessConfigValue" | |
| }, | |
| "required": ["provider"], | |
| "description": "Primary or fallback harness provider configuration. Reserved keys are provider and fallback; all other keys are passed through to the CLI as flags." | |
| }, |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/cmn/schema/dag.schema.json` around lines 3355 - 3373, The
harnessProviderConfig schema currently forbids reserved keys "binary" and
"prompt_args" via propertyNames.not.enum but still allows "fallback" as an
arbitrary flag; update the propertyNames.not.enum array to also include
"fallback" so nested entries cannot define a reserved "fallback" key (refer to
harnessProviderConfig -> propertyNames.not.enum and additionalProperties -> $ref
to harnessConfigValue), ensuring the schema treats "fallback" as reserved for
the top-level only.
| // Harnesses contains reusable custom harness definitions available to harness steps. | ||
| // Excluded from JSON: derived from DAG/base config and rebuilt from YAML when needed. | ||
| Harnesses HarnessDefinitions `json:"-"` |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo '== DAG persistence / reload paths =='
rg -n -C2 --type=go 'dag\.json|json\.(Marshal|Unmarshal)|NewEncoder|NewDecoder|BaseConfigData|YamlData' internal
echo
echo '== Runtime use of harness definitions =='
rg -n -C2 --type=go '\bHarnesses\b|\bHarnessDefinition\b|\bdefinition\b|getProvider\(' internal/core internal/runtime
echo
echo '== Step build output for harness providers =='
rg -n -C2 --type=go 'mergeHarnessConfig|validateHarnessProviderConfig|ExecutorConfig\.Config' internal/core/specRepository: dagucloud/dagu
Length of output: 50371
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== DAG reload/deserialization patterns ==="
# Find where DAGs are unmarshaled from JSON
rg -n -B3 -A3 'json\.Unmarshal.*\*core\.DAG|newDAGFromSnapshot|readDAG' internal/core internal/runtime internal/service --type=go | head -100
echo
echo "=== YAML rebuild paths ==="
# Find where DAGs are rebuilt from YamlData
rg -n -B2 -A2 'rebuildDAGRunSnapshotFromYAML|BuildDAG.*YamlData|spec\.Parse.*YamlData' internal --type=go | head -80
echo
echo "=== Harnesses field usage at runtime ==="
# Check if dag.Harnesses is accessed during execution
rg -n 'dag\.Harnesses|\.Harnesses\[' internal/runtime internal/core/spec --type=go
echo
echo "=== HarnessDefinitions initialization ==="
# Find where Harnesses field is populated
rg -n -B2 -A2 'Harnesses\s*=' internal/core --type=goRepository: dagucloud/dagu
Length of output: 3358
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== When/how are Harnesses populated? ==="
# Find where dag.Harnesses is assigned (not just cloned)
rg -n 'Harnesses\s*=' internal/core/spec --type=go -A2
echo
echo "=== Step.go line 1277 context ==="
# See the full context of the Harnesses access in step building
sed -n '1270,1285p' internal/core/spec/step.go
echo
echo "=== Does YAML parsing/BuildDAG populate Harnesses? ==="
# Find DAG initialization from YAML
rg -n -B5 -A10 'func.*BuildDAG|func.*Parse.*DAG' internal/core/spec/builder.go | head -80
echo
echo "=== Harness config building/merging ==="
# Find where harness configs are created/merged
rg -n -B2 -A5 'mergeHarnessConfig|NewHarnessDefinition|buildHarness' internal/core/spec --type=go | head -100Repository: dagucloud/dagu
Length of output: 99
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== When/how are Harnesses populated? ==="
# Find where dag.Harnesses is assigned (including nested assignments)
rg -n 'Harnesses\s*:=|Harnesses\s*=' internal/core --type=go
echo
echo "=== Step.go line 1277 full context ==="
# See the full context of the Harnesses access in step building
if [ -f internal/core/spec/step.go ]; then
sed -n '1270,1290p' internal/core/spec/step.go
fi
echo
echo "=== BuildDAG function signature and start ==="
# Find the BuildDAG function
rg -n -A30 'func BuildDAG' internal/core/spec/builder.go | head -50
echo
echo "=== Harness initialization in builder ==="
# Look for where Harnesses field is set during building
rg -n -B3 -A3 'dag\.Harnesses.*=' internal/core/spec --type=goRepository: dagucloud/dagu
Length of output: 1050
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Full rebuildDAGRunSnapshotFromYAML function ==="
sed -n '503,550p' internal/service/frontend/api/v1/dagruns.go
echo
echo "=== Where is dag.Harnesses assigned during DAG load/build? ==="
# Search more broadly in builder/loader
rg -n 'Harnesses' internal/core/spec/builder.go internal/core/spec/loader.go --type=go
echo
echo "=== DAG struct initialization ==="
# See if Harnesses is set anywhere in DAG creation
rg -n -B5 -A5 'func.*\*core\.DAG' internal/core/spec/builder.go | head -100
echo
echo "=== Check loader.go for Harnesses population ==="
cat internal/core/spec/loader.go | grep -n -A5 -B5 'Harnesses\|harness' | head -150Repository: dagucloud/dagu
Length of output: 1414
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Complete rebuildDAGRunSnapshotFromYAML to see all field copies ==="
sed -n '503,560p' internal/service/frontend/api/v1/dagruns.go
echo
echo "=== LoadYAML function ==="
rg -n -A20 'func LoadYAML' internal/core/spec --type=go
echo
echo "=== Does LoadYAML populate Harnesses? ==="
rg -n -B5 -A10 'func.*Load.*\*core\.DAG|Harnesses' internal/core/spec/loader.go --type=go | head -100
echo
echo "=== Check what fresh DAG has after LoadYAML ==="
# See if Harnesses is computed in builder
rg -n 'harness' internal/core/spec/builder.go --type=go -i | head -50Repository: dagucloud/dagu
Length of output: 6606
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Critical: Check if Harnesses are copied in rebuildDAGRunSnapshotFromYAML ==="
sed -n '545,560p' internal/service/frontend/api/v1/dagruns.go
echo
echo "=== Check builder.go to see if fresh DAG gets Harnesses populated ==="
rg -n 'Harnesses' internal/core/spec/builder.go --type=go -B3 -A3
echo
echo "=== Check if Harnesses are part of DAG initialization ==="
rg -n 'buildDAGFromConfig|NewDAG' internal/core/spec/builder.go --type=go -A20 | head -80
echo
echo "=== Look for where Harnesses field is set in builder ==="
rg -n 'dag\.Harnesses\s*=' internal/core/spec --type=goRepository: dagucloud/dagu
Length of output: 646
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== CRITICAL: rebuildDAGRunSnapshotFromYAML does NOT copy Harnesses ==="
echo "Lines 545-557 copy Env, Params, SMTP, SSH, RegistryAuths but NOT Harnesses"
echo
echo "=== Does spec.LoadYAML populate fresh.Harnesses? ==="
# Trace through builder to see if Harnesses is set during DAG building
rg -n 'Harnesses\s*:=' internal/core/spec/builder.go --type=go
echo
echo "=== Where is dag.Harnesses initialized/populated? ==="
# Search entire spec package
find internal/core/spec -name '*.go' -exec grep -n 'Harnesses' {} + | grep -v test | grep -v '//'
echo
echo "=== Check buildDAG/NewDAG to see what fields are initialized ==="
rg -n -A30 'func buildDAGFromConfig' internal/core/spec/builder.go --type=go | head -100
echo
echo "=== Check if Harnesses comes from config/base definition ==="
rg -n 'Harnesses.*base|base.*Harnesses|config.*Harness' internal/core/spec --type=goRepository: dagucloud/dagu
Length of output: 258
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Check all Harnesses references in spec/ ==="
rg 'Harnesses' internal/core/spec --type=go | grep -v test | grep -v '//'
echo
echo "=== Check where Harnesses field is first set during DAG creation ==="
rg -n 'dag\.Harnesses' internal/core/spec --type=go
echo
echo "=== Dump loader.go to find Harnesses initialization ==="
wc -l internal/core/spec/loader.go
sed -n '220,350p' internal/core/spec/loader.go
echo
echo "=== Look for mergeHarnessConfig to understand where harnesses come from ==="
rg -n -B5 -A10 'mergeHarnessConfig' internal/core/spec --type=goRepository: dagucloud/dagu
Length of output: 7355
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== FOUND IT: buildHarnesses function ==="
rg -n -A20 'func buildHarnesses' internal/core/spec/dag.go --type=go
echo
echo "=== Confirm: rebuildDAGRunSnapshotFromYAML MISSING Harnesses copy ==="
echo "rebuildDAGRunSnapshotFromYAML copies these fields from fresh DAG:"
sed -n '545,558p' internal/service/frontend/api/v1/dagruns.go
echo ""
echo "Missing: dag.Harnesses = fresh.Harnesses"
echo
echo "=== Verify DAG is built from YAML config (dag.Harnesses should be set) ==="
rg -n -B5 -A15 'type dag struct' internal/core/spec/dag.go | head -80Repository: dagucloud/dagu
Length of output: 2431
Add missing Harnesses field copy in rebuildDAGRunSnapshotFromYAML.
rebuildDAGRunSnapshotFromYAML rebuilds the DAG from its stored YamlData, which correctly repopulates custom harness definitions via the YAML parser. However, the function only copies select fields (Env, Params, SMTP, SSH, RegistryAuths) back to the original DAG, omitting Harnesses. Since step validation at line 1277 in step.go accesses ctx.dag.Harnesses to resolve custom harness providers, this omission causes custom harnesses to be lost after queue/retry deserialization.
Add the missing line after line 550 in internal/service/frontend/api/v1/dagruns.go:
dag.Harnesses = fresh.Harnesses
This ensures that custom harness definitions are carried through the serialization/deserialization cycle.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/core/dag.go` around lines 235 - 237, Missing copy of Harnesses when
rebuilding DAG from YAML: in rebuildDAGRunSnapshotFromYAML assign the fresh
parsed harness definitions back to the original dag (i.e., add dag.Harnesses =
fresh.Harnesses alongside the existing copies of Env, Params, SMTP, SSH,
RegistryAuths) so that ctx.dag.Harnesses used in step validation (step.go
referencing ctx.dag.Harnesses) retains custom harness providers after
queue/retry deserialization.
| func configToFlags(cfg map[string]any, definition *core.HarnessDefinition) []string { | ||
| keys := make([]string, 0, len(cfg)) | ||
| for k := range cfg { | ||
| if reservedKeys[k] { | ||
| continue | ||
| } | ||
| keys = append(keys, k) | ||
| } | ||
| sort.Strings(keys) | ||
|
|
||
| var args []string | ||
| for _, key := range keys { | ||
| flag := flagTokenForKey(key, definition) | ||
| switch v := cfg[key].(type) { | ||
| case bool: | ||
| if v { | ||
| args = append(args, flag) | ||
| } | ||
| case string: | ||
| if v != "" { | ||
| args = append(args, flag, v) | ||
| } | ||
| case int: | ||
| args = append(args, flag, strconv.Itoa(v)) | ||
| case int8: | ||
| args = append(args, flag, strconv.FormatInt(int64(v), 10)) | ||
| case int16: | ||
| args = append(args, flag, strconv.FormatInt(int64(v), 10)) | ||
| case int32: | ||
| args = append(args, flag, strconv.FormatInt(int64(v), 10)) | ||
| case int64: | ||
| args = append(args, flag, strconv.FormatInt(v, 10)) | ||
| case uint: | ||
| args = append(args, flag, strconv.FormatUint(uint64(v), 10)) | ||
| case uint8: | ||
| args = append(args, flag, strconv.FormatUint(uint64(v), 10)) | ||
| case uint16: | ||
| args = append(args, flag, strconv.FormatUint(uint64(v), 10)) | ||
| case uint32: | ||
| args = append(args, flag, strconv.FormatUint(uint64(v), 10)) | ||
| case uint64: | ||
| args = append(args, flag, strconv.FormatUint(v, 10)) | ||
| case float32: | ||
| args = append(args, flag, strconv.FormatFloat(float64(v), 'f', -1, 32)) | ||
| case float64: | ||
| if v == float64(int(v)) { | ||
| args = append(args, flag, strconv.Itoa(int(v))) | ||
| } else { | ||
| args = append(args, flag, strconv.FormatFloat(v, 'f', -1, 64)) | ||
| } | ||
| case []string: | ||
| for _, item := range v { | ||
| args = append(args, flag, item) | ||
| } | ||
| case []any: | ||
| for _, item := range v { | ||
| args = append(args, flag, fmt.Sprint(item)) | ||
| } | ||
| } | ||
| } | ||
| return args | ||
| } |
There was a problem hiding this comment.
Built-in providers still forward config keys verbatim, so the “unified” option surface won't actually be portable.
For built-ins, buildInvocation always calls configToFlags(cfg.flags, nil), and the Provider contract only supplies BaseArgs(prompt). That leaves no place to translate model, effort, max_turns, output_format, etc. into provider-specific flags. Today a documented key like max_turns would become --max_turns, and any provider that expects --max-turns or a different token will fail.
Also applies to: 594-598, 647-657
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/runtime/builtin/harness/harness.go` around lines 243 - 304, The
problem is that built-in providers call configToFlags(cfg.flags, nil) so unified
keys like "max_turns" are forwarded verbatim and not mapped to provider-specific
tokens; update the caller buildInvocation to pass the appropriate
*core.HarnessDefinition (or the provider's definition) into configToFlags
instead of nil, and ensure flagTokenForKey(key, definition) uses the
HarnessDefinition mapping (or falls back to a small normalization: convert
common unified keys/underscores to provider-expected tokens, e.g.
underscore->hyphen) so BaseArgs(prompt) / Provider invocations receive
provider-specific flags rather than raw unified keys.
| prompt := extractPrompt(step) | ||
| if prompt == "" { | ||
| return nil, fmt.Errorf("harness: command field (prompt) is required") |
There was a problem hiding this comment.
Reject multi-command harness steps instead of silently truncating them.
extractPrompt only reads step.Commands[0], so any additional command entries are ignored without warning. That is easy to hit when converting an existing shell step to type: harness, and the executor will run with a partial prompt.
💡 Proposed fix
func newHarness(ctx context.Context, step core.Step) (executor.Executor, error) {
+ if len(step.Commands) > 1 {
+ return nil, fmt.Errorf("harness: exactly one command entry is supported")
+ }
cfg := normalizeConfigMap(step.ExecutorConfig.Config)
var defs core.HarnessDefinitions
env := runtime.GetEnv(ctx)
@@
func validateHarnessStep(step core.Step) error {
if len(step.Commands) == 0 || extractPrompt(step) == "" {
return fmt.Errorf("harness: command field (prompt) is required")
}
+ if len(step.Commands) > 1 {
+ return fmt.Errorf("harness: exactly one command entry is supported")
+ }
cfg := step.ExecutorConfig.ConfigAlso applies to: 527-547
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/runtime/builtin/harness/harness.go` around lines 318 - 320, The
harness currently silently truncates multi-command steps because extractPrompt
reads only step.Commands[0]; change the logic to detect and reject steps with
more than one command: update extractPrompt (or the callers that assign prompt
:= extractPrompt(step)) to check len(step.Commands) and return an error if >1,
and apply the same validation in the other similar use-site referenced (the
block around the other prompt extraction at the 527-547 area) so any harness
step with multiple commands fails fast with a clear error that the harness type
only accepts a single prompt/command.
| ```yaml | ||
| steps: | ||
| - name: task | ||
| type: harness | ||
| command: "Write tests for the auth module" | ||
| config: | ||
| provider: claude | ||
| model: sonnet | ||
| effort: high | ||
| max-turns: 20 | ||
| max-budget-usd: 2.00 | ||
| permission-mode: auto | ||
| allowedTools: "Bash,Read,Edit" | ||
| bare: true |
There was a problem hiding this comment.
Use one config-key convention in the examples.
This page says harness forwards non-reserved keys as CLI flags, but the examples mix kebab-case (max-turns) with camelCase (allowedTools). If forwarding is verbatim, those produce different flags. Please normalize the examples to the actual accepted keys so readers do not copy a broken config.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@skills/dagu/references/codingagent.md` around lines 207 - 220, The example
YAML mixes camelCase and kebab-case for config keys which will produce
inconsistent CLI flags; update the example under the steps -> name: task block
so all non-reserved config keys use a single convention (prefer kebab-case):
change allowedTools to allowed-tools (and any other camelCase keys) while
keeping existing kebab-case keys like max-turns, max-budget-usd, and
permission-mode unchanged so forwarded CLI flags are consistent.
skills/dagu/references/executors.md
Outdated
| The `command` field is the prompt. `config.provider` can reference either a built-in provider or a named custom entry under top-level `harnesses:`. All non-reserved keys are passed directly as CLI flags (`--key value` for strings/numbers, `--key` for booleans). Reserved keys are `provider` and `fallback`. | ||
|
|
||
| Supported providers: `claude`, `codex`, `copilot`, `opencode`, `pi`. | ||
|
|
||
| Top-level `harness:` acts as a DAG-wide default. Step-level config overlays the DAG-level primary config, and step-level `fallback` replaces the DAG-level `fallback`. If a step omits `type:` and the DAG defines `harness:`, the step is inferred as `type: harness`. | ||
|
|
||
| `provider` may be parameterized with `${...}` and is resolved at runtime after interpolation. | ||
|
|
||
| Custom `harnesses:` definitions describe how to invoke arbitrary harness CLIs: | ||
| - `binary` | ||
| - `prefix_args` | ||
| - `prompt_mode`: `arg`, `flag`, or `stdin` | ||
| - `prompt_flag` for `flag` mode | ||
| - `prompt_position`: `before_flags` or `after_flags` | ||
| - `flag_style`: `gnu_long` or `single_dash` | ||
| - `option_flags` to override specific flag tokens | ||
|
|
There was a problem hiding this comment.
Document extra_flags here as part of the harness surface.
This section explains key/value flag forwarding, but the PR also adds extra_flags as the escape hatch for raw CLI switches. Without that callout, users have no discoverable way to express repeated or ordering-sensitive flags.
I can draft a short example for this section if helpful.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@skills/dagu/references/executors.md` around lines 431 - 447, The
documentation for harnesses is missing the new escape-hatch field extra_flags;
add a short paragraph under the Custom `harnesses:` section describing
`extra_flags` as an array (or string) of raw CLI switches that are forwarded
verbatim to the invoked binary to support repeated or ordering-sensitive flags,
explain it bypasses the normal key=>`--key value` mapping and is appended
according to `prompt_position`/`flag_style` and any `option_flags` overrides,
and include a brief example showing repeated flags (e.g., ["-I",
"-I/usr/include"] or ["--foo","--bar"]) and note it is intended for advanced use
when precise ordering or duplicates are required.
Summary
type: harnessstep executor that spawns coding agent CLIs (Claude Code, Codex, OpenCode, Pi) as subprocess workflow stepsmodel,effort,max_turns,output_format) to its CLI-specific flags, with provider-specific options and anextra_flagsescape hatchcodingagent.md,executors.md) to usetype: harnessinstead of bare shell scriptsTest plan
go build ./...passesmake lint— 0 issuesgo test -race ./internal/runtime/builtin/harness/...— 33 tests passdagu starta DAG withtype: harness+provider: claudeSummary by CodeRabbit
New Features
Documentation