Skip to content

fix(workflows): actionable error messages for common workflow authoring mistakes#1388

Merged
stack72 merged 2 commits into
mainfrom
benchmark/workflow-authoring-errors
May 14, 2026
Merged

fix(workflows): actionable error messages for common workflow authoring mistakes#1388
stack72 merged 2 commits into
mainfrom
benchmark/workflow-authoring-errors

Conversation

@stack72
Copy link
Copy Markdown
Contributor

@stack72 stack72 commented May 14, 2026

Summary

Benchmark analysis (k8s-debug-v2) showed 35% of agent turns were spent on
workflow authoring errors that could have been caught or guided by better error
messages. The agent hit 6 sequential errors before succeeding — each one
discoverable only by running the workflow and parsing failure output.

This PR adds actionable error messages at the point of failure:

  • argumentsinputs detection — step task preprocessing now catches
    arguments: (wrong) and suggests inputs: (correct) with an example. Root
    cause: swamp model type describe showed "Arguments:" which agents
    copy into workflow YAML.
  • String dependsOn detection — job and step schemas now detect bare
    strings in dependsOn arrays and show the required { job, condition }
    object format instead of a raw Zod validation dump.
  • "Workflow not found" guidance — now suggests both swamp workflow create
    and swamp doctor workflows --json so agents can self-recover.
  • "Skipping broken workflow" guidance — warns suggest
    swamp doctor workflows --json to see all errors at once.
  • Terminology alignment — method parameters header renamed from
    "Arguments:" to "Inputs:" in log output; inputs field added alongside
    arguments in JSON output (backward compatible).

Test plan

  • New test: arguments in step task throws clear error
  • New test: string dependsOn in job throws clear error
  • Updated test: workflowNotFound message includes create + doctor guidance
  • All 5939 tests pass, deno check clean, deno lint clean
  • Binary compiles successfully
  • Verified in benchmark re-runs: "Workflow not found" with guidance caused
    1-step recovery; inputs field in JSON output prevented the arguments/inputs
    confusion entirely

🤖 Generated with Claude Code

…ng mistakes

Benchmark analysis showed 35% of agent turns (9 of 26) were spent on workflow
authoring errors that could have been caught or guided by better error messages.
This fixes the five most impactful failure modes:

- Detect `arguments:` in step tasks and suggest `inputs:` (the correct field)
- Detect string `dependsOn` entries and show the required object format
- "Workflow not found" now suggests `swamp workflow create` and `swamp doctor
  workflows --json`
- "Skipping broken workflow" warnings now suggest `swamp doctor workflows --json`
- Rename method "Arguments:" to "Inputs:" in type describe log output to align
  with workflow YAML terminology
- Add `inputs` field alongside `arguments` in type describe JSON output for
  backward-compatible terminology alignment

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI UX Review

Blocking

  1. swamp model method describe still shows "Arguments:" — inconsistent with the PR's stated goal

    src/presentation/renderers/model_method_describe.ts:53 still emits bold(cyan("Arguments:")). The PR updated model_get.ts (used by swamp model get and, via the shared formatMethodLines, swamp type describe), but missed the separate renderer for swamp model method describe, which has its own rendering path.

    After this PR:

    • swamp model getInputs:
    • swamp type describeInputs: ✅ (inherits via formatMethodLines)
    • swamp model method describeArguments: ❌ (unchanged)

    The PR description says the root cause was `swamp model type describe` showing "Arguments:" — but `swamp model method describe` is the most focused command for inspecting a single method's signature and it still shows the old label. Any agent or user reading that output will still see "Arguments:" and may copy it into YAML.

    Fix: In src/presentation/renderers/model_method_describe.ts:53, change:

    lines.push(bold(cyan("Arguments:")));

    to:

    lines.push(bold(cyan("Inputs:")));

Suggestions

  1. The workflowNotFound message is now two sentences long and will appear inline (no line break) in log output. Consider putting the guidance on a new line for readability, e.g. \nCreate it with ... — same information, easier to scan quickly.

  2. In model_method_describe.ts, the schema is formatted via data.method.arguments (the old field name). That still works since arguments is preserved for backward compat, but it would be cleaner to switch the renderer to data.method.inputs once the label is updated, so the field name and displayed label agree.

Verdict

NEEDS CHANGES — the swamp model method describe renderer was not updated, leaving "Arguments:" in the output of the command most likely to be used when authoring workflow YAML. The fix is a one-line change.

github-actions[bot]
github-actions Bot previously approved these changes May 14, 2026
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Clean, well-motivated PR. The benchmark-driven approach to identifying the top 5 agent failure modes is solid, and the fixes are appropriately scoped. DDD layering is correct: validation logic lives in domain schemas, terminology alignment touches the presentation layer, and the libswamp import boundary is respected throughout.

Blocking Issues

None.

Suggestions

  1. Missing test for Step-level string dependsOn detectionStepDependencyFieldSchema in src/domain/workflows/step.ts:60-79 has the same string detection logic as JobDependencyFieldSchema in job.ts, but only the Job variant has a test (job_test.ts:254-273). Consider adding a parallel test in step_test.ts for completeness — CLAUDE.md calls for comprehensive unit test coverage. The logic is identical so this isn't blocking, but it's a gap.

  2. Import placement in job_test.tsimport { JobSchema } from "./job.ts" is added at line 252, after existing test code. Convention is to place imports at the top of the file. Job is already imported on line 21; JobSchema could be added to that existing import.

  3. arguments field check in step_task.ts:66 — The condition "arguments" in d && !("inputs" in d) means that if a user provides both arguments and inputs, the arguments key is silently ignored (stripped by Zod since it's not in the raw schema). This is probably fine for the intended use case (agents won't provide both), but worth noting for edge-case awareness.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial Review

Critical / High

None found.

Medium

  1. Inconsistent terminology rename — src/presentation/renderers/model_method_describe.ts:53

    The PR's stated goal is terminology alignment: rename "Arguments:" to "Inputs:" in log output to prevent agents from copying the wrong field name into workflow YAML. model_get.ts:96 was updated to "Inputs:", but model_method_describe.ts:53 still reads bold(cyan("Arguments:")). This means swamp model get displays "Inputs:" while swamp model method describe displays "Arguments:" — exactly the inconsistency this PR is supposed to fix.

    Breaking scenario: An agent runs swamp model method describe <model> run (the more targeted command), sees "Arguments:", and writes arguments: in its workflow YAML — the exact failure this PR is designed to prevent.

    Suggested fix: Change line 53 of model_method_describe.ts from "Arguments:" to "Inputs:".

Low

  1. arguments detection in step_task.ts fires for all task types — src/domain/workflows/step_task.ts:66

    The check if ("arguments" in d && !("inputs" in d)) runs before the type === "model_method" guard, so it fires for type: "workflow" tasks too. The error message example shows model_method format (modelIdOrName, methodName), which could be misleading for someone writing a workflow-type task with a typo. Not harmful since inputs is the correct field for both types, but the example could confuse.

  2. No test for StepSchema string dependsOn detection

    The PR adds a test for JobSchema string dependsOn detection (job_test.ts) but not for StepSchema (step_test.ts), despite adding identical preprocess logic to both. The step-level logic is duplicated (not shared), so a regression in one wouldn't be caught by the other's test.

Verdict

PASS — The logic changes are sound and well-tested. The medium finding (inconsistent "Arguments:" → "Inputs:" rename) weakens the PR's stated goal but doesn't cause incorrect behavior or data loss. Worth fixing before or shortly after merge.

- Fix inconsistent terminology in model_method_describe.ts: rename
  "Arguments:" to "Inputs:" to match model_get.ts and type_describe.ts
- Switch renderers to read method.inputs instead of method.arguments
- Add newline in workflowNotFound message for readability
- Move JobSchema import to top of job_test.ts
- Add missing StepSchema string dependsOn test in step_test.ts
- Update model skill JSON example to show new inputs field

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Clean, well-focused PR that adds actionable error messages to reduce agent workflow-authoring iterations. No blocking issues found.

Blocking Issues

None.

Suggestions

  1. Minor duplication in dependsOn validationJobDependencyFieldSchema (src/domain/workflows/job.ts:45-64) and StepDependencyFieldSchema (src/domain/workflows/step.ts:60-79) are nearly identical, differing only in the job: vs step: key in the example output. Per project conventions ("three similar lines is better than a premature abstraction"), this is fine as-is, but noting it in case the pattern spreads further.

What looks good

  • DDD placement: Validation logic sits in the domain-layer Zod schemas (value objects/entities), exactly where it belongs. The workflowNotFound guidance is in the application service layer — also appropriate.
  • Libswamp import boundary: All presentation files correctly import from ../../libswamp/mod.ts, not internal paths.
  • Backward compatibility: JSON output adds inputs alongside arguments without breaking existing consumers. The arguments check in step_task.ts correctly allows both fields present simultaneously (line 66: "arguments" in d && !("inputs" in d)).
  • Test coverage: Every new behavior has a corresponding test — arguments detection, string dependsOn for both Job and Step schemas, and updated workflowNotFound message.
  • Scope discipline: Changes are minimal and focused — no unrelated refactoring or feature creep.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI UX Review

Blocking

None.

Suggestions

  1. workflowNotFound message in JSON mode — the error field now contains a multi-line string with an embedded \n. Scripts doing jq -r .error get a two-line string, which is unexpected if they're expecting a single-line message. The code: "workflow_not_found" field is there for machine consumers to key on, so this isn't breaking, but splitting the guidance into a separate hint key (alongside error) would be cleaner for scripts. Not blocking since the code field already gives scripts a stable identifier.

  2. "Skipping broken workflow" warning structure — when the Zod error message is multi-line (as the new arguments → inputs and string dependsOn errors are), errorMsg is a JSON dump of the ZodError issues, and the doctor guidance (. Run 'swamp doctor workflows --json' to see all errors.) gets appended after that blob. The result is that the guidance is visually buried at the end of a potentially long JSON string. The new guidance is still helpful — it's better than nothing — but the placement makes it hard to spot in practice. Consider logging the doctor suggestion as a separate warn line so it stands out.

  3. Terminology note for help textswamp model get and swamp model method describe both previously showed Arguments: in their output. Any user documentation or AI agent prompts that reference that label will now see Inputs: instead. The JSON output adds inputs alongside the existing arguments field (backward compatible), so JSON consumers are fine. Just worth noting the log-mode label change is not backward compatible for users screen-scraping output.

Verdict

PASS — all referenced commands (swamp workflow create, swamp doctor workflows --json) are verified to exist, JSON output is backward compatible, and the error messages are specific and actionable with correct YAML examples.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial Review

Critical / High

None found.

Medium

  1. Inconsistent terminology alignment in Ink TUI rendererssrc/presentation/renderers/model_search.tsx:216,254 and src/presentation/renderers/type_search.tsx:205,231 still use method.arguments and display the header "Arguments:", while the log-mode renderers (model_get.ts:91-96, model_method_describe.ts:50-53) were updated to use method.inputs and "Inputs:". This means the same command (e.g. swamp model search) displays "Inputs:" in log mode but "Arguments:" in the interactive TUI picker. Not a correctness bug since both fields contain the same data, but it undermines the stated PR goal of terminology alignment. Agents using the TUI output could still learn the wrong term.

Low

  1. Mixed string/object dependsOn arrays produce slightly misleading errorsrc/domain/workflows/job.ts:47-48: For dependsOn: ["build", { job: "test", condition: { type: "succeeded" } }] (mixed array), the error says "dependsOn entries must be objects, not strings" implying all entries are strings. The message is still helpful, and a mixed array is an unlikely mistake, so this is cosmetic.

Verdict

PASS — The changes are correct and well-tested. The z.preprocess throw pattern is consistent with existing precedent in step_task.ts. Backward compatibility is properly maintained via the dual arguments/inputs fields on MethodDescribeData. All error paths have test coverage. The Ink TUI terminology inconsistency (Medium #1) is worth a follow-up but doesn't block.

@stack72 stack72 merged commit 9dc29ef into main May 14, 2026
11 checks passed
@stack72 stack72 deleted the benchmark/workflow-authoring-errors branch May 14, 2026 23:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant