feat(validate): warn on undeclared agent.output refs and field-level mismatches in explicit mode by jrob5756 · Pull Request #208 · microsoft/conductor

jrob5756 · 2026-05-18T19:35:53Z

Closes #105.

PR #125 added explicit-mode warnings for undeclared workflow.input / agent.output references. While building on top of it, I confirmed (by running real workflows that pass validation and crash at runtime) two follow-up gaps that still produce the exact TemplateError: 'dict object' has no attribute 'X' symptom from #105.

Gap A — Field-level precision

When an agent declares input: [a.output.foo], the engine (engine/context.py:_add_agent_input) copies only the foo field into the agent's ctx. A prompt that then references {{ a.output.bar }} fails at runtime — same 'dict object' has no attribute 'bar' error the issue body quotes. The validator previously tracked declared agents as a flat set[str] of names, so this slipped through.

Fix: track declared fields per root as dict[str, set[str | None]] (where None in the set means "whole-output declared, any field allowed"), and emit a warning naming the missing field when a template references a field not in the declared set. Same logic applies to static parallel groups (pg.outputs.member.field — the engine field-slices on len ≥ 3). For-each groups are intentionally skipped because the engine copies whole members regardless of .field suffix (elif is_for_each_dict or len(remaining_parts) == 2), so field-precision warnings would be false positives there.

Gap B — Script/sub-workflow `agent.output` exclusion was too broad

agent.type not in ("script", "workflow") previously suppressed both workflow.input and agent.output warnings. But the engine's _LOCAL_RENDER_AGENT_TYPES carve-out only populates workflow.input for these types — agent.output references still raise KeyError in _add_explicit_input if undeclared.

Fix: split the gate. Keep the script/workflow/human_gate exclusion for workflow.input warnings only; for agent.output warnings, only exclude human_gate (whose prompts render in accumulate mode).

Other improvements

_extract_template_refs returns a typed TemplateRefs NamedTuple carrying field detail (agent_output_fields, group_member_fields, group_error_refs) alongside the existing flat agent_refs set.
AST walker filters inner-link Getattr nodes — for {{ a.output.bar }}, only the outermost Getattr (attrs=["output", "bar"]) is processed. The inner Getattr (attrs=["output"]) is the .node of the outer, so emitting it separately would falsely register a whole-output reference and suppress the field-precision warning.
AST walker detects method-call Getattr nodes (those that are the callee of a Call). {% for k,v in a.output.items() %} now registers as a whole-output ref rather than a field ref to items — eliminates a class of false positives.
human_gate agents added to the explicit-mode exclusion list — fixes a pre-existing false-positive warning. Gate prompts render via context.get_for_template() which forces mode='accumulate', so explicit-mode declarations are not required.
Documented limitations in _extract_template_refs: bracket access (a.output["x"]) and dynamic field access (a.output[var]) are intentionally not detected.

Example output

Issue #105's exact example:

⚠ agent 'intake' prompt references 'workflow.input.prompt' but agent 'intake'
  does not declare 'workflow.input.prompt' in its input: list (explicit context mode)

Gap A (new):

⚠ agent 'b' args[0] references 'a.output.bar' but agent 'b' only declares
  a.output.foo in its input: list (explicit context mode)

Gap B (newly caught):

⚠ agent 'scripty' args[0] references 'a.output' but agent 'scripty' does not
  declare 'a.output' in its input: list (explicit context mode)

Testing

14 new tests across TestExtractTemplateRefs, TestExplicitModeWarnings (script/sub-workflow agent.output, sub-workflow workflow.input carve-out regression, human_gate false-positive regression), and a new TestExplicitModeFieldPrecision class (field precision, whole-output declarations, method-call filter, static parallel precision, for-each precision skip, sub-workflow input_mapping field precision).
Existing tests updated for the new NamedTuple return shape.
2640 tests pass, 15 skipped (full repo suite, -m "not performance").
make lint ✅
make typecheck: only the pre-existing dialog_evaluator.py diagnostic (verified by stashing and re-running on main).
All examples/*.yaml continue to validate without unexpected warnings.

Decisions that may need a closer look

Severity: kept as warning per feat(validation): warn when explicit-mode agent prompts reference undeclared inputs #105's framing, even for field-mismatch cases where runtime failure is essentially certain. Rationale: consistency, and templates can still legitimately use guards/defaults.
TemplateRefs is a typed NamedTuple, not a dataclass — chosen for slightly cleaner tuple-style unpacking in tests and zero runtime overhead. No behavior difference.
Dict-method filter is belt-and-suspenders — the primary defense is the Call-node filter (handles a.output.items()); _DICT_METHOD_NAMES covers the rarer case of referencing a method without calling it ({{ a.output.items }} as a value).

Suggested commit on merge: feat(validate): warn on undeclared agent.output refs and field-level mismatches in explicit mode (#NNN)

…mismatches in explicit mode PR #125 added warnings for undeclared workflow.input/agent.output refs in explicit-mode prompts. Two follow-up gaps still produced the same runtime `TemplateError: 'dict object' has no attribute 'X'` that issue #105 reports. This change also fixes one false-positive and one false-negative class that landed alongside the original PR. Gap A — Field-level precision When an agent declares `input: [a.output.foo]`, the engine's `_add_agent_input` only copies the `foo` field into ctx. A prompt that references `{{ a.output.bar }}` then fails at runtime. The validator previously only tracked agent NAMES in `declared_agents`, not specific fields. Now `_validate_template_references` tracks declared fields per root (`declared_agent_output_fields: dict[str, set[str | None]]`) and emits a warning naming the missing field. Same logic applies to static parallel groups (`pg.outputs.member.field`); for-each groups are skipped because the engine's whole-member copy behavior makes field-precision warnings a false positive there. Gap B — Script/sub-workflow exclusion was too broad The condition `agent.type not in ("script", "workflow")` suppressed BOTH `workflow.input` AND `agent.output` warnings for these types. The `_LOCAL_RENDER_AGENT_TYPES` carve-out in the engine only populates `workflow.input` for them — `agent.output` references still require declaration. Split the gate: keep the script/workflow exclusion for `workflow.input` only; for `agent.output`, only exclude `human_gate` (whose prompts render in accumulate mode via `get_for_template()`). Namespace separation (review-driven fix) `INPUT_REF_PATTERN` previously didn't capture whether a group reference was for `.outputs` or `.errors`. This caused: - `input: ["pg.errors"]` to silently suppress warnings for `{{ pg.outputs.a.val }}` references (false negative) - `input: ["pg.errors.a.foo"]` to emit warnings saying "only declares pg.outputs.a.foo" (misleading message) The engine populates only the declared namespace into ctx, so a runtime KeyError would follow in both cases. Pattern now captures `pg_kind` (outputs|errors); declared-tracking uses three independent accumulators keyed by namespace. Other changes - `_extract_template_refs` returns a typed `TemplateRefs` NamedTuple carrying field detail (`agent_output_fields`, `group_member_fields`, `group_error_refs`) alongside the existing flat `agent_refs` set. - AST walker filters inner-link Getattr nodes so `{{ a.output.bar }}` doesn't emit BOTH a `bar`-field ref AND a spurious whole-output ref from the inner Getattr in the same chain. - AST walker detects method-call Getattr nodes (callee of a Call node) so `{% for k,v in a.output.items() %}` registers as a whole-output reference rather than a field ref to `items`. - `_extract_template_refs` also catches `TemplateAssertionError` from `meta.find_undeclared_variables` (e.g., duplicate `{% block %}` names) so validation no longer hard-fails on templates that parse but semantically conflict. - `human_gate` agents added to the explicit-mode exclusion list — fixes a pre-existing false-positive warning for gate prompts (engine renders them with full accumulated context via `get_for_template()`). - Documented limitations: bracket access (`a.output["x"]`) and dynamic field access (`a.output[var]`) are not detected. Tests - 27 new tests across `TestExtractTemplateRefs`, `TestInputRefPatternExtensions`, `TestExplicitModeWarnings`, and new `TestExplicitModeFieldPrecision`. Covers field precision (including whole-output vs specific-field), namespace separation (outputs vs errors), method-call filter, prefix-chain dedup, optional `?` suffix, human_gate / script / sub-workflow carve-outs, static parallel matrix (whole-group / whole-member / exact-field), for-each precision skip, and the `TemplateAssertionError` graceful-degradation path. - Existing tests updated for the new NamedTuple return shape. - Full suite (2653 tests) passes; lint clean; all `examples/*.yaml` still validate without unexpected warnings. Refs #105 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

jrob5756 · 2026-05-18T20:06:05Z

Review-driven update

I ran a multi-agent review (code-review, test-analyzer, silent-failure-hunter, type-design-analyzer, code-simplifier) before flagging this PR for human review. Two agents independently caught the same real bug I had missed, plus a related TemplateAssertionError leak. Force-pushed 249423d to address them along with smaller polish.

Bugs fixed (verified at runtime, not just on paper)

pg.errors / pg.outputs namespace conflation in the declared-input tracking. INPUT_REF_PATTERN used a non-capturing (?:outputs|errors) group, so declaration-side tracking merged the two namespaces. This caused two symptoms:
- False negative: input: ["pg.errors"] + prompt: "{{ pg.outputs.a.val }}" → no warning. The runtime would KeyError on "outputs".
- Misleading message: input: ["pg.errors.a.foo"] + prompt: "{{ pg.outputs.a.bar }}" → warning text said "only declares pg.outputs.a.foo" — but the user declared pg.errors.a.foo, not pg.outputs.a.foo.
Fixed by adding a pg_kind named capture and splitting tracking into three independent accumulators (declared_agent_output_fields, declared_group_output_member_fields + declared_groups_with_outputs, declared_group_errors). Iteration in _validate_template_references now goes over the structured TemplateRefs maps directly (one block per namespace) so the kinds can't leak across.
TemplateAssertionError leak: _extract_template_refs only caught TemplateSyntaxError. Templates that parse but trip a semantic check during meta.find_undeclared_variables (e.g., duplicate {% block %} names) would hard-fail validation. Now also caught — render-time will surface the precise error if the workflow actually runs.

Test additions (13 new, total 27 new on the PR)

Namespace separation matrix: three tests asserting that outputs and errors declarations don't suppress warnings for the other namespace, plus an explicit regression test that the message never reads "only declares pg.outputs.a.foo" when the declaration was for pg.errors.a.foo.
Static parallel matrix: whole-group / whole-member / exact-field / bare-member-with-specific-decl shapes.
Optional ? suffix in explicit-mode declarations.
Mixed bare + specific extractor case: {{ a.output }} {{ a.output.foo }} exercises both prefix-chain dedup paths simultaneously.
Method-call filter on specific-field decl — the previous test used a whole-output decl which didn't actually exercise the filter logic (test-analyzer's catch).
pg_kind regex capture regression.
TemplateAssertionError graceful-degradation path.

Code polish (from code-simplifier suggestions)

Track group-outputs declaration set during the parsing pass instead of recomputing {g for (g, _) in declared_group_output_member_fields} per template iteration.
Hoist declared_list join out of the field-precision inner loop.
Split compound guards in the group-output block to mirror the agent-output block style.
Skipped: helper extraction for the three "unknown agent" error blocks (modest savings, light risk) and field-name renaming on TemplateRefs (touches the NamedTuple shape — could be a follow-up).

Verification

151 validator/CLI tests pass (up from 138).
Full suite: 2653 passed, 15 skipped.
make lint clean.
All examples/*.yaml validate without unexpected warnings.
Both runtime-failure scenarios from issue feat(validation): warn when explicit-mode agent prompts reference undeclared inputs #105 now caught at validation time with informative messages.

Diff vs previous push:

src/conductor/config/validator.py   | +/- changes
tests/test_config/test_validator.py | 13 new tests

jrob5756 force-pushed the feat/105-explicit-mode-undeclared-input-warning branch from 970033c to 249423d Compare May 18, 2026 20:05

jrob5756 merged commit 8ec298d into main May 18, 2026
9 checks passed

jrob5756 deleted the feat/105-explicit-mode-undeclared-input-warning branch May 18, 2026 20:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(validate): warn on undeclared agent.output refs and field-level mismatches in explicit mode#208

feat(validate): warn on undeclared agent.output refs and field-level mismatches in explicit mode#208
jrob5756 merged 1 commit into
mainfrom
feat/105-explicit-mode-undeclared-input-warning

jrob5756 commented May 18, 2026

Uh oh!

jrob5756 commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jrob5756 commented May 18, 2026

Gap A — Field-level precision

Gap B — Script/sub-workflow agent.output exclusion was too broad

Other improvements

Example output

Testing

Decisions that may need a closer look

Uh oh!

jrob5756 commented May 18, 2026

Review-driven update

Bugs fixed (verified at runtime, not just on paper)

Test additions (13 new, total 27 new on the PR)

Code polish (from code-simplifier suggestions)

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Gap B — Script/sub-workflow `agent.output` exclusion was too broad