feat(validate): warn on undeclared agent.output refs and field-level mismatches in explicit mode#208
Conversation
…mismatches in explicit mode PR #125 added warnings for undeclared workflow.input/agent.output refs in explicit-mode prompts. Two follow-up gaps still produced the same runtime `TemplateError: 'dict object' has no attribute 'X'` that issue #105 reports. This change also fixes one false-positive and one false-negative class that landed alongside the original PR. Gap A — Field-level precision When an agent declares `input: [a.output.foo]`, the engine's `_add_agent_input` only copies the `foo` field into ctx. A prompt that references `{{ a.output.bar }}` then fails at runtime. The validator previously only tracked agent NAMES in `declared_agents`, not specific fields. Now `_validate_template_references` tracks declared fields per root (`declared_agent_output_fields: dict[str, set[str | None]]`) and emits a warning naming the missing field. Same logic applies to static parallel groups (`pg.outputs.member.field`); for-each groups are skipped because the engine's whole-member copy behavior makes field-precision warnings a false positive there. Gap B — Script/sub-workflow exclusion was too broad The condition `agent.type not in ("script", "workflow")` suppressed BOTH `workflow.input` AND `agent.output` warnings for these types. The `_LOCAL_RENDER_AGENT_TYPES` carve-out in the engine only populates `workflow.input` for them — `agent.output` references still require declaration. Split the gate: keep the script/workflow exclusion for `workflow.input` only; for `agent.output`, only exclude `human_gate` (whose prompts render in accumulate mode via `get_for_template()`). Namespace separation (review-driven fix) `INPUT_REF_PATTERN` previously didn't capture whether a group reference was for `.outputs` or `.errors`. This caused: - `input: ["pg.errors"]` to silently suppress warnings for `{{ pg.outputs.a.val }}` references (false negative) - `input: ["pg.errors.a.foo"]` to emit warnings saying "only declares pg.outputs.a.foo" (misleading message) The engine populates only the declared namespace into ctx, so a runtime KeyError would follow in both cases. Pattern now captures `pg_kind` (outputs|errors); declared-tracking uses three independent accumulators keyed by namespace. Other changes - `_extract_template_refs` returns a typed `TemplateRefs` NamedTuple carrying field detail (`agent_output_fields`, `group_member_fields`, `group_error_refs`) alongside the existing flat `agent_refs` set. - AST walker filters inner-link Getattr nodes so `{{ a.output.bar }}` doesn't emit BOTH a `bar`-field ref AND a spurious whole-output ref from the inner Getattr in the same chain. - AST walker detects method-call Getattr nodes (callee of a Call node) so `{% for k,v in a.output.items() %}` registers as a whole-output reference rather than a field ref to `items`. - `_extract_template_refs` also catches `TemplateAssertionError` from `meta.find_undeclared_variables` (e.g., duplicate `{% block %}` names) so validation no longer hard-fails on templates that parse but semantically conflict. - `human_gate` agents added to the explicit-mode exclusion list — fixes a pre-existing false-positive warning for gate prompts (engine renders them with full accumulated context via `get_for_template()`). - Documented limitations: bracket access (`a.output["x"]`) and dynamic field access (`a.output[var]`) are not detected. Tests - 27 new tests across `TestExtractTemplateRefs`, `TestInputRefPatternExtensions`, `TestExplicitModeWarnings`, and new `TestExplicitModeFieldPrecision`. Covers field precision (including whole-output vs specific-field), namespace separation (outputs vs errors), method-call filter, prefix-chain dedup, optional `?` suffix, human_gate / script / sub-workflow carve-outs, static parallel matrix (whole-group / whole-member / exact-field), for-each precision skip, and the `TemplateAssertionError` graceful-degradation path. - Existing tests updated for the new NamedTuple return shape. - Full suite (2653 tests) passes; lint clean; all `examples/*.yaml` still validate without unexpected warnings. Refs #105 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
970033c to
249423d
Compare
Review-driven updateI ran a multi-agent review (code-review, test-analyzer, silent-failure-hunter, type-design-analyzer, code-simplifier) before flagging this PR for human review. Two agents independently caught the same real bug I had missed, plus a related TemplateAssertionError leak. Force-pushed Bugs fixed (verified at runtime, not just on paper)
Test additions (13 new, total 27 new on the PR)
Code polish (from code-simplifier suggestions)
Verification
Diff vs previous push: |
Closes #105.
PR #125 added explicit-mode warnings for undeclared
workflow.input/agent.outputreferences. While building on top of it, I confirmed (by running real workflows that pass validation and crash at runtime) two follow-up gaps that still produce the exactTemplateError: 'dict object' has no attribute 'X'symptom from #105.Gap A — Field-level precision
When an agent declares
input: [a.output.foo], the engine (engine/context.py:_add_agent_input) copies only thefoofield into the agent's ctx. A prompt that then references{{ a.output.bar }}fails at runtime — same'dict object' has no attribute 'bar'error the issue body quotes. The validator previously tracked declared agents as a flatset[str]of names, so this slipped through.Fix: track declared fields per root as
dict[str, set[str | None]](whereNonein the set means "whole-output declared, any field allowed"), and emit a warning naming the missing field when a template references a field not in the declared set. Same logic applies to static parallel groups (pg.outputs.member.field— the engine field-slices onlen ≥ 3). For-each groups are intentionally skipped because the engine copies whole members regardless of.fieldsuffix (elif is_for_each_dict or len(remaining_parts) == 2), so field-precision warnings would be false positives there.Gap B — Script/sub-workflow
agent.outputexclusion was too broadagent.type not in ("script", "workflow")previously suppressed bothworkflow.inputandagent.outputwarnings. But the engine's_LOCAL_RENDER_AGENT_TYPEScarve-out only populatesworkflow.inputfor these types —agent.outputreferences still raiseKeyErrorin_add_explicit_inputif undeclared.Fix: split the gate. Keep the script/workflow/human_gate exclusion for
workflow.inputwarnings only; foragent.outputwarnings, only excludehuman_gate(whose prompts render in accumulate mode).Other improvements
_extract_template_refsreturns a typedTemplateRefsNamedTuple carrying field detail (agent_output_fields,group_member_fields,group_error_refs) alongside the existing flatagent_refsset.{{ a.output.bar }}, only the outermost Getattr (attrs=["output", "bar"]) is processed. The inner Getattr (attrs=["output"]) is the.nodeof the outer, so emitting it separately would falsely register a whole-output reference and suppress the field-precision warning.Call).{% for k,v in a.output.items() %}now registers as a whole-output ref rather than a field ref toitems— eliminates a class of false positives.human_gateagents added to the explicit-mode exclusion list — fixes a pre-existing false-positive warning. Gate prompts render viacontext.get_for_template()which forcesmode='accumulate', so explicit-mode declarations are not required._extract_template_refs: bracket access (a.output["x"]) and dynamic field access (a.output[var]) are intentionally not detected.Example output
Issue #105's exact example:
Gap A (new):
Gap B (newly caught):
Testing
TestExtractTemplateRefs,TestExplicitModeWarnings(script/sub-workflowagent.output, sub-workflowworkflow.inputcarve-out regression, human_gate false-positive regression), and a newTestExplicitModeFieldPrecisionclass (field precision, whole-output declarations, method-call filter, static parallel precision, for-each precision skip, sub-workflowinput_mappingfield precision).-m "not performance").make lint✅make typecheck: only the pre-existingdialog_evaluator.pydiagnostic (verified by stashing and re-running onmain).examples/*.yamlcontinue to validate without unexpected warnings.Decisions that may need a closer look
dataclass— chosen for slightly cleaner tuple-style unpacking in tests and zero runtime overhead. No behavior difference.a.output.items());_DICT_METHOD_NAMEScovers the rarer case of referencing a method without calling it ({{ a.output.items }}as a value).Suggested commit on merge:
feat(validate): warn on undeclared agent.output refs and field-level mismatches in explicit mode (#NNN)