Skip to content

Batch workflows fail through MCP server with unresolved template variables #79

@spinje

Description

@spinje

Summary

Batch workflows that execute successfully through the CLI fail through the MCP server with Unresolved variables in parameter 'prompt' — with no actual variable names listed in the error. The error occurs because the MCP execution path skips template validation, which has a critical side effect: registering batch context variables (item, __index__) that batch nodes depend on.

Steps to Reproduce

  1. Save any workflow with batch LLM nodes (e.g., release-announcements which has ${item.platform}, ${item.format_rules} in prompts)

  2. Execute via CLI — works:

uv run pflow release-announcements version=0.5.0 changelog_path=./CHANGELOG.md slack_channel=test discord_channel_id=1234
# ✓ Workflow completed in 51.078s — all 8 nodes succeed, all 3 batch items resolve
  1. Execute via MCP tool — fails:
workflow_execute(workflow="release-announcements", parameters={
    "version": "0.5.0",
    "changelog_path": "./CHANGELOG.md",
    "slack_channel": "test",
    "discord_channel_id": "1234"
})
# ❌ Unresolved variables in parameter 'prompt':
#
# Available context keys:
#   • item (dict)          ← item IS available, but templates fail anyway
#   • version (str): 0.5.0
#   • extract-changelog (dict)
#   ... etc

Note: The error message lists item (dict) as an available context key but still reports unresolved variables — and doesn't list WHICH variables are unresolved (empty after the colon).

Root Cause

The CLI and MCP execution paths diverge in how they handle validation before execution. Template validation has a critical side effect: it registers batch context variables (item, __index__) via _extract_node_outputs() in template_validator.py:795-851. The MCP path never triggers this registration.

CLI path (works):

  1. cli/main.py:2181-2182 — calls _validate_before_execution() when auto_repair=False
  2. _validate_before_execution() calls WorkflowValidator.validate() with execution_params
  3. This triggers template validation in template_validator.py
  4. _extract_node_outputs() at lines 795-851 detects batch config and registers:
    • item (or custom alias from batch.as) as an available variable
    • __index__ as an available variable
  5. Template validation passes — ${item.platform}, ${item.format_rules} etc. are recognized
  6. Execution proceeds with validate=False (already validated) — batch templates resolve at runtime

MCP path (fails):

  1. mcp_server/services/execution_service.py:256 — calls execute_workflow() with enable_repair=False
  2. workflow_execution.py:611-620 — the enable_repair=False branch skips all validation
  3. Calls executor.execute_workflow() with validate=False
  4. executor_service.py:107 passes validate=False to compile_ir_to_flow()
  5. compiler.py:1236-1242 — skips template validation entirely when validate=False
  6. Batch variables (item, __index__) are never registered
  7. Template resolution encounters ${item.platform} and fails

The architectural issue

Template validation in _extract_node_outputs() performs two distinct roles:

  1. Validation — checking that template references are valid (skippable)
  2. Context registration — registering batch variables that runtime resolution depends on (not skippable)

These are conflated. The validate=False flag disables both, but only #1 should be skippable. Role #2 is a prerequisite for batch execution, not an optional validation step.

Key files

File Lines Role
src/pflow/cli/main.py 2081-2131, 2181-2182 CLI pre-validation (has _validate_before_execution())
src/pflow/execution/workflow_execution.py 611-620 MCP branch skips validation entirely
src/pflow/execution/executor_service.py 103-110 Passes validate to compiler
src/pflow/runtime/compiler.py 1236-1242 Skips template validation when validate=False
src/pflow/runtime/template_validator.py 795-851 _extract_node_outputs() registers batch variables
src/pflow/mcp_server/services/execution_service.py 256-264 MCP entry — enable_repair=False always

Possible fixes

Option A: MCP runs pre-validation too
Add a validation step in the MCP execution path equivalent to CLI's _validate_before_execution(). This ensures batch context is registered before compilation.

Option B: Separate context registration from validation
Extract batch variable registration out of template validation into a standalone step that always runs during compilation, regardless of the validate flag. This is the cleaner architectural fix — context setup shouldn't be gated behind a validation flag.

Option C: Always validate for batch workflows
Detect batch nodes in the workflow IR and force validate=True when batch config is present. Quick fix but doesn't address the underlying conflation.

Impact

  • All batch workflows fail through MCP — any workflow with batch config on a node
  • Non-batch workflows are unaffected — simple linear chains work fine through MCP
  • CLI is unaffected — has the pre-validation step
  • Pre-existing bug — exists on main, not caused by the markdown format migration (verified: parsed IR is byte-for-byte identical between JSON and markdown formats)

Environment

  • pflow version: development (pre-0.8.0)
  • Python: 3.10+
  • Discovered during Task 107 (markdown workflow format) integration testing

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions