Skip to content

Enhance robustness of plan-execute argument passing #231

@ShuxinLin

Description

@ShuxinLin

Summary

The plan-execute workflow has several silent failure modes in argument passing between steps. When things go wrong, errors are swallowed and empty dicts are returned — making debugging very hard and causing downstream tool calls to fail with missing-parameter errors rather than a clear root cause.

Identified Issues

1. Silent JSON parse failure in planner (planner.py:77-78)

When the LLM produces a malformed #ArgsN: line, json.loads fails and the args silently become {}. No warning is logged.

except json.JSONDecodeError:
    args[n] = {}  # completely silent

2. _has_placeholders() only checks top-level string values (executor.py:184-189)

Nested placeholders inside dicts or lists are not detected, so those steps bypass LLM resolution and call the tool with raw {step_N} strings.

3. Missing step references silently ignored in context (executor.py:221-224)

If a step references {step_N} but step N failed or doesn't exist, it's silently dropped from the context fed to the LLM. The LLM then receives incomplete context and may produce wrong or empty resolved values.

context_text = "\n".join(
    f"Step {n}: {context[n].response}"
    for n in sorted(referenced)
    if n in context          # missing step → no context, no warning
)

4. Unparseable LLM resolution response silently becomes {} (executor.py:264-265)

If _parse_json can't extract valid JSON from the LLM's resolution response, it returns {}. The resolved args dict then lacks all placeholder-sourced keys — the subsequent tool call fails with a confusing missing-parameter error.

5. No retry or fallback for LLM resolution call (executor.py:236)

A single LLM call resolves all placeholders. Transient LLM failures or empty responses cascade directly to tool failure with no retry.

Proposed Fix

  • Log warnings at all silent fallback sites (planner.py, executor.py) so failures are visible in logs.
  • Recurse into nested structures in _has_placeholders() and _resolve_args_with_llm() to handle dicts/lists containing placeholders.
  • Warn when referenced steps are missing from context, and surface this in the StepResult.error field.
  • Warn when _parse_json falls back to {}, including the raw LLM output (truncated) for diagnostics.
  • Add a single retry for the LLM resolution call on empty/unparseable response.

Impact

These are all silent data-loss bugs in the hot path of every multi-step plan. Fixing them improves debuggability and correctness for any plan with inter-step argument dependencies.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions