Skip to content

Orchestrator Context Explosion: Unbounded Accumulation of Agent Responses #202

@StuartLeitch

Description

@StuartLeitch

Problem Statement

The FastAgent orchestrator accumulates all text content from every agent response (including tool calls, raw tool results, and intermediate processing steps) and passes this ever-growing context to each subsequent sub-agent. This design causes context to grow at O(N) for the orchestrator and O(N²) for workers with history enabled, leading to performance degradation, API token exhaustion, and system failures.

Evidence

1. Unbounded Text Accumulation

The orchestrator's _execute_step method captures all agent output without filtering:

# src/mcp_agent/agents/workflow/orchestrator_agent.py, lines 343-347
result = await future_obj
result_text = result.all_text()  # Extracts ALL text content

The all_text() method concatenates everything:

  • Direct LLM responses
  • Tool call arguments and results (including large file contents, database dumps, etc.)
  • Error messages and stack traces
  • Any embedded resource text

2. Full Context Propagation

Every sub-agent receives the complete execution history:

# src/mcp_agent/agents/workflow/orchestrator_agent.py, lines 321-324
task_description = TASK_PROMPT_TEMPLATE.format(
    objective=previous_result.objective,
    task=task.description,
    context=context  # Contains ALL accumulated results from ALL previous steps
)

The format_plan_result() function serializes the entire PlanResult object, including all StepResult and TaskWithResult entries, into XML format without truncation.

3. Quadratic Growth with Worker History

Worker agents default to use_history=True, causing them to retain their own message history. Since each message contains the full orchestrator context at that point:

  • Iteration 1: Worker receives context of size X
  • Iteration 2: Worker receives context of size 2X, but also retains the previous X in history
  • Iteration N: Worker's effective prompt size ≈ Σ(k=1 to N) k*X = O(N²)

4. No Context Management

The codebase lacks any mechanisms for:

  • Context size limits or token counting
  • Result summarization or compression
  • Selective context sharing
  • Context window management
  • Tool result filtering

Implications

1. Performance Degradation

  • Each agent invocation takes progressively longer as context grows
  • API latency increases non-linearly with workflow complexity
  • Simple 5-step workflows can take 10x longer than necessary

2. Context Window Exhaustion

  • Modern LLMs have context limits (8k-128k tokens)
  • A single large tool result (e.g., reading a 50KB file) gets replicated in every subsequent prompt
  • Workflows fail mid-execution when context exceeds model limits

3. Cost Explosion

  • API costs scale with token usage
  • A workflow with 5 steps and 3 iterations can consume 15x more tokens than the actual content requires
  • Each worker processes redundant historical data

4. Quality Degradation

  • Relevant task information gets buried in historical noise
  • LLMs exhibit "lost in the middle" problems with large contexts
  • Agents may focus on irrelevant details from unrelated previous steps

Reproducible Example

# Simple workflow that demonstrates the issue:
1. Agent A: Read a 10KB configuration file
2. Agent B: Analyze the configuration
3. Agent C: Generate a report

# Actual context sizes:
- Agent A receives: objective (0.1KB)
- Agent B receives: objective + Agent A's full output including raw file content (10.1KB)
- Agent C receives: objective + Agent A output + Agent B output (20.2KB+)
# If any agent is reused, their history compounds the problem quadratically

System Impact

This is not an edge case but a fundamental architectural issue that affects:

  • Every orchestrator workflow (context always accumulates)
  • Every multi-step plan (more steps = more accumulation)
  • Every agent reuse (history compounds the problem)
  • Every tool that returns substantial output (multiplies the impact)

The current design makes it effectively impossible to run complex, multi-step workflows with multiple iterations without hitting context limits or experiencing severe performance degradation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions