Orchestrator Context Explosion: Unbounded Accumulation of Agent Responses

## Problem Statement

The FastAgent orchestrator accumulates **all text content** from every agent response (including tool calls, raw tool results, and intermediate processing steps) and passes this ever-growing context to each subsequent sub-agent. This design causes context to grow at O(N) for the orchestrator and O(N²) for workers with history enabled, leading to performance degradation, API token exhaustion, and system failures.

## Evidence

### 1. Unbounded Text Accumulation

The orchestrator's `_execute_step` method captures all agent output without filtering:

```python
# src/mcp_agent/agents/workflow/orchestrator_agent.py, lines 343-347
result = await future_obj
result_text = result.all_text()  # Extracts ALL text content
```

The `all_text()` method concatenates everything:

- Direct LLM responses
- Tool call arguments and results (including large file contents, database dumps, etc.)
- Error messages and stack traces
- Any embedded resource text

### 2. Full Context Propagation

Every sub-agent receives the complete execution history:

```python
# src/mcp_agent/agents/workflow/orchestrator_agent.py, lines 321-324
task_description = TASK_PROMPT_TEMPLATE.format(
    objective=previous_result.objective,
    task=task.description,
    context=context  # Contains ALL accumulated results from ALL previous steps
)
```

The `format_plan_result()` function serializes the entire `PlanResult` object, including all `StepResult` and `TaskWithResult` entries, into XML format without truncation.

### 3. Quadratic Growth with Worker History

Worker agents default to `use_history=True`, causing them to retain their own message history. Since each message contains the full orchestrator context at that point:

- **Iteration 1**: Worker receives context of size X
- **Iteration 2**: Worker receives context of size 2X, but also retains the previous X in history
- **Iteration N**: Worker's effective prompt size ≈ Σ(k=1 to N) k\*X = O(N²)

### 4. No Context Management

The codebase lacks any mechanisms for:

- Context size limits or token counting
- Result summarization or compression
- Selective context sharing
- Context window management
- Tool result filtering

## Implications

### 1. Performance Degradation

- Each agent invocation takes progressively longer as context grows
- API latency increases non-linearly with workflow complexity
- Simple 5-step workflows can take 10x longer than necessary

### 2. Context Window Exhaustion

- Modern LLMs have context limits (8k-128k tokens)
- A single large tool result (e.g., reading a 50KB file) gets replicated in every subsequent prompt
- Workflows fail mid-execution when context exceeds model limits

### 3. Cost Explosion

- API costs scale with token usage
- A workflow with 5 steps and 3 iterations can consume 15x more tokens than the actual content requires
- Each worker processes redundant historical data

### 4. Quality Degradation

- Relevant task information gets buried in historical noise
- LLMs exhibit "lost in the middle" problems with large contexts
- Agents may focus on irrelevant details from unrelated previous steps

## Reproducible Example

```python
# Simple workflow that demonstrates the issue:
1. Agent A: Read a 10KB configuration file
2. Agent B: Analyze the configuration
3. Agent C: Generate a report

# Actual context sizes:
- Agent A receives: objective (0.1KB)
- Agent B receives: objective + Agent A's full output including raw file content (10.1KB)
- Agent C receives: objective + Agent A output + Agent B output (20.2KB+)
# If any agent is reused, their history compounds the problem quadratically
```

## System Impact

This is not an edge case but a fundamental architectural issue that affects:

- **Every orchestrator workflow** (context always accumulates)
- **Every multi-step plan** (more steps = more accumulation)
- **Every agent reuse** (history compounds the problem)
- **Every tool that returns substantial output** (multiplies the impact)

The current design makes it effectively impossible to run complex, multi-step workflows with multiple iterations without hitting context limits or experiencing severe performance degradation.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Orchestrator Context Explosion: Unbounded Accumulation of Agent Responses #202

Problem Statement

Evidence

1. Unbounded Text Accumulation

2. Full Context Propagation

3. Quadratic Growth with Worker History

4. No Context Management

Implications

1. Performance Degradation

2. Context Window Exhaustion

3. Cost Explosion

4. Quality Degradation

Reproducible Example

System Impact

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Orchestrator Context Explosion: Unbounded Accumulation of Agent Responses #202

Description

Problem Statement

Evidence

1. Unbounded Text Accumulation

2. Full Context Propagation

3. Quadratic Growth with Worker History

4. No Context Management

Implications

1. Performance Degradation

2. Context Window Exhaustion

3. Cost Explosion

4. Quality Degradation

Reproducible Example

System Impact

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions