Skip to content

fix: enable auto-compaction for sub-agents and improve context overflow detection#25180

Open
emco1234 wants to merge 4 commits intoanomalyco:devfrom
emco1234:fix/subagent-context-compaction
Open

fix: enable auto-compaction for sub-agents and improve context overflow detection#25180
emco1234 wants to merge 4 commits intoanomalyco:devfrom
emco1234:fix/subagent-context-compaction

Conversation

@emco1234
Copy link
Copy Markdown

@emco1234 emco1234 commented Apr 30, 2026

Issue for this PR

Closes #25187

Type of change

  • Bug fix
  • New feature
  • Refactor / code improvement
  • Documentation

What does this PR do?

Sub-agents hang indefinitely on context overflow because compaction never triggers for them. The root cause is in session/processor.ts: overflow detection is purely reactive — it only checks token counts from the API response AFTER the stream finishes. Some providers (z.ai, some OpenAI-compatible endpoints) accept context overflows silently without returning errors or accurate usage stats, so the processor never detects the overflow and the sub-agent hangs forever.

This PR adds a proactive context estimation step in the processor that runs BEFORE the LLM stream starts. It estimates the outgoing payload size using Token.estimate() (char-count / 4 heuristic) and compares it against the model's context window. If the estimate exceeds 85% of the limit, compaction is triggered immediately — without making an API call that would likely fail or hang.

The second change adds missing overflow error patterns in provider/error.ts. Some z.ai/GLM error messages (e.g. "token limit exceeded") weren't matched by existing patterns, so they were classified as generic API errors → retry loop → hang.

How did you verify your code works?

  • TypeScript compilation passes (npx tsc --noEmit --skipLibCheck)
  • The proactive check uses the same Token.estimate() function already used elsewhere in compaction.ts
  • The 85% threshold is conservative — it won't trigger for normal-sized conversations
  • The change is backward-compatible — existing compaction flow (reactive detection + error-based detection) is untouched
  • overflow.ts is intentionally NOT modified — its reactive behavior is correct for the main agent which typically has accurate model metadata

Screenshots / recordings

N/A — backend logic change, no UI impact.

Checklist

  • I have tested my changes locally
  • I have not included unrelated changes in this PR

…ow detection

Sub-agents (task tool delegates) can hang indefinitely when their context
overflows because compaction never triggers. This happens through three
gaps in the current implementation:

1. **overflow.ts**: When a model doesn't report context limits (context === 0),
   overflow detection is completely disabled (`return false`). This affects
   providers like z.ai that may not expose model limits in a standard way.

2. **processor.ts**: Overflow detection is purely reactive — it only checks
   token counts AFTER the API responds. Some providers (notably z.ai) accept
   overflows silently without returning errors or accurate usage stats, so
   the processor never detects the overflow and the sub-agent hangs.

3. **error.ts**: Not all z.ai/GLM overflow error responses are matched by
   existing patterns, so some overflow errors are classified as generic API
   errors instead of ContextOverflowError. These get retried (not compacted),
   causing the "loads for days" behavior.

Changes:

- overflow.ts: Replace `if (context === 0) return 0` with a 128k default
  fallback, so overflow detection is never fully disabled. Remove the
  corresponding guard in isOverflow().

- processor.ts: Add proactive context estimation BEFORE the LLM stream
  starts. Estimates token count from the outgoing message payload using
  Token.estimate(). If the estimated size exceeds 85% of the model's
  context window, triggers compaction immediately without making an API
  call that would likely fail or hang.

- error.ts: Add additional overflow detection patterns for z.ai GLM models
  and generic token-limit messages.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for your contribution!

This PR doesn't have a linked issue. All PRs must reference an existing issue.

Please:

  1. Open an issue describing the bug/feature (if one doesn't exist)
  2. Add Fixes #<number> or Closes #<number> to this PR description

See CONTRIBUTING.md for details.

@github-actions github-actions Bot added the needs:compliance This means the issue will auto-close after 2 hours. label Apr 30, 2026
@github-actions
Copy link
Copy Markdown
Contributor

The following comment was made by an LLM, it may be inaccurate:

Based on my search, I found several related PRs that address compaction and context overflow issues:

Potentially Related PRs:

  1. fix(compaction): prune context before overflow compaction to prevent unrecoverable 413 errors #20718 - fix(compaction): prune context before overflow compaction to prevent unrecoverable 413 errors — Handles compaction before overflow errors similar to your proactive approach

  2. feat(compaction): harden compaction system with breaker, retry, caps, and budget #20516 - feat(compaction): harden compaction system with breaker, retry, caps, and budget — General compaction system hardening that may overlap with your improvements

  3. fix(core): honor compaction.auto on provider overflow #17936 - fix(core): honor compaction.auto on provider overflow — Deals with automatic compaction on provider-reported overflow, related to your overflow detection fixes

  4. fix: improve compaction continuation to prevent agent from stopping #16073 - fix: improve compaction continuation to prevent agent from stopping — Addresses compaction flow issues similar to your sub-agent hang prevention

  5. fix(session): prevent infinite loop in auto-compaction when assistant ended its turn #15532 - fix(session): prevent infinite loop in auto-compaction when assistant ended its turn — Prevents auto-compaction infinite loops, related to context management

  6. fix: treat Anthropic long context billing error as context overflow #18683 - fix: treat Anthropic long context billing error as context overflow — Error pattern recognition for overflow detection similar to your z.ai/GLM pattern additions

These PRs all address aspects of compaction and overflow detection, though your PR appears to be the first comprehensive fix specifically targeting proactive estimation before API calls for sub-agents and unknown model limits. You may want to review #20718 and #17936 to ensure compatibility.

The proactive check in processor.ts is the primary fix for sub-agent
context overflow. It has its own context-limit fallback (128k) scoped
to the pre-stream estimation, so it doesn't affect the main agent's
overflow behavior. Restoring overflow.ts to its original state avoids
premature compaction for models with large context windows (200k+).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions github-actions Bot removed needs:issue needs:compliance This means the issue will auto-close after 2 hours. labels May 1, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

Thanks for updating your PR! It now meets our contributing guidelines. 👍

The proactive compaction check now uses the model's real context
limit directly. At 200k context it triggers at 170k, at 500k at 425k.
When the model doesn't report a limit (context === 0), the proactive
check is skipped entirely — reactive overflow detection still applies.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Bojun-Vvibe added a commit to Bojun-Vvibe/oss-contributions that referenced this pull request May 1, 2026
Four PRs across opencode (3) and codex (1):
- anomalyco/opencode#25166: merge-as-is, server.mdx adds /global/config docs
- anomalyco/opencode#25143: merge-as-is, ecosystem.mdx adds opencode-swarm row
- anomalyco/opencode#25180: merge-after-nits, pre-stream Token.estimate
  proactive compaction at processor.ts:543-568 + 3 new overflow regex arms
- openai/codex#20524: merge-as-is, notify deprecation across README +
  TOML doc + JSON schema + struct doc + 2 metrics + DeprecationNotice
  event + 153-line user_notification.rs delete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sub-agents hang indefinitely on context overflow — no compaction triggered

1 participant