-
Notifications
You must be signed in to change notification settings - Fork 374
runtime: catastrophic memory explosion triggers macOS jetsam kill in long-running TUI sessions #2978
Copy link
Copy link
Open
Labels
area/agentFor work that has to do with the general agent loop/agentic features of the appFor work that has to do with the general agent loop/agentic features of the apparea/sessionsFor features/issues/fixes related to session lifecycle (resume, persistence, export)For features/issues/fixes related to session lifecycle (resume, persistence, export)area/toolsFor features/issues/fixes related to the usage of built-in and MCP toolsFor features/issues/fixes related to the usage of built-in and MCP toolsarea/tuiFor features/issues/fixes related to the TUIFor features/issues/fixes related to the TUIautomatedIssues created by cagentIssues created by cagentstatus/needs-triageFor issues that need to be triagedFor issues that need to be triaged
Metadata
Metadata
Assignees
Labels
area/agentFor work that has to do with the general agent loop/agentic features of the appFor work that has to do with the general agent loop/agentic features of the apparea/sessionsFor features/issues/fixes related to session lifecycle (resume, persistence, export)For features/issues/fixes related to session lifecycle (resume, persistence, export)area/toolsFor features/issues/fixes related to the usage of built-in and MCP toolsFor features/issues/fixes related to the usage of built-in and MCP toolsarea/tuiFor features/issues/fixes related to the TUIFor features/issues/fixes related to the TUIautomatedIssues created by cagentIssues created by cagentstatus/needs-triageFor issues that need to be triagedFor issues that need to be triaged
Type
Fields
Give feedbackNo fields configured for Bug.
Summary
Long-running
docker-agent runTUI sessions experience sudden catastrophic memory explosions — growing from a stable ~168 MB baseline to 26+ GB in under 2 minutes — which trigger macOS jetsam SIGKILL. Observed across 7+ separate sessions over two weeks, consistently on macOS Apple Silicon.This is not a gradual memory leak. The process runs stably for hours, then a specific operation triggers an unbounded allocation cascade.
Environment
~/.cagent/session.db— 687 MB, 1168 sessions, 51649 items, 531 MB of message JSONConfirmed evidence
1. macOS Jetsam report (
JetsamEvent-2026-06-02-174418.ips)Process entry (PID 66283) at time of kill:
At kill time the system had only 534 MB free pages — 70+ system daemons were jetsammed in the same cascade.
2. RSS timeseries (30-second samples, PID 48218)
Process ran flat at 168 MB for 8+ hours, then:
168 MB → 26 GB in 90 seconds. CPU spiking to 126–144% confirms multiple goroutines allocating simultaneously.
3. Trigger pattern
Immediately before the explosion the agent ran a large filesystem search (
Search Files Contentacross~, 48-second runtime, 2008 matches across 1191 files).Hypothesis: a large tool result triggers a cascade of in-memory copies — tool output buffer → message list append →
session_itemsWAL write → context window serialisation for Anthropic API — each step holding its own copy, with no back-pressure or size cap.Steps to reproduce (approximate)
max_history_itemscap)Search Files Contentacross~or a large repo with thousands of matches)while sleep 30; do ps -o rss= -p $PID; doneExpected behaviour
Tool results exceeding a size threshold should be truncated or streamed rather than fully buffered. The session serialisation path should not hold multiple full copies of large payloads simultaneously.
Suggested investigation points
pkg/tools/builtin/shell/shell.go— tool output accumulation (no size cap on stdout buffer)pkg/runtime/toolexec/dispatcher.go— tool result handling before entering message listsession_itemsto sqlitepkg/runtime/streaming.go— context window construction for next API callmessage_count=163being sent whole with nomax_history_itemscap setAdditional context
~/.cagent/session.dbin-memory representation during serialisation is significantly larger than its 687 MB on-disk sizelargestProcessverdict, same sudden RSS profile after hours of stability