Skip to content

Webapp serve can wedge under message.part.delta storm from one session #17977

@dzianisv

Description

@dzianisv

Summary

A single active session can wedge the shared opencode serve instance and make unrelated webapp routes hang.

This is happening in production right now on http://100.68.120.26:4096.

Current impact

  • Session page hangs/blank-loads for unrelated sessions.
  • curl --max-time 8 http://100.68.120.26:4096/global/health times out with 0 bytes.
  • curl --max-time 8 http://100.68.120.26:4096/L1VzZXJzL2VuZ2luZWVyL3dvcmtzcGFjZS92aWJlYnJvd3Nlci92aWJl/session/ses_30526a392ffegNvJj3VONZughB times out with 0 bytes.

Evidence

  • Port 4096 listener is the root .opencode serve --hostname 100.68.120.26 process.
  • The root process ballooned to multi-GB RSS / footprint and stopped responding before sending headers.
  • watchdog.ndjson shows the new root process (20427 parent / 20428 child) jumping from ~3.3 GB tree RSS to ~5.36 GB quickly, with repeated spikes afterward.
  • sample on the child showed ~6.5 GB physical footprint and ~8.7 GB peak during the wedge.
  • DB size is small (opencode.db ~11 MB, 62 sessions, 839 messages, 2910 parts), so this is not a giant-history issue.
  • Serve logs show the process healthy until it enters an extreme message.part.delta publish storm around 2026-03-17 08:40 UTC-07, after which unrelated web routes stop making progress.

Likely trigger

The likely triggering session is:

  • session id: ses_3050c2b49ffejDdZpPUK5zEPsU
  • title: 6 months review and update of CV
  • directory: /Users/engineer/workspace/vibebrowser/vibe

Root cause hypothesis

This looks like an event-stream amplification problem, not a specific bad session page.

Relevant paths:

  • packages/opencode/src/session/processor.ts
    • emits one message.part.delta bus event per model chunk for text-delta and reasoning-delta
  • packages/opencode/src/bus/index.ts
    • logs every bus publish at info level, including every message.part.delta
  • packages/opencode/src/server/routes/global.ts
    • subscribes to GlobalBus and calls await stream.writeSSE(...) inside an async event-emitter listener
  • packages/opencode/src/server/server.ts
    • same async-per-event SSE pattern for /event
  • packages/opencode/src/acp/agent.ts
    • secondary amplification risk: refetches the full message on every message.part.delta

The most direct webapp failure mode is that the server-side SSE relays queue unbounded async writes during a delta storm. Since EventEmitter.emit() does not await async listeners, a hot session can accumulate huge numbers of pending writeSSE() promises and JSON payloads, driving memory/CPU up until the shared serve instance stops responding.

Expected fix shape

  • Add a bounded/coalescing server-side relay for /global/event and /event.
  • Merge repeated message.part.delta chunks per part while a write is in flight.
  • Drop stale delta events when a newer message.part.updated supersedes them in the queue.
  • Stop logging every message.part.delta at info level.
  • Optionally cache part type in ACP so it does not fetch the full message on every delta.

Repro notes

The failure is live as of 2026-03-17.

Useful local artifacts:

  • ~/.local/share/opencode/log/2026-03-17T081027.log
  • ~/.local/share/opencode/log/watchdog.ndjson
  • ~/.local/share/opencode/log/memory/2026-03-17T16-24-11-664Z

Metadata

Metadata

Assignees

Labels

coreAnything pertaining to core functionality of the application (opencode server stuff)perfIndicates a performance issue or need for optimization

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions