Skip to content

fix(edge,shareddoc): ws auth close via message listener + flush save deadlock#157

Merged
kptdobe merged 5 commits into
mainfrom
fix/ws-auth-close-and-flush-save
May 20, 2026
Merged

fix(edge,shareddoc): ws auth close via message listener + flush save deadlock#157
kptdobe merged 5 commits into
mainfrom
fix/ws-auth-close-and-flush-save

Conversation

@kptdobe
Copy link
Copy Markdown
Contributor

@kptdobe kptdobe commented May 20, 2026

Summary

Follow up of #156 and #149.

The Network connectivity Errors was first to the closing of the server which was too early and then, when reverted, the deadlock introduced by the flushSave guard (if 401 or 403 during a save, close connections, which triggers a flushSave, which waits for the initial pending save).

  • WS auth close: wsAuthFailureResponse now defers close() until the client sends its first message (via addEventListener('message', ...)). Calling close() immediately after accept() — before the 101 response is established — caused a CF Workers "Network connection lost." runtime exception.

  • flushSave deadlock: The if (saving) { return; } guard in flushSave was too broad — it caused external callers (e.g. flush-on-disconnect, flush-request messages) to return immediately instead of waiting for the in-flight save. Fixed by removing the guard from flushSave (external callers now correctly await savingPromise) and instead skipping flushSave inside closeConn when doc.saving is true, which is the only path that could deadlock (persistence.update → closeConn → flushSave).

  • Log noise: Reduced severity for expected auth failures in persistence.update: 401 → console.warn (message only), 403 → console.log (message only), other errors → console.error (full Error object).

Test plan

  • npm test — 126 tests passing
  • WS auth tests assert listeners are registered and close fires on first client message
  • New shareddoc test asserts second flushSave waits for in-flight PUT before resolving
  • Three new log-level tests assert correct severity per HTTP status

🤖 Generated with Claude Code

kptdobe and others added 2 commits May 20, 2026 10:04
- wsAuthFailureResponse now defers close() until the client sends its
  first message, preventing the CF Workers "Network connection lost."
  exception that occurred when close() was called before the 101
  response was fully established.

- flushSave no longer returns early when a save is in-flight; external
  callers now correctly await savingPromise. The deadlock that could
  occur when persistence.update → closeConn → flushSave re-entered the
  save is prevented by skipping flushSave in closeConn when doc.saving
  is true.

- Reduced log noise for expected auth failures in persistence.update:
  401 → console.warn (message only), 403 → console.log (message only),
  other errors → console.error (full Error object).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
kptdobe and others added 3 commits May 20, 2026 12:02
- wsAuthFailureResponse: arm a 5s safety timeout so clients that never
  send a first message still get closed; clear it from message/error/close
  listeners.

- closeConn: replace ydoc.saving duplicate-state guard with an isReentrant
  flag passed explicitly from persistence.update's closeAll loop, eliminating
  the dual-state drift risk and making the intent explicit at the call site.

- persistence.update error logging: guard err?.message?.startsWith against
  non-Error throws; unify 401 and 403 to console.warn with full Error object;
  keep console.error for all other failures.

- Tests: extract testWsUpgradeAuthFailure helper; assert triggerMessage is
  defined before calling; add safety-timeout test; add re-entrancy deadlock
  regression test; update log-level assertions to expect Error object.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
401 → console.warn (message only), 403 → console.log (message only),
other errors → console.error (full Error). Uses safe optional chaining
to guard against non-Error throws.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@kptdobe kptdobe merged commit ed718ee into main May 20, 2026
5 checks passed
@kptdobe kptdobe deleted the fix/ws-auth-close-and-flush-save branch May 20, 2026 12:11
adobe-bot pushed a commit that referenced this pull request May 20, 2026
## [1.5.3](v1.5.2...v1.5.3) (2026-05-20)

### Bug Fixes

* **edge,shareddoc:** ws auth close via message listener + flush save deadlock ([#157](#157)) ([ed718ee](ed718ee))

### Reverts

* remove wsAuthFailureResponse ([#149](#149)) ([#156](#156)) ([cf6ad46](cf6ad46))
@adobe-bot
Copy link
Copy Markdown
Collaborator

🎉 This PR is included in version 1.5.3 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants