Skip to content

fix: block y-websocket auto-reconnect during async IMS refresh#943

Merged
chrischrischris merged 1 commit into
mainfrom
imsrace
May 18, 2026
Merged

fix: block y-websocket auto-reconnect during async IMS refresh#943
chrischrischris merged 1 commit into
mainfrom
imsrace

Conversation

@chrischrischris
Copy link
Copy Markdown
Contributor

Why

Three remaining bugs around the IMS token refresh path that survived #937 / #941. Together they explain a continuous baseline of HEAD 401s on da-admin (660k/day in Coralogix) even after those PRs landed.

Bug 1 — Race between async refresh and y-websocket's 100ms reconnect timer

y-websocket's onclose schedules setTimeout(setupWS, 2^wsUnsuccessfulReconnects * 100ms) synchronously. On a 4401 close the upgrade returns 101 + immediate close, so onopen briefly fires and wsUnsuccessfulReconnects stays at 0 — backoff is 100ms, shorter than a typical adobeIMS.refreshToken() round-trip. The auto-reconnect fired with the still-stale token and burned 2–3 HEAD 401s on da-admin per token expiry per tab.

Bug 2 — Concurrent close handlers racing lastSentToken

Because bug 1 produced multiple 4401 closes back-to-back, multiple async handlers ran concurrently. Handler A finishes first and writes lastSentToken = X2. Handler B (started during A's await) then sees fresh === lastSentToken (X2 === X2) and incorrectly concludes "no new token", sets shouldConnect = false, and shows the auth banner — even though the refresh worked.

Bug 3 — checkDoc doubles the HEAD volume on auth-close

createAwarenessStatusWidget has a second connection-close listener that calls checkDoc(path)daFetch(path, { method: 'HEAD' }) on every close. On a 4401/4403 the doc is fine — da-collab signalled an auth failure — so this just fires another HEAD on da-admin which daFetch's refresh-and-retry compounds further.

What

  • Flip shouldConnect=false synchronously before the first await in the close handler. The pending setTimeout(setupWS) becomes a no-op, eliminating bugs 1 and 2 (no concurrent close handlers because no auto-reconnects fire during the refresh window).
  • Drive the reconnect ourselves via provider.connect() once we have the fresh token.
  • Skip checkDoc when the close code is 4401/4403.

Trust model

Unchanged. 4403 still stops reconnection. 4401 with no new token still bails and surfaces the modal. Anonymous users still bail without showing the modal.

Tests

  • New: Blocks y-websocket auto-reconnect during the in-flight refresh on 4401 — controls refreshToken via an unresolved promise and asserts shouldConnect===false during the await, then true after.
  • New: Skips checkDoc HEAD on auth-close codes (4401/4403) — patches window.fetch and asserts no HEAD is issued on 4401/4403, but is on 1006.
  • All existing close-handler tests still pass.

🤖 Generated with Claude Code

@aem-code-sync
Copy link
Copy Markdown

aem-code-sync Bot commented May 18, 2026

Hello, I'm the AEM Code Sync Bot and I will run some actions to deploy your branch.
In case there are problems, just click the checkbox below to rerun the respective action.

  • Re-sync branch
Commits

The 4401 close handler awaits adobeIMS.refreshToken() before deciding
whether to retry. y-websocket's onclose schedules its own
setTimeout(setupWS, ...) synchronously; on a 4401 the upgrade returned
101 + immediate close, so onopen briefly fired and
wsUnsuccessfulReconnects stayed at 0 - backoff is 2^0*100=100ms, shorter
than a typical IMS refresh round-trip. The auto-reconnect fired with the
still-stale token and burned 2-3 HEAD 401s on da-admin per token expiry
per tab.

Flip provider.shouldConnect=false synchronously before the first await
so the pending setTimeout becomes a no-op, then drive the reconnect
ourselves via provider.connect() once we have the fresh token. Same
fix removes the bug 2 race where two concurrent close handlers could
mutate lastSentToken out from under each other and stop reconnection
on a successful refresh.

Also skip the createAwarenessStatusWidget checkDoc HEAD on auth-close
codes - those signal auth failure not doc-deleted, and the daFetch
refresh-and-retry doubles the HEAD 401 traffic for no benefit.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@chrischrischris chrischrischris changed the title fix(prose): block y-websocket auto-reconnect during async IMS refresh fix: block y-websocket auto-reconnect during async IMS refresh May 18, 2026
@chrischrischris chrischrischris merged commit 79f46a7 into main May 18, 2026
4 checks passed
@chrischrischris chrischrischris deleted the imsrace branch May 18, 2026 22:22
@mhaack mhaack mentioned this pull request May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants