RTC: Predefined retry schedules for disconnect dialog, make more lenient#76966
RTC: Predefined retry schedules for disconnect dialog, make more lenient#76966alecgeatches merged 11 commits intotrunkfrom
Conversation
…ts to fix while disconnected
|
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message. To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
|
Size Change: +159 B (0%) Total Size: 7.74 MB
ℹ️ View Unchanged
|
|
Flaky tests detected in c73fee2. 🔍 Workflow run URL: https://github.com/WordPress/gutenberg/actions/runs/23868299497
|
…Don't reset the timing intervals on retry
…r the next request
…ent (#76966) * Increase disconnected timeout to disconnect dialog, allow more requests to fix while disconnected * Show disconnected modal only after a failure, not mid-cycle * Decrease disconncted debounce to 25 to allow for the next-try to succeed at roughly 30s * Reduce DISCONNECTED_DEBOUNCE_MS to 16s * Replace backoff system with predefined poll intervals for easier reasoning * Simplify manual retry, reset failure counter * In "Retry" dialog, ensure "Retrying..." stays up for at least 500ms. Don't reset the timing intervals on retry * Change manual retry on the disconnect dialog reset the time to 15s for the next request * Rename isSubsequentRetry to hasRetried * Minor refactor, remove isConnected, showModal ternary * Fix disconnection unit tests Co-authored-by: alecgeatches <alecgeatches@git.wordpress.org> Co-authored-by: chriszarate <czarate@git.wordpress.org> Co-authored-by: ingeniumed <ingeniumed@git.wordpress.org> Co-authored-by: maxschmeling <maxschmeling@git.wordpress.org>
|
I just cherry-picked this PR to the wp/7.0 branch to get it included in the next release: 03a3d23 |
…ent (#76966) * Increase disconnected timeout to disconnect dialog, allow more requests to fix while disconnected * Show disconnected modal only after a failure, not mid-cycle * Decrease disconncted debounce to 25 to allow for the next-try to succeed at roughly 30s * Reduce DISCONNECTED_DEBOUNCE_MS to 16s * Replace backoff system with predefined poll intervals for easier reasoning * Simplify manual retry, reset failure counter * In "Retry" dialog, ensure "Retrying..." stays up for at least 500ms. Don't reset the timing intervals on retry * Change manual retry on the disconnect dialog reset the time to 15s for the next request * Rename isSubsequentRetry to hasRetried * Minor refactor, remove isConnected, showModal ternary * Fix disconnection unit tests Co-authored-by: alecgeatches <alecgeatches@git.wordpress.org> Co-authored-by: chriszarate <czarate@git.wordpress.org> Co-authored-by: ingeniumed <ingeniumed@git.wordpress.org> Co-authored-by: maxschmeling <maxschmeling@git.wordpress.org>
What?
This PR replaces the exponential backoff and debounce for the disconnect dialog with predefined retry schedules. The dialog now waits through a fixed list of retry times before showing, with separate schedules for solo and collaborative editing. After the dialog appears, retries continue in the background at 30s intervals.
Additionally, the disconnect dialog adds a "Retrying..." state so it's clearer when a connection attempt is being made (even if it fails immediately):
Why?
The previous system used exponential back-off (
pollInterval * 2, capped at 30s) and ran a time-based debounce depending on the number of collaborators. The two mechanisms interacted in ways that made changes like #76704 difficult to reason about.Even though we increased both the poll interval and disconnect debounce by 4x, the exponential math worked out so just one failed request could cause a disconnect dialog:
Because the backoff multiplied the base polling interval, the behavior was different in solo mode, and not in a useful way. Solo editing had fewer retries before showing the dialog, even though solo users are the least likely to hit CRDT conflicts from a brief disconnect. Collaborative editing had more retries, despite being the mode where stale state matters most. Adjusting either the backoff multiplier or the debounce timer required recalculating the tables for both modes to make sure the timing still made sense.
Predefined retry schedules fix all of this by making the timing readable at a glance. Each mode has its own array of delays, and you can see exactly when each retry happens and when the dialog appears without doing any math.
How?
The exponential backoff in the polling manager is replaced with two explicit retry delay arrays in config:
These have been selected sort of arbitrarily, but gives more time for solo connections to reconnect.
Additionally, we no longer will show the dialog mid-cycle. Previously the dialog debounce timer made it so that we might show a dialog mid-cycle with a few seconds left before another retry. Instead, ensure we've just seen a failure before showing the dialog.
Here's the new behavior for both modes:
Testing Instructions
Enable RTC via Settings -> Writing -> "Enable real-time collaboration" checkbox.
Solo editing (dialog after ~26s):
Collaborative editing (dialog after ~15s):
Use of AI Tools
AI assistance: Yes
Tool(s): Claude Code
Used for: Reviewing the previous disconnect timing logic, designing the fixed retry schedule, and implementing the changes