Make reconnect timeout mandatory with 5-minute default#1443
Conversation
The retry timeout was optional and defaulted to infinite, so a permanently down server caused the reload loop to spin forever. When the timeout did fire it returned a generic "reconnect timed out" message, hiding the underlying connect failure. Default the timeout to 5 minutes in both the Rust Backoff and the JS ReloadDelay, and propagate the most recent connect error through to the closed() future / closed promise so callers can see why the loop stopped.
| timeout?: DOMHighResTimeStamp; | ||
| // Resets after each successful connection. Set to 0 for unlimited retries. | ||
| // default: 300000 (5 minutes) | ||
| timeout: DOMHighResTimeStamp; |
There was a problem hiding this comment.
Keep the timeout? API. Set it to 5 minutes if undefined.
| pub struct Reconnect { | ||
| abort: tokio::task::AbortHandle, | ||
| closed_rx: tokio::sync::watch::Receiver<bool>, | ||
| closed_rx: tokio::sync::watch::Receiver<Option<String>>, |
There was a problem hiding this comment.
anyhow::Error instead of a String?
- JS: keep ReloadDelay.timeout optional, default to 5 minutes when undefined. - Rust: store Arc<anyhow::Error> in the closed channel instead of String, so the structured error is preserved instead of being stringified at the boundary.
WalkthroughThis pull request unifies timeout handling for reconnection logic across JavaScript and Rust implementations. The timeout configuration changes from an optional field (where omission disabled timeouts) to a required 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
✨ Simplify code
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
rs/moq-native/src/reconnect.rs (1)
152-159: 🏗️ Heavy liftCover the new timeout contract, not just the default value.
This test only proves the default is
300s. The behavioral changes here aretimeout == 0meaning unlimited retries andclosed()surfacing the last connection error on timeout; those paths need regression tests.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@rs/moq-native/src/reconnect.rs` around lines 152 - 159, Add tests that cover the new timeout contract beyond Backoff::default(): create one test (e.g., test_backoff_unlimited_timeout) that constructs a Backoff with timeout == Duration::ZERO and verifies retry behavior does not stop due to timeout (simulate advancing the backoff or counting retry attempts to ensure retries continue); and create another test (e.g., test_backoff_closed_returns_last_error) that exercises Backoff::closed() by driving retries until a timeout is reached and asserting closed() returns the last connection error observed. Use the existing Backoff struct and methods (Backoff::default(), Backoff::closed()) and simulate failures/advancing time as done elsewhere in reconnect.rs to trigger the two distinct code paths (timeout == 0 unlimited retries, and timeout expiry returning last error).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@js/net/src/connection/reload.ts`:
- Around line 132-137: Compute the remaining timeout budget (remaining = timeout
- (performance.now() - this.#retryStart)) before scheduling the next retry; if
remaining <= 0 call this.#closedReject(...) immediately, otherwise pass
Math.min(remaining, desiredDelay) to the effect.timer call so the next wait is
clamped to the remaining time. Update the logic around timeout, this.delay and
the effect.timer invocation (use this.delay or the delay value you currently
compute for the next retry) and keep using this.#retryStart and
this.#closedReject for consistency.
In `@rs/moq-native/src/reconnect.rs`:
- Around line 92-96: The loop currently sleeps for the full delay after a failed
connect, which can exceed the configured backoff.timeout; modify the reconnect
loop around where delay is awaited (referencing backoff.delay, backoff.timeout,
retry_start, and last_error) to compute the remaining_time =
backoff.timeout.saturating_sub(retry_start.elapsed()) and sleep for
min(backoff.delay, remaining_time) (or skip sleeping if remaining_time is_zero),
so the sleep is capped to the remaining timeout budget and the function returns
the timed-out Err using last_error when no time remains.
---
Nitpick comments:
In `@rs/moq-native/src/reconnect.rs`:
- Around line 152-159: Add tests that cover the new timeout contract beyond
Backoff::default(): create one test (e.g., test_backoff_unlimited_timeout) that
constructs a Backoff with timeout == Duration::ZERO and verifies retry behavior
does not stop due to timeout (simulate advancing the backoff or counting retry
attempts to ensure retries continue); and create another test (e.g.,
test_backoff_closed_returns_last_error) that exercises Backoff::closed() by
driving retries until a timeout is reached and asserting closed() returns the
last connection error observed. Use the existing Backoff struct and methods
(Backoff::default(), Backoff::closed()) and simulate failures/advancing time as
done elsewhere in reconnect.rs to trigger the two distinct code paths (timeout
== 0 unlimited retries, and timeout expiry returning last error).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 7a49ef0a-2d8a-46a4-bfff-6b0b5754a3d3
📒 Files selected for processing (2)
js/net/src/connection/reload.tsrs/moq-native/src/reconnect.rs
| const timeout = this.delay.timeout ?? 300000; | ||
| if (timeout > 0) { | ||
| const elapsed = performance.now() - this.#retryStart; | ||
| if (elapsed >= this.delay.timeout) { | ||
| if (elapsed >= timeout) { | ||
| console.warn("reconnect timed out"); | ||
| this.#closedReject(new Error("reconnect timed out")); | ||
| this.#closedReject(err instanceof Error ? err : new Error(String(err))); |
There was a problem hiding this comment.
Clamp the next retry to the remaining timeout budget.
This check only rejects after a failed attempt. If this.#delay is larger than timeout - elapsed, the following effect.timer(...) can keep the loop alive well past the documented "maximum total time" before closed rejects.
Suggested fix
- const timeout = this.delay.timeout ?? 300000;
- if (timeout > 0) {
- const elapsed = performance.now() - this.#retryStart;
- if (elapsed >= timeout) {
- console.warn("reconnect timed out");
- this.#closedReject(err instanceof Error ? err : new Error(String(err)));
- return;
- }
- }
+ const timeout = this.delay.timeout ?? 300000;
+ let retryDelay = this.#delay;
+ if (timeout > 0) {
+ const elapsed = performance.now() - this.#retryStart;
+ const remaining = timeout - elapsed;
+ if (remaining <= 0) {
+ console.warn("reconnect timed out");
+ this.#closedReject(err instanceof Error ? err : new Error(String(err)));
+ return;
+ }
+ retryDelay = Math.min(retryDelay, remaining);
+ }
const tick = this.#tick.peek() + 1;
- effect.timer(() => this.#tick.update((prev) => Math.max(prev, tick)), this.#delay);
+ effect.timer(() => this.#tick.update((prev) => Math.max(prev, tick)), retryDelay);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const timeout = this.delay.timeout ?? 300000; | |
| if (timeout > 0) { | |
| const elapsed = performance.now() - this.#retryStart; | |
| if (elapsed >= this.delay.timeout) { | |
| if (elapsed >= timeout) { | |
| console.warn("reconnect timed out"); | |
| this.#closedReject(new Error("reconnect timed out")); | |
| this.#closedReject(err instanceof Error ? err : new Error(String(err))); | |
| const timeout = this.delay.timeout ?? 300000; | |
| let retryDelay = this.#delay; | |
| if (timeout > 0) { | |
| const elapsed = performance.now() - this.#retryStart; | |
| const remaining = timeout - elapsed; | |
| if (remaining <= 0) { | |
| console.warn("reconnect timed out"); | |
| this.#closedReject(err instanceof Error ? err : new Error(String(err))); | |
| return; | |
| } | |
| retryDelay = Math.min(retryDelay, remaining); | |
| } | |
| const tick = this.#tick.peek() + 1; | |
| effect.timer(() => this.#tick.update((prev) => Math.max(prev, tick)), retryDelay); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@js/net/src/connection/reload.ts` around lines 132 - 137, Compute the
remaining timeout budget (remaining = timeout - (performance.now() -
this.#retryStart)) before scheduling the next retry; if remaining <= 0 call
this.#closedReject(...) immediately, otherwise pass Math.min(remaining,
desiredDelay) to the effect.timer call so the next wait is clamped to the
remaining time. Update the logic around timeout, this.delay and the effect.timer
invocation (use this.delay or the delay value you currently compute for the next
retry) and keep using this.#retryStart and this.#closedReject for consistency.
| if !backoff.timeout.is_zero() && retry_start.elapsed() > backoff.timeout { | ||
| let timeout = backoff.timeout; | ||
| return Err(last_error | ||
| .map(|e| e.context(format!("reconnect timed out after {timeout:?}"))) | ||
| .unwrap_or_else(|| anyhow::anyhow!("reconnect timed out after {timeout:?}"))); |
There was a problem hiding this comment.
Cap the backoff sleep to the remaining timeout budget.
The timeout is checked before the next connect attempt, but the loop still sleeps for the full delay after each failure. If only 2 seconds remain and delay is 30 seconds, this can exceed the configured maximum retry time by another 28 seconds before returning.
Suggested fix
- if !backoff.timeout.is_zero() && retry_start.elapsed() > backoff.timeout {
- let timeout = backoff.timeout;
- return Err(last_error
- .map(|e| e.context(format!("reconnect timed out after {timeout:?}")))
- .unwrap_or_else(|| anyhow::anyhow!("reconnect timed out after {timeout:?}")));
- }
+ let mut sleep_for = delay;
+ if !backoff.timeout.is_zero() {
+ let elapsed = retry_start.elapsed();
+ let timeout = backoff.timeout;
+ if elapsed >= timeout {
+ return Err(last_error
+ .map(|e| e.context(format!("reconnect timed out after {timeout:?}")))
+ .unwrap_or_else(|| anyhow::anyhow!("reconnect timed out after {timeout:?}")));
+ }
+ sleep_for = sleep_for.min(timeout.saturating_sub(elapsed));
+ }
@@
- tokio::time::sleep(delay).await;
+ tokio::time::sleep(sleep_for).await;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if !backoff.timeout.is_zero() && retry_start.elapsed() > backoff.timeout { | |
| let timeout = backoff.timeout; | |
| return Err(last_error | |
| .map(|e| e.context(format!("reconnect timed out after {timeout:?}"))) | |
| .unwrap_or_else(|| anyhow::anyhow!("reconnect timed out after {timeout:?}"))); | |
| let mut sleep_for = delay; | |
| if !backoff.timeout.is_zero() { | |
| let elapsed = retry_start.elapsed(); | |
| let timeout = backoff.timeout; | |
| if elapsed >= timeout { | |
| return Err(last_error | |
| .map(|e| e.context(format!("reconnect timed out after {timeout:?}"))) | |
| .unwrap_or_else(|| anyhow::anyhow!("reconnect timed out after {timeout:?}"))); | |
| } | |
| sleep_for = sleep_for.min(timeout.saturating_sub(elapsed)); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@rs/moq-native/src/reconnect.rs` around lines 92 - 96, The loop currently
sleeps for the full delay after a failed connect, which can exceed the
configured backoff.timeout; modify the reconnect loop around where delay is
awaited (referencing backoff.delay, backoff.timeout, retry_start, and
last_error) to compute the remaining_time =
backoff.timeout.saturating_sub(retry_start.elapsed()) and sleep for
min(backoff.delay, remaining_time) (or skip sleeping if remaining_time is_zero),
so the sleep is capped to the remaining timeout budget and the function returns
the timed-out Err using last_error when no time remains.
Summary
Changes the reconnect timeout from an optional field to a mandatory one with a sensible 5-minute default. This ensures reconnection attempts always have a bounded timeout rather than potentially retrying indefinitely.
Changes
Rust (
rs/moq-native/src/reconnect.rs)Backoff.timeoutfromOption<Duration>toDurationwith adefault_value = "5m"in the CLI argument parserDuration::from_secs(300)(5 minutes) instead ofNoneif let Some(timeout)guard since timeout is now always presentclosed_rxwatch channel now carriesOption<String>(the error message) instead of a boolean flag, allowing callers to see the actual connection error that triggered the timeoutlast_errorand include it in the timeout error context for better diagnosticsclosed()method to extract and return the error message from the watch channelTypeScript (
js/net/src/connection/reload.ts)ReloadDelay.timeoutfrom optional to mandatory with a default of300000ms (5 minutes)Reloadconstructor to includetimeout: 300000if (this.delay.timeout !== undefined)guard#closedRejectinstead of a generic "reconnect timed out" messageImplementation Details
https://claude.ai/code/session_01HdKf7HkEmT2pJr9Fs143yh