fix: respect reconnect=false and clamp server-supplied retry: 0#135
Draft
kinyoklion wants to merge 1 commit intorl/sdk-2345/parser-error-reconnect-statefrom
Draft
Conversation
Three pre-existing reconnect-control gaps surfaced during the multi- agent review of #134 (SDK-2347): 1. `BackoffRetry::change_base_delay` accepted any duration, including `Duration::ZERO`. A server emitting `retry: 0` collapsed the backoff to zero across every reconnect path, producing a tight reconnect loop. Clamp the input to a 1 ms floor. 2. The EOF arm of `ReconnectingRequest::poll_next` unconditionally scheduled a reconnect, even when `reconnect_opts.reconnect` was false. Honor the flag and transition to `StreamClosed` when reconnect is disabled, matching every other error path. 3. The parse-error arm only transitioned state when reconnect was enabled. With reconnect disabled, the parser stayed poisoned and the next poll drained to EOF, where (1) above papered over the bug. Transition to `StreamClosed` so the documented "do not use the stream after error" contract holds. Tests: - `test_change_base_delay_clamps_to_minimum` pins the retry floor. - `parser_error_closes_stream_when_reconnect_disabled` asserts the stream returns `None` after a parse error when reconnect is off. - `eof_closes_stream_when_reconnect_disabled` asserts the stream returns `None` after end-of-body when reconnect is off.
kinyoklion
commented
May 8, 2026
| /// Floor applied to a server-supplied SSE `retry:` value. A server that | ||
| /// sends `retry: 0` would otherwise collapse the backoff to zero and | ||
| /// reconnect would become a tight loop. | ||
| const MINIMUM_BASE_DELAY: Duration = Duration::from_millis(1); |
Member
Author
There was a problem hiding this comment.
We may want this larger.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Stacked on top of #134. Addresses three pre-existing reconnect-control gaps surfaced during the multi-agent review of that PR.
retry: 0collapses backoff to zero. A server emittingretry: 0setBackoffRetry::base_delaytoDuration::ZERO, after which every reconnect path (client.rs:474, 525, 547, 612and the new parse-error path) computednext_delay() == 0and reconnected immediately — a tight loop. Clampschange_base_delayto a 1 ms floor.EOF arm ignored
reconnect_opts.reconnect. The body-exhausted branch unconditionally scheduledWaitingToReconnecteven when reconnect was disabled. Now honors the flag and transitions toStreamClosed, matching every other error path.Parse-error arm with
reconnect=falseleft the parser poisoned. No state transition happened, so the next poll drained the broken body to EOF — where (2) above papered over the bug. Now transitions toStreamClosedso the documented "do not use the stream after error" contract holds.Context
Stacked PR — base is the
rl/sdk-2345/parser-error-reconnect-statebranch from #134, not main. Will retarget once #134 lands.Tracked in SDK-2347. Predecessor: SDK-2345 / #134.
Surfaced from the multi-agent review of #134 (findings 1, 2, and the corresponding suggested follow-ups 4 and 5). All three were pre-existing — #134 only made the parse-error path more visible.
Test plan
test_change_base_delay_clamps_to_minimum(retry.rs) — pins the 1 ms floor againstchange_base_delay(Duration::ZERO)and a sub-floor value.parser_error_closes_stream_when_reconnect_disabled(client.rs) — asserts the stream emits oneInvalidLinethenNonewhen reconnect is off.eof_closes_stream_when_reconnect_disabled(client.rs) — asserts the stream emits the event,Eof, thenNonewhen reconnect is off.parser_error_schedules_reconnect_immediatelyfrom fix: schedule reconnect after parse error during streaming #134 still passes.cargo test— 63 lib tests + 1 doc test pass.cargo fmt --checkclean.