Skip to content

Retry stateful reconnect attempts #50745

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

BrennanConroy
Copy link
Member

Add a few retires when attempting to reconnect during stateful reconnect.
Increases the likelihood of reconnecting successfully during a short network break.

Could consider adding a Task.Delay(1000) between each iteration.

@BrennanConroy BrennanConroy added the area-signalr Includes: SignalR clients and servers label Sep 15, 2023
@davidfowl
Copy link
Member

Why wouldn't we a number?

@@ -2082,7 +2082,7 @@ public bool ShouldProcessMessage(HubMessage message)
{
if (!_messageBuffer.ShouldProcessMessage(message))
{
Log.DroppingMessage(_logger, ((HubInvocationMessage)message).GetType().Name, ((HubInvocationMessage)message).InvocationId);
Log.DroppingMessage(_logger, message.GetType().Name, (message as HubInvocationMessage)?.InvocationId ?? "(null}");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Log.DroppingMessage(_logger, message.GetType().Name, (message as HubInvocationMessage)?.InvocationId ?? "(null}");
Log.DroppingMessage(_logger, message.GetType().Name, (message as HubInvocationMessage)?.InvocationId ?? "(null)");

@BrennanConroy
Copy link
Member Author

Why wouldn't we a number?

I assume you mean configurable? Because RC2 ends today, and we probably would want something more like IRetryPolicy.

Copy link
Member

@davidfowl davidfowl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is gonna bite us 🥲 . Do we read IConfiguration in the client? In JavaScript we can read a property without public API 😄.

@adityamandaleeka
Copy link
Member

Are we still on the fence about this for 8? Is this strictly better than what we have today? Or is the concern that someone may want to dial back the retries via config and not be able to?

@BrennanConroy
Copy link
Member Author

Are we still on the fence about this for 8?

Yeah, no one has asked for it yet, and David has reservations.

Is this strictly better than what we have today?

I think it is, it increases the odds of a successful stateful reconnect.
The main downside is that in theory there could be a slightly longer delay for the connection to be seen as disconnected if the last reconnect attempt took a long time to fail to connect. We could add a timeout though to mitigate that.

We could wait and see if customers have feedback relating to retrying the reconnect and try getting this change into a patch.

@adityamandaleeka
Copy link
Member

We could wait and see if customers have feedback relating to retrying the reconnect and try getting this change into a patch.

Makes sense to me, and this seems like a reasonable thing to add in servicing.

@ghost ghost added the pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun label Oct 3, 2023
@wtgodbe wtgodbe removed the pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun label Feb 13, 2024
@dotnet dotnet deleted a comment Feb 13, 2024
@dotnet-policy-service dotnet-policy-service bot added the area-infrastructure Includes: MSBuild projects/targets, build scripts, CI, Installers and shared framework label Feb 13, 2024
@halter73
Copy link
Member

I assume you mean configurable? Because RC2 ends today, and we probably would want something more like IRetryPolicy.

Why doesn't this use the IRetryPolicy? Is it that we want to allow developers to specify a different policy for stateful reconnects and fall back to the current policy for stateless reconnects?

@wtgodbe wtgodbe removed the area-infrastructure Includes: MSBuild projects/targets, build scripts, CI, Installers and shared framework label Feb 13, 2024
@dotnet-policy-service dotnet-policy-service bot added the pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun label Feb 20, 2024
@danmoseley danmoseley requested a review from Copilot February 14, 2025 04:06
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (6)

src/SignalR/clients/csharp/Http.Connections.Client/src/Internal/WebSocketsTransport.cs:427

  • [nitpick] The error message 'Reconnect attempt failed.' is generic. Consider providing more context about the failure, such as including the exception message.
throw new InvalidOperationException("Reconnect attempt failed.", innerException: exception);

src/SignalR/clients/csharp/Http.Connections.Client/src/Internal/WebSocketsTransport.cs:409

  • The new retry behavior should be covered by tests to ensure it works as expected. Add tests to verify the retry logic.
for (var i = 0; i < 3; i++)

src/SignalR/clients/ts/signalr/src/HttpConnection.ts:445

  • [nitpick] The variable 'error' should be explicitly typed to avoid ambiguity. Consider using 'let error: any;' or a more specific type.
let error;

src/SignalR/clients/ts/signalr/src/HttpConnection.ts:461

  • The 'catch' block should capture and log the exception for better debugging. Consider using 'catch (ex) { this._logger.log(LogLevel.Error, Reconnect attempt failed with error: ${ex}); callStop = true; }'.
catch {

src/SignalR/clients/ts/signalr/src/HttpConnection.ts:446

  • Ensure that the retry mechanism is covered by tests to verify its behavior. Add tests that simulate transient network issues and validate the reconnect attempts.
for (let i = 0; i < 3; i++) {

src/SignalR/clients/ts/signalr/tests/HttpConnection.test.ts:1995

  • Consider adding a test case to verify that the error is logged if all retries fail. This will ensure that the new behavior is fully tested.
TestWebSocket.webSocket.onclose(new TestEvent());

@@ -2082,7 +2082,7 @@ public bool ShouldProcessMessage(HubMessage message)
{
if (!_messageBuffer.ShouldProcessMessage(message))
{
Log.DroppingMessage(_logger, ((HubInvocationMessage)message).GetType().Name, ((HubInvocationMessage)message).InvocationId);
Log.DroppingMessage(_logger, message.GetType().Name, (message as HubInvocationMessage)?.InvocationId ?? "(null}");
Copy link
Preview

Copilot AI Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The closing parenthesis in the error message is incorrect. It should be "(null)" instead of "(null}".

Suggested change
Log.DroppingMessage(_logger, message.GetType().Name, (message as HubInvocationMessage)?.InvocationId ?? "(null}");
Log.DroppingMessage(_logger, message.GetType().Name, (message as HubInvocationMessage)?.InvocationId ?? "(null)");

Copilot uses AI. Check for mistakes.

@BrennanConroy
Copy link
Member Author

Will come back in the future.

@dotnet-policy-service dotnet-policy-service bot added this to the 8.0.18 milestone Jun 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-signalr Includes: SignalR clients and servers pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants