Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection disconnected with error 'Error: Server timeout elapsed without receiving a message from the server.'. #42778

Closed
1 task done
MicroTrendsTom opened this issue Jul 18, 2022 · 16 comments
Labels
area-blazor Includes: Blazor, Razor Components feature-blazor-deployment Issues related to deploying Blazor feature-blazor-server ✔️ Resolution: Answered Resolved because the question asked by the original author has been answered. question Status: Resolved

Comments

@MicroTrendsTom
Copy link

MicroTrendsTom commented Jul 18, 2022

Is there an existing issue for this?

  • I have searched the existing issues

Describe the bug

Client Disconnect reconnect message on browser multiple times per minute in some cases

Very frequent disconnect reconnect issues that make the system unusable - unless very close proximity in the USA.
reports in other country's unusable - in Asia
Asia regions were stable prior .e.g. it could go all night no disconnect on soak tests.
but for the last 2 to 3months this is very bad. i have attempted to add code server side config forced sticky sessions and so on and on the app config side

This occurs within local debugging Dev, Stage and Production so im assuming its related to signalr connection.

image

i tried pretty much all of the info on the web i could find over many weeks, before submitting this issue- but to no avail as yet.
So i thought i would drop you a line thank you in advance

have gone all in on Azure + Blazor so this is mission critical for us :-)

Expected Behavior

very few disconnects if at all.

Steps To Reproduce

Navigate to from an Geo Asia location:
https://alphawebtrader.azurewebsites.net/
open a web console and view the output
wait for disconnect reconnect

Exceptions (if any)

Error: Connection disconnected with error 'Error: Server timeout elapsed without receiving a message from the server.'.

.NET Version

6

Anything else?

image

Visual Studio 2020

alphawebtrader.azurewebsites.net-1658144604996.log
2
image
Program.cs.txt
AlphaWebTrader.csproj.txt

image

image

@javiercn javiercn added area-blazor Includes: Blazor, Razor Components feature-blazor-server feature-blazor-deployment Issues related to deploying Blazor labels Jul 18, 2022
@javiercn
Copy link
Member

javiercn commented Jul 18, 2022

@MicroTrendsTom thanks for contacting us.

Are you using Azure SignalR service to scale-out the connections?

Nevermind, I saw it on your settings.

Also, are you deploying to a single region within Azure and accessing from different continents?

The first thing we would recommend would be to try to put servers closer to where your users are. Given that Blazor is a real-time framework, traffic across continents can make the experience suffer, since at such distances, even at the speed of light you have limitations, so each server roundtrip can result in a few hundred milliseconds of latency.

Another thing to check is if there is a way to increment the timeout settings on the client, since I believe what is failing is the ping mechanism. @BrennanConroy is this possible?

It might also help to capture the network traffic to try and detect if something is going on at the server when these things happen. (Is the app restarting/the server rebooting? Which will cause the existing circuits to get destroyed)

Does the connection recover or do you need to refresh the page to make it go away? Does the problem persist after refreshing the page?

@javiercn javiercn added the Needs: Author Feedback The author of this issue needs to respond in order for us to continue investigating this issue. label Jul 18, 2022
@ghost
Copy link

ghost commented Jul 18, 2022

Hi @MicroTrendsTom. We have added the "Needs: Author Feedback" label to this issue, which indicates that we have an open question for you before we can take further action. This issue will be closed automatically in 7 days if we do not hear back from you by then - please feel free to re-open it if you come back to this issue after that time.

@MicroTrendsTom
Copy link
Author

MicroTrendsTom commented Jul 18, 2022

Hi

Dedicated Signal R - yes as you saw.

"Also, are you deploying to a single region within Azure and accessing from different continents?:"
Yes 1 single region only - North Central US - that was working ok until about 2 to 3months ago in Asia - customer reports - Australia is unusable but was ok prior. Singapore not bad on and off a few an hour or more. Thailand is very sketchy - maybe a submarine cable was damaged or something regional changed with networks.

Client to Server ping times are 15ms to 30ms average - but line quality im not sure- jittering and so on.

Side by side tests
I have run other blazor websites alongside it hosted in the USA - such as https://blazorhelpwebsite.com/ and blazotrain.com but no errors on those - even on pages with no dynamic elements.

Ok Noted: So we could use geo region web servers with Azure

"Another thing to check is if there is a way to increment the timeout settings on the client, since I believe what is failing is the ping mechanism. @BrennanConroy is this possible?"
This sounds like a good idea will wait for the answer on that one. - i have provided the program.cs.txt with some settings in it etc.

builder.Services.AddSignalR().AddAzureSignalR(options =>
{
options.ServerStickyMode = Microsoft.Azure.SignalR.ServerStickyMode.Required;
options.MaxPollIntervalInSeconds = 10;
});
is this related?

"Does the connection recover or do you need to refresh the page to make it go away? Does the problem persist after refreshing the page?"

the system always manages to reconnect 99% of the time.
After a refresh the problem comes back
After a service restart the problem also persists.
The problem is the same on any page with static or dynamic content.

some settings here:
Program.cs.txt.txt

Browser Network:
network-request-stack
network-timing
network

Further tests this end:

  1. Remove server signal R and make a local debug version run test to prove its not a code based issue or Signal R Server issue
    2)Create a blank website first and see what the out of the box blazor does alongside from the same app server alongside the problematic - to verify its not a rogue webiste code base etc.
    3)Geo located server test
    4)remove google analytics etc

@ghost ghost added Needs: Attention 👋 This issue needs the attention of a contributor, typically because the OP has provided an update. and removed Needs: Author Feedback The author of this issue needs to respond in order for us to continue investigating this issue. labels Jul 18, 2022
@BrennanConroy
Copy link
Member

options.KeepAliveInterval = TimeSpan.FromSeconds(120);

Did you also change the client side code (serverTimeoutInMilliseconds) to not expect a ping for (recommended) double that time?

@mkArtakMSFT mkArtakMSFT added Needs: Author Feedback The author of this issue needs to respond in order for us to continue investigating this issue. and removed Needs: Attention 👋 This issue needs the attention of a contributor, typically because the OP has provided an update. labels Jul 18, 2022
@ghost
Copy link

ghost commented Jul 18, 2022

Hi @MicroTrendsTom. We have added the "Needs: Author Feedback" label to this issue, which indicates that we have an open question for you before we can take further action. This issue will be closed automatically in 7 days if we do not hear back from you by then - please feel free to re-open it if you come back to this issue after that time.

@MicroTrendsTom
Copy link
Author

MicroTrendsTom commented Jul 18, 2022

ok will check this..

1) Geo location test
I have just tested that a SignalR Instance in region SouthEastAsia still has same issues as one in NorhtCentral US same error
[2022-07-18T16:29:33.057Z] Information: WebSocket connected to wss://signalr-sea-001.service.signalr.net/client/?hub=componenthub&asrs.op=%2F_blazor...
[2022-07-18T16:29:29.475Z] Error: Connection disconnected with error 'Error: Server timeout elapsed without receiving a message from the server.'.

Another test would be a server app and signalr in same zone and client in South east asia -

  1. ruled out blazor analytics JScript

  2. options.KeepAliveInterval = TimeSpan.FromSeconds(120)- set to options.KeepAliveInterval = TimeSpan.FromSeconds(240);
    Tested
    [2022-07-18T17:27:29.610Z] Error: Connection disconnected with error 'Error: Server timeout elapsed without receiving a message from the server.' -reconnect message etc

@ghost ghost added Needs: Attention 👋 This issue needs the attention of a contributor, typically because the OP has provided an update. and removed Needs: Author Feedback The author of this issue needs to respond in order for us to continue investigating this issue. labels Jul 18, 2022
@BrennanConroy
Copy link
Member

3. options.KeepAliveInterval = TimeSpan.FromSeconds(120)- set to options.KeepAliveInterval = TimeSpan.FromSeconds(240);

That is not what I said, is this a typo?

@MicroTrendsTom
Copy link
Author

  1. options.KeepAliveInterval = TimeSpan.FromSeconds(120)- set to options.KeepAliveInterval = TimeSpan.FromSeconds(240);

That is not what I said, is this a typo?

ok sorry what to set it to TimeSpan.FromSeconds(60)
this was changed to 120 as part of a test

@BrennanConroy
Copy link
Member

set it to TimeSpan.FromSeconds(60)

Changing this setting isn't going to help unless you also change the serverTimeoutInMilliseconds option on the client. The issue you are seeing is a timeout on the client-side, which is very likely caused by the keep alive interval on the server being changed without also updating the client.

@MicroTrendsTom
Copy link
Author

ok gotcha understood.
Where to set the ServerTimeoutInMilliseconds ?
sounds like it would be here?

<script src="_framework/blazor.server.js"></script>

or in program.cs builder ?

@javiercn
Copy link
Member

@MicroTrendsTom You can do it like this

<body>
    ...

    <script src="_framework/blazor.server.js" autostart="false"></script>
    <script>
      Blazor.start({
        configureSignalR: function (builder) {
          builder.serverTimeoutInMilliseconds = ...
        }
      });
    </script>
</body>

@MicroTrendsTom
Copy link
Author

MicroTrendsTom commented Jul 19, 2022

ok awesome assistance guys thank you.

Information summary from the above
Server side if options.KeepAliveInterval is changed
serverTimeoutInMilliseconds must be changed on the client side.
#42778 (comment)

Why were defaults changed?

  1. mobile phone browsers cutting out/requiring a reload too quickly settings were changed
  2. in addition a sticky session was required to prevent an exception so the detailed setting was added

ad hoc changes were made and the end result....caused the issue.
seems then a little knowledge is dangerous and these actions fixed some issues and broke the normal reliable Blazor defaults

Actions:
using this guide: https://github.com/dotnet/AspNetCore.Docs/blob/main/aspnetcore/signalr/configuration.md
Rolling back to defaults to test

@MicroTrendsTom
Copy link
Author

Quick system soak test for disconnect/reconnect
Results so far are... the website https://alphawebtrader.azurewebsites.net/
has not disconnected at all no errors and has beaten a parallel test of a prior configured clone

based on the information and knowledge provided above It would seem that this is resolved albeit a very short test interval.
i will continue to test and will prepare to roll out to prod.... and feedback in a few hours.

@javiercn
Copy link
Member

@MicroTrendsTom thanks for the additional details.

I am glad that we were able to figure out the cause of the issue. We will be improving the docs in this area as a result.

@guardrex can you add something to the docs in the deployment section? I think the two important bits of information to add are:

  • Guidance for "global" deployments:
    • Prefer deploying to the same region your users are in.
    • Considerations about increased latency when dealing with traffic across continents (increased latency).
  • How to update the timeouts if the app is aggressively displaying the reconnection UI because of ping timeouts.
    • How to set the value on the server and the client.
    • How to select a value for the server (at the very least double the MAX round-trip time between client and server you expect)
    • Double on the client than on the server (as per @BrennanConroy recommendation).

@javiercn javiercn added question ✔️ Resolution: Answered Resolved because the question asked by the original author has been answered. and removed Needs: Attention 👋 This issue needs the attention of a contributor, typically because the OP has provided an update. labels Jul 19, 2022
@ghost ghost added the Status: Resolved label Jul 19, 2022
@MicroTrendsTom
Copy link
Author

sounds great - definitely the problem has been resolved by the above.
we had a few disconnects as did RDP sessions so nothing unexpected.

Documentation on optimizing for mobile browser experience - would be a real win also
there is a tendency out of the box for mobile experience to be not so great and users abandon blazor mobile in preference for desktop - due to reconnect/reloads dropped circuits. And blunders can be made when seeking to resolve this etc.

Would be good if SiganlR could do geo scale out to provide clients nearest endpoint - without neccessarily adding a web app in that region too.

@ghost
Copy link

ghost commented Jul 20, 2022

This issue has been resolved and has not had any activity for 1 day. It will be closed for housekeeping purposes.

See our Issue Management Policies for more information.

@ghost ghost closed this as completed Jul 20, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Aug 19, 2022
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-blazor Includes: Blazor, Razor Components feature-blazor-deployment Issues related to deploying Blazor feature-blazor-server ✔️ Resolution: Answered Resolved because the question asked by the original author has been answered. question Status: Resolved
Projects
None yet
Development

No branches or pull requests

4 participants