Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(blazor) Failed to reconnect to the server #10325

Closed
Stamo-Gochev opened this issue May 17, 2019 · 21 comments
Closed

(blazor) Failed to reconnect to the server #10325

Stamo-Gochev opened this issue May 17, 2019 · 21 comments

Comments

@Stamo-Gochev
Copy link

@Stamo-Gochev Stamo-Gochev commented May 17, 2019

Describe the bug

When running a server-side blazor project, after some time, the SignalR connection is lost and the following overlay is displayed ("Failed to reconnect to the server"):
image
However, clicking the retry button does not work - the overlay is still present although the following is logged in the dev tools:

Information: Normalizing '_blazor' to 'https://localhost:44369/<APP-NAME>/_blazor'.
Information: WebSocket connected to wss://localhost:44369/<APP-NAME>/_blazor?id=eKQVfTXvMHxKidi6ivsLag.

Probably #8710 will fix this, but is there a workaround for preview 5? Also note that there times which an overlay with "Attempting to reconnect to the server" is displayed and this actually reconnects successfully. However, if the app gets to a state in which the "Failed to reconnect to the server" message is displayed, it will not reconnect from the "Retry" button.

To Reproduce

  1. Run the sample project from the default server-side blazor template
  2. Leave the browser tab without any interaction for some time until the overlay is displayed (this mimics the behavior by the end user). The alternative is to disconnect manually in order to get the notification for the connection loss. Note that calling:
window['Blazor']._internal.forceCloseConnection()

results in the ""Attempting to reconnect to the server", which succeeds though.

Expected behavior

The retry button works.

Additional context

dotnet --info

.NET Core SDK (reflecting any global.json):
 Version:   3.0.100-preview5-011568
 Commit:    b487ff10aa

Runtime Environment:
 OS Name:     Windows
 OS Version:  10.0.17134
 OS Platform: Windows
 RID:         win10-x64
 Base Path:   C:\Program Files\dotnet\sdk\3.0.100-preview5-011568\

Host (useful for support):
  Version: 3.0.0-preview5-27626-15
  Commit:  61f30f5a23

.NET Core SDKs installed:
  1.1.13 [C:\Program Files\dotnet\sdk]
  2.1.503 [C:\Program Files\dotnet\sdk]
  2.1.600-preview-009472 [C:\Program Files\dotnet\sdk]
  2.1.602 [C:\Program Files\dotnet\sdk]
  2.1.700-preview-009597 [C:\Program Files\dotnet\sdk]
  2.1.700-preview-009618 [C:\Program Files\dotnet\sdk]
  2.2.103 [C:\Program Files\dotnet\sdk]
  3.0.100-preview5-011568 [C:\Program Files\dotnet\sdk]

.NET Core runtimes installed:
  Microsoft.AspNetCore.All 2.1.7 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.1.9 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.2.1 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.2.2 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.2.3 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.App 2.1.7 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.1.9 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.2.1 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.2.2 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.2.3 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 3.0.0-preview5-19227-01 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 1.0.15 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 1.1.12 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.0.0 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.7 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.9 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.2.1 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.2.2 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.2.3 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 3.0.0-preview-27324-5 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 3.0.0-preview5-27626-15 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]

  Microsoft.WindowsDesktop.App 3.0.0-preview5-27626-15 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]

To install additional .NET Core runtimes or SDKs:
  https://aka.ms/dotnet-download
@Stamo-Gochev Stamo-Gochev changed the title Failed to reconnect to the server (blazor) Failed to reconnect to the server May 17, 2019
@Eilon Eilon added the area-blazor label May 17, 2019
@ivanchev

This comment has been minimized.

Copy link

@ivanchev ivanchev commented May 20, 2019

Some more info:

Testing the same website on our staging server, I receive the following log when failed to reconnect:

POST http://<HOST>/_blazor/negotiate 503 (Service Unavailable)
blazor.server.js:1 [2019-05-20T08:15:01.098Z] Error: Failed to complete negotiation with the server: Error: Service Unavailable
blazor.server.js:1 [2019-05-20T08:15:01.099Z] Error: Failed to start the connection: Error: Service Unavailable
blazor.server.js:15 [2019-05-20T08:15:01.099Z] Error: Error: Service Unavailable
blazor.server.js:15 [2019-05-20T08:15:01.101Z] Error: Error: Cannot send data if the connection is not in the 'Connected' State.
@danroth27 danroth27 added this to To do in Blazor via automation May 20, 2019
@danroth27 danroth27 added this to the 3.0.0-preview6 milestone May 20, 2019
@danroth27

This comment has been minimized.

Copy link
Member

@danroth27 danroth27 commented May 20, 2019

@mkArtakMSFT Please assign for investigation.

@danroth27

This comment has been minimized.

Copy link
Member

@danroth27 danroth27 commented May 20, 2019

@Stamo-Gochev This should only happen if the server gets shutdown or recycled. For example, in VS this can happen if you edit a C# file. VS will detect the file change and then restart the app.

@Stamo-Gochev

This comment has been minimized.

Copy link
Author

@Stamo-Gochev Stamo-Gochev commented May 20, 2019

When testing using IIS Express (running from Visual Studio) I am not changing any of the code. The case is similar when hosting using IIS/Kestrel - what might cause the server to shutdown? Is there a way to handle the reconnection from the server?

@ivanchev

This comment has been minimized.

Copy link

@ivanchev ivanchev commented May 20, 2019

We're hosting this on IIS as well on the staging server, with the same results. Using the default app pool, without any specific configuration. The app is published with Release configuration.

@danroth27

This comment has been minimized.

Copy link
Member

@danroth27 danroth27 commented May 20, 2019

Is there any chance that the app was recycled in the staging environment when this happened?

@ivanchev

This comment has been minimized.

Copy link

@ivanchev ivanchev commented May 20, 2019

We will check if we can monitor the logs, and cross reference them with any dropped connections. Will write back as soon as we have a clear answer to this.

@ivanchev

This comment has been minimized.

Copy link

@ivanchev ivanchev commented May 21, 2019

The Log doesn't show any useful info. Here's a crash error:
[2019-05-21T06:53:45.046Z] Error: Connection disconnected with error 'Error: WebSocket closed with status code: 1006 ().'.

And here's the log from the staging server at that time:

2019-05-21 06:53:45 172.31.176.190 GET /blazor-ui/_blazor id=I72uhps1Z_6HB_lwn8HgOA 80 - 192.168.14.187 Mozilla/5.0+(Macintosh;+Intel+Mac+OS+X+10_13_3)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/74.0.3729.131+Safari/537.36 - 101 0 0 681762
2019-05-21 06:53:45 172.31.176.190 GET /blazor-ui/_blazor id=65lcXYvl-m9nk9laC8u_8Q 443 - 192.168.149.61 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/74.0.3729.131+Safari/537.36 - 101 0 0 870607
2019-05-21 06:53:45 172.31.176.190 GET /blazor-ui/_blazor id=jNnshkk6BZAwG0aO86sUeg 80 - 192.168.14.187 Mozilla/5.0+(Macintosh;+Intel+Mac+OS+X+10_13_3)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/74.0.3729.131+Safari/537.36 - 101 0 0 692329
2019-05-21 06:53:45 172.31.176.190 GET /blazor-ui/_blazor id=MhPWBrlevszZ_cf_0_Isbw 80 - 192.168.14.187 Mozilla/5.0+(Macintosh;+Intel+Mac+OS+X+10_13_3)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/74.0.3729.131+Safari/537.36 - 101 0 0 639775
2019-05-21 06:53:45 172.31.176.190 GET /blazor-ui/_blazor id=oyZBbLadu59gsb5_Y-bADg 80 - 192.168.14.187 Mozilla/5.0+(Macintosh;+Intel+Mac+OS+X+10_13_3)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/74.0.3729.131+Safari/537.36 - 101 0 0 660625
2019-05-21 06:53:45 172.31.176.190 GET /blazor-ui/_blazor id=lqhrIZLfGNeVvYvCN0vLcw 80 - 192.168.14.187 Mozilla/5.0+(Macintosh;+Intel+Mac+OS+X+10_13_3)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/74.0.3729.131+Safari/537.36 - 101 0 0 798451
#Software: Microsoft Internet Information Services 8.5
#Version: 1.0
#Date: 2019-05-21 06:54:21
#Fields: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) cs(Referer) sc-status sc-substatus sc-win32-status time-taken

I had several tabs loaded, and they all crashed at the same time.

Could you also clarify what do you mean by recycling the app - the pool or the app itself?

The current setup is the app published to a destination folder, which is hosted through IIS. This is done only through a build, and the build is not triggered during the test period, when the crash happens. In other words the files of the hosted application are not modified at all.

@Stamo-Gochev

This comment has been minimized.

Copy link
Author

@Stamo-Gochev Stamo-Gochev commented May 21, 2019

The findings so far show that the problem is caused by an (unexpected) restarting of the w3wp.exe process on our end (on the staging environment) and I am not able to reproduce the problem when the app is hosted on my IIS (by making sure nothing is modified, restarted, etc.). We will further investigate, but the issue seems to be caused by our environment.

@javiercn javiercn added the cost: XS label May 21, 2019
@Stamo-Gochev

This comment has been minimized.

Copy link
Author

@Stamo-Gochev Stamo-Gochev commented May 29, 2019

After further testing, it turned out that there are situations in which the testing environment needs to be shutdown, e.g. when installing updates, restarting the machine, etc., so is there a way to fix the "Retry" button functionality? I am not sure if the SignalR connection can be restored in these cases.

@danroth27

This comment has been minimized.

Copy link
Member

@danroth27 danroth27 commented May 29, 2019

@Stamo-Gochev The Retry button specifically tries to reconnect back to the state ("circuit") for the current UI session. Blazor doesn't provide an out of the box mechanism to persist and rehydrate this state, so after a server restart the state will be gone and no amount of retrying to connect to that state will help. Instead the user will need to refresh the browser.

We have an issue open to make this clearer in the default UI (#10496). You can also customize this UI yourself: https://docs.microsoft.com/en-us/aspnet/core/blazor/hosting-models#reconnection-to-the-same-server.

Lastly, you should be able to do work in your app and components to support persisting the required state so that the app can survive a server restart. For example, the app could persist the circuit state into a memory cache, and the components could leverage local storage in the browser. We don't have great docs or samples on how to do this yet, but it's something we expect to provide before we ship.

@Stamo-Gochev

This comment has been minimized.

Copy link
Author

@Stamo-Gochev Stamo-Gochev commented May 30, 2019

Thanks for the suggestions, We have already tried the customization of the "Retry" button to reload the page as this was the only thing that fixed the problem during testing. Still, we will test the ideas that will be added to the docs for restoring the circuit when they are out.

@agonzalezm

This comment has been minimized.

Copy link

@agonzalezm agonzalezm commented May 30, 2019

I have this problem if i modify and save html file while running, the project is recompiled and browser try to refresh and says attempting to reconnect server.. but it wait just 2 secs and fail while solution is still compiling. If i hit retry it doesnt do anything, i have to press F5 in browser so it loads the app again.

@javiercn

This comment has been minimized.

Copy link
Contributor

@javiercn javiercn commented May 30, 2019

@danroth27 I think @agonzalezm issue might be related to VS recompiling the app on changes?

@agonzalezm Are you using server-side blazor with an HTML file?

@danroth27 I'm not very familiar with how the recompilation stuff works in VS and @SteveSandersonMS is out. I think all these issues are related to the server restarting for some reason. Will you be ok, if the output of this item is to change the reconnection UI to handle the case where the circuit is gone and display a different message, like your session was terminated, refresh the page (with a button to trigger a refresh).

@agonzalezm

This comment has been minimized.

Copy link

@agonzalezm agonzalezm commented May 30, 2019

@javiercn yes the issue is while recompiling the app on changes that browser doesnt wait till it is recompiled to refresh.

I mean modify html code in .razor file, not .html file.

@mkArtakMSFT

This comment has been minimized.

Copy link
Contributor

@mkArtakMSFT mkArtakMSFT commented May 30, 2019

@agonzalezm the behavior you're experiencing is expected. Changes to .razor files result in recompilation, which restarts the server and that drops existing circuits (including states). So you must refresh your browser.

It seems there is no other work pending on this issue, as the only pending work is being tracked in above referenced issues.

Blazor automation moved this from To do to Done May 30, 2019
@mkArtakMSFT mkArtakMSFT added question and removed investigate labels May 30, 2019
@edgolub

This comment has been minimized.

Copy link

@edgolub edgolub commented Oct 3, 2019

Why is this closed?

I'm running this with the latest version on Blazor and the app is hosted via IIs. If I keep my tab open for a few hours the app gives me an "not connected" error. This is with a release build for production.

This is really not something that can be shown regularly to users in production.

A solution I would prefer: can I detect if the connnection was closed on the frontend with Javascript and force a page reload if the reconnect fails 3 times and say, the user didn't try to reconnect himself?

@danroth27

This comment has been minimized.

Copy link
Member

@danroth27 danroth27 commented Oct 3, 2019

Hi @edgolub,

This issue is closed because there is no known Blazor work tracked here. Blazor Server apps will go into a "not connected" state if the connection is lost for any reason, including the network connection being dropped or the server being restarted. If the server is restarted then there is no way for the app to reconnect back to the same state because the server state is gone. The only way forward is to refresh the browser. Blazor won't refresh the browser automatically, but you can setup your app to do so. For example you could add the following script to your _Host.cshtml file:

<script>
    // Wait until a 'reload' button appears
    new MutationObserver((mutations, observer) => {
        if (document.querySelector('#components-reconnect-modal h5 a')) {
            // Now every 10 seconds, see if the server appears to be back, and if so, reload
            async function attemptReload() {
                await fetch(''); // Check the server really is back
                location.reload();
            }
            observer.disconnect();
            attemptReload();
            setInterval(attemptReload, 10000);
        }
    }).observe(document.body, { childList: true, subtree: true });
</script>

This uses the JS DOM mutation observer API to detect when we’re offering a “reload” button, and automatically does a reload in that case. It has to be careful not to reload before the server actually comes back, otherwise the kiosk would get stuck displaying a “network error” type screen. There is still the potential that the refresh will fail, and if it does then there's no way to retry without user intervention.

@rolfik

This comment has been minimized.

Copy link

@rolfik rolfik commented Nov 4, 2019

Daniel (@danroth27), thank You for the code to autorefresh browser.
I now use it in my kiosk Blazor application.
Anyway it would be better to make it optional part of Blazor itself without using that '#components-reconnect-modal h5 a' which could change.

@danroth27

This comment has been minimized.

Copy link
Member

@danroth27 danroth27 commented Nov 4, 2019

@rolfik Thanks for the feedback! We are considering this as part of #9256

@YehudaKremer

This comment has been minimized.

Copy link

@YehudaKremer YehudaKremer commented Nov 22, 2019

Hi @edgolub,

This issue is closed because there is no known Blazor work tracked here. Blazor Server apps will go into a "not connected" state if the connection is lost for any reason, including the network connection being dropped or the server being restarted. If the server is restarted then there is no way for the app to reconnect back to the same state because the server state is gone. The only way forward is to refresh the browser. Blazor won't refresh the browser automatically, but you can setup your app to do so. For example you could add the following script to your _Host.cshtml file:

<script>
    // Wait until a 'reload' button appears
    new MutationObserver((mutations, observer) => {
        if (document.querySelector('#components-reconnect-modal h5 a')) {
            // Now every 10 seconds, see if the server appears to be back, and if so, reload
            async function attemptReload() {
                await fetch(''); // Check the server really is back
                location.reload();
            }
            observer.disconnect();
            attemptReload();
            setInterval(attemptReload, 10000);
        }
    }).observe(document.body, { childList: true, subtree: true });
</script>

This uses the JS DOM mutation observer API to detect when we’re offering a “reload” button, and automatically does a reload in that case. It has to be careful not to reload before the server actually comes back, otherwise the kiosk would get stuck displaying a “network error” type screen. There is still the potential that the refresh will fail, and if it does then there's no way to retry without user intervention.

If you sure you want to reload every change,
you can instant reload base on @danroth27 answer:

<style>
   #components-reconnect-modal {
       display: none !important;
   }
</style>
<script>
   new MutationObserver(() => document.querySelector('#components-reconnect-modal') && location.reload())
      .observe(document.body, { childList: true });
</script>

put it right before:

</body>
@msftbot msftbot bot locked as resolved and limited conversation to collaborators Dec 22, 2019
@jaredpar jaredpar removed this from Done in Blazor Jan 7, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
10 participants
You can’t perform that action at this time.