Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SignalR WebSocket Connectivity Issue in FireFox #48305

Open
1 task done
MGulraiz opened this issue May 18, 2023 · 9 comments
Open
1 task done

SignalR WebSocket Connectivity Issue in FireFox #48305

MGulraiz opened this issue May 18, 2023 · 9 comments
Labels
area-signalr Includes: SignalR clients and servers

Comments

@MGulraiz
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Describe the bug

I am getting "Error: Failed to start the transport 'WebSockets': Error: WebSocket failed to connect" while using SignalR in an Angular application with .NET 6.0. The issue occurs specifically in Mozilla Firefox and it's random. Despite attempting to refresh the page, the problem persists. However, closing and reopening the browser resolves the issue.

After approximately 1-2 minutes of being stuck in the "connecting" state, SignalR throws an error and automatically switches its transport protocol to Server-Sent Events (SSE), following which the connection successfully establishes.

The WebSocket connection is working fine in Chrome

Expected Behavior

Like Chrome a websocket connection should be established in Firefox as well.

Steps To Reproduce

No response

Exceptions (if any)

No response

.NET Version

.Net Version: 6.0.3 Signalr Version: 6.0.3

Anything else?

connecting

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-signalr Includes: SignalR clients and servers label May 18, 2023
@BrennanConroy
Copy link
Member

Get a network trace
https://learn.microsoft.com/aspnet/core/signalr/diagnostics?view=aspnetcore-7.0#network-traces

@MGulraiz
Copy link
Author

I will try to obtain a network trace since the issue is random, making it quite difficult for me to reproduce. However, I do have the network activity of my browser from the last time.
blockrequest

@MGulraiz
Copy link
Author

2023-5-19_v1.2.zip
@BrennanConroy Please find the attached network trace. It appears that SignalR is taking more than one minute to send a second request after the negotiate request, which might be causing the issue.

@ttutko
Copy link

ttutko commented Jul 27, 2023

I have also just started seeing something very similar to this if not the exact issue. It's on an airgapped network that I can't provide a trace from though. Will also point out that this is Firefox 114.0.2 on a linux host (Pop_Os! 22.04).

@fabianeichfeldt
Copy link

I am observing similar problems since some weeks
From what I can see:
Initial websocket connection request seems to fail, afterwards it's falling back to Server Sent Events, which is taking forever. In the end the websocket connection upgrade is working after 40seconds
image

Interesting observation: Problems occur only on local deployment. For remote connections it works as intended:
image

Used versions
Firefox: 117.0.1
OS: Ubuntu 22.04
JS SignalR Client: 6.0.11
ASPnet core: 6.0.22

Backend does not report any suspicious, but I can test this with some more aggressive logging settings as well, if you can provide them.

@qwertie
Copy link

qwertie commented Sep 22, 2023

This happens to me rather often, as Firefox on Windows 10 is my primary browser. Once the glitch appears on http://localhost:3000, it usually persists across different tabs until the browser is closed & restarted. Something peculiar about the "WebSocket failed to connect" message is that it doesn't appear on initial page load, it appears during reload, and when it appears it is always the very first message; notably, it appears before the constructor for new signalR.HubConnectionBuilder() is invoked.

But is it just a SignalR thing or are all websockets broken? Well, SignalR connections on different domains still work, so the problem is limited to a a particular domain, and maybe localhost only. I found that if I ran this code on Chrome:

let socket = new WebSocket("ws://localhost:5001/api/coordinator");
socket.onopen = function(e) { console.warn("Connection established", e); };
socket.onerror = function(e) { console.error(e); };

It immediately prints "Connection established". Running the same code in Firefox (in its "glitched" state), nothing happens (neither success nor error). So SignalR seems blameless.

However, the SSE fallback doesn't happen after 1-2 minutes; it takes about 13 minutes and 20 seconds (800 seconds) which seems excessive! This is with @microsoft/signalr v6.0.7. When it finally does establish an SSE connection, something goes wrong and the connection fails afterward. The console output looks like this:

GET ws://localhost:5001/api/coordinator?id=ctDwol9eSfT2BzjvWpU9Pg [HTTP/1.1 404 Not Found 3ms]
Firefox can’t establish a connection to the server at ws://localhost:5001/api/coordinator?id=ctDwol9eSfT2BzjvWpU9Pg. [WebSocketTransport.ts:74](c:/Dev/Barreleye/UI/node_modules/@microsoft/signalr/src/WebSocketTransport.ts)
[2023-09-22T14:20:44.381Z] Information: (WebSockets transport) There was an error with the transport. [Utils.ts:199](c:/Dev/Barreleye/UI/node_modules/@microsoft/signalr/src/Utils.ts)
[2023-09-22T14:20:44.382Z] Error: Failed to start the transport 'WebSockets': Error: WebSocket failed to connect. The connection could not be found on the server, either the endpoint may not be a SignalR endpoint, the connection ID is not present on the server, or there is a proxy blocking WebSockets. If you have multiple servers check that sticky sessions are enabled. [localhost:3000:2367:25](http://localhost:3000/)
[2023-09-22T14:20:44.399Z] Information: SSE connected to http://localhost:5001/api/coordinator?id=dnZ0lqjAl6EVx8EJjfCn-Q [Utils.ts:199](c:/Dev/Barreleye/UI/node_modules/@microsoft/signalr/src/Utils.ts)
XHR POST http://localhost:5001/api/coordinator?id=dnZ0lqjAl6EVx8EJjfCn-Q [HTTP/1.1 404 Not Found 14997ms]
[2023-09-22T14:20:59.400Z] Error: Connection disconnected with error 'Error: Server returned handshake error: Handshake was canceled.'. [localhost:3000:2367:25](http://localhost:3000/)
Uncaught (in promise) Error: Server returned handshake error: Handshake was canceled.
    _processHandshakeResponse HubConnection.ts:614
    _processIncomingData HubConnection.ts:536
    node_modules vendors-node_modules_microsoft_signalr_dist_esm_HubConnectionBuilder_js-node_modules_mui_icon-f68620.chunk.js:1447
    node_modules vendors-node_modules_microsoft_signalr_dist_esm_HubConnectionBuilder_js-node_modules_mui_icon-f68620.chunk.js:3227
    node_modules vendors-node_modules_microsoft_signalr_dist_esm_HubConnectionBuilder_js-node_modules_mui_icon-f68620.chunk.js:3222
    connect ServerSentEventsTransport.ts:51
    _startTransport HttpConnection.ts:438
    _createTransport HttpConnection.ts:393
[HubConnection.ts:614](c:/Dev/Barreleye/UI/node_modules/@microsoft/signalr/src/HubConnection.ts)
Failed to connect to /api/coordinator; will retry. Error: No Connection with that ID: Status code '404'

Note 1: After the "SSE connected" message, it pauses for a few seconds before the next message appears.
Note 2: the last message comes from a catch block I put around the call to signalR.HubConnection.start().

It looks like the server (Microsoft.AspNetCore.SignalR.Core.dll in dotnet\shared\Microsoft.AspNetCore.App\7.0.7 folder) is rejecting the SSE connection attempt, but I don't know why. I haven't added any configuration to disable SSE (I just call IServiceCollection.AddSignalR(); and my Hub has an uninteresting constructor.)

@BrennanConroy
Copy link
Member

The more logs the better, here are docs on getting server and client logs: https://learn.microsoft.com/aspnet/core/signalr/diagnostics?view=aspnetcore-7.0 (trace is the most verbose)

@fabianeichfeldt
Copy link

Please find my detailed logs below. We are using Authorization, I kept the Auth logs in place, to see that there is no general timing issue in network communication.
backend.log
frontend.log

hope this helps

@fabianeichfeldt
Copy link

Could someone meanwhile find the time to check the logs? Is there anything else required to tackle the issue?

@dotnet-policy-service dotnet-policy-service bot added the pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun label Feb 6, 2024
@wtgodbe wtgodbe removed the pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun label Feb 6, 2024
@dotnet-policy-service dotnet-policy-service bot added the pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun label Feb 6, 2024
@wtgodbe wtgodbe removed the pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun label Feb 13, 2024
@dotnet dotnet deleted a comment from dotnet-policy-service bot Feb 13, 2024
@dotnet dotnet deleted a comment from dotnet-policy-service bot Feb 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-signalr Includes: SignalR clients and servers
Projects
None yet
Development

No branches or pull requests

7 participants
@qwertie @BrennanConroy @ttutko @wtgodbe @fabianeichfeldt @MGulraiz and others