-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SignalR core: confusion about the usage of sticky sessions when scaling out the app #11678
Comments
Sorta. By default it's required because SignalR will at the very least make 2 requests regardless of the transport. First the negotiate request to determine what transports the server supports, second is the attempt to connect to that transport. It may make more requests to other transports as it tries to fallback. This all requires sticky sessions to work as SignalR Core requires the transport request to go back to the same server that negotiate requests was made on. Without ARR affinity it'll fail with the 404 you're seeing above. SignalR stores local state about a connection on the machine the connection was made to. Further more, for non websocket transports sending from client to server requires sticky sessions (I know you don't care about this scenario). It also matters for long polling because even receiving from server to client, long polling needs to make multiple requests and those need to go back to the same server where that state is stored. Now for the more nuanced answer: SignalR core does support a direct to websocket connection that avoids the negotiate requests (via an option on the client side called PS: The Azure SignalR service handles this for you so you can avoid having to make your web tier more stateful. |
Hi, thanks for replying. So, summarizing, there are three options available:
Unfortunately we can't adopt the third solution, because several of our customers have their own infrastructure or they want to run their installation in clouds other than Azure. The only viable solutions for our applications are the first and the second. The main concern with the second solution is that, if for any reason the web sockets are not available in some customer networks, then our app won't be able to work properly. We have to consider quite a wide range of different scenarios. I have a question related to the first solution (switching the ARR affinity on). Some customers of ours have multi data center installation in Azure. We basically deploy our application in two different data centers and we use a traffic manager in order to route the client requests to the proper data center. Based on my understanding, the ARR affinity available in Azure app service works at the load balancer level using a cookie. The traffic manager sits in front of the load balancers of the two data centers. How can I get sticky sessions in such a scenario ? Is the Azure ARR affinity able to handle such a scenario ? |
Hi, I tried out the solution of skipping negotiation from the JS client and I confirm that it works fine, even when the web application is scaled out to multiple instances and ARR affinity is switched off. That's great for us. Just as a reference for the readers, the connection options required are the followings:
With regard to my question on traffic manger scenario, I found this documentation which seems to confirm that sticky sessions are not available when running behind the Azure traffic manager.
Using sticky sessions instead of skipping negotiation could be helpful for us in scenarios where, for any reason, web socket protocol is not available from the users side. Do you confirm that there is no way to have sticky sessions in our multi data center installations ? Thanks for helping Enrico |
@EnricoMassoneDeltatre as far as I know, bouncing between multi data center installations will have traffic routed only at the traffic manager level. And since that routing as you noted earlier is purely DNS based, there won't be any knowledge of any session state. In cases where your product can't make use of web sockets, I think you'll have to enable AARAffinity. Hope that helps even though I know it's more than likely not answer you're looking for. |
Hi @RyanHill-MSFT thanks for replying. We will go with the web sockets only option (avoiding protocol negotiation), because we need to support the multi data center scenario. The best compromise seems to be requiring the web sockets support as a prerequisite to install our product. Unfortunately this change in the behavior of signal-r from the old version to the .net core version (the old one was stateless, while the new one is stateful) can be quite painful in the case of cloud applications. To be honest, this limitation affects other similar products such as socket.io as documented here, so this design seems to be reasonable. Today I noticed that the sticky sessions requirement for scaling out is actually documented here; I didn't noticed it before opening this issue. Thanks for the help ! |
Hi,
we are migrating an old ASP.NET MVC application written in .NET framework 4.7.1 to ASP.NET core 2.2
The process has been quite smooth until now, but we are facing some issues with the porting from the "old" signalr library for .NET framework 4.7.1 to the new .net core signalr library. This is a summary of our current environment:
asp.net core 2.2
MVC web applicationMicrosoft.AspNetCore.SignalR
nuget package version1.1.0
aspnet-signalr
javascript library version1.1.4
Google Chrome
version75.0.3770.100
AngularJS
version1.6.5
When the app service plan is scaled out to only 1 instance everything works like a charm. The javascript client connects to the web application by using web sockets as a transmission protocol , the web server notifications are received and the behavior of the client side application doesn't break when browser page is reloaded. The application doesn't break even if web sockets are turned off: in that case signalr downgrades to SSE events and everything works as expected. In this configuration (only 1 instance of the app service plan) turning on and off the ARR affinity doesn't have any impact on the application behavior: in any case everything works fine.
The troubles begin when we scale out the app service plan to 2 instances.
We are aware of the issues pointed out in this guide related to the fact that each instance of the web application only knows the clients connected to it, while being completely unaware of the clients connected to the other instances of the web application. In our case this is not an issue, because each instance of our web application receives messages from a service bus and we have the guarantee that each message is received by all the existing instances of the web application, so the fact that each instance of the web application is only able to notify its own clients is not a problem for us (in any case, all the existing clients will be notified by the node to which they are connected via signalr). For this reason we don't use neither Azure SignalR Service nor the Redis backplane. The issues we are experiencing when we scale out are probably related to the fact that we have the ARR affinity disabled, because our application is completely stateless and the old version of SignalR didn't require the ARR affinity (we have always had the ARR affinity disabled and we didn't experience any issue with signal r when we scaled out the old app built with .NET framework 4.7.1).
The behaviour we get when we scale out is the following: sometimes the clients works fine (they connect to the backend by using web sockets and they receive notifications as expected), sometimes doing a page refresh of the browser breaks the behavior (after the page reloads signalr stops working), some other times signalr doesn't work since the first page load in the browser (no page reload is needed in order to break the application). All of this is completely random: sometimes it happens and some other times it doesn't happen. There is not a clear error pattern.
Interestingly, we get two different types of error:
Here are the errors that we get in the Google chrome console in scenario A:
Here are the errors that we get in the Google chrome console in scenario B:
Some guides online (for instance this blog post) seems to state that the usage of the ARR affinity is not required when the web sockets are enabled, regardless of the number of instances of the web application. Put another way, it seems that the ARR affinity is required only when communication protocols other than web sockets are used.
So, here is my question: is the ARR affinity always required when the app is scaled out to more instances, regardless of the type of communication protocol used, and so even when both the server and the client are able to use web sockets ?
Thanks for helping
Enrico
The text was updated successfully, but these errors were encountered: