Connection falls during addition user in groups #3408

Open
gzroman opened this Issue Feb 10, 2015 · 4 comments

Projects

None yet

2 participants

@gzroman
gzroman commented Feb 10, 2015

1.Client connect to server though longpolling. Success.
2.Server send some data to client. Success.
3.Client ask server to add him in groups. Success.
4.Server execute
foreach (long id in chats)
{
_hubContext.Groups.Add(myConnectionId, id.ToString());
}
40 times (number of groups where user is participant)
5.Reconnect happens after client send next request
6.Client can't reconnect until timer is expired

Info:
DeadlockErrorTimeout = 60 sec
TransportConnectTimeout = 60 sec

Log:

13:10:35.6690954 - null - ChangeState(Disconnected, Connecting)
13:10:38.0817652 - 058a8df7-5aa8-4e20-98a9-1e609151121a - LP Connect: http://some_server.azurewebsites.net/signalr/connect?clientProtocol=1.4&transport=longPolling&connectionData=[{"Name":"hub_name"}]&connectionToken=some_token&noCache=8c86ff15-899c-4dd5-8f70-7e303ae18b0e
13:10:38.4934077 - 058a8df7-5aa8-4e20-98a9-1e609151121a - LP: OnMessage({"C":"d-BF6B3427-c,0|d,0|e,0|f,1","S":1,"M":[]})
13:10:38.7172713 - 058a8df7-5aa8-4e20-98a9-1e609151121a - LP Poll: http://some_server.azurewebsites.net/signalr/poll?clientProtocol=1.4&transport=longPolling&connectionData=[{"Name":"hub_name"}]&connectionToken=some_token&messageId=d-BF6B3427-c%2C0%7Cd%2C0%7Ce%2C0%7Cf%2C1&noCache=0f341b43-359e-4ac1-b3a8-33539ef3c60b
13:10:39.2021384 - 058a8df7-5aa8-4e20-98a9-1e609151121a - ChangeState(Connecting, Connected)
13:10:39.2645825 - 058a8df7-5aa8-4e20-98a9-1e609151121a - LP: OnMessage({"C":"d-BF6B3427-c,0|d,2|e,0|f,1","M":[{"$id":"86","$type":"Microsoft.AspNet.SignalR.Hubs.ClientHubInvocation, Microsoft.AspNet.SignalR.Core","H":"hub_name","M":"getConnectionId","A":["058a8df7-5aa8-4e20-98a9-1e609151121a"]},{"$id":"87","$type":"Microsoft.AspNet.SignalR.Hubs.ClientHubInvocation, Microsoft.AspNet.SignalR.Core","H":"hub_name","M":"getMachineName","A":["RD00155D88D459"]}]})
13:10:39.2827184 - 058a8df7-5aa8-4e20-98a9-1e609151121a - LP Poll: http://some_server.azurewebsites.net/signalr/poll?clientProtocol=1.4&transport=longPolling&connectionData=[{"Name":"hub_name"}]&connectionToken=some_token&messageId=d-BF6B3427-c%2C0%7Cd%2C2%7Ce%2C0%7Cf%2C1&noCache=46374b37-a773-4d5b-998f-990ae103fb39
13:10:59.4758803 - 058a8df7-5aa8-4e20-98a9-1e609151121a - LP: OnMessage( {"C":"some_string","G":"some_string","M":[]})
13:10:59.4799089 - 058a8df7-5aa8-4e20-98a9-1e609151121a - LP Poll: http://some_server.azurewebsites.net/signalr/poll?clientProtocol=1.4&transport=longPolling&connectionData=[{"Name":"hub_name"}]&connectionToken=some_token&messageId=message_id&noCache=e3d25078-9b97-4356-a72e-eca37b7db16e
13:11:00.3715846 - 058a8df7-5aa8-4e20-98a9-1e609151121a - ChangeState(Connected, Reconnecting)
13:11:00.4320144 - 058a8df7-5aa8-4e20-98a9-1e609151121a - OnError(System.Exception: Error HRESULT E_FAIL has been returned from a call to a COM component.
at System.Net.Browser.ClientHttpWebRequest.InternalEndGetResponse(IAsyncResult asyncResult)
at System.Net.Browser.ClientHttpWebRequest.<>c__DisplayClasse.b__d(Object sendState)
at System.Net.Browser.AsyncHelper.<>c__DisplayClass1.b__0(Object sendState))

[HUB] Error: Error HRESULT E_FAIL has been returned from a call to a COM component.
[HUB] LOG:
13:11:02.5289168 - 058a8df7-5aa8-4e20-98a9-1e609151121a - LP Reconnect: http://some_server.azurewebsites.net/signalr/reconnect?clientProtocol=1.4&transport=longPolling&connectionData=[{"Name":"hub_name"}]&connectionToken=some_token&messageId=message_id&noCache=728487a2-3c62-4feb-8f60-e72a9efd32a7
13:11:03.3014345 - 058a8df7-5aa8-4e20-98a9-1e609151121a - OnError(System.Exception: Error HRESULT E_FAIL has been returned from a call to a COM component.
at System.Net.Browser.ClientHttpWebRequest.InternalEndGetResponse(IAsyncResult asyncResult)
at System.Net.Browser.ClientHttpWebRequest.<>c__DisplayClasse.b__d(Object sendState)
at System.Net.Browser.AsyncHelper.<>c__DisplayClass1.b__0(Object sendState))

@gzroman
gzroman commented Feb 10, 2015

Issue appears on WP only. iOS and Android signalR clients works well with the same backend

@moozzyk
Contributor
moozzyk commented Feb 10, 2015

Looks like #3400. Since it is a COM error thrown from the system components it is hard to tell what is happening. I am afraid that without a repro it will be hard to figure out what's going on and to fix the issue.

@gzroman
gzroman commented Feb 11, 2015

Source code for client and server deployed at https://onedrive.live.com/redir?resid=BFEADAEA32CC29A6!40457&authkey=!AMYJXlMLYbiAwhY&ithint=folder%2c

For your convenience server for repro already launched in the cloud

Test script:

  1. Launch client app on WP simulator or phone and wait while connection established. In case of success you will see in log:

[HUB] StateChanged (Connecting -> Connected)
[HUB] Connection. Transport: longPolling
[HUB] Received: {
"H": "TestChatHub",
"M": "getConnectionId",
"A": [
"048e88da-7e7a-4176-9896-e5b176674764"
]
}
[HUB] Received: {
"H": "TestChatHub",
"M": "getMachineName",
"A": [
"RD00155D88D459"
]
}


  1. Set number of groups > 47 (by default it is 100).

  2. Press "Start Test" button and you will see in log:


"TaskHost.exe" (CoreCLR: Silverlight AppDomain). Загружено "D:\WPSystem\Apps{6475D277-9B49-4DE4-8380-1FA67F49F369}\Install\Microsoft.Threading.Tasks.Extensions.DLL". Символы загружены.
[HttpClient] request: http://testserversignalr.azurewebsites.net/api/test/Joins?connectionid=048e88da-7e7a-4176-9896-e5b176674764&countGroups=100
"TaskHost.exe" (CoreCLR: Silverlight AppDomain). Loaded "C:\windows\system32\en-US\mscorlib.debug.resources.dll". Module was built without symbols.
[HttpClient] response status: OK
[HttpClient] response data:
[HUB] StateChanged (Connected -> Reconnecting)
[HUB] Reconnecting...
[HUB] Error: Error HRESULT E_FAIL has been returned from a call to a COM component.
[HUB] LOG:
09:46:40.8227158 - null - ChangeState(Disconnected, Connecting)
...


@moozzyk
Contributor
moozzyk commented Feb 11, 2015

Thanks for the repro. The problem here is that on each reconnect all groups the client is subscribed to are sent back to the server in the form of groups token. When the client is subscribed to many groups the groups token can get big. Requests are using GET method and they must be hitting some kind of limit that is causing the exception. You can reproduce this without SignalR with the following code:

var uriString = "http://testserversignalr.azurewebsites.net/?" + new string('A', 2040);
var request = (HttpWebRequest)WebRequest.Create(uriString);
var response = await request.GetResponseAsync();
using(var sr = new StreamReader(response.GetResponseStream()))
{
    var contents = sr.ReadToEnd();
    Debug.WriteLine(contents.Substring(0, 100));
}

(2040 seems to be the limit - if you use 2039 in the snippet above you won't get the exception).
In general it is not common that a single client is subscribed to that many groups. The ultimate solution on SignalR side for this would be probably to use POST instead of GET for reconnects (and other requests that send groupsToken) but this would require quite big changes on both server and client. There are also some ideas about optimizing how we create the groupsToken in this bug #1341. I don't think there is a good workaround for this bug with current codebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment