Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OWIN WebSockets SendAsync on IIS sometimes returns a task that never completes #155

Closed
darkl opened this issue Jan 10, 2018 · 23 comments
Closed

Comments

@darkl
Copy link

@darkl darkl commented Jan 10, 2018

This issue was found while investigating this issue.

When running a long running WebSocket server hosted on IIS, sometimes SendAsync returns a task that never completes nor terminates. I would expect the task to fault at some point, but it remains in state WaitingForActivation.

A repro can be found here. First run OwinHangSample and then WebSocketClient. You will see that after some time (a few minutes), the WebSocketClient stops receiving messages. Debugging the sample further (for instance storing the SendAsync task in a static variable), shows that the last SendAsync Task is stuck in WaitingForActivation state.

Elad

@Tratcher
Copy link
Member

@Tratcher Tratcher commented Jan 10, 2018

Katana provides a minimal wrapper for System.Web.WebSockets in this scenario. Can you reproduce this using System.Web directly?
See

return WebSocket.SendAsync(buffer, OpCodeToEnum(messageType), endOfMessage, cancel);

@darkl
Copy link
Author

@darkl darkl commented Jan 10, 2018

Probably it is possible to reproduce this using System.Web directly. What further actions do you suggest (i.e. how should I modify the repro, and where should I move the issue to)?

Thanks
Elad

@Tratcher
Copy link
Member

@Tratcher Tratcher commented Jan 10, 2018

If you can eliminate your Microsoft.Owin dependencies and repro with only System.Web then I can help you track down the right person to investigate.

@darkl
Copy link
Author

@darkl darkl commented Jan 10, 2018

Created a branch without OWIN dependencies.

Thanks!
Elad

@Tratcher
Copy link
Member

@Tratcher Tratcher commented Jan 10, 2018

@Eilon can you recommend someone for System.Web WebSockets debugging?

@Eilon
Copy link
Member

@Eilon Eilon commented Jan 11, 2018

@Tratcher usually I'd recommend you 😄 I can give you a name in person.

@flin-8
Copy link

@flin-8 flin-8 commented Jan 17, 2018

Hi guys,
Do we know where the issue lies yet?

@Tratcher
Copy link
Member

@Tratcher Tratcher commented Jan 17, 2018

Not yet. I've passed this off to the Asp.Net 4 servicing team to investigate.

@flin-8
Copy link

@flin-8 flin-8 commented Jan 18, 2018

Great thanks!

@Jinhuafei
Copy link

@Jinhuafei Jinhuafei commented Jan 18, 2018

@flin-zap It looks like the client side causes the issue. If you enable socket tracing, you will see the following log. At the end of the log, the receive operation never completed. I also tried two other websocket clients(System.Net.WebSockets.ClientWebSocket and JavaScript WebSocket), and all worked well.

System.Net.Sockets Verbose: 0 : [7728] Entering Socket#45004109::ReceiveAsync()
System.Net.Sockets Verbose: 0 : [7728] Exiting Socket#45004109::ReceiveAsync() -> Boolean#1
System.Net.Sockets Verbose: 0 : [13504] Entering Socket#45004109::SendAsync()
System.Net.Sockets Verbose: 0 : [13504] Entering Socket#45004109::SendAsync(Boolean#1)
System.Net.Sockets Verbose: 0 : [7728] Data from Socket#45004109::FinishOperation(SendAsync)
System.Net.Sockets Verbose: 0 : [7728] 00000000 : 89 94 4D C8 F0 48 7C E7-C1 70 62 FA C0 79 75 E8 : ..M..H|..pb..yu.
System.Net.Sockets Verbose: 0 : [7728] 00000010 : C2 72 7E FA CA 7D 7B E8-A0 05 : .r~..}{...
System.Net.Sockets Verbose: 0 : [7728] Data from Socket#45004109::FinishOperation(ReceiveAsync)
System.Net.Sockets Verbose: 0 : [7728] 00000000 : 8A 14 : ..
System.Net.Sockets Verbose: 0 : [7728] Entering Socket#45004109::ReceiveAsync()
System.Net.Sockets Verbose: 0 : [7728] Exiting Socket#45004109::ReceiveAsync() -> Boolean#1
System.Net.Sockets Verbose: 0 : [7728] Data from Socket#45004109::FinishOperation(ReceiveAsync)
System.Net.Sockets Verbose: 0 : [7728] 00000000 : 31 2F 31 38 2F 32 30 31-38 20 32 3A 33 32 3A 35 : 1/18/2018 2:32:5
System.Net.Sockets Verbose: 0 : [7728] 00000010 : 36 20 50 4D : 6 PM
System.Net.Sockets Verbose: 0 : [7728] Entering Socket#45004109::ReceiveAsync()
System.Net.Sockets Verbose: 0 : [7728] Exiting Socket#45004109::ReceiveAsync() -> Boolean#1

@darkl
Copy link
Author

@darkl darkl commented Jan 19, 2018

Dear @Jinhuafei,

Thanks for your help. I understand that this is the client's fault, but what would you suggest to do on the server side? I would expect the Task to fault after some timeout. Should the server developer handle this issue themselves?

Elad

@darkl
Copy link
Author

@darkl darkl commented Jan 19, 2018

By the way, the original code works fine when using OWIN self-host. Any ideas why?

Elad

@darkl
Copy link
Author

@darkl darkl commented Jan 20, 2018

I tried to debug the WebSocket4Net codebase. The code is waiting for a reply after calling ReceiveAsync. As I mentioned in my previous comment, this does work fine on OWIN self-host, and on many other WebSockets implementations (for instance vtortola).
I couldn't find anything wrong while debugging the WebSocket4Net code. I feel like this issue is caused by both IIS and WebSocket4Net, but it's hard for me to point exactly how.

Elad

@bigbearzhu
Copy link

@bigbearzhu bigbearzhu commented Jan 22, 2018

Narrowed down this a bit more. Fiddler shows that the server doesn't reply with the message sent through. Interestingly it shows that a Ping message was sent to the server from WebSocket4Net almost at the same time of the last message sent. I tried sending a Ping message manually immediately after sending a message. It hangs very quickly (< 5 messages). It seems IIS just don't like Ping and normal message coming together.

A fork to show the issue can be found here

When I set EnableAutoSendPing on the websocket to false, it seems working fine.

@Tratcher
Copy link
Member

@Tratcher Tratcher commented Jan 22, 2018

@pan-wang @anurse does this look like any of the IIS websocket issues you've seen?

@anurse
Copy link

@anurse anurse commented Jan 22, 2018

Might be aspnet/WebSockets#100 - I believe that was fixed in ANCM, not sure if the fix was in ANCM or the core websocket handling though. It's possible a similar fix is needed elsewhere.

@Tratcher
Copy link
Member

@Tratcher Tratcher commented Jan 23, 2018

This resembles a known IIS bug that was fixed in recent Windows releases (1703 or 1709). What OS version are you testing on? See winver.exe.

@darkl
Copy link
Author

@darkl darkl commented Jan 23, 2018

image

@bigbearzhu
Copy link

@bigbearzhu bigbearzhu commented Jan 23, 2018

I have tested using the code on latest Windows 10 Version 1709 (OS Build 16299.125) with IIS Express (iisexpress 10.0.14358.1000). It still failed almost immediately. Fortunately, on the same machine, IIS (w3wp 10.0.16299.15) doesn't fail like before, but it can still get stuck for a few seconds and then continued every minute or so. I believe it is much better now!

At the same time, we tested this on another server machine with Windows Server 2016 Version 1607 (OS Build 14393.2007) It does have same issue with both IIS Express and IIS. So I guess it confirmed @Tratcher's reply. @Tratcher do you have reference to the original bug? Thanks

@Tratcher
Copy link
Member

@Tratcher Tratcher commented Jan 23, 2018

aspnet/WebSockets#100 is the only public reference. I'll check if there's a corresponding fix for IIS Express.

@bigbearzhu
Copy link

@bigbearzhu bigbearzhu commented Jan 23, 2018

Thanks @Tratcher for your effort! Let's hope this update will be rolled into Azure Web App hosting OS soon. So far I could only disable the auto ping to work around it as the hosting OS is still 10.0.14393.0.

@Tratcher
Copy link
Member

@Tratcher Tratcher commented Jan 25, 2018

It looks like the fix has not been ported to IIS Express yet. There is a plan to update express, but no specific schedule.

@Tratcher Tratcher added this to the Discussions milestone Jan 26, 2018
@darkl
Copy link
Author

@darkl darkl commented Aug 23, 2020

Resolved in newer Windows versions
image

@darkl darkl closed this as completed Aug 23, 2020
@msftbot msftbot bot locked as resolved and limited conversation to collaborators Sep 22, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants