Skip to content
This repository has been archived by the owner on Dec 18, 2018. It is now read-only.

Server deadlock #103

Closed
main-- opened this issue May 23, 2015 · 20 comments
Closed

Server deadlock #103

main-- opened this issue May 23, 2015 · 20 comments

Comments

@main--
Copy link

main-- commented May 23, 2015

Right now, Kestrel invokes Task.Wait() in several places (one of them:

).

In combination with MVC, this has quite fatal consequences: MVC asynchronously renders the page. It then disposes the output writer which in turn flushes the output stream. Because IDisposable is incompatible with async/await, all of this is done synchronously, leading to the synchronous Write call. As a result, multiple requests (2 were already enough in all of my tests) lead to more and more server threads getting hung up in Task.Wait, until they finally deadlock the entire server as the tasks they are waiting for never run.

@Tragetaschen
Copy link
Contributor

I don't get what you are talking about. What doesn't work with async/await/IDisposable? What multiple server threads are you talking about in Kestrel?

@main--
Copy link
Author

main-- commented May 23, 2015

IDisposable.Dispose returns void, so the MVC writer has no other choice, it must call the synchronous Write. What I'm experiencing right now is that Kestrel stops mid-response after a few requests and then does nothing at all. It's completely locked up because every thread hangs at Task.Wait().

@mdekrey
Copy link

mdekrey commented May 26, 2015

I'm noticing the same issue cross-platform. I'm having trouble creating a reliable trivial test. Relevant stack trace appears to be:

mscorlib.dll!System.Threading.Tasks.Task.Wait(int millisecondsTimeout, System.Threading.CancellationToken cancellationToken)     
Microsoft.AspNet.Server.Kestrel.dll!Microsoft.AspNet.Server.Kestrel.Http.FrameResponseStream.Write(byte[] buffer, int offset, int count)     
Microsoft.AspNet.Mvc.Core.dll!Microsoft.AspNet.Mvc.HttpResponseStreamWriter.FlushInternal(bool flushStream = false, bool flushEncoder)   
Microsoft.AspNet.Mvc.Core.dll!Microsoft.AspNet.Mvc.HttpResponseStreamWriter.Dispose(bool disposing)  
mscorlib.dll!System.IO.TextWriter.Dispose()  
Microsoft.AspNet.Mvc.Core.dll!Microsoft.AspNet.Mvc.ViewExecutor.ExecuteAsync(Microsoft.AspNet.Mvc.Rendering.IView view = {Microsoft.AspNet.Mvc.ResultExecutingContext}, Microsoft.AspNet.Mvc.ActionContext actionContext = null, Microsoft.AspNet.Mvc.ViewDataDictionary viewData = {Microsoft.AspNet.Mvc.Razor.RazorView}, Microsoft.AspNet.Mvc.ITempDataDictionary tempData = {Microsoft.AspNet.Mvc.ViewDataDictionary}, Microsoft.AspNet.Mvc.Rendering.HtmlHelperOptions htmlHelperOptions = {Microsoft.AspNet.Mvc.TempDataDictionary}, Microsoft.Net.Http.Headers.MediaTypeHeaderValue contentType = {Microsoft.AspNet.Mvc.Rendering.HtmlHelperOptions})     
mscorlib.dll!System.Runtime.CompilerServices.AsyncMethodBuilderCore.MoveNextRunner.InvokeMoveNext(object stateMachine)   

This will eventually leave my server running with no errors, accepting new requests but not responding, with the last request before the hang partly sent (probably at a TextWriter buffer boundary, judging by the stack trace) and open.

Update: Kestrel entry from project.lock.json:

  "Kestrel/1.0.0-beta5-11745": {
    "dependencies": {
      "Microsoft.AspNet.Hosting": "1.0.0-beta5-11877",
      "Microsoft.AspNet.Server.Kestrel": "1.0.0-beta5-11745"
    },
    "frameworkAssemblies": [
      "mscorlib",
      "System",
      "System.Core",
      "Microsoft.CSharp"
    ],
    "compile": {
      "lib/dnx451/Kestrel.dll": {}
    },
    "runtime": {
      "lib/dnx451/Kestrel.dll": {}
    }
  }

@Tragetaschen
Copy link
Contributor

When the mentioned Write hangs, the Post'ed work on the libuv thread (from WriteAsync down to SocketOutput.Write) doesn't run to finish the TCS. Do you have a (console) logger running to see if any exception brought down the loop?

@mdekrey
Copy link

mdekrey commented May 27, 2015

Unfortunately, no. This is all I have:

Started
info    : [Microsoft.AspNet.Mvc.Routing.InnerAttributeRoute] Request successfull

y matched the route with name '' and template ''.
verbose : [Microsoft.AspNet.Mvc.MvcRouteHandler] Executing action Samples.Pages.
Controllers.BlogController.Home
info : [Microsoft.Framework.DependencyInjection.DataProtectionServices] User
profile is available. Using 'C:\Users\Matt\AppData\Local\ASP.NET\DataProtection-
Keys' as key repository and Windows DPAPI to encrypt keys at rest.
verbose : [Microsoft.AspNet.Mvc.ViewResult] The view 'Home' was found.

Samples.Pages is my own project - not really a public sample (yet). In this case, it hung on the first request. It was to BlogController.Home() via attribute routing which fired up the view 'Home'. This loaded a couple partial async views (to display each blog entry on the home page).

Interestingly, if I leave out writing the blog entry body <p>@blogEntry.Body</p>, where Body is a string, from the partial view, I don't tend to get the hang. As I said, I'm having a hard time showing a reliable trivial demonstration.

@crozone
Copy link

crozone commented May 27, 2015

Just writing to confirm I am noticing the exact same thing on an Ubuntu LTS machine, latest DNX runtime, as well as locally on Windows 8.1. I'm not sure what triggered the issue - whether it was the new libuv release (1.4), or the new kestrel runtime. Up until I saw this issue, I assumed it was a problem with my own code deadlocking or something to do with the asp.net layer, but it seems the issue isn't present in IIS.

I usually can't even get a single response to complete successfully on Kestrel - the response is written to the browser in full but the connection never closes (the little loading spinner in firefox never disappears). After that first request, the server accepts no further connections (so embedded images don't load, and often bootstrap libraries/other things that aren't in the main html don't load).

I have noticed that if I put a breakpoint in somewhere and pause during the request (can be in the controller or a view) and slow the entire request down, it completes successfully. This is definitely indicative of some sort of race condition, getting stuck on that Write() seems to fit the bill.

No exceptions in logging, log reads:

info : [Microsoft.Framework.DependencyInjection.DataProtectionServices] User profile is available. Using 'C:\Users\USER\AppData\Local\ASP.NET\DataProtection-Keys' as key repository and Windows DPAPI to encrypt keys at rest.
Started
verbose : [Microsoft.AspNet.Mvc.Routing.InnerAttributeRoute] Request did not match any attribute route.
info : [Microsoft.AspNet.Routing.Template.TemplateRoute] Request successfully matched the route with name 'AreaRoute' and template '{area:exists}/{controller:exists}/{action}/{id?}'.
verbose : [Microsoft.AspNet.Mvc.MvcRouteHandler] Executing action Thingo.Web.Areas.Main.Controllers.HomeController.Index
verbose : [Microsoft.AspNet.Mvc.ViewResult] The view 'Index' was found.

And that's all I get before it hangs indefinitely.

@bojanrajkovic
Copy link

I'm seeing the same thing with our MVC-using apps. I was able to work around the issue by introducing a middleware that wraps the request stream with a custom one, like so:

public class StupidMiddleware 
{
    readonly RequestDelegate next;

    public StupidMiddleware(RequestDelegate next) 
    {
        this.next = next;
    }

    public async Task Invoke(HttpContext ctx)
    {
        var resp = ctx.Response;

        // Replace the body stream with a fake one
        var realBodyStream = resp.Body;
        var fakeBody = new MemoryStream();
        resp.Body = fakeBody;

        await next(ctx);

        // The fake body stream is closed, so make a new one
        fakeBody = new MemoryStream(fakeBody.ToArray());

        // Swap the real body stream back in
        resp.Body = realBodyStream;

        await fakeBody.CopyToAsync(resp.Body);
    }
}

I reproduced the deadlock again and attached to the Kestrel process. The stack trace for the hung thread is in https://gist.github.com/bojanrajkovic/45f31c6713608ee96c70, and shows Kestrel's FrameResponseStream.Write method waiting on something. This seems like it might be a deadlock as a result of inconsistent async/await usage (http://blog.stephencleary.com/2012/07/dont-block-on-async-code.html), but I don't know enough about Kestrel internals to say. An attached console logger doesn't show anything out of the ordinary -- the last two lines are as follows:

verbose : [Microsoft.AspNet.Mvc.MvcRouteHandler] Executing action Foo.Controllers.AccountController.ViewLogin
verbose : [Microsoft.AspNet.Mvc.ViewResult] The view 'Login' was found.

The wrapping middleware above seems to take care of the problem by making all of the elements that MVC deals with async-safe, thus preventing the deadlock.

@mdekrey
Copy link

mdekrey commented May 28, 2015

Thanks for the workaround, @bojanrajkovic! I've boiled it down (https://gist.github.com/mdekrey/59b93ab52cd130c4a2a8) if you're already using MVC.

It should be noted that this needs to go first in your Configure method, or it won't be doing anything!

@bojanrajkovic
Copy link

Nice, didn't know about NonDisposableStream. We ended up switching to WebListener.

@adalinesimonian
Copy link

Can confirm that I am able to replicate this problem on OS X Yosemite: after a number of requests, Kestrel would stop responding to any more. Kill process, restart, repeat. Very odd problem.

So far, thanks to @mdekrey's variant of @bojanrajkovic's workaround, it's been up and running and now I can actually test ASP .NET applications for more than a few minutes.

@balneaves
Copy link

This makes Kestrel behave much better for me too (OSX/Yosemite/DNX latest). However it does seem to make Kestrel refuse to quit when you Ctrl+C

After I ctrl+c it just sits there... then if I got to my browser and refresh the MVC page the browser shows an error then Kestrel quits. Usually with one of the following 2 errors:

System.Exception: Error -16 EBUSY resource busy or locked at Microsoft.AspNet.Server.Kestrel.Networking.Libuv.Check (Int32 statusCode) [0x00000] in <filename unknown>:0 at Microsoft.AspNet.Server.Kestrel.Networking.Libuv.loop_close (Microsoft.AspNet.Server.Kestrel.Networking.UvLoopHandle handle) [0x00000] in <filename unknown>:0 at Microsoft.AspNet.Server.Kestrel.Networking.UvLoopHandle.ReleaseHandle () [0x00000] in <filename unknown>:0 at System.Runtime.InteropServices.SafeHandle.Dispose (Boolean disposing) [0x00000] in <filename unknown>:0 at System.Runtime.InteropServices.SafeHandle.Finalize () [0x00000] in <filename unknown>:0

Or

Unhandled Exception: System.NullReferenceException: Object reference not set to an instance of an object at Microsoft.AspNet.Server.Kestrel.Networking.UvHandle.ReleaseHandle () [0x00000] in <filename unknown>:0 at System.Runtime.InteropServices.SafeHandle.Dispose (Boolean disposing) [0x00000] in <filename unknown>:0 at System.Runtime.InteropServices.SafeHandle.Finalize () [0x00000] in <filename unknown>:0

@Tragetaschen
Copy link
Contributor

That particular issue is tracked here: #9

@mdekrey
Copy link

mdekrey commented Jun 7, 2015

Agreed with @Tragetaschen - the Ctrl+C behavior was malfunctioning long before adding the workaround provided by @bojanrajkovic.

@davidfowl
Copy link
Member

I believe this was fixed by @halter73 in 7e125fa

@mdekrey
Copy link

mdekrey commented Jun 13, 2015

Appears to be working for me as of Kestrel/1.0.0-beta6-11833 (tested on Windows only as of yet). Thanks for the notice, @davidfowl, and thanks to @halter73, too!

@Tragetaschen
Copy link
Contributor

If that's the problem, #60 solved this by filling the async gap from the FrameResponseStream into the Write method. All the callbacks are ultimately only used to transport a possible error to the higher layers which a TaskCompletionSource does very well.

@mdekrey
Copy link

mdekrey commented Jun 14, 2015

Doesn't look like it is resolved on OSX, dnvm version 1.0.0-beta6-12032, with Kestrel/1.0.0-beta6-11833 according to the project.lock.json.

@halter73
Copy link
Member

@mdekrey I think I resolved this deadlock issue. I noticed a similar deadlock when running https://github.com/aspnet/MusicStore on Linux and fixed it with this commit.

This is the same commit @davidfowl linked to earlier in the thread, but you'll need Kestrel 1.0.0-beta6-11852 or later to try it out.

@corruptmem
Copy link

I'm getting deadlocks when making simultaneous requests with Kestrel/1.0.0-beta6-11943 on dnx 1.0.0-beta6-12250 clr x64 for Windows and dnx 1.0.0-beta6-12232 mono x64 on Linux/Mac.

If I switch to any host besides kestrel on Windows, the issue disappears.

@glennc
Copy link
Member

glennc commented Sep 1, 2015

Closing as we think this is fixed. If anyone repros with beta8 then create a new issue and link to this one.

@glennc glennc closed this as completed Sep 1, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests