Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock when generating schema on parallel tasks ?? #778

Closed
kimbell opened this issue Sep 19, 2018 · 20 comments
Closed

Deadlock when generating schema on parallel tasks ?? #778

kimbell opened this issue Sep 19, 2018 · 20 comments

Comments

@kimbell
Copy link

kimbell commented Sep 19, 2018

I have a .NET Core 2.1 application. In this I have code that calls await JsonSchema4.FromTypeAsync<T>();
I also have 400+ tests, and for each of them the code that creates the schema is called. All my tests are asynchronous.

When running the tests through ReSharper, the majority of tests randomly hang until I stop test execution. If I run them one-by-one, things work fine, so I'm thinking deadlock might be an issue.

I downloaded the source code and built against that, then I started commenting out bits to see where things stopped working. I have tried creating a standalone project for this issue, but so far unsuccessful.

When digging around the code, I had to make two changes for it to work for me.

In XmlDocumentationExtensions.cs, there are a number of calls to

await _lock.WaitAsync()

this I changed to

await _lock.WaitAsync().ConfigureAwait(false);

Since ASP .NET Core doesn't have a synchronization context, this baffles me a bit.

In DynamicApis.cs I had to change

        public static async Task<bool> FileExistsAsync(string filePath)
        {
            if (!SupportsFileApis)
                throw new NotSupportedException("The System.IO.File API is not available on this platform.");

            if (string.IsNullOrEmpty(filePath))
                return false;

            return await FromResult(false);
            //return await FromResult((bool)FileType.GetRuntimeMethod("Exists",
            //    new[] { typeof(string) }).Invoke(null, new object[] { filePath }));
        }

I've also fired up Process Monitor so see if I could see anything related to file access, but haven't found anything.

@kimbell
Copy link
Author

kimbell commented Sep 19, 2018

I found another hack that works for my tests.

By adding the following code, things run without any problem.

var dynamicApisType = typeof(DynamicApis);

var fileTypeField = dynamicApisType.GetField("FileType", BindingFlags.NonPublic | BindingFlags.Static);
fileTypeField?.SetValue(null, null);

var directoryTypeField = dynamicApisType.GetField("DirectoryType", BindingFlags.NonPublic | BindingFlags.Static);
directoryTypeField?.SetValue(null, null);

var pathTypeField = dynamicApisType.GetField("PathType", BindingFlags.NonPublic | BindingFlags.Static);
pathTypeField?.SetValue(null, null);

@RicoSuter
Copy link
Owner

Returning false in FileExistsAsync essentially disables the whole XML Docs feature... it must be a problem with the locks...

@RicoSuter
Copy link
Owner

Can you provide a simple unit test to reproduce this problem?

@kimbell
Copy link
Author

kimbell commented Sep 19, 2018

As I mentioned in the original post, I tried setting up a test project that would reproduce the error, but so far I haven't had any luck. I'll keep working my brain to see if I can figure out what the magical thing is to trigger this error.

@RicoSuter
Copy link
Owner

Maybe multiple projects/xml files?

@kimbell
Copy link
Author

kimbell commented Sep 20, 2018

I did some more testing today using multiple projects, but so far no luck. The project where the problem surfaced is using a number of internal nuget packages, so I'm trying to distill it down to what is actually needed.

@RicoSuter
Copy link
Owner

If i update NJS with a possible fix, can you verify in your own project?

@kimbell
Copy link
Author

kimbell commented Sep 20, 2018

Sure.

@kimbell
Copy link
Author

kimbell commented Sep 20, 2018

if you make the change in a branch, I can get that and build against it and see how things behave.

@RicoSuter
Copy link
Owner

RicoSuter commented Sep 20, 2018

Can you try with the patch/fix-xml-docs-deadlock branch?

@kimbell
Copy link
Author

kimbell commented Sep 20, 2018

I'll try it out when I get into work tomorrow.

@kimbell
Copy link
Author

kimbell commented Sep 21, 2018

I added the updated code, but it didn't have much effect. I've been exploring and trying things out for a couple of hours.

I started by adding in a bunch of Debug.WriteLine() statements to see what was going on around the lock. I discovered that I didn't get some statements that I expected; adding ConfigureAwait(false); solved that. I searched through the code and found other places where it was missing, so I added them in.

If I remove the locking code from GetXmlDocumentationTagAsync things work as expected, so there is something fishy going on there. Looking at your AsyncLock implementation, it's very different from https://github.com/StephenCleary/AsyncEx/blob/master/src/Nito.AsyncEx.Coordination/AsyncLock.cs

That package is not compatible with the current frameworks supported by NJS.

The changes I've made are in this patch file.
deadlock.txt

@RicoSuter
Copy link
Owner

I dont understand this, only the public methods are synchronized and all private methods aren't - also the synchronized public methods do not call other synchronized public methods (reclaim semaphore => deadlock). Maybe you can switch from WaitAsync() to Wait(), run your tests in debug mode, wait until it blocks, and examine the Debug > Windows > Threads window?

@RicoSuter
Copy link
Owner

I've updated the branch with your changes, except the uncommented sync code in GetXmlDocumentationTagAsync (we need to synchronize this method) - where is the deadlock!?

@RicoSuter
Copy link
Owner

My AsyncLock is just a wrapper around SemaphoreSlim, I dont want to reference another library just for this lock :-) - it must be something stupid...

@RicoSuter
Copy link
Owner

I've switched to this implementation, probably wont change much: http://applications.lt/awaiting-in-csharp-lock-block/

@kimbell
Copy link
Author

kimbell commented Sep 21, 2018

When I changed the code from WaitAsync() to Wait(), all the tests run as expected.

@RicoSuter
Copy link
Owner

What? That is really strange...

@RicoSuter
Copy link
Owner

Ok, I've switched to sync lock, can you verify with latest commit in the branch? Sync lock is fine for me, but still a little bothering because it makes a difference...

@kimbell
Copy link
Author

kimbell commented Sep 21, 2018

Tried with the latest commit, and things work as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants