Skip to content

Deadlock caused by lock in static constructor #1193

Closed
@reima

Description

@reima

Summary

I've found a rather subtle deadlock condition in Noda Time that can be triggered by creating patterns from different threads.

Reproduction

Here's a minimal program that reproduces the issue:

using NodaTime.Text;
using System.Diagnostics;
using System.Globalization;
using System.Threading;

namespace NodaTimeBug
{
    internal class Program
    {
        private static void Main(string[] args)
        {
            var enUsThread = new Thread(
                () =>
                {
                    Debugger.Break();
                    LocalDateTimePattern.Create("ddMMyyyy", new CultureInfo("en-US", false));
                }) { Name = "en-US" };

            var invariantThread = new Thread(
                () =>
                {
                    LocalDateTimePattern.Create("ddMMyyyy", CultureInfo.InvariantCulture);
                }) { Name = "Invariant" };

            enUsThread.Start();
            invariantThread.Start();

            enUsThread.Join();
            invariantThread.Join();
        }
    }
}

The program may not deadlock every time, as there is a race condition involved. To reproduce the deadlock consistently, follow these steps in Visual Studio:

  1. Create a breakpoint at
    if (dictionary.TryGetValue(key, out TValue value))
  2. Start debugging.
  3. The Debugger.Break() should get hit first. While the program is paused, freeze the thread named "en-US" and continue.
  4. The breakpoint in Cache.cs should get hit next. While the program is paused, freeze the thread named "Invariant", and thaw the thread named "en-US". Then continue.
  5. The breakpoint in Cache.cs should get hit again. Continue.
  6. Wait for a second or so and pause the program. Thaw the thread "Invariant". All threads should now be running again. Continue.
  7. Observe that the program is now deadlocked.

Cause

Suppose the thread "Invariant" starts running first. It eventually calls into NodaFormatInfo.InvariantInfo.ParsePattern, which calls Cache.GetOrAdd on its cache field. Cache.GetOrAdd locks on its mutex. Suppose the thread is now preempted. Note that NodaFormatInfo.InvariantInfo.cache.mutex is locked by the thread "Invariant".

Thread "en-US" now runs. It also first goes through a Cache.GetOrAdd, which locks a mutex. But that's not relevant here, as this is a different mutex (from the NodaFormatInfo for the "en-US" culture). The important part is that LocalDateTimePatternParser.ParsePattern is called through the valueFactory of the Cache object. But the method cannot run immediately, because it references static fields from LocalDateTimePattern.Patterns that have not yet been initialized. That means that the CLR must call the static constructor of LocalDateTimePattern.Patterns first.

To ensure the static constructor is called only once, the CLR acquires an unique lock for the static constructor of LocalDateTimePattern.Patterns. The static initialization calls into LocalDateTimePattern.CreateWithInvariantCulture, which calls into NodaFormatInfo.InvariantInfo.ParsePattern, which calls Cache.GetOrAdd on its cache field. This method tries to lock its mutex. This mutex is currently locked by the thread "Invariant", which means the thread "en-US" blocks. Note that the unique lock for the static constructor of LocalDateTimePattern.Patterns is still held by the thread "en-US".

Suppose the thread "Invariant" is now scheduled again. It is still in the Cache.GetOrAdd method. Similar to the other thread, LocalDateTimePatternParser.ParsePattern is called through the valueFactory of the Cache. For the same reasons as before, the CLR tries to acquire the unique lock for the static constructor of LocalDateTimePattern.Patterns. But this lock is held by the thread "en-US". Hence the deadlock.

Here are the stack traces for both threads at the time of the deadlock:

Invariant

[Managed to Native Transition]
NodaTime.dll!NodaTime.Text.FixedFormatInfoPatternParser<NodaTime.LocalDateTime>..ctor.AnonymousMethod__0(string patternText) Line 24
NodaTime.dll!NodaTime.Utility.Cache<string, NodaTime.Text.IPattern<NodaTime.LocalDateTime>>.GetOrAdd(string key) Line 60
NodaTime.dll!NodaTime.Text.FixedFormatInfoPatternParser<NodaTime.LocalDateTime>.ParsePattern(string pattern) Line 28
NodaTime.dll!NodaTime.Text.LocalDateTimePattern.Create(string patternText, NodaTime.Globalization.NodaFormatInfo formatInfo, NodaTime.LocalDateTime templateValue) Line 169
NodaTime.dll!NodaTime.Text.LocalDateTimePattern.Create(string patternText, System.Globalization.CultureInfo cultureInfo, NodaTime.LocalDateTime templateValue) Line 191
NodaTime.dll!NodaTime.Text.LocalDateTimePattern.Create(string patternText, System.Globalization.CultureInfo cultureInfo) Line 204
NodaTimeBug.exe!NodaTimeBug.Program.Main.AnonymousMethod__0_1() Line 23
mscorlib.dll!System.Threading.ThreadHelper.ThreadStart_Context(object state) Line 41
mscorlib.dll!System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state, bool preserveSyncCtx) Line 293
mscorlib.dll!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state, bool preserveSyncCtx) Line 268
mscorlib.dll!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state) Line 261
mscorlib.dll!System.Threading.ThreadHelper.ThreadStart() Line 60

en-US

mscorlib.dll!System.Threading.Monitor.Enter(object obj, ref bool lockTaken) Line 30
NodaTime.dll!NodaTime.Utility.Cache<string, NodaTime.Text.IPattern<NodaTime.LocalDateTime>>.GetOrAdd(string key) Line 43
NodaTime.dll!NodaTime.Text.FixedFormatInfoPatternParser<NodaTime.LocalDateTime>.ParsePattern(string pattern) Line 28
NodaTime.dll!NodaTime.Text.LocalDateTimePattern.Create(string patternText, NodaTime.Globalization.NodaFormatInfo formatInfo, NodaTime.LocalDateTime templateValue) Line 169
NodaTime.dll!NodaTime.Text.LocalDateTimePattern.CreateWithInvariantCulture(string patternText) Line 230
NodaTime.dll!NodaTime.Text.LocalDateTimePattern.Patterns.Patterns() Line 87
[Native to Managed Transition]
[Managed to Native Transition]
NodaTime.dll!NodaTime.Text.FixedFormatInfoPatternParser<NodaTime.LocalDateTime>..ctor.AnonymousMethod__0(string patternText) Line 24
NodaTime.dll!NodaTime.Utility.Cache<string, NodaTime.Text.IPattern<NodaTime.LocalDateTime>>.GetOrAdd(string key) Line 60
NodaTime.dll!NodaTime.Text.FixedFormatInfoPatternParser<NodaTime.LocalDateTime>.ParsePattern(string pattern) Line 28
NodaTime.dll!NodaTime.Text.LocalDateTimePattern.Create(string patternText, NodaTime.Globalization.NodaFormatInfo formatInfo, NodaTime.LocalDateTime templateValue) Line 169
NodaTime.dll!NodaTime.Text.LocalDateTimePattern.Create(string patternText, System.Globalization.CultureInfo cultureInfo, NodaTime.LocalDateTime templateValue) Line 191
NodaTime.dll!NodaTime.Text.LocalDateTimePattern.Create(string patternText, System.Globalization.CultureInfo cultureInfo) Line 204
NodaTimeBug.exe!NodaTimeBug.Program.Main.AnonymousMethod__0_0() Line 16
mscorlib.dll!System.Threading.ThreadHelper.ThreadStart_Context(object state) Line 41
mscorlib.dll!System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state, bool preserveSyncCtx) Line 293
mscorlib.dll!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state, bool preserveSyncCtx) Line 268
mscorlib.dll!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state) Line 261
mscorlib.dll!System.Threading.ThreadHelper.ThreadStart() Line 60

Solution (?)

Generally speaking, blocking (directly or indirectly) inside a static constructor or initializer should probably be avoided as much as possible. As this example shows, it's not always obvious when a static constructor will run. This makes it very difficult to prevent deadlocks like this, as the order the locks are taken in can be hard to control.

So in my opinion the goal should be to eliminate blocking calls from static constructors (or initializers). Unfortunately I'm not versed enough in the intricacies of Noda Time to make that change myself or propose an alternative solution to this particular problem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions