Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[threads] Use refcounts for coordinating finalization and detaching #12391

Merged
merged 1 commit into from Jan 15, 2019

Commits on Jan 11, 2019

  1. [threads] Use refcounts for coordinating finalization and detaching

    Reverts a29ad08 (mono#9914)
    
    The basic problem we want to solve is the following:
      1. All access to InternalThread:state must be protected by the
         InternalThread:synch_cs mutex
      2. We must destroy the mutex when we are done with the thread.
      3. We don't know which happens later - detaching the machine thread or
         finalizing its InternalThread managed object.
    
    The solution is to replace InternalThread:synch_cs by InternalThread:longlived
    which is a refcounted struct that holds the synch_cs.  The refcount starts out
    at 2 when the thread is attached to the runtime and when we create the managed
    InternalThread object that represents it.
    Both detaching and finalizing the managed object will decrement the refounct,
    and whichever one happens last will be responsible for destroying the mutex.
    
    This addresses mono#11956 which was a race
    condition due to the previous attempt to fix this lifetime problem.  The
    previous attempt incorrectly used CAS in mono_thread_detach_internal while
    continuing to use locking of synch_cs elsewhere. In particular
    mono_thread_suspend_all_other_threads could race with
    mono_thread_detach_internal: it expects to take the thread lock and test
    thread->state and use the thread->suspended event, while detaching deletes
    thread->suspended without taking a lock.
    
    As a result we had a concurrency bug: in suspend_all_other_threads it's
    possible to see both the old (non-Stopped) value of thread->state and the
    new (NULL) value of thread->suspended.  Which leads to crashes.
    
    ---
    
    Background - why we don't know if detaching or finalization happens first.
    
    1. InternalThread normally outlives the machine thread. This can happen because
    when one thread starts another it can hold a reference to the fresh thread's
    Thread object which holds a reference to the InternalThread. So after the
    machine thread is done, the older thread can query the state of the younger
    Thread object. This is the normal situation.
    
    2. During shutdown we can have the opposite situation: the InternalThread
    objects are finalized first (this happens during root domain finalization), but
    the machine threads are still running, and they may still return to
    start_wrapper_internal and call detach_internal. So in this case we have an
    InternalThread whose finalizer ran first and detach will run second.
    lambdageek committed Jan 11, 2019
    Configuration menu
    Copy the full SHA
    6af8eed View commit details
    Browse the repository at this point in the history