Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[scudo] Add two missing locks to enable/disable. #79670

Merged
merged 4 commits into from
Jan 29, 2024

Conversation

eugenis
Copy link
Contributor

@eugenis eugenis commented Jan 27, 2024

The StackDepot lock was actually observed deadlocking in the code that uses malloc after fork but before exec. The other one was found by code inspection.

The StackDepot lock was actually observed deadlocking in the code that uses
malloc after fork but before exec. The other one was found by code
inspection.
@llvmbot
Copy link
Collaborator

llvmbot commented Jan 27, 2024

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Evgenii Stepanov (eugenis)

Changes

The StackDepot lock was actually observed deadlocking in the code that uses malloc after fork but before exec. The other one was found by code inspection.


Full diff: https://github.com/llvm/llvm-project/pull/79670.diff

5 Files Affected:

  • (modified) compiler-rt/lib/scudo/standalone/combined.h (+2)
  • (modified) compiler-rt/lib/scudo/standalone/primary32.h (+2)
  • (modified) compiler-rt/lib/scudo/standalone/primary64.h (+2)
  • (modified) compiler-rt/lib/scudo/standalone/release.h (+16)
  • (modified) compiler-rt/lib/scudo/standalone/stack_depot.h (+8)
diff --git a/compiler-rt/lib/scudo/standalone/combined.h b/compiler-rt/lib/scudo/standalone/combined.h
index 4624f83d142a0de..80774073522a72c 100644
--- a/compiler-rt/lib/scudo/standalone/combined.h
+++ b/compiler-rt/lib/scudo/standalone/combined.h
@@ -691,10 +691,12 @@ class Allocator {
     Quarantine.disable();
     Primary.disable();
     Secondary.disable();
+    Depot.disable();
   }
 
   void enable() NO_THREAD_SAFETY_ANALYSIS {
     initThreadMaybe();
+    Depot.enable();
     Secondary.enable();
     Primary.enable();
     Quarantine.enable();
diff --git a/compiler-rt/lib/scudo/standalone/primary32.h b/compiler-rt/lib/scudo/standalone/primary32.h
index 4d03b282d000def..89fffc50dcd0096 100644
--- a/compiler-rt/lib/scudo/standalone/primary32.h
+++ b/compiler-rt/lib/scudo/standalone/primary32.h
@@ -261,6 +261,7 @@ template <typename Config> class SizeClassAllocator32 {
   }
 
   void disable() NO_THREAD_SAFETY_ANALYSIS {
+    RegionPageMap::disable();
     // The BatchClassId must be locked last since other classes can use it.
     for (sptr I = static_cast<sptr>(NumClasses) - 1; I >= 0; I--) {
       if (static_cast<uptr>(I) == SizeClassMap::BatchClassId)
@@ -281,6 +282,7 @@ template <typename Config> class SizeClassAllocator32 {
         continue;
       getSizeClassInfo(I)->Mutex.unlock();
     }
+    RegionPageMap::enable();
   }
 
   template <typename F> void iterateOverBlocks(F Callback) {
diff --git a/compiler-rt/lib/scudo/standalone/primary64.h b/compiler-rt/lib/scudo/standalone/primary64.h
index 9a642d23620e1ea..4adb53b3748ffe3 100644
--- a/compiler-rt/lib/scudo/standalone/primary64.h
+++ b/compiler-rt/lib/scudo/standalone/primary64.h
@@ -341,9 +341,11 @@ template <typename Config> class SizeClassAllocator64 {
     }
     getRegionInfo(SizeClassMap::BatchClassId)->MMLock.lock();
     getRegionInfo(SizeClassMap::BatchClassId)->FLLock.lock();
+    RegionPageMap::disable();
   }
 
   void enable() NO_THREAD_SAFETY_ANALYSIS {
+    RegionPageMap::enable();
     getRegionInfo(SizeClassMap::BatchClassId)->FLLock.unlock();
     getRegionInfo(SizeClassMap::BatchClassId)->MMLock.unlock();
     for (uptr I = 0; I < NumClasses; I++) {
diff --git a/compiler-rt/lib/scudo/standalone/release.h b/compiler-rt/lib/scudo/standalone/release.h
index b6f76a4d20585ab..6e90ec97e0f3c32 100644
--- a/compiler-rt/lib/scudo/standalone/release.h
+++ b/compiler-rt/lib/scudo/standalone/release.h
@@ -168,6 +168,14 @@ class BufferPool {
     return Buf.BufferIndex != StaticBufferCount;
   }
 
+  void disable() NO_THREAD_SAFETY_ANALYSIS {
+    Mutex.lock();
+  }
+
+  void enable() NO_THREAD_SAFETY_ANALYSIS {
+    Mutex.unlock();
+  }
+
 private:
   Buffer getDynamicBuffer(const uptr NumElements) {
     // When using a heap-based buffer, precommit the pages backing the
@@ -325,6 +333,14 @@ class RegionPageMap {
 
   uptr getBufferNumElements() const { return BufferNumElements; }
 
+  static void disable() NO_THREAD_SAFETY_ANALYSIS {
+    Buffers.disable();
+  }
+
+  static void enable() NO_THREAD_SAFETY_ANALYSIS {
+    Buffers.enable();
+  }
+
 private:
   // We may consider making this configurable if there are cases which may
   // benefit from this.
diff --git a/compiler-rt/lib/scudo/standalone/stack_depot.h b/compiler-rt/lib/scudo/standalone/stack_depot.h
index 12c35eb2a4f3383..ab5e96680cab58c 100644
--- a/compiler-rt/lib/scudo/standalone/stack_depot.h
+++ b/compiler-rt/lib/scudo/standalone/stack_depot.h
@@ -136,6 +136,14 @@ class StackDepot {
   u64 operator[](uptr RingPos) const {
     return atomic_load_relaxed(&Ring[RingPos & RingMask]);
   }
+
+  void disable() NO_THREAD_SAFETY_ANALYSIS {
+    RingEndMu.lock();
+  }
+
+  void enable() NO_THREAD_SAFETY_ANALYSIS {
+    RingEndMu.unlock();
+  }
 };
 
 } // namespace scudo

Copy link

github-actions bot commented Jan 27, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Comment on lines 171 to 178
void disable() NO_THREAD_SAFETY_ANALYSIS {
Mutex.lock();
}

void enable() NO_THREAD_SAFETY_ANALYSIS {
Mutex.unlock();
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need the disable()/enable() here and RegionPageMap down below. They are utility data structures (the lock is to guard the static storage)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think this lock is special, see the other comment.

Comment on lines 140 to 146
void disable() NO_THREAD_SAFETY_ANALYSIS {
RingEndMu.lock();
}

void enable() NO_THREAD_SAFETY_ANALYSIS {
RingEndMu.unlock();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you share more details about the deadlock? From where the RingEndMu is used (only in the insert() above and it seems to me that it's not disabling the StackDepot. For example, find() is still working when we call disable().
I'm wondering if we want a different lock to guard all the operations (we may not be able to lock in the user because it passes the address of Depot)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, StackDepot is not actually being disabled here. The idea is grab all allocator locks in a pthread_atfork before the fork, and release them after. This allows malloc to be used in a fork child of a multithreaded process, which is expressly forbidden by the standard, but very widely used. For example, Android's init uses std::string after fork when spawning services, for example in android::init::EnterNamespaces.

Any lock that is necessary to serve an allocator call must be handled this way. Otherwise there is a possibility that the lock is held during the call to fork, which results in it being held forever in the child process, and the next operation that needs it deadlocks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thanks for the explanation.

For the use case in release.h, they are supposed to be acquired after the region mutexes of primary allocator. Which means once you disable the primary allocator, those mutexes won't be held. Given the constraint of thread safety analysis, we don't have an easy way to express that but I can add some runtime debug checks to ensure that assumption.

For the StackDepot, even the names enable/disable seem to be slightly misleading here but I think it's fine according to the context. Can we add some short comment to the enable/disable ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point about the BufferPool lock, removed it from the pull request.
Added a comment in StackDepot.

Copy link
Contributor

@ChiaHungDuan ChiaHungDuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@eugenis eugenis merged commit c82f3ca into llvm:main Jan 29, 2024
4 checks passed
@eugenis eugenis deleted the scudo-atfork branch January 29, 2024 22:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants