Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[scudo] Use a tryLock in secondary release to OS #132827

Merged
merged 1 commit into from
Mar 25, 2025

Conversation

cferris1000
Copy link
Contributor

In the caching part of the secondary path, when about to try to release memory to the OS, we always wait while acquiring the lock. However, if multiple threads are attempting this at the same time, all other threads will likely do nothing when the release call is made. Change the algorithm to skip the release if there is another release in process.

Also, pull the lock outside the releaseOlderThan function. This is so that in the store path, we use the tryLock and skip if another thread is releasing. But in the path where a forced release call is being made, that call will wait for release to complete which guarantees that all entries are released when requested.

In the caching part of the secondary path, when about to try to
release memory to the OS, we always wait while acquiring the lock.
However, if multiple threads are attempting this at the same time,
all other threads will likely do nothing when the release call is
made. Change the algorithm to skip the release if there is another
release in process.

Also, pull the lock outside the releaseOlderThan function. This
is so that in the store path, we use the tryLock and skip if another
thread is releasing. But in the path where a forced release call
is being made, that call will wait for release to complete which
guarantees that all entries are released when requested.
@llvmbot
Copy link
Member

llvmbot commented Mar 24, 2025

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Christopher Ferris (cferris1000)

Changes

In the caching part of the secondary path, when about to try to release memory to the OS, we always wait while acquiring the lock. However, if multiple threads are attempting this at the same time, all other threads will likely do nothing when the release call is made. Change the algorithm to skip the release if there is another release in process.

Also, pull the lock outside the releaseOlderThan function. This is so that in the store path, we use the tryLock and skip if another thread is releasing. But in the path where a forced release call is being made, that call will wait for release to complete which guarantees that all entries are released when requested.


Full diff: https://github.com/llvm/llvm-project/pull/132827.diff

1 Files Affected:

  • (modified) compiler-rt/lib/scudo/standalone/secondary.h (+24-10)
diff --git a/compiler-rt/lib/scudo/standalone/secondary.h b/compiler-rt/lib/scudo/standalone/secondary.h
index f3f91c40e7a03..286e5d332f57c 100644
--- a/compiler-rt/lib/scudo/standalone/secondary.h
+++ b/compiler-rt/lib/scudo/standalone/secondary.h
@@ -206,11 +206,13 @@ class MapAllocatorCache {
     computePercentage(SuccessfulRetrieves, CallsToRetrieve, &Integral,
                       &Fractional);
     const s32 Interval = atomic_load_relaxed(&ReleaseToOsIntervalMs);
-    Str->append(
-        "Stats: MapAllocatorCache: EntriesCount: %zu, "
-        "MaxEntriesCount: %u, MaxEntrySize: %zu, ReleaseToOsIntervalMs = %d\n",
-        LRUEntries.size(), atomic_load_relaxed(&MaxEntriesCount),
-        atomic_load_relaxed(&MaxEntrySize), Interval >= 0 ? Interval : -1);
+    Str->append("Stats: MapAllocatorCache: EntriesCount: %zu, "
+                "MaxEntriesCount: %u, MaxEntrySize: %zu, ReleaseToOsSkips: "
+                "%zu, ReleaseToOsIntervalMs = %d\n",
+                LRUEntries.size(), atomic_load_relaxed(&MaxEntriesCount),
+                atomic_load_relaxed(&MaxEntrySize),
+                atomic_load_relaxed(&ReleaseToOsSkips),
+                Interval >= 0 ? Interval : -1);
     Str->append("Stats: CacheRetrievalStats: SuccessRate: %u/%u "
                 "(%zu.%02zu%%)\n",
                 SuccessfulRetrieves, CallsToRetrieve, Integral, Fractional);
@@ -343,8 +345,15 @@ class MapAllocatorCache {
       unmapCallBack(EvictMemMap);
 
     if (Interval >= 0) {
-      // TODO: Add ReleaseToOS logic to LRU algorithm
-      releaseOlderThan(Time - static_cast<u64>(Interval) * 1000000);
+      // It is very likely that multiple threads trying to do a release at the
+      // same time will not actually release any extra elements. Therefore,
+      // let any other thread continue, skipping the release.
+      if (Mutex.tryLock()) {
+        // TODO: Add ReleaseToOS logic to LRU algorithm
+        releaseOlderThan(Time - static_cast<u64>(Interval) * 1000000);
+        Mutex.unlock();
+      } else
+        atomic_fetch_add(&ReleaseToOsSkips, 1U, memory_order_relaxed);
     }
   }
 
@@ -488,7 +497,12 @@ class MapAllocatorCache {
     return true;
   }
 
-  void releaseToOS() { releaseOlderThan(UINT64_MAX); }
+  void releaseToOS() EXCLUDES(Mutex) {
+    // Since this is a request to release everything, always wait for the
+    // lock so that we guarantee all entries are released after this call.
+    ScopedLock L(Mutex);
+    releaseOlderThan(UINT64_MAX);
+  }
 
   void disableMemoryTagging() EXCLUDES(Mutex) {
     ScopedLock L(Mutex);
@@ -554,8 +568,7 @@ class MapAllocatorCache {
     Entry.Time = 0;
   }
 
-  void releaseOlderThan(u64 Time) EXCLUDES(Mutex) {
-    ScopedLock L(Mutex);
+  void releaseOlderThan(u64 Time) REQUIRES(Mutex) {
     if (!LRUEntries.size() || OldestTime == 0 || OldestTime > Time)
       return;
     OldestTime = 0;
@@ -573,6 +586,7 @@ class MapAllocatorCache {
   atomic_s32 ReleaseToOsIntervalMs = {};
   u32 CallsToRetrieve GUARDED_BY(Mutex) = 0;
   u32 SuccessfulRetrieves GUARDED_BY(Mutex) = 0;
+  atomic_uptr ReleaseToOsSkips = {};
 
   CachedBlock Entries[Config::getEntriesArraySize()] GUARDED_BY(Mutex) = {};
   NonZeroLengthArray<CachedBlock, Config::getQuarantineSize()>

@cferris1000
Copy link
Contributor Author

I verified that system_server on Android devices does hit this in regular operation, so this seems worthwhile.

Copy link
Contributor

@ajordanr-google ajordanr-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks good, fairly clear, and I see the value of this! Also neat that it's represented in the stats. This seems like a strict improvement over the previous set up.

@cferris1000 cferris1000 merged commit 134cb88 into llvm:main Mar 25, 2025
14 checks passed
@cferris1000 cferris1000 deleted the secondary_trylock branch March 25, 2025 23:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants