Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[scudo] Add hooks to mark the range of realloc #73883

Closed
wants to merge 0 commits into from

Conversation

ChiaHungDuan
Copy link
Contributor

realloc may involve both allocation and deallocation. Given that the reporting the events is not atomic and which may lead the hook user to a false case that the double-use pattern happens. In general, this can be resolved on the hook side. To alleviate the task of handling it, we add two new hooks to mark the range and reorder the reporting.

@llvmbot
Copy link
Collaborator

llvmbot commented Nov 30, 2023

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: None (ChiaHungDuan)

Changes

realloc may involve both allocation and deallocation. Given that the reporting the events is not atomic and which may lead the hook user to a false case that the double-use pattern happens. In general, this can be resolved on the hook side. To alleviate the task of handling it, we add two new hooks to mark the range and reorder the reporting.


Full diff: https://github.com/llvm/llvm-project/pull/73883.diff

4 Files Affected:

  • (modified) compiler-rt/lib/scudo/standalone/combined.h (+9-2)
  • (modified) compiler-rt/lib/scudo/standalone/include/scudo/interface.h (+6-1)
  • (modified) compiler-rt/lib/scudo/standalone/tests/wrappers_c_test.cpp (+36-6)
  • (modified) compiler-rt/lib/scudo/standalone/wrappers_c.inc (+49-16)
diff --git a/compiler-rt/lib/scudo/standalone/combined.h b/compiler-rt/lib/scudo/standalone/combined.h
index b1700e5ecef7f5b..0a7d6492e1194ab 100644
--- a/compiler-rt/lib/scudo/standalone/combined.h
+++ b/compiler-rt/lib/scudo/standalone/combined.h
@@ -827,10 +827,15 @@ class Allocator {
   // for it, which then forces realloc to copy the usable size of a chunk as
   // opposed to its actual size.
   uptr getUsableSize(const void *Ptr) {
-    initThreadMaybe();
     if (UNLIKELY(!Ptr))
       return 0;
 
+    return getAllocSize(Ptr);
+  }
+
+  uptr getAllocSize(const void *Ptr) {
+    initThreadMaybe();
+
 #ifdef GWP_ASAN_HOOKS
     if (UNLIKELY(GuardedAlloc.pointerIsMine(Ptr)))
       return GuardedAlloc.getSize(Ptr);
@@ -839,9 +844,11 @@ class Allocator {
     Ptr = getHeaderTaggedPointer(const_cast<void *>(Ptr));
     Chunk::UnpackedHeader Header;
     Chunk::loadHeader(Cookie, Ptr, &Header);
-    // Getting the usable size of a chunk only makes sense if it's allocated.
+
+    // Getting the alloc size of a chunk only makes sense if it's allocated.
     if (UNLIKELY(Header.State != Chunk::State::Allocated))
       reportInvalidChunkState(AllocatorAction::Sizing, const_cast<void *>(Ptr));
+
     return getSize(Ptr, &Header);
   }
 
diff --git a/compiler-rt/lib/scudo/standalone/include/scudo/interface.h b/compiler-rt/lib/scudo/standalone/include/scudo/interface.h
index 6c0c521f8d82f85..e4534fc9ec2f721 100644
--- a/compiler-rt/lib/scudo/standalone/include/scudo/interface.h
+++ b/compiler-rt/lib/scudo/standalone/include/scudo/interface.h
@@ -17,10 +17,15 @@ extern "C" {
 __attribute__((weak)) const char *__scudo_default_options(void);
 
 // Post-allocation & pre-deallocation hooks.
-// They must be thread-safe and not use heap related functions.
 __attribute__((weak)) void __scudo_allocate_hook(void *ptr, size_t size);
 __attribute__((weak)) void __scudo_deallocate_hook(void *ptr);
 
+// These hooks are used to mark the scope of doing realloc(). Note that the
+// allocation/deallocation are still reported through the hooks above, this is
+// only used when you want to group two operations in the realloc().
+__attribute__((weak)) void __scudo_realloc_begin_hook(void *old_ptr);
+__attribute__((weak)) void __scudo_realloc_end_hook(void *old_ptr);
+
 void __scudo_print_stats(void);
 
 typedef void (*iterate_callback)(uintptr_t base, size_t size, void *arg);
diff --git a/compiler-rt/lib/scudo/standalone/tests/wrappers_c_test.cpp b/compiler-rt/lib/scudo/standalone/tests/wrappers_c_test.cpp
index 623550535b6c8e2..7b47c872ae88d71 100644
--- a/compiler-rt/lib/scudo/standalone/tests/wrappers_c_test.cpp
+++ b/compiler-rt/lib/scudo/standalone/tests/wrappers_c_test.cpp
@@ -61,8 +61,13 @@ struct AllocContext {
 struct DeallocContext {
   void *Ptr;
 };
+struct ReallocContext {
+  void *Begin;
+  void *End;
+};
 static AllocContext AC;
 static DeallocContext DC;
+static ReallocContext RC;
 
 #if (SCUDO_ENABLE_HOOKS_TESTS == 1)
 __attribute__((visibility("default"))) void __scudo_allocate_hook(void *Ptr,
@@ -73,6 +78,14 @@ __attribute__((visibility("default"))) void __scudo_allocate_hook(void *Ptr,
 __attribute__((visibility("default"))) void __scudo_deallocate_hook(void *Ptr) {
   DC.Ptr = Ptr;
 }
+__attribute__((visibility("default"))) void
+__scudo_realloc_begin_hook(void *Ptr) {
+  RC.Begin = Ptr;
+}
+__attribute__((visibility("default"))) void
+__scudo_realloc_end_hook(void *Ptr) {
+  RC.End = Ptr;
+}
 #endif // (SCUDO_ENABLE_HOOKS_TESTS == 1)
 }
 
@@ -83,9 +96,12 @@ class ScudoWrappersCTest : public Test {
       printf("Hooks are enabled but hooks tests are disabled.\n");
   }
 
-  void invalidateAllocHookPtrAs(UNUSED void *Ptr) {
-    if (SCUDO_ENABLE_HOOKS_TESTS)
+  void invalidateHookPtrsAs(UNUSED void *Ptr) {
+    if (SCUDO_ENABLE_HOOKS_TESTS) {
       AC.Ptr = Ptr;
+      DC.Ptr = Ptr;
+      RC.Begin = RC.End = Ptr;
+    }
   }
   void verifyAllocHookPtr(UNUSED void *Ptr) {
     if (SCUDO_ENABLE_HOOKS_TESTS)
@@ -99,6 +115,12 @@ class ScudoWrappersCTest : public Test {
     if (SCUDO_ENABLE_HOOKS_TESTS)
       EXPECT_EQ(Ptr, DC.Ptr);
   }
+  void verifyReallocHooksScope(UNUSED void *Ptr) {
+    if (SCUDO_ENABLE_HOOKS_TESTS) {
+      EXPECT_EQ(Ptr, RC.Begin);
+      EXPECT_EQ(Ptr, RC.End);
+    }
+  }
 };
 using ScudoWrappersCDeathTest = ScudoWrappersCTest;
 
@@ -258,26 +280,30 @@ TEST_F(ScudoWrappersCTest, AlignedAlloc) {
 }
 
 TEST_F(ScudoWrappersCDeathTest, Realloc) {
+  invalidateHookPtrsAs(reinterpret_cast<void *>(0xdeadbeef));
   // realloc(nullptr, N) is malloc(N)
   void *P = realloc(nullptr, Size);
   EXPECT_NE(P, nullptr);
   verifyAllocHookPtr(P);
   verifyAllocHookSize(Size);
+  verifyReallocHooksScope(nullptr);
   free(P);
   verifyDeallocHookPtr(P);
 
+  invalidateHookPtrsAs(reinterpret_cast<void *>(0xdeadbeef));
   P = malloc(Size);
   EXPECT_NE(P, nullptr);
   // realloc(P, 0U) is free(P) and returns nullptr
   EXPECT_EQ(realloc(P, 0U), nullptr);
   verifyDeallocHookPtr(P);
+  verifyReallocHooksScope(P);
 
   P = malloc(Size);
   EXPECT_NE(P, nullptr);
   EXPECT_LE(Size, malloc_usable_size(P));
   memset(P, 0x42, Size);
 
-  invalidateAllocHookPtrAs(reinterpret_cast<void *>(0xdeadbeef));
+  invalidateHookPtrsAs(reinterpret_cast<void *>(0xdeadbeef));
   void *OldP = P;
   P = realloc(P, Size * 2U);
   EXPECT_NE(P, nullptr);
@@ -285,14 +311,16 @@ TEST_F(ScudoWrappersCDeathTest, Realloc) {
   for (size_t I = 0; I < Size; I++)
     EXPECT_EQ(0x42, (reinterpret_cast<uint8_t *>(P))[I]);
   if (OldP == P) {
-    verifyAllocHookPtr(reinterpret_cast<void *>(0xdeadbeef));
+    verifyDeallocHookPtr(OldP);
+    verifyAllocHookPtr(OldP);
   } else {
     verifyAllocHookPtr(P);
     verifyAllocHookSize(Size * 2U);
     verifyDeallocHookPtr(OldP);
   }
+  verifyReallocHooksScope(OldP);
 
-  invalidateAllocHookPtrAs(reinterpret_cast<void *>(0xdeadbeef));
+  invalidateHookPtrsAs(reinterpret_cast<void *>(0xdeadbeef));
   OldP = P;
   P = realloc(P, Size / 2U);
   EXPECT_NE(P, nullptr);
@@ -300,11 +328,13 @@ TEST_F(ScudoWrappersCDeathTest, Realloc) {
   for (size_t I = 0; I < Size / 2U; I++)
     EXPECT_EQ(0x42, (reinterpret_cast<uint8_t *>(P))[I]);
   if (OldP == P) {
-    verifyAllocHookPtr(reinterpret_cast<void *>(0xdeadbeef));
+    verifyDeallocHookPtr(OldP);
+    verifyAllocHookPtr(OldP);
   } else {
     verifyAllocHookPtr(P);
     verifyAllocHookSize(Size / 2U);
   }
+  verifyReallocHooksScope(OldP);
   free(P);
 
   EXPECT_DEATH(P = realloc(P, Size), "");
diff --git a/compiler-rt/lib/scudo/standalone/wrappers_c.inc b/compiler-rt/lib/scudo/standalone/wrappers_c.inc
index e213da73393b5da..fde459bf5bca631 100644
--- a/compiler-rt/lib/scudo/standalone/wrappers_c.inc
+++ b/compiler-rt/lib/scudo/standalone/wrappers_c.inc
@@ -27,6 +27,51 @@ static void reportDeallocation(void *ptr) {
     if (__scudo_deallocate_hook)
       __scudo_deallocate_hook(ptr);
 }
+static void reportReallocBegin(void *old_ptr) {
+  if (SCUDO_ENABLE_HOOKS)
+    if (__scudo_realloc_begin_hook)
+      __scudo_realloc_begin_hook(old_ptr);
+}
+static void reportReallocEnd(void *old_ptr) {
+  if (SCUDO_ENABLE_HOOKS)
+    if (__scudo_realloc_end_hook)
+      __scudo_realloc_end_hook(old_ptr);
+}
+
+static void *reallocImpl(void *ptr, size_t size) {
+  if (!ptr) {
+    void *Ptr = SCUDO_ALLOCATOR.allocate(size, scudo::Chunk::Origin::Malloc,
+                                         SCUDO_MALLOC_ALIGNMENT);
+    reportAllocation(Ptr, size);
+    return Ptr;
+  }
+  if (size == 0) {
+    reportDeallocation(ptr);
+    SCUDO_ALLOCATOR.deallocate(ptr, scudo::Chunk::Origin::Malloc);
+    return nullptr;
+  }
+
+  // Given that the reporting of deallocation and allocation are not atomic, we
+  // always pretend the old pointer will be released so that the user don't need
+  // to worry about the false double-use case from the view of hooks.
+  // For example, before the reporting all the events in the `realloc` has
+  // finished, another thread may have got the pointer released by the
+  // `realloc` and report it to the hook. It may be misinterpreted as double-use
+  // if it's not handled properly on the hook side.
+  reportDeallocation(ptr);
+  void *NewPtr = SCUDO_ALLOCATOR.reallocate(ptr, size, SCUDO_MALLOC_ALIGNMENT);
+  if (NewPtr != nullptr) {
+    // Note that even if NewPtr == ptr, the size has changed. We still need to
+    // report the new size.
+    reportAllocation(NewPtr, size);
+  } else {
+    // If `realloc` fails, the old pointer is not released. Report the old
+    // pointer as allocated.
+    reportAllocation(ptr, SCUDO_ALLOCATOR.getAllocSize(ptr));
+  }
+
+  return NewPtr;
+}
 
 extern "C" {
 
@@ -163,23 +208,11 @@ INTERFACE WEAK void *SCUDO_PREFIX(pvalloc)(size_t size) {
 }
 
 INTERFACE WEAK void *SCUDO_PREFIX(realloc)(void *ptr, size_t size) {
-  if (!ptr) {
-    void *Ptr = SCUDO_ALLOCATOR.allocate(size, scudo::Chunk::Origin::Malloc,
-                                         SCUDO_MALLOC_ALIGNMENT);
-    reportAllocation(Ptr, size);
-    return scudo::setErrnoOnNull(Ptr);
-  }
-  if (size == 0) {
-    reportDeallocation(ptr);
-    SCUDO_ALLOCATOR.deallocate(ptr, scudo::Chunk::Origin::Malloc);
-    return nullptr;
-  }
+  reportReallocBegin(ptr);
 
-  void *NewPtr = SCUDO_ALLOCATOR.reallocate(ptr, size, SCUDO_MALLOC_ALIGNMENT);
-  if (NewPtr != ptr) {
-    reportAllocation(NewPtr, size);
-    reportDeallocation(ptr);
-  }
+  void *NewPtr = reallocImpl(ptr, size);
+
+  reportReallocEnd(ptr);
 
   return scudo::setErrnoOnNull(NewPtr);
 }

__attribute__((weak)) void __scudo_allocate_hook(void *ptr, size_t size);
__attribute__((weak)) void __scudo_deallocate_hook(void *ptr);

// These hooks are used to mark the scope of doing realloc(). Note that the
// allocation/deallocation are still reported through the hooks above, this is
// only used when you want to group two operations in the realloc().
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to describe this as:

only used when a hook user wants to know that the deallocate/allocate operation are a single realloc operation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Done

@@ -163,23 +208,11 @@ INTERFACE WEAK void *SCUDO_PREFIX(pvalloc)(size_t size) {
}

INTERFACE WEAK void *SCUDO_PREFIX(realloc)(void *ptr, size_t size) {
if (!ptr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it make more sense to leave the !ptr and size == 0 checks here before calling the callback functions. Even though the original user did call realloc in these cases, the actual operations are not reallocs, so I think it make sense to skip the reportRealloc hooks in that case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hesitating if we want to do that. My argument is that these two cases are defined in the realloc spec so it's still part of the realloc logic.

For some cases (may not be useful through...), you can tell the differences between deallocations from free() and realloc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I don't have strong opinion on this. Let me know if you think it's better to exclude them

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No strong opinion either, but I'd lean on the side of handling them separately. It would make it easier to describe the contract of the new hooks (always __scudo_realloc_begin_hook followed by __scudo_deallocate_hook followed by __scudo_allocate_hook followed by __scudo_realloc_end_hook) and client code wouldn't have to second-guess what the intent of the realloc call was.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wound't say there's the contract always __scudo_realloc_begin_hook followed by __scudo_deallocate_hook followed by __scudo_allocate_hook followed by __scudo_realloc_end_hook BTW. It's still reasonable to do it in old way but with the scope markers. To me, it says the begin/end for realloc, covering the whole range then it does let the user doesn't need to second-guess what the intent is.

But like I said, no strong opinion here. Will change it to the form desired

if (!ptr) {
void *Ptr = SCUDO_ALLOCATOR.allocate(size, scudo::Chunk::Origin::Malloc,
SCUDO_MALLOC_ALIGNMENT);
reportAllocation(Ptr, size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this isn't your problem, but if Ptr is nullptr, should the reportAllocation hook be called? This wouldn't normally happen, but if you are working with user data setting the size, I can see cases where an invalid size is passed in and fails to allocate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have covered that case. In reportAllocation, it only reports the ptr if it's non-nullptr.

static void reportAllocation(void *ptr, size_t size) {                              
  if (SCUDO_ENABLE_HOOKS)                                                           
    if (__scudo_allocate_hook && ptr)                                               
      __scudo_allocate_hook(ptr, size);                                             
}               

compiler-rt/lib/scudo/standalone/wrappers_c.inc Outdated Show resolved Hide resolved
compiler-rt/lib/scudo/standalone/wrappers_c.inc Outdated Show resolved Hide resolved
// always pretend the old pointer will be released so that the user don't need
// to worry about the false double-use case from the view of hooks.
// For example, before the reporting all the events in the `realloc` has
// finished, another thread may have got the pointer released by the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another thread may have got the pointer released by the -> another thread may have released the pointer in realloc and reported it through the hook.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, I try to say that another thread gets the old pointer released by realloc.
Reword it a little bit. Let me know if it's more clear

__attribute__((weak)) void __scudo_allocate_hook(void *ptr, size_t size);
__attribute__((weak)) void __scudo_deallocate_hook(void *ptr);

// These hooks are used to mark the scope of doing realloc(). Note that the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a more detailed description of the order in which the hooks are called? e.g.

  1. __scudo_realloc_begin_hook is always called at the beginning of realloc
  2. either __scudo_allocate_hook or __scudo_deallocate_hook or both (__scudo_deallocate_hook and then __scudo_allocate_hook)
  3. __scudo_realloc_end_hook is always called at the end

I know that it's easy to see what the code does, but having a comment explicitly describing the intended behavior can be useful in my opinion to distinguish what is official and what is an implementation detail

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the hook names are self-explained. Every hooks called in between begin/end indicate they are called during realloc. I'm not sure those comments will make it more clear

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is that I expect clients to build their mini-finite-state-machines to process realloc notifications. Having known and well-defined transitions would help maintaining them

@ChiaHungDuan
Copy link
Contributor Author

The reorder of invocation of deallocation/allocation hooks has been split out to #74149, this one will change to add new hooks only

@ChiaHungDuan
Copy link
Contributor Author

For some reason I just created another one for the rest part of this change in #74353

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants