Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flang][runtime] Validate pointer DEALLOCATE #78612

Merged
merged 1 commit into from
Jan 25, 2024
Merged

Conversation

klausler
Copy link
Contributor

The standard requires a compiler to diagnose an incorrect use of a pointer in a DEALLOCATE statement. The pointer must be associated with an entire object that was allocated as a pointer (not allocatable) by an ALLOCATE statement.

Implement by appending a validation footer to pointer allocations. This is an extra allocated word that encodes the base address of the allocation. If it is not found after the data payload when the pointer is deallocated, signal an error. There is a chance of a false positive result, but that should be vanishingly unlikely.

This change requires all pointer allocations (not allocatables) to take place in the runtime in PointerAllocate(), which might be slower in cases that could otherwise be handled with a native memory allocation operation. I believe that memory allocation of pointers is less common than with allocatables, which are not affected. If this turns out to become a performance problem, we can inline the creation and initialization of the footer word.

Fixes #78391.

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 18, 2024

@llvm/pr-subscribers-flang-fir-hlfir

@llvm/pr-subscribers-flang-runtime

Author: Peter Klausler (klausler)

Changes

The standard requires a compiler to diagnose an incorrect use of a pointer in a DEALLOCATE statement. The pointer must be associated with an entire object that was allocated as a pointer (not allocatable) by an ALLOCATE statement.

Implement by appending a validation footer to pointer allocations. This is an extra allocated word that encodes the base address of the allocation. If it is not found after the data payload when the pointer is deallocated, signal an error. There is a chance of a false positive result, but that should be vanishingly unlikely.

This change requires all pointer allocations (not allocatables) to take place in the runtime in PointerAllocate(), which might be slower in cases that could otherwise be handled with a native memory allocation operation. I believe that memory allocation of pointers is less common than with allocatables, which are not affected. If this turns out to become a performance problem, we can inline the creation and initialization of the footer word.

Fixes #78391.


Full diff: https://github.com/llvm/llvm-project/pull/78612.diff

8 Files Affected:

  • (modified) flang/include/flang/Runtime/descriptor.h (+1)
  • (modified) flang/include/flang/Runtime/magic-numbers.h (+5)
  • (modified) flang/lib/Lower/Allocatable.cpp (+3-1)
  • (modified) flang/runtime/descriptor.cpp (+5-1)
  • (modified) flang/runtime/pointer.cpp (+35-8)
  • (modified) flang/runtime/stat.cpp (+4)
  • (modified) flang/runtime/stat.h (+1)
  • (modified) flang/test/Lower/Intrinsics/c_loc.f90 (+31-14)
diff --git a/flang/include/flang/Runtime/descriptor.h b/flang/include/flang/Runtime/descriptor.h
index e36b37c1a917eb..38ab4155735976 100644
--- a/flang/include/flang/Runtime/descriptor.h
+++ b/flang/include/flang/Runtime/descriptor.h
@@ -375,6 +375,7 @@ class Descriptor {
   // allocation.  Does not allocate automatic components or
   // perform default component initialization.
   RT_API_ATTRS int Allocate();
+  RT_API_ATTRS void SetByteStrides();
 
   // Deallocates storage; does not call FINAL subroutines or
   // deallocate allocatable/automatic components.
diff --git a/flang/include/flang/Runtime/magic-numbers.h b/flang/include/flang/Runtime/magic-numbers.h
index 196b13ad3755b3..38ccc5e7d3df65 100644
--- a/flang/include/flang/Runtime/magic-numbers.h
+++ b/flang/include/flang/Runtime/magic-numbers.h
@@ -63,6 +63,11 @@ same allocatable.
 #endif
 #define FORTRAN_RUNTIME_STAT_MOVE_ALLOC_SAME_ALLOCATABLE 109
 
+#if 0
+Additional status code for a bad pointer DEALLOCATE.
+#endif
+#define FORTRAN_RUNTIME_STAT_BAD_POINTER_DEALLOCATION 110
+
 #if 0
 ieee_class_type values
 The sequence is that of F18 Clause 17.2p3, but nothing depends on that.
diff --git a/flang/lib/Lower/Allocatable.cpp b/flang/lib/Lower/Allocatable.cpp
index 898f34786a248e..affb483e5b82b4 100644
--- a/flang/lib/Lower/Allocatable.cpp
+++ b/flang/lib/Lower/Allocatable.cpp
@@ -454,7 +454,9 @@ class AllocateStmtHelper {
                            const fir::MutableBoxValue &box) {
     if (!box.isDerived() && !errorManager.hasStatSpec() &&
         !alloc.type.IsPolymorphic() && !alloc.hasCoarraySpec() &&
-        !useAllocateRuntime) {
+        !useAllocateRuntime && !box.isPointer()) {
+      // Pointers must use PointerAllocate so that their deallocations
+      // can be validated.
       genInlinedAllocation(alloc, box);
       return;
     }
diff --git a/flang/runtime/descriptor.cpp b/flang/runtime/descriptor.cpp
index 34ca33a6a8e306..e1f1602756b3a2 100644
--- a/flang/runtime/descriptor.cpp
+++ b/flang/runtime/descriptor.cpp
@@ -162,6 +162,11 @@ RT_API_ATTRS int Descriptor::Allocate() {
   }
   // TODO: image synchronization
   raw_.base_addr = p;
+  SetByteStrides();
+  return 0;
+}
+
+RT_API_ATTRS void Descriptor::SetByteStrides() {
   if (int dims{rank()}) {
     std::size_t stride{ElementBytes()};
     for (int j{0}; j < dims; ++j) {
@@ -170,7 +175,6 @@ RT_API_ATTRS int Descriptor::Allocate() {
       stride *= dimension.Extent();
     }
   }
-  return 0;
 }
 
 RT_API_ATTRS int Descriptor::Destroy(
diff --git a/flang/runtime/pointer.cpp b/flang/runtime/pointer.cpp
index f83c00089813eb..a565f960aae8f5 100644
--- a/flang/runtime/pointer.cpp
+++ b/flang/runtime/pointer.cpp
@@ -129,17 +129,32 @@ int RTDEF(PointerAllocate)(Descriptor &pointer, bool hasStat,
   if (!pointer.IsPointer()) {
     return ReturnError(terminator, StatInvalidDescriptor, errMsg, hasStat);
   }
-  int stat{ReturnError(terminator, pointer.Allocate(), errMsg, hasStat)};
-  if (stat == StatOk) {
-    if (const DescriptorAddendum * addendum{pointer.Addendum()}) {
-      if (const auto *derived{addendum->derivedType()}) {
-        if (!derived->noInitializationNeeded()) {
-          stat = Initialize(pointer, *derived, terminator, hasStat, errMsg);
-        }
+  std::size_t byteSize{pointer.Elements() * pointer.ElementBytes()};
+  // Add space for a footer to validate during DEALLOCATE.
+  constexpr std::size_t align{sizeof(std::uintptr_t)};
+  byteSize = ((byteSize + align - 1) / align) * align;
+  std::size_t total{byteSize + sizeof(std::uintptr_t)};
+  void *p{std::malloc(total)};
+  if (!p) {
+    return ReturnError(terminator, CFI_ERROR_MEM_ALLOCATION, errMsg, hasStat);
+  }
+  pointer.set_base_addr(p);
+  pointer.SetByteStrides();
+  // Fill the footer word with the XOR of the ones' complement of
+  // the base address, which is a value that would be highly unlikely
+  // to appear accidentally at the right spot.
+  std::uintptr_t *footer{
+      reinterpret_cast<std::uintptr_t *>(static_cast<char *>(p) + byteSize)};
+  *footer = ~reinterpret_cast<std::uintptr_t>(p);
+  int stat{StatOk};
+  if (const DescriptorAddendum * addendum{pointer.Addendum()}) {
+    if (const auto *derived{addendum->derivedType()}) {
+      if (!derived->noInitializationNeeded()) {
+        stat = Initialize(pointer, *derived, terminator, hasStat, errMsg);
       }
     }
   }
-  return stat;
+  return ReturnError(terminator, stat, errMsg, hasStat);
 }
 
 int RTDEF(PointerAllocateSource)(Descriptor &pointer, const Descriptor &source,
@@ -163,6 +178,18 @@ int RTDEF(PointerDeallocate)(Descriptor &pointer, bool hasStat,
   if (!pointer.IsAllocated()) {
     return ReturnError(terminator, StatBaseNull, errMsg, hasStat);
   }
+  // Validate the footer.  This should fail if the pointer doesn't
+  // span the entire object, or the object was not allocated as a
+  // pointer.
+  std::size_t byteSize{pointer.Elements() * pointer.ElementBytes()};
+  constexpr std::size_t align{sizeof(std::uintptr_t)};
+  byteSize = ((byteSize + align - 1) / align) * align;
+  void *p{pointer.raw().base_addr};
+  std::uintptr_t *footer{
+      reinterpret_cast<std::uintptr_t *>(static_cast<char *>(p) + byteSize)};
+  if (*footer != ~reinterpret_cast<std::uintptr_t>(p)) {
+    return ReturnError(terminator, StatBadPointerDeallocation, errMsg, hasStat);
+  }
   return ReturnError(terminator,
       pointer.Destroy(/*finalize=*/true, /*destroyPointers=*/true, &terminator),
       errMsg, hasStat);
diff --git a/flang/runtime/stat.cpp b/flang/runtime/stat.cpp
index 24368fa6a1ae1a..525a4e36cdc773 100644
--- a/flang/runtime/stat.cpp
+++ b/flang/runtime/stat.cpp
@@ -66,6 +66,10 @@ RT_API_ATTRS const char *StatErrorString(int stat) {
   case StatMoveAllocSameAllocatable:
     return "MOVE_ALLOC passed the same address as to and from";
 
+  case StatBadPointerDeallocation:
+    return "DEALLOCATE of a pointer that is not the whole content of a pointer "
+           "ALLOCATE";
+
   default:
     return nullptr;
   }
diff --git a/flang/runtime/stat.h b/flang/runtime/stat.h
index e2b0658b122c97..55cdac46eb3a57 100644
--- a/flang/runtime/stat.h
+++ b/flang/runtime/stat.h
@@ -51,6 +51,7 @@ enum Stat {
   StatValueTooShort = FORTRAN_RUNTIME_STAT_VALUE_TOO_SHORT,
   StatMoveAllocSameAllocatable =
       FORTRAN_RUNTIME_STAT_MOVE_ALLOC_SAME_ALLOCATABLE,
+  StatBadPointerDeallocation = FORTRAN_RUNTIME_STAT_BAD_POINTER_DEALLOCATION,
 };
 
 RT_API_ATTRS const char *StatErrorString(int);
diff --git a/flang/test/Lower/Intrinsics/c_loc.f90 b/flang/test/Lower/Intrinsics/c_loc.f90
index e73e63913566c9..f46b80fd9b980e 100644
--- a/flang/test/Lower/Intrinsics/c_loc.f90
+++ b/flang/test/Lower/Intrinsics/c_loc.f90
@@ -177,20 +177,37 @@ subroutine c_loc_arraysection()
 ! CHECK:         %[[VAL_2:.*]] = fir.zero_bits !fir.ptr<i32>
 ! CHECK:         fir.store %[[VAL_2]] to %[[VAL_1]] : !fir.ref<!fir.ptr<i32>>
 ! CHECK:         %[[VAL_3:.*]] = fir.alloca !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}> {bindc_name = "ptr", uniq_name = "_QFc_loc_non_save_pointer_scalarEptr"}
-! CHECK:         %[[VAL_4:.*]] = fir.allocmem i32 {fir.must_be_heap = true, uniq_name = "_QFc_loc_non_save_pointer_scalarEi.alloc"}
-! CHECK:         %[[VAL_5:.*]] = fir.convert %[[VAL_4]] : (!fir.heap<i32>) -> !fir.ptr<i32>
-! CHECK:         fir.store %[[VAL_5]] to %[[VAL_1]] : !fir.ref<!fir.ptr<i32>>
-! CHECK:         %[[VAL_6:.*]] = arith.constant 10 : i32
-! CHECK:         %[[VAL_7:.*]] = fir.load %[[VAL_1]] : !fir.ref<!fir.ptr<i32>>
-! CHECK:         fir.store %[[VAL_6]] to %[[VAL_7]] : !fir.ptr<i32>
-! CHECK:         %[[VAL_8:.*]] = fir.load %[[VAL_1]] : !fir.ref<!fir.ptr<i32>>
-! CHECK:         %[[VAL_9:.*]] = fir.embox %[[VAL_8]] : (!fir.ptr<i32>) -> !fir.box<i32>
-! CHECK:         %[[VAL_10:.*]] = fir.alloca !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>
-! CHECK-DAG:         %[[VAL_11:.*]] = fir.box_addr %[[VAL_9]] : (!fir.box<i32>) -> !fir.ref<i32>
-! CHECK-DAG:         %[[VAL_12:.*]] = fir.convert %[[VAL_11]] : (!fir.ref<i32>) -> i64
-! CHECK-DAG:         %[[VAL_13:.*]] = fir.field_index __address, !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>
-! CHECK-DAG:         %[[VAL_14:.*]] = fir.coordinate_of %[[VAL_10]], %[[VAL_13]] : (!fir.ref<!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>>, !fir.field) -> !fir.ref<i64>
-! CHECK:         fir.store %[[VAL_12]] to %[[VAL_14]] : !fir.ref<i64>
+! CHECK:         %[[VAL_false:.*]] = arith.constant false
+! CHECK:         %[[VAL_4:.*]] = fir.absent !fir.box<none>
+! CHECK:         %[[VAL_5:.*]] = fir.address_of(@{{.*}}) : !fir.ref<!fir.char<1,{{.*}}>>
+! CHECK:         %[[C_LN:.*]] = arith.constant {{.*}} : i32
+! CHECK:         %[[VAL_6:.*]] = fir.zero_bits !fir.ptr<i32>
+! CHECK:         %[[VAL_7:.*]] = fir.embox %[[VAL_6:.*]] : (!fir.ptr<i32>) -> !fir.box<!fir.ptr<i32>>
+! CHECK:         fir.store %[[VAL_7:.*]] to %[[VAL_0:.*]] : !fir.ref<!fir.box<!fir.ptr<i32>>>
+! CHECK:         %[[VAL_8:.*]] = fir.convert %[[VAL_0:.*]] : (!fir.ref<!fir.box<!fir.ptr<i32>>>) -> !fir.ref<!fir.box<none>>
+! CHECK:         %[[VAL_9:.*]] = fir.convert %[[VAL_5:.*]] : (!fir.ref<!fir.char<1,{{.*}}>>) -> !fir.ref<i8>
+! CHECK:         %[[VAL_10:.*]] = fir.call @_FortranAPointerAllocate(%[[VAL_8:.*]], %[[VAL_false:.*]], %[[VAL_4:.*]], %[[VAL_9:.*]], %[[C_LN:.*]]) fastmath<contract> : (!fir.ref<!fir.box<none>>, i1, !fir.box<none>, !fir.ref<i8>, i32) -> i32
+! CHECK:         %[[VAL_11:.*]] = fir.load %[[VAL_0:.*]] : !fir.ref<!fir.box<!fir.ptr<i32>>>
+! CHECK:         %[[VAL_12:.*]] = fir.box_addr %[[VAL_11:.*]] : (!fir.box<!fir.ptr<i32>>) -> !fir.ptr<i32>
+! CHECK:         fir.store %[[VAL_12:.*]] to %[[VAL_1:.*]] : !fir.ref<!fir.ptr<i32>>
+! CHECK:         %[[C_10:.*]] = arith.constant 10 : i32
+! CHECK:         %[[VAL_13:.*]] = fir.load %[[VAL_1:.*]] : !fir.ref<!fir.ptr<i32>>
+! CHECK:         fir.store %[[C_10]] to %[[VAL_13:.*]] : !fir.ptr<i32>
+! CHECK:         %[[VAL_14:.*]] = fir.load %[[VAL_1:.*]] : !fir.ref<!fir.ptr<i32>>
+! CHECK:         %[[VAL_15:.*]] = fir.embox %[[VAL_14:.*]] : (!fir.ptr<i32>) -> !fir.box<i32>
+! CHECK:         %[[VAL_16:.*]] = fir.alloca !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>
+! CHECK:         %[[VAL_17:.*]] = fir.field_index __address, !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>
+! CHECK:         %[[VAL_18:.*]] = fir.coordinate_of %[[VAL_16:.*]], %[[VAL_17:.*]] : (!fir.ref<!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>>, !fir.field) -> !fir.ref<i64>
+! CHECK:         %[[VAL_19:.*]] = fir.box_addr %[[VAL_15:.*]] : (!fir.box<i32>) -> !fir.ref<i32>
+! CHECK:         %[[VAL_20:.*]] = fir.convert %[[VAL_19:.*]] : (!fir.ref<i32>) -> i64
+! CHECK:         fir.store %[[VAL_20:.*]] to %[[VAL_18:.*]] : !fir.ref<i64>
+! CHECK:         %[[VAL_21:.*]] = fir.field_index __address, !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>
+! CHECK:         %[[VAL_22:.*]] = fir.coordinate_of %[[VAL_16:.*]], %[[VAL_21:.*]] : (!fir.ref<!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>>, !fir.field) -> !fir.ref<i64>
+! CHECK:         %[[VAL_23:.*]] = fir.field_index __address, !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>
+! CHECK:         %[[VAL_24:.*]] = fir.coordinate_of %[[VAL_3:.*]], %[[VAL_23:.*]] : (!fir.ref<!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>>, !fir.field) -> !fir.ref<i64>
+! CHECK:         %[[VAL_25:.*]] = fir.load %[[VAL_22:.*]] : !fir.ref<i64>
+! CHECK:         fir.store %[[VAL_25:.*]] to %[[VAL_24:.*]] : !fir.ref<i64>
+! CHECK:         return
 ! CHECK:       }
 
 subroutine c_loc_non_save_pointer_scalar()

Copy link
Contributor

@jeanPerier jeanPerier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lowering change looks good, thanks.
I think your patch is safe to catch deallocation of allocated pointer subojects, but could still segfault if the program pass a pointer pointing to something that is not a pointer or pointer suboject. Although, given this last case always crashes today, this still seems to be an improvement.

void *p{pointer.raw().base_addr};
std::uintptr_t *footer{
reinterpret_cast<std::uintptr_t *>(static_cast<char *>(p) + byteSize)};
if (*footer != ~reinterpret_cast<std::uintptr_t>(p)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Playing the devil's advocate here, there is a slight chance for this *footer read to crash if this is a POINTER pointing to something like a whole allocatable (the read would be after the allocated memory for the allocatable), or pointing some array target that is neither an allocatable/pointer (the read could be outside of the stack/data memory).

But the only safe way I can think of to do the check your patch is adding without this issue would be to maintain some runtime pointer allocation table, and this may be overkill/no very parallelism friendly.

The standard requires a compiler to diagnose an incorrect use
of a pointer in a DEALLOCATE statement.  The pointer must be
associated with an entire object that was allocated as a pointer
(not allocatable) by an ALLOCATE statement.

Implement by appending a validation footer to pointer allocations.
This is an extra allocated word that encodes the base address of
the allocation.  If it is not found after the data payload when
the pointer is deallocated, signal an error.  There is a chance
of a false positive result, but that should be vanishingly
unlikely.

This change requires all pointer allocations (not allocatables)
to take place in the runtime in PointerAllocate(), which might
be slower in cases that could otherwise be handled with a native
memory allocation operation.  I believe that memory allocation
of pointers is less common than with allocatables, which are not
affected.  If this turns out to become a performance problem, we
can inline the creation and initialization of the footer word.

Fixes llvm#78391.
@klausler klausler merged commit a3bbe62 into llvm:main Jan 25, 2024
3 of 4 checks passed
@klausler klausler deleted the bug78391 branch January 25, 2024 22:44
@@ -454,7 +454,9 @@ class AllocateStmtHelper {
const fir::MutableBoxValue &box) {
if (!box.isDerived() && !errorManager.hasStatSpec() &&
!alloc.type.IsPolymorphic() && !alloc.hasCoarraySpec() &&
!useAllocateRuntime) {
!useAllocateRuntime && !box.isPointer()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be the same check in genDeallocate below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bad deallocations are being caught in my testing without it, but I don't really understand the flow here. Should I just add the check to genDeallocate() anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I bet it's the presence of stat= in my tests that makes the difference. PointerDeallocate() should be called even without stat=. Will update.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang:runtime flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Flang] Pointer deallocation for zero-size target with STAT specifier causes runtime segfault at –O0.
4 participants