Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flang][cuda] Make sure to issue freemem for the allocated temp #98078

Merged
merged 3 commits into from
Jul 12, 2024

Conversation

clementval
Copy link
Contributor

When implicit data transfer is created, make sure we generate the freemem op on the allocmem result value and not the declare op value.

@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir labels Jul 8, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Jul 8, 2024

@llvm/pr-subscribers-flang-fir-hlfir

Author: Valentin Clement (バレンタイン クレメン) (clementval)

Changes

When implicit data transfer is created, make sure we generate the freemem op on the allocmem result value and not the declare op value.


Full diff: https://github.com/llvm/llvm-project/pull/98078.diff

2 Files Affected:

  • (modified) flang/lib/Lower/Bridge.cpp (+7-2)
  • (modified) flang/test/Lower/CUDA/cuda-data-transfer.cuf (+14-3)
diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp
index 60422dd336762a..60845f706defe1 100644
--- a/flang/lib/Lower/Bridge.cpp
+++ b/flang/lib/Lower/Bridge.cpp
@@ -4294,8 +4294,13 @@ class FirConverter : public Fortran::lower::AbstractConverter {
           auto [temp, cleanup] =
               hlfir::createTempFromMold(loc, builder, entity);
           auto needCleanup = fir::getIntIfConstant(cleanup);
-          if (needCleanup && *needCleanup)
-            temps.push_back(temp);
+          if (needCleanup && *needCleanup) {
+            if (auto declareOp =
+                    mlir::dyn_cast<hlfir::DeclareOp>(temp.getDefiningOp()))
+              temps.push_back(declareOp.getMemref());
+            else
+              temps.push_back(temp);
+          }
           addSymbol(sym,
                     hlfir::translateToExtendedValue(loc, builder, temp).first,
                     /*forced=*/true);
diff --git a/flang/test/Lower/CUDA/cuda-data-transfer.cuf b/flang/test/Lower/CUDA/cuda-data-transfer.cuf
index 5dbf39c58c449e..874c31c580719d 100644
--- a/flang/test/Lower/CUDA/cuda-data-transfer.cuf
+++ b/flang/test/Lower/CUDA/cuda-data-transfer.cuf
@@ -103,7 +103,7 @@ end
 ! CHECK: cuf.data_transfer %[[ADEV]]#1 to %[[DECL_TEMP]]#0 {transfer_kind = #cuf.cuda_transfer<device_host>} : !fir.ref<!fir.array<10xi32>>, !fir.heap<!fir.array<10xi32>>
 ! CHECK: %[[ELEMENTAL:.*]] = hlfir.elemental %{{.*}} unordered : (!fir.shape<1>) -> !hlfir.expr<10xi32>
 ! CHECK: hlfir.assign %[[ELEMENTAL]] to %[[BHOST]]#0 : !hlfir.expr<10xi32>, !fir.ref<!fir.array<10xi32>>
-! CHECK: fir.freemem %[[DECL_TEMP]]#0 : !fir.heap<!fir.array<10xi32>>
+! CHECK: fir.freemem %[[TEMP]] : !fir.heap<!fir.array<10xi32>>
 
 subroutine sub3()
   use mod1
@@ -213,8 +213,6 @@ subroutine sub10(a, b)
   res = a + b
 end subroutine
 
-
-
 ! CHECK-LABEL: func.func @_QPsub10(
 ! CHECK-SAME: %[[ARG0:.*]]: !fir.ref<i32> {cuf.data_attr = #cuf.cuda<device>, fir.bindc_name = "a"}
 
@@ -222,3 +220,16 @@ end subroutine
 ! CHECK: cuf.data_transfer %[[A]]#1 to %{{.*}}#0 {transfer_kind = #cuf.cuda_transfer<device_host>} : !fir.ref<i32>, !fir.ref<i32>
 ! CHECK-NOT: cuf.data_transfer
 
+subroutine sub11(a, b, n)
+  integer :: n
+  integer :: a(n)
+  integer, allocatable, device :: b(:)
+  integer :: res(10)
+
+  res = a + b
+end subroutine
+
+! CHECK-LABEL: func.func @_QPsub11
+! CHECK: %[[TEMP:.*]] = fir.allocmem !fir.array<?xi32>, %14#1 {bindc_name = ".tmp", uniq_name = ""}
+! CHECK: cuf.data_transfer
+! CHECK: fir.freemem %[[TEMP]] : !fir.heap<!fir.array<?xi32>>

@clementval clementval merged commit 9b6504e into llvm:main Jul 12, 2024
7 checks passed
@clementval clementval deleted the cuf_freemem branch July 12, 2024 00:15
aaryanshukla pushed a commit to aaryanshukla/llvm-project that referenced this pull request Jul 14, 2024
…#98078)

When implicit data transfer is created, make sure we generate the
`freemem` op on the `allocmem` result value and not the declare op
value.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants