-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[acc] Change acc declare_action recipe #157764
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-openacc Author: Susan Tan (ス-ザン タン) (SusanTan) ChangesChange the declare_action recipe from using acc.update_device to acc.declare_enter for prealloc/postalloc, and acc.declare_exit for predealloc/postdealloc, since update_device is not meant for accomodating acc declare allocatables. Full diff: https://github.com/llvm/llvm-project/pull/157764.diff 3 Files Affected:
diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp
index bbe749f8c8805..8aa40f84c474f 100644
--- a/flang/lib/Lower/OpenACC.cpp
+++ b/flang/lib/Lower/OpenACC.cpp
@@ -244,17 +244,16 @@ static void createDeclareAllocFuncWithArg(mlir::OpBuilder &modBuilder,
if (unwrapFirBox)
asFortranDesc << accFirDescriptorPostfix.str();
- // Updating descriptor must occur before the mapping of the data so that
- // attached data pointer is not overwritten.
- mlir::acc::UpdateDeviceOp updateDeviceOp =
- createDataEntryOp<mlir::acc::UpdateDeviceOp>(
- builder, loc, registerFuncOp.getArgument(0), asFortranDesc, bounds,
- /*structured=*/false, /*implicit=*/true,
- mlir::acc::DataClause::acc_update_device, descTy,
- /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{});
- llvm::SmallVector<int32_t> operandSegments{0, 0, 0, 1};
- llvm::SmallVector<mlir::Value> operands{updateDeviceOp.getResult()};
- createSimpleOp<mlir::acc::UpdateOp>(builder, loc, operands, operandSegments);
+ // Use declare_enter for the descriptor so the runtime mirrors allocation
+ // semantics instead of issuing an update. This ensures the descriptor's
+ // device-side metadata is established via a structured begin.
+ EntryOp descEntryOp = createDataEntryOp<EntryOp>(
+ builder, loc, registerFuncOp.getArgument(0), asFortranDesc, bounds,
+ /*structured=*/false, /*implicit=*/true, clause, descTy,
+ /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{});
+ mlir::acc::DeclareEnterOp::create(
+ builder, loc, mlir::acc::DeclareTokenType::get(descEntryOp.getContext()),
+ mlir::ValueRange(descEntryOp.getAccVar()));
if (unwrapFirBox) {
mlir::Value desc =
@@ -3989,17 +3988,16 @@ static void createDeclareAllocFunc(mlir::OpBuilder &modBuilder,
asFortranDesc << accFirDescriptorPostfix.str();
llvm::SmallVector<mlir::Value> bounds;
- // Updating descriptor must occur before the mapping of the data so that
- // attached data pointer is not overwritten.
- mlir::acc::UpdateDeviceOp updateDeviceOp =
- createDataEntryOp<mlir::acc::UpdateDeviceOp>(
- builder, loc, addrOp, asFortranDesc, bounds,
- /*structured=*/false, /*implicit=*/true,
- mlir::acc::DataClause::acc_update_device, addrOp.getType(),
- /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{});
- llvm::SmallVector<int32_t> operandSegments{0, 0, 0, 1};
- llvm::SmallVector<mlir::Value> operands{updateDeviceOp.getResult()};
- createSimpleOp<mlir::acc::UpdateOp>(builder, loc, operands, operandSegments);
+ // Use declare_enter for the descriptor so the runtime mirrors allocation
+ // semantics instead of issuing an update. This ensures the descriptor's
+ // device-side metadata is established via a structured begin.
+ EntryOp descEntryOp = createDataEntryOp<EntryOp>(
+ builder, loc, addrOp, asFortranDesc, bounds,
+ /*structured=*/false, /*implicit=*/true, clause, addrOp.getType(),
+ /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{});
+ mlir::acc::DeclareEnterOp::create(
+ builder, loc, mlir::acc::DeclareTokenType::get(descEntryOp.getContext()),
+ mlir::ValueRange(descEntryOp.getAccVar()));
if (unwrapFirBox) {
auto loadOp = fir::LoadOp::create(builder, loc, addrOp.getResult());
@@ -4092,15 +4090,15 @@ static void createDeclareDeallocFunc(mlir::OpBuilder &modBuilder,
if (unwrapFirBox)
asFortran << accFirDescriptorPostfix.str();
llvm::SmallVector<mlir::Value> bounds;
- mlir::acc::UpdateDeviceOp updateDeviceOp =
- createDataEntryOp<mlir::acc::UpdateDeviceOp>(
+ // Use declare_exit for the descriptor to end the structured declare region
+ // instead of issuing an update.
+ mlir::acc::GetDevicePtrOp descEntryOp =
+ createDataEntryOp<mlir::acc::GetDevicePtrOp>(
builder, loc, addrOp, asFortran, bounds,
- /*structured=*/false, /*implicit=*/true,
- mlir::acc::DataClause::acc_update_device, addrOp.getType(),
+ /*structured=*/false, /*implicit=*/true, clause, addrOp.getType(),
/*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{});
- llvm::SmallVector<int32_t> operandSegments{0, 0, 0, 1};
- llvm::SmallVector<mlir::Value> operands{updateDeviceOp.getResult()};
- createSimpleOp<mlir::acc::UpdateOp>(builder, loc, operands, operandSegments);
+ mlir::acc::DeclareExitOp::create(builder, loc, mlir::Value{},
+ mlir::ValueRange(descEntryOp.getAccVar()));
modBuilder.setInsertionPointAfter(postDeallocOp);
}
diff --git a/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 b/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90
index 6869af863644d..f9a8f7bf0469b 100644
--- a/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90
+++ b/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90
@@ -1,8 +1,8 @@
! This test checks lowering of OpenACC declare directive in function and
! subroutine specification parts.
-
! RUN: bbc -fopenacc -emit-hlfir --openacc-unwrap-fir-box=true --openacc-generate-default-bounds=true %s -o - | FileCheck %s
+
module acc_declare
contains
@@ -258,8 +258,6 @@ subroutine acc_declare_allocate()
! CHECK-LABEL: func.func private @_QMacc_declareFacc_declare_allocateEa_acc_declare_update_desc_post_alloc(
! CHECK-SAME: %[[ARG0:.*]]: !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) {
-! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[ARG0]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {implicit = true, name = "a_desc", structured = false}
-! CHECK: acc.update dataOperands(%[[UPDATE]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>)
! CHECK: %[[LOAD:.*]] = fir.load %[[ARG0]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>
! CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[LOAD]] {acc.declare = #acc.declare<dataClause = acc_create>} : (!fir.box<!fir.heap<!fir.array<?xi32>>>) -> !fir.heap<!fir.array<?xi32>>
! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.heap<!fir.array<?xi32>>) -> !fir.heap<!fir.array<?xi32>> {name = "a", structured = false}
@@ -281,7 +279,7 @@ subroutine acc_declare_allocate()
! CHECK-SAME: %[[ARG0:.*]]: !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) {
! CHECK: %[[LOAD:.*]] = fir.load %[[ARG0]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>
! CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[LOAD]] : (!fir.box<!fir.heap<!fir.array<?xi32>>>) -> !fir.heap<!fir.array<?xi32>>
-! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[BOX_ADDR]] : !fir.heap<!fir.array<?xi32>>) -> !fir.heap<!fir.array<?xi32>> {implicit = true, name = "a_desc", structured = false}
+! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[BOX_ADDR]] : !fir.heap<!fir.array<?xi32>>) -> !fir.heap<!fir.array<?xi32>>
! CHECK: acc.update dataOperands(%[[UPDATE]] : !fir.heap<!fir.array<?xi32>>)
! CHECK: return
! CHECK: }
@@ -355,8 +353,8 @@ module acc_declare_allocatable_test
! CHECK-LABEL: func.func private @_QMacc_declare_allocatable_testEdata1_acc_declare_update_desc_post_alloc() {
! CHECK: %[[GLOBAL_ADDR:.*]] = fir.address_of(@_QMacc_declare_allocatable_testEdata1) : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>
-! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[GLOBAL_ADDR]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {implicit = true, name = "data1_desc", structured = false}
-! CHECK: acc.update dataOperands(%[[UPDATE]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>)
+! CHECK: %[[CREATE_DESC:.*]] = acc.create varPtr(%[[GLOBAL_ADDR]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {implicit = true, name = "data1_desc", structured = false}
+! CHECK: acc.declare_enter dataOperands(%[[CREATE_DESC]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>)
! CHECK: %[[LOAD:.*]] = fir.load %[[GLOBAL_ADDR]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>
! CHECK: %[[BOXADDR:.*]] = fir.box_addr %[[LOAD]] {acc.declare = #acc.declare<dataClause = acc_create>} : (!fir.box<!fir.heap<!fir.array<?xi32>>>) -> !fir.heap<!fir.array<?xi32>>
! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOXADDR]] : !fir.heap<!fir.array<?xi32>>) -> !fir.heap<!fir.array<?xi32>> {name = "data1", structured = false}
@@ -376,8 +374,8 @@ module acc_declare_allocatable_test
! CHECK-LABEL: func.func private @_QMacc_declare_allocatable_testEdata1_acc_declare_update_desc_post_dealloc() {
! CHECK: %[[GLOBAL_ADDR:.*]] = fir.address_of(@_QMacc_declare_allocatable_testEdata1) : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>
-! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[GLOBAL_ADDR]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {implicit = true, name = "data1_desc", structured = false}
-! CHECK: acc.update dataOperands(%[[UPDATE]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>)
+! CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[GLOBAL_ADDR]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {dataClause = #acc<data_clause acc_create>, implicit = true, name = "data1_desc", structured = false}
+! CHECK: acc.declare_exit dataOperands(%[[DEVPTR]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>)
! CHECK: return
! CHECK: }
diff --git a/flang/test/Lower/OpenACC/acc-declare.f90 b/flang/test/Lower/OpenACC/acc-declare.f90
index 4d95ffa10edaf..3b17dee796619 100644
--- a/flang/test/Lower/OpenACC/acc-declare.f90
+++ b/flang/test/Lower/OpenACC/acc-declare.f90
@@ -1,8 +1,8 @@
! This test checks lowering of OpenACC declare directive in function and
! subroutine specification parts.
-
! RUN: bbc -fopenacc -emit-hlfir %s -o - | FileCheck %s
+
module acc_declare
contains
@@ -250,8 +250,8 @@ subroutine acc_declare_allocate()
! CHECK-LABEL: func.func private @_QMacc_declareFacc_declare_allocateEa_acc_declare_update_desc_post_alloc(
! CHECK-SAME: %[[ARG0:.*]]: !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) {
-! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[ARG0]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {implicit = true, name = "a", structured = false}
-! CHECK: acc.update dataOperands(%[[UPDATE]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>)
+! CHECK: %[[CREATE_DESC:.*]] = acc.create varPtr(%[[ARG0]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {implicit = true, name = "a", structured = false}
+! CHECK: acc.declare_enter dataOperands(%[[CREATE_DESC]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>)
! CHECK: return
! CHECK: }
@@ -330,15 +330,15 @@ module acc_declare_allocatable_test
! CHECK-LABEL: func.func private @_QMacc_declare_allocatable_testEdata1_acc_declare_update_desc_post_alloc() {
! CHECK: %[[GLOBAL_ADDR:.*]] = fir.address_of(@_QMacc_declare_allocatable_testEdata1) : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>
-! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[GLOBAL_ADDR]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {implicit = true, name = "data1", structured = false}
-! CHECK: acc.update dataOperands(%[[UPDATE]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>)
+! CHECK: %[[CREATE_DESC:.*]] = acc.create varPtr(%[[GLOBAL_ADDR]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {implicit = true, name = "data1", structured = false}
+! CHECK: acc.declare_enter dataOperands(%[[CREATE_DESC]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>)
! CHECK: return
! CHECK: }
! CHECK-LABEL: func.func private @_QMacc_declare_allocatable_testEdata1_acc_declare_update_desc_post_dealloc() {
! CHECK: %[[GLOBAL_ADDR:.*]] = fir.address_of(@_QMacc_declare_allocatable_testEdata1) : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>
-! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[GLOBAL_ADDR]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {implicit = true, name = "data1", structured = false}
-! CHECK: acc.update dataOperands(%[[UPDATE]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>)
+! CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[GLOBAL_ADDR]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {dataClause = #acc<data_clause acc_create>, implicit = true, name = "data1", structured = false}
+! CHECK: acc.declare_exit dataOperands(%[[DEVPTR]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>)
! CHECK: return
! CHECK: }
|
@llvm/pr-subscribers-flang-fir-hlfir Author: Susan Tan (ス-ザン タン) (SusanTan) ChangesChange the declare_action recipe from using acc.update_device to acc.declare_enter for prealloc/postalloc, and acc.declare_exit for predealloc/postdealloc, since update_device is not meant for accomodating acc declare allocatables. Full diff: https://github.com/llvm/llvm-project/pull/157764.diff 3 Files Affected:
diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp
index bbe749f8c8805..8aa40f84c474f 100644
--- a/flang/lib/Lower/OpenACC.cpp
+++ b/flang/lib/Lower/OpenACC.cpp
@@ -244,17 +244,16 @@ static void createDeclareAllocFuncWithArg(mlir::OpBuilder &modBuilder,
if (unwrapFirBox)
asFortranDesc << accFirDescriptorPostfix.str();
- // Updating descriptor must occur before the mapping of the data so that
- // attached data pointer is not overwritten.
- mlir::acc::UpdateDeviceOp updateDeviceOp =
- createDataEntryOp<mlir::acc::UpdateDeviceOp>(
- builder, loc, registerFuncOp.getArgument(0), asFortranDesc, bounds,
- /*structured=*/false, /*implicit=*/true,
- mlir::acc::DataClause::acc_update_device, descTy,
- /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{});
- llvm::SmallVector<int32_t> operandSegments{0, 0, 0, 1};
- llvm::SmallVector<mlir::Value> operands{updateDeviceOp.getResult()};
- createSimpleOp<mlir::acc::UpdateOp>(builder, loc, operands, operandSegments);
+ // Use declare_enter for the descriptor so the runtime mirrors allocation
+ // semantics instead of issuing an update. This ensures the descriptor's
+ // device-side metadata is established via a structured begin.
+ EntryOp descEntryOp = createDataEntryOp<EntryOp>(
+ builder, loc, registerFuncOp.getArgument(0), asFortranDesc, bounds,
+ /*structured=*/false, /*implicit=*/true, clause, descTy,
+ /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{});
+ mlir::acc::DeclareEnterOp::create(
+ builder, loc, mlir::acc::DeclareTokenType::get(descEntryOp.getContext()),
+ mlir::ValueRange(descEntryOp.getAccVar()));
if (unwrapFirBox) {
mlir::Value desc =
@@ -3989,17 +3988,16 @@ static void createDeclareAllocFunc(mlir::OpBuilder &modBuilder,
asFortranDesc << accFirDescriptorPostfix.str();
llvm::SmallVector<mlir::Value> bounds;
- // Updating descriptor must occur before the mapping of the data so that
- // attached data pointer is not overwritten.
- mlir::acc::UpdateDeviceOp updateDeviceOp =
- createDataEntryOp<mlir::acc::UpdateDeviceOp>(
- builder, loc, addrOp, asFortranDesc, bounds,
- /*structured=*/false, /*implicit=*/true,
- mlir::acc::DataClause::acc_update_device, addrOp.getType(),
- /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{});
- llvm::SmallVector<int32_t> operandSegments{0, 0, 0, 1};
- llvm::SmallVector<mlir::Value> operands{updateDeviceOp.getResult()};
- createSimpleOp<mlir::acc::UpdateOp>(builder, loc, operands, operandSegments);
+ // Use declare_enter for the descriptor so the runtime mirrors allocation
+ // semantics instead of issuing an update. This ensures the descriptor's
+ // device-side metadata is established via a structured begin.
+ EntryOp descEntryOp = createDataEntryOp<EntryOp>(
+ builder, loc, addrOp, asFortranDesc, bounds,
+ /*structured=*/false, /*implicit=*/true, clause, addrOp.getType(),
+ /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{});
+ mlir::acc::DeclareEnterOp::create(
+ builder, loc, mlir::acc::DeclareTokenType::get(descEntryOp.getContext()),
+ mlir::ValueRange(descEntryOp.getAccVar()));
if (unwrapFirBox) {
auto loadOp = fir::LoadOp::create(builder, loc, addrOp.getResult());
@@ -4092,15 +4090,15 @@ static void createDeclareDeallocFunc(mlir::OpBuilder &modBuilder,
if (unwrapFirBox)
asFortran << accFirDescriptorPostfix.str();
llvm::SmallVector<mlir::Value> bounds;
- mlir::acc::UpdateDeviceOp updateDeviceOp =
- createDataEntryOp<mlir::acc::UpdateDeviceOp>(
+ // Use declare_exit for the descriptor to end the structured declare region
+ // instead of issuing an update.
+ mlir::acc::GetDevicePtrOp descEntryOp =
+ createDataEntryOp<mlir::acc::GetDevicePtrOp>(
builder, loc, addrOp, asFortran, bounds,
- /*structured=*/false, /*implicit=*/true,
- mlir::acc::DataClause::acc_update_device, addrOp.getType(),
+ /*structured=*/false, /*implicit=*/true, clause, addrOp.getType(),
/*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{});
- llvm::SmallVector<int32_t> operandSegments{0, 0, 0, 1};
- llvm::SmallVector<mlir::Value> operands{updateDeviceOp.getResult()};
- createSimpleOp<mlir::acc::UpdateOp>(builder, loc, operands, operandSegments);
+ mlir::acc::DeclareExitOp::create(builder, loc, mlir::Value{},
+ mlir::ValueRange(descEntryOp.getAccVar()));
modBuilder.setInsertionPointAfter(postDeallocOp);
}
diff --git a/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 b/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90
index 6869af863644d..f9a8f7bf0469b 100644
--- a/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90
+++ b/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90
@@ -1,8 +1,8 @@
! This test checks lowering of OpenACC declare directive in function and
! subroutine specification parts.
-
! RUN: bbc -fopenacc -emit-hlfir --openacc-unwrap-fir-box=true --openacc-generate-default-bounds=true %s -o - | FileCheck %s
+
module acc_declare
contains
@@ -258,8 +258,6 @@ subroutine acc_declare_allocate()
! CHECK-LABEL: func.func private @_QMacc_declareFacc_declare_allocateEa_acc_declare_update_desc_post_alloc(
! CHECK-SAME: %[[ARG0:.*]]: !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) {
-! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[ARG0]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {implicit = true, name = "a_desc", structured = false}
-! CHECK: acc.update dataOperands(%[[UPDATE]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>)
! CHECK: %[[LOAD:.*]] = fir.load %[[ARG0]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>
! CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[LOAD]] {acc.declare = #acc.declare<dataClause = acc_create>} : (!fir.box<!fir.heap<!fir.array<?xi32>>>) -> !fir.heap<!fir.array<?xi32>>
! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.heap<!fir.array<?xi32>>) -> !fir.heap<!fir.array<?xi32>> {name = "a", structured = false}
@@ -281,7 +279,7 @@ subroutine acc_declare_allocate()
! CHECK-SAME: %[[ARG0:.*]]: !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) {
! CHECK: %[[LOAD:.*]] = fir.load %[[ARG0]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>
! CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[LOAD]] : (!fir.box<!fir.heap<!fir.array<?xi32>>>) -> !fir.heap<!fir.array<?xi32>>
-! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[BOX_ADDR]] : !fir.heap<!fir.array<?xi32>>) -> !fir.heap<!fir.array<?xi32>> {implicit = true, name = "a_desc", structured = false}
+! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[BOX_ADDR]] : !fir.heap<!fir.array<?xi32>>) -> !fir.heap<!fir.array<?xi32>>
! CHECK: acc.update dataOperands(%[[UPDATE]] : !fir.heap<!fir.array<?xi32>>)
! CHECK: return
! CHECK: }
@@ -355,8 +353,8 @@ module acc_declare_allocatable_test
! CHECK-LABEL: func.func private @_QMacc_declare_allocatable_testEdata1_acc_declare_update_desc_post_alloc() {
! CHECK: %[[GLOBAL_ADDR:.*]] = fir.address_of(@_QMacc_declare_allocatable_testEdata1) : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>
-! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[GLOBAL_ADDR]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {implicit = true, name = "data1_desc", structured = false}
-! CHECK: acc.update dataOperands(%[[UPDATE]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>)
+! CHECK: %[[CREATE_DESC:.*]] = acc.create varPtr(%[[GLOBAL_ADDR]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {implicit = true, name = "data1_desc", structured = false}
+! CHECK: acc.declare_enter dataOperands(%[[CREATE_DESC]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>)
! CHECK: %[[LOAD:.*]] = fir.load %[[GLOBAL_ADDR]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>
! CHECK: %[[BOXADDR:.*]] = fir.box_addr %[[LOAD]] {acc.declare = #acc.declare<dataClause = acc_create>} : (!fir.box<!fir.heap<!fir.array<?xi32>>>) -> !fir.heap<!fir.array<?xi32>>
! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOXADDR]] : !fir.heap<!fir.array<?xi32>>) -> !fir.heap<!fir.array<?xi32>> {name = "data1", structured = false}
@@ -376,8 +374,8 @@ module acc_declare_allocatable_test
! CHECK-LABEL: func.func private @_QMacc_declare_allocatable_testEdata1_acc_declare_update_desc_post_dealloc() {
! CHECK: %[[GLOBAL_ADDR:.*]] = fir.address_of(@_QMacc_declare_allocatable_testEdata1) : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>
-! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[GLOBAL_ADDR]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {implicit = true, name = "data1_desc", structured = false}
-! CHECK: acc.update dataOperands(%[[UPDATE]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>)
+! CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[GLOBAL_ADDR]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {dataClause = #acc<data_clause acc_create>, implicit = true, name = "data1_desc", structured = false}
+! CHECK: acc.declare_exit dataOperands(%[[DEVPTR]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>)
! CHECK: return
! CHECK: }
diff --git a/flang/test/Lower/OpenACC/acc-declare.f90 b/flang/test/Lower/OpenACC/acc-declare.f90
index 4d95ffa10edaf..3b17dee796619 100644
--- a/flang/test/Lower/OpenACC/acc-declare.f90
+++ b/flang/test/Lower/OpenACC/acc-declare.f90
@@ -1,8 +1,8 @@
! This test checks lowering of OpenACC declare directive in function and
! subroutine specification parts.
-
! RUN: bbc -fopenacc -emit-hlfir %s -o - | FileCheck %s
+
module acc_declare
contains
@@ -250,8 +250,8 @@ subroutine acc_declare_allocate()
! CHECK-LABEL: func.func private @_QMacc_declareFacc_declare_allocateEa_acc_declare_update_desc_post_alloc(
! CHECK-SAME: %[[ARG0:.*]]: !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) {
-! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[ARG0]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {implicit = true, name = "a", structured = false}
-! CHECK: acc.update dataOperands(%[[UPDATE]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>)
+! CHECK: %[[CREATE_DESC:.*]] = acc.create varPtr(%[[ARG0]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {implicit = true, name = "a", structured = false}
+! CHECK: acc.declare_enter dataOperands(%[[CREATE_DESC]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>)
! CHECK: return
! CHECK: }
@@ -330,15 +330,15 @@ module acc_declare_allocatable_test
! CHECK-LABEL: func.func private @_QMacc_declare_allocatable_testEdata1_acc_declare_update_desc_post_alloc() {
! CHECK: %[[GLOBAL_ADDR:.*]] = fir.address_of(@_QMacc_declare_allocatable_testEdata1) : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>
-! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[GLOBAL_ADDR]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {implicit = true, name = "data1", structured = false}
-! CHECK: acc.update dataOperands(%[[UPDATE]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>)
+! CHECK: %[[CREATE_DESC:.*]] = acc.create varPtr(%[[GLOBAL_ADDR]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {implicit = true, name = "data1", structured = false}
+! CHECK: acc.declare_enter dataOperands(%[[CREATE_DESC]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>)
! CHECK: return
! CHECK: }
! CHECK-LABEL: func.func private @_QMacc_declare_allocatable_testEdata1_acc_declare_update_desc_post_dealloc() {
! CHECK: %[[GLOBAL_ADDR:.*]] = fir.address_of(@_QMacc_declare_allocatable_testEdata1) : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>
-! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[GLOBAL_ADDR]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {implicit = true, name = "data1", structured = false}
-! CHECK: acc.update dataOperands(%[[UPDATE]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>)
+! CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[GLOBAL_ADDR]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {dataClause = #acc<data_clause acc_create>, implicit = true, name = "data1", structured = false}
+! CHECK: acc.declare_exit dataOperands(%[[DEVPTR]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>)
! CHECK: return
! CHECK: }
|
! CHECK: %[[CREATE_DESC:.*]] = acc.create varPtr(%[[ARG0]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {implicit = true, name = "a", structured = false} | ||
! CHECK: acc.declare_enter dataOperands(%[[CREATE_DESC]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the device box is only created post allocation, what happens if you have descriptor query request on the device before allocation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will get the empty descriptor with no data attached to it. this is because in the constructor we also call acc.declare_enter, which ensures at device attach time the descriptor will be present. We handle acc.declare_enter in these two scenarios differently downstream
|
||
end subroutine | ||
|
||
! CHECK-LABEL: func.func private @_QMacc_declareFacc_declare_allocateEa_acc_declare_update_desc_post_alloc( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we update the name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@razvanlupusoru could you please weigh in here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes please :) It's OK to avoid "update_desc" in the name in both old behavior and new behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Would be good to get Razvan review as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! And thank you for keeping the old decomposed behavior also - we may consider getting rid of it in future but I did not want it done as part of this work.
Change the declare_action recipe from using acc.update_device to acc.declare_enter for prealloc/postalloc, and acc.declare_exit for predealloc/postdealloc, since update_device is not meant for accomodating acc declare allocatables.