-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Flang][MLIR][OpenMP] Use function-attached target attributes for OpenMP lowering #78291
Conversation
@llvm/pr-subscribers-flang-openmp @llvm/pr-subscribers-mlir Author: Sergio Afonso (skatrak) ChangesThis patch removes the omp.target module attribute, since the information it held on the target CPU and features is available to each function as a function attribute. Target outlining during the MLIR to LLVM IR translation stage is also updated, so that these attributes are passed along to the newly created function. Full diff: https://github.com/llvm/llvm-project/pull/78291.diff 9 Files Affected:
diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h
index b61224ff4f1b3c..5c59c99675699d 100644
--- a/flang/include/flang/Tools/CrossToolHelpers.h
+++ b/flang/include/flang/Tools/CrossToolHelpers.h
@@ -109,17 +109,6 @@ void setOffloadModuleInterfaceAttributes(
}
}
-// Shares assinging of the OpenMP OffloadModuleInterface and its TargetCPU
-// attribute accross Flang tools (bbc/flang)
-void setOffloadModuleInterfaceTargetAttribute(mlir::ModuleOp &module,
- llvm::StringRef targetCPU, llvm::StringRef targetFeatures) {
- // Should be registered by the OpenMPDialect
- if (auto offloadMod = llvm::dyn_cast<mlir::omp::OffloadModuleInterface>(
- module.getOperation())) {
- offloadMod.setTarget(targetCPU, targetFeatures);
- }
-}
-
void setOpenMPVersionAttribute(mlir::ModuleOp &module, int64_t version) {
module.getOperation()->setAttr(
mlir::StringAttr::get(module.getContext(), llvm::Twine{"omp.version"}),
diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp
index 397e403847588e..85dc1a0836421e 100644
--- a/flang/lib/Frontend/FrontendActions.cpp
+++ b/flang/lib/Frontend/FrontendActions.cpp
@@ -301,9 +301,6 @@ bool CodeGenAction::beginSourceFileAction() {
Fortran::common::LanguageFeature::OpenMP)) {
setOffloadModuleInterfaceAttributes(*mlirModule,
ci.getInvocation().getLangOpts());
- setOffloadModuleInterfaceTargetAttribute(
- *mlirModule, targetMachine.getTargetCPU(),
- targetMachine.getTargetFeatureString());
setOpenMPVersionAttribute(*mlirModule,
ci.getInvocation().getLangOpts().OpenMPVersion);
}
diff --git a/flang/test/Lower/OpenMP/FIR/target_cpu_features.f90 b/flang/test/Lower/OpenMP/FIR/target_cpu_features.f90
index 179b71b3f0cfa5..46051e03179e18 100644
--- a/flang/test/Lower/OpenMP/FIR/target_cpu_features.f90
+++ b/flang/test/Lower/OpenMP/FIR/target_cpu_features.f90
@@ -1,5 +1,5 @@
!REQUIRES: amdgpu-registered-target, nvptx-registered-target
-!RUN: %flang_fc1 -emit-fir -triple amdgcn-amd-amdhsa -target-cpu gfx908 -fopenmp -fopenmp-is-target-device %s -o - | FileCheck %s
+!RUN: %flang_fc1 -emit-fir -triple amdgcn-amd-amdhsa -target-cpu gfx908 -fopenmp -fopenmp-is-target-device %s -o - | FileCheck --check-prefix=AMDGCN %s
!RUN: %flang_fc1 -emit-hlfir -triple nvptx64-nvidia-cuda -target-cpu sm_80 -fopenmp -fopenmp-is-target-device %s -o - | FileCheck --check-prefix=NVPTX %s
@@ -7,16 +7,20 @@
! Target_Enter Simple
!===============================================================================
-!CHECK: omp.target = #omp.target<target_cpu = "gfx908",
-!CHECK-SAME: target_features = "+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot10-insts,
-!CHECK-SAME: +dot2-insts,+dot3-insts,+dot4-insts,+dot5-insts,+dot6-insts,+dot7-insts,+dpp,
-!CHECK-SAME: +gfx8-insts,+gfx9-insts,+gws,+image-insts,+mai-insts,+s-memrealtime,+s-memtime-inst,
-!CHECK-SAME: +wavefrontsize64">
-!NVPTX: omp.target = #omp.target<target_cpu = "sm_80", target_features = "+ptx61,+sm_80">
-!CHECK-LABEL: func.func @_QPomp_target_simple()
+!AMDGCN-LABEL: func.func @_QPomp_target_simple() attributes {
+!AMDGCN-SAME: target_cpu = "gfx908"
+!AMDGCN-SAME: target_features = #llvm.target_features<["+16-bit-insts", "+ci-insts",
+!AMDGCN-SAME: "+dl-insts", "+dot1-insts", "+dot10-insts", "+dot2-insts", "+dot3-insts",
+!AMDGCN-SAME: "+dot4-insts", "+dot5-insts", "+dot6-insts", "+dot7-insts", "+dpp",
+!AMDGCN-SAME: "+gfx8-insts", "+gfx9-insts", "+gws", "+image-insts", "+mai-insts",
+!AMDGCN-SAME: "+s-memrealtime", "+s-memtime-inst", "+wavefrontsize64"]>
+
+!NVPTX-LABEL: func.func @_QPomp_target_simple() attributes {
+!NVPTX-SAME: target_cpu = "sm_80"
+!NVPTX-SAME: target_features = #llvm.target_features<["+ptx61", "+sm_80"]>
+
subroutine omp_target_simple
! Directive needed to prevent subroutine from being filtered out when
! compiling for the device.
!$omp declare target
end subroutine omp_target_simple
-
diff --git a/flang/test/Lower/OpenMP/target_cpu_features.f90 b/flang/test/Lower/OpenMP/target_cpu_features.f90
index ea1e5e38fca88e..aa0b049565b99a 100644
--- a/flang/test/Lower/OpenMP/target_cpu_features.f90
+++ b/flang/test/Lower/OpenMP/target_cpu_features.f90
@@ -1,21 +1,25 @@
!REQUIRES: amdgpu-registered-target, nvptx-registered-target
-!RUN: %flang_fc1 -emit-hlfir -triple amdgcn-amd-amdhsa -target-cpu gfx908 -fopenmp -fopenmp-is-target-device %s -o - | FileCheck %s
+!RUN: %flang_fc1 -emit-hlfir -triple amdgcn-amd-amdhsa -target-cpu gfx908 -fopenmp -fopenmp-is-target-device %s -o - | FileCheck --check-prefix=AMDGCN %s
!RUN: %flang_fc1 -emit-hlfir -triple nvptx64-nvidia-cuda -target-cpu sm_80 -fopenmp -fopenmp-is-target-device %s -o - | FileCheck --check-prefix=NVPTX %s
!===============================================================================
! Target_Enter Simple
!===============================================================================
-!CHECK: omp.target = #omp.target<target_cpu = "gfx908",
-!CHECK-SAME: target_features = "+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot10-insts,
-!CHECK-SAME: +dot2-insts,+dot3-insts,+dot4-insts,+dot5-insts,+dot6-insts,+dot7-insts,+dpp,
-!CHECK-SAME: +gfx8-insts,+gfx9-insts,+gws,+image-insts,+mai-insts,+s-memrealtime,+s-memtime-inst,
-!CHECK-SAME: +wavefrontsize64">
-!NVPTX: omp.target = #omp.target<target_cpu = "sm_80", target_features = "+ptx61,+sm_80">
-!CHECK-LABEL: func.func @_QPomp_target_simple()
+!AMDGCN-LABEL: func.func @_QPomp_target_simple() attributes {
+!AMDGCN-SAME: target_cpu = "gfx908"
+!AMDGCN-SAME: target_features = #llvm.target_features<["+16-bit-insts", "+ci-insts",
+!AMDGCN-SAME: "+dl-insts", "+dot1-insts", "+dot10-insts", "+dot2-insts", "+dot3-insts",
+!AMDGCN-SAME: "+dot4-insts", "+dot5-insts", "+dot6-insts", "+dot7-insts", "+dpp",
+!AMDGCN-SAME: "+gfx8-insts", "+gfx9-insts", "+gws", "+image-insts", "+mai-insts",
+!AMDGCN-SAME: "+s-memrealtime", "+s-memtime-inst", "+wavefrontsize64"]>
+
+!NVPTX-LABEL: func.func @_QPomp_target_simple() attributes {
+!NVPTX-SAME: target_cpu = "sm_80"
+!NVPTX-SAME: target_features = #llvm.target_features<["+ptx61", "+sm_80"]>
+
subroutine omp_target_simple
! Directive needed to prevent subroutine from being filtered out when
! compiling for the device.
!$omp declare target
end subroutine omp_target_simple
-
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
index d614f2666a85ab..3ebb05d810cafd 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
@@ -72,15 +72,6 @@ def FlagsAttr : OpenMP_Attr<"Flags", "flags"> {
let assemblyFormat = "`<` struct(params) `>`";
}
-def TargetAttr : OpenMP_Attr<"Target", "target"> {
- let parameters = (ins
- StringRefParameter<>:$target_cpu,
- StringRefParameter<>:$target_features
- );
-
- let assemblyFormat = "`<` struct(params) `>`";
-}
-
class OpenMP_Op<string mnemonic, list<Trait> traits = []> :
Op<OpenMP_Dialect, mnemonic, traits>;
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOpsInterfaces.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOpsInterfaces.td
index 89d04af64766fc..653ad033560c5b 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOpsInterfaces.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOpsInterfaces.td
@@ -205,34 +205,6 @@ def OffloadModuleInterface : OpInterface<"OffloadModuleInterface"> {
assumeTeamsOversubscription, assumeThreadsOversubscription,
assumeNoThreadState, assumeNoNestedParallelism, noGPULib, openmpDeviceVersion));
}]>,
- InterfaceMethod<
- /*description=*/[{
- Get the Target attribute on the current module if it exists
- and return the attribute, if it doesn't exist it returns a nullptr.
- }],
- /*retTy=*/"mlir::omp::TargetAttr",
- /*methodName=*/"getTarget",
- (ins), [{}], [{
- if (Attribute flags = $_op->getAttr("omp.target"))
- return ::llvm::dyn_cast_or_null<mlir::omp::TargetAttr>(flags);
- return nullptr;
- }]>,
- InterfaceMethod<
- /*description=*/[{
- Set the attribute target on the current module with the
- specified string arguments - name of cpu and corresponding features.
- }],
- /*retTy=*/"void",
- /*methodName=*/"setTarget",
- (ins "llvm::StringRef":$targetCPU,
- "llvm::StringRef":$targetFeatures), [{}], [{
- if (targetCPU.empty())
- return;
- $_op->setAttr(("omp." + mlir::omp::TargetAttr::getMnemonic()).str(),
- mlir::omp::TargetAttr::get($_op->getContext(),
- targetCPU.str(),
- targetFeatures.str()));
- }]>,
InterfaceMethod<
/*description=*/[{
Set a StringAttr on the current module containing the host IR file path. This
diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index e7aebc3ce4be56..1092fff9ed4dab 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -2333,6 +2333,7 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
if (!targetOpSupported(opInst))
return failure();
+ auto parentFn = opInst.getParentOfType<LLVM::LLVMFuncOp>();
auto targetOp = cast<omp::TargetOp>(opInst);
auto &targetRegion = targetOp.getRegion();
DataLayout dl = DataLayout(opInst.getParentOfType<ModuleOp>());
@@ -2342,6 +2343,22 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
using InsertPointTy = llvm::OpenMPIRBuilder::InsertPointTy;
auto bodyCB = [&](InsertPointTy allocaIP,
InsertPointTy codeGenIP) -> InsertPointTy {
+ // Forward target-cpu and target-features function attributes from the
+ // original function to the new outlined function.
+ llvm::Function *llvmParentFn =
+ moduleTranslation.lookupFunction(parentFn.getName());
+ llvm::Function *llvmOutlinedFn = codeGenIP.getBlock()->getParent();
+ assert(llvmParentFn && llvmOutlinedFn &&
+ "Both parent and outlined functions must exist at this point");
+
+ if (auto attr = llvmParentFn->getFnAttribute("target-cpu");
+ attr.isStringAttribute())
+ llvmOutlinedFn->addFnAttr(attr);
+
+ if (auto attr = llvmParentFn->getFnAttribute("target-features");
+ attr.isStringAttribute())
+ llvmOutlinedFn->addFnAttr(attr);
+
builder.restoreIP(codeGenIP);
unsigned argIndex = 0;
for (auto &mapOp : mapOperands) {
@@ -2360,11 +2377,11 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
};
llvm::OpenMPIRBuilder::LocationDescription ompLoc(builder);
- StringRef parentName = opInst.getParentOfType<LLVM::LLVMFuncOp>().getName();
+ StringRef parentName = parentFn.getName();
// Override parent name if early outlining function
- if (auto earlyOutlineOp = llvm::dyn_cast<mlir::omp::EarlyOutliningInterface>(
- opInst.getParentOfType<LLVM::LLVMFuncOp>().getOperation())) {
+ if (auto earlyOutlineOp =
+ llvm::dyn_cast<mlir::omp::EarlyOutliningInterface>(*parentFn)) {
llvm::StringRef outlineParentName = earlyOutlineOp.getParentName();
parentName = outlineParentName.empty() ? parentName : outlineParentName;
}
diff --git a/mlir/test/Target/LLVMIR/omptarget-parallel-llvm.mlir b/mlir/test/Target/LLVMIR/omptarget-parallel-llvm.mlir
index a21e6d61a56185..8756a22e6798db 100644
--- a/mlir/test/Target/LLVMIR/omptarget-parallel-llvm.mlir
+++ b/mlir/test/Target/LLVMIR/omptarget-parallel-llvm.mlir
@@ -3,7 +3,7 @@
// The aim of the test is to check the LLVM IR codegen for the device
// for omp target parallel construct
-module attributes {dlti.dl_spec = #dlti.dl_spec<#dlti.dl_entry<"dlti.alloca_memory_space", 5 : ui32>>, llvm.data_layout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9", llvm.target_triple = "amdgcn-amd-amdhsa", omp.is_gpu = true, omp.is_target_device = true, omp.target = #omp.target<target_cpu = "gfx90a", target_features = "">} {
+module attributes {dlti.dl_spec = #dlti.dl_spec<#dlti.dl_entry<"dlti.alloca_memory_space", 5 : ui32>>, llvm.data_layout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9", llvm.target_triple = "amdgcn-amd-amdhsa", omp.is_gpu = true, omp.is_target_device = true} {
llvm.func @_QQmain_omp_outline_1(%arg0: !llvm.ptr) attributes {omp.declare_target = #omp.declaretarget<device_type = (host), capture_clause = (to)>, omp.outline_parent_name = "_QQmain"} {
%0 = omp.map_info var_ptr(%arg0 : !llvm.ptr, i32) map_clauses(from) capture(ByRef) -> !llvm.ptr {name = "d"}
omp.target map_entries(%0 -> %arg2 : !llvm.ptr) {
diff --git a/mlir/test/Target/LLVMIR/omptarget-target-cpu-features.mlir b/mlir/test/Target/LLVMIR/omptarget-target-cpu-features.mlir
new file mode 100644
index 00000000000000..fddb799142820b
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/omptarget-target-cpu-features.mlir
@@ -0,0 +1,23 @@
+// Test that the target_features and target_cpu llvm.func attributes are
+// forwarded to outlined target region functions.
+
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+module attributes {omp.is_target_device = false} {
+ llvm.func @omp_target_region() attributes {
+ target_cpu = "x86-64",
+ target_features = #llvm.target_features<["+mmx", "+sse"]>
+ } {
+ omp.target {
+ omp.terminator
+ }
+ llvm.return
+ }
+}
+
+// CHECK: define void @omp_target_region() #[[ATTRS:.*]] {
+// CHECK: define internal void @__omp_offloading_{{.*}}_omp_target_region_{{.*}}() #[[ATTRS]] {
+
+// CHECK: attributes #[[ATTRS]] = {
+// CHECK-SAME: "target-cpu"="x86-64"
+// CHECK-SAME: "target-features"="+mmx,+sse"
|
@llvm/pr-subscribers-mlir-openmp Author: Sergio Afonso (skatrak) ChangesThis patch removes the omp.target module attribute, since the information it held on the target CPU and features is available to each function as a function attribute. Target outlining during the MLIR to LLVM IR translation stage is also updated, so that these attributes are passed along to the newly created function. Full diff: https://github.com/llvm/llvm-project/pull/78291.diff 9 Files Affected:
diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h
index b61224ff4f1b3cd..5c59c99675699d2 100644
--- a/flang/include/flang/Tools/CrossToolHelpers.h
+++ b/flang/include/flang/Tools/CrossToolHelpers.h
@@ -109,17 +109,6 @@ void setOffloadModuleInterfaceAttributes(
}
}
-// Shares assinging of the OpenMP OffloadModuleInterface and its TargetCPU
-// attribute accross Flang tools (bbc/flang)
-void setOffloadModuleInterfaceTargetAttribute(mlir::ModuleOp &module,
- llvm::StringRef targetCPU, llvm::StringRef targetFeatures) {
- // Should be registered by the OpenMPDialect
- if (auto offloadMod = llvm::dyn_cast<mlir::omp::OffloadModuleInterface>(
- module.getOperation())) {
- offloadMod.setTarget(targetCPU, targetFeatures);
- }
-}
-
void setOpenMPVersionAttribute(mlir::ModuleOp &module, int64_t version) {
module.getOperation()->setAttr(
mlir::StringAttr::get(module.getContext(), llvm::Twine{"omp.version"}),
diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp
index 397e403847588e0..85dc1a0836421eb 100644
--- a/flang/lib/Frontend/FrontendActions.cpp
+++ b/flang/lib/Frontend/FrontendActions.cpp
@@ -301,9 +301,6 @@ bool CodeGenAction::beginSourceFileAction() {
Fortran::common::LanguageFeature::OpenMP)) {
setOffloadModuleInterfaceAttributes(*mlirModule,
ci.getInvocation().getLangOpts());
- setOffloadModuleInterfaceTargetAttribute(
- *mlirModule, targetMachine.getTargetCPU(),
- targetMachine.getTargetFeatureString());
setOpenMPVersionAttribute(*mlirModule,
ci.getInvocation().getLangOpts().OpenMPVersion);
}
diff --git a/flang/test/Lower/OpenMP/FIR/target_cpu_features.f90 b/flang/test/Lower/OpenMP/FIR/target_cpu_features.f90
index 179b71b3f0cfa5c..46051e03179e183 100644
--- a/flang/test/Lower/OpenMP/FIR/target_cpu_features.f90
+++ b/flang/test/Lower/OpenMP/FIR/target_cpu_features.f90
@@ -1,5 +1,5 @@
!REQUIRES: amdgpu-registered-target, nvptx-registered-target
-!RUN: %flang_fc1 -emit-fir -triple amdgcn-amd-amdhsa -target-cpu gfx908 -fopenmp -fopenmp-is-target-device %s -o - | FileCheck %s
+!RUN: %flang_fc1 -emit-fir -triple amdgcn-amd-amdhsa -target-cpu gfx908 -fopenmp -fopenmp-is-target-device %s -o - | FileCheck --check-prefix=AMDGCN %s
!RUN: %flang_fc1 -emit-hlfir -triple nvptx64-nvidia-cuda -target-cpu sm_80 -fopenmp -fopenmp-is-target-device %s -o - | FileCheck --check-prefix=NVPTX %s
@@ -7,16 +7,20 @@
! Target_Enter Simple
!===============================================================================
-!CHECK: omp.target = #omp.target<target_cpu = "gfx908",
-!CHECK-SAME: target_features = "+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot10-insts,
-!CHECK-SAME: +dot2-insts,+dot3-insts,+dot4-insts,+dot5-insts,+dot6-insts,+dot7-insts,+dpp,
-!CHECK-SAME: +gfx8-insts,+gfx9-insts,+gws,+image-insts,+mai-insts,+s-memrealtime,+s-memtime-inst,
-!CHECK-SAME: +wavefrontsize64">
-!NVPTX: omp.target = #omp.target<target_cpu = "sm_80", target_features = "+ptx61,+sm_80">
-!CHECK-LABEL: func.func @_QPomp_target_simple()
+!AMDGCN-LABEL: func.func @_QPomp_target_simple() attributes {
+!AMDGCN-SAME: target_cpu = "gfx908"
+!AMDGCN-SAME: target_features = #llvm.target_features<["+16-bit-insts", "+ci-insts",
+!AMDGCN-SAME: "+dl-insts", "+dot1-insts", "+dot10-insts", "+dot2-insts", "+dot3-insts",
+!AMDGCN-SAME: "+dot4-insts", "+dot5-insts", "+dot6-insts", "+dot7-insts", "+dpp",
+!AMDGCN-SAME: "+gfx8-insts", "+gfx9-insts", "+gws", "+image-insts", "+mai-insts",
+!AMDGCN-SAME: "+s-memrealtime", "+s-memtime-inst", "+wavefrontsize64"]>
+
+!NVPTX-LABEL: func.func @_QPomp_target_simple() attributes {
+!NVPTX-SAME: target_cpu = "sm_80"
+!NVPTX-SAME: target_features = #llvm.target_features<["+ptx61", "+sm_80"]>
+
subroutine omp_target_simple
! Directive needed to prevent subroutine from being filtered out when
! compiling for the device.
!$omp declare target
end subroutine omp_target_simple
-
diff --git a/flang/test/Lower/OpenMP/target_cpu_features.f90 b/flang/test/Lower/OpenMP/target_cpu_features.f90
index ea1e5e38fca88ef..aa0b049565b99af 100644
--- a/flang/test/Lower/OpenMP/target_cpu_features.f90
+++ b/flang/test/Lower/OpenMP/target_cpu_features.f90
@@ -1,21 +1,25 @@
!REQUIRES: amdgpu-registered-target, nvptx-registered-target
-!RUN: %flang_fc1 -emit-hlfir -triple amdgcn-amd-amdhsa -target-cpu gfx908 -fopenmp -fopenmp-is-target-device %s -o - | FileCheck %s
+!RUN: %flang_fc1 -emit-hlfir -triple amdgcn-amd-amdhsa -target-cpu gfx908 -fopenmp -fopenmp-is-target-device %s -o - | FileCheck --check-prefix=AMDGCN %s
!RUN: %flang_fc1 -emit-hlfir -triple nvptx64-nvidia-cuda -target-cpu sm_80 -fopenmp -fopenmp-is-target-device %s -o - | FileCheck --check-prefix=NVPTX %s
!===============================================================================
! Target_Enter Simple
!===============================================================================
-!CHECK: omp.target = #omp.target<target_cpu = "gfx908",
-!CHECK-SAME: target_features = "+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot10-insts,
-!CHECK-SAME: +dot2-insts,+dot3-insts,+dot4-insts,+dot5-insts,+dot6-insts,+dot7-insts,+dpp,
-!CHECK-SAME: +gfx8-insts,+gfx9-insts,+gws,+image-insts,+mai-insts,+s-memrealtime,+s-memtime-inst,
-!CHECK-SAME: +wavefrontsize64">
-!NVPTX: omp.target = #omp.target<target_cpu = "sm_80", target_features = "+ptx61,+sm_80">
-!CHECK-LABEL: func.func @_QPomp_target_simple()
+!AMDGCN-LABEL: func.func @_QPomp_target_simple() attributes {
+!AMDGCN-SAME: target_cpu = "gfx908"
+!AMDGCN-SAME: target_features = #llvm.target_features<["+16-bit-insts", "+ci-insts",
+!AMDGCN-SAME: "+dl-insts", "+dot1-insts", "+dot10-insts", "+dot2-insts", "+dot3-insts",
+!AMDGCN-SAME: "+dot4-insts", "+dot5-insts", "+dot6-insts", "+dot7-insts", "+dpp",
+!AMDGCN-SAME: "+gfx8-insts", "+gfx9-insts", "+gws", "+image-insts", "+mai-insts",
+!AMDGCN-SAME: "+s-memrealtime", "+s-memtime-inst", "+wavefrontsize64"]>
+
+!NVPTX-LABEL: func.func @_QPomp_target_simple() attributes {
+!NVPTX-SAME: target_cpu = "sm_80"
+!NVPTX-SAME: target_features = #llvm.target_features<["+ptx61", "+sm_80"]>
+
subroutine omp_target_simple
! Directive needed to prevent subroutine from being filtered out when
! compiling for the device.
!$omp declare target
end subroutine omp_target_simple
-
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
index d614f2666a85abc..3ebb05d810cafdf 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
@@ -72,15 +72,6 @@ def FlagsAttr : OpenMP_Attr<"Flags", "flags"> {
let assemblyFormat = "`<` struct(params) `>`";
}
-def TargetAttr : OpenMP_Attr<"Target", "target"> {
- let parameters = (ins
- StringRefParameter<>:$target_cpu,
- StringRefParameter<>:$target_features
- );
-
- let assemblyFormat = "`<` struct(params) `>`";
-}
-
class OpenMP_Op<string mnemonic, list<Trait> traits = []> :
Op<OpenMP_Dialect, mnemonic, traits>;
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOpsInterfaces.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOpsInterfaces.td
index 89d04af64766fc2..653ad033560c5b4 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOpsInterfaces.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOpsInterfaces.td
@@ -205,34 +205,6 @@ def OffloadModuleInterface : OpInterface<"OffloadModuleInterface"> {
assumeTeamsOversubscription, assumeThreadsOversubscription,
assumeNoThreadState, assumeNoNestedParallelism, noGPULib, openmpDeviceVersion));
}]>,
- InterfaceMethod<
- /*description=*/[{
- Get the Target attribute on the current module if it exists
- and return the attribute, if it doesn't exist it returns a nullptr.
- }],
- /*retTy=*/"mlir::omp::TargetAttr",
- /*methodName=*/"getTarget",
- (ins), [{}], [{
- if (Attribute flags = $_op->getAttr("omp.target"))
- return ::llvm::dyn_cast_or_null<mlir::omp::TargetAttr>(flags);
- return nullptr;
- }]>,
- InterfaceMethod<
- /*description=*/[{
- Set the attribute target on the current module with the
- specified string arguments - name of cpu and corresponding features.
- }],
- /*retTy=*/"void",
- /*methodName=*/"setTarget",
- (ins "llvm::StringRef":$targetCPU,
- "llvm::StringRef":$targetFeatures), [{}], [{
- if (targetCPU.empty())
- return;
- $_op->setAttr(("omp." + mlir::omp::TargetAttr::getMnemonic()).str(),
- mlir::omp::TargetAttr::get($_op->getContext(),
- targetCPU.str(),
- targetFeatures.str()));
- }]>,
InterfaceMethod<
/*description=*/[{
Set a StringAttr on the current module containing the host IR file path. This
diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index e7aebc3ce4be569..1092fff9ed4daba 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -2333,6 +2333,7 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
if (!targetOpSupported(opInst))
return failure();
+ auto parentFn = opInst.getParentOfType<LLVM::LLVMFuncOp>();
auto targetOp = cast<omp::TargetOp>(opInst);
auto &targetRegion = targetOp.getRegion();
DataLayout dl = DataLayout(opInst.getParentOfType<ModuleOp>());
@@ -2342,6 +2343,22 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
using InsertPointTy = llvm::OpenMPIRBuilder::InsertPointTy;
auto bodyCB = [&](InsertPointTy allocaIP,
InsertPointTy codeGenIP) -> InsertPointTy {
+ // Forward target-cpu and target-features function attributes from the
+ // original function to the new outlined function.
+ llvm::Function *llvmParentFn =
+ moduleTranslation.lookupFunction(parentFn.getName());
+ llvm::Function *llvmOutlinedFn = codeGenIP.getBlock()->getParent();
+ assert(llvmParentFn && llvmOutlinedFn &&
+ "Both parent and outlined functions must exist at this point");
+
+ if (auto attr = llvmParentFn->getFnAttribute("target-cpu");
+ attr.isStringAttribute())
+ llvmOutlinedFn->addFnAttr(attr);
+
+ if (auto attr = llvmParentFn->getFnAttribute("target-features");
+ attr.isStringAttribute())
+ llvmOutlinedFn->addFnAttr(attr);
+
builder.restoreIP(codeGenIP);
unsigned argIndex = 0;
for (auto &mapOp : mapOperands) {
@@ -2360,11 +2377,11 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder,
};
llvm::OpenMPIRBuilder::LocationDescription ompLoc(builder);
- StringRef parentName = opInst.getParentOfType<LLVM::LLVMFuncOp>().getName();
+ StringRef parentName = parentFn.getName();
// Override parent name if early outlining function
- if (auto earlyOutlineOp = llvm::dyn_cast<mlir::omp::EarlyOutliningInterface>(
- opInst.getParentOfType<LLVM::LLVMFuncOp>().getOperation())) {
+ if (auto earlyOutlineOp =
+ llvm::dyn_cast<mlir::omp::EarlyOutliningInterface>(*parentFn)) {
llvm::StringRef outlineParentName = earlyOutlineOp.getParentName();
parentName = outlineParentName.empty() ? parentName : outlineParentName;
}
diff --git a/mlir/test/Target/LLVMIR/omptarget-parallel-llvm.mlir b/mlir/test/Target/LLVMIR/omptarget-parallel-llvm.mlir
index a21e6d61a56185a..8756a22e6798dbb 100644
--- a/mlir/test/Target/LLVMIR/omptarget-parallel-llvm.mlir
+++ b/mlir/test/Target/LLVMIR/omptarget-parallel-llvm.mlir
@@ -3,7 +3,7 @@
// The aim of the test is to check the LLVM IR codegen for the device
// for omp target parallel construct
-module attributes {dlti.dl_spec = #dlti.dl_spec<#dlti.dl_entry<"dlti.alloca_memory_space", 5 : ui32>>, llvm.data_layout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9", llvm.target_triple = "amdgcn-amd-amdhsa", omp.is_gpu = true, omp.is_target_device = true, omp.target = #omp.target<target_cpu = "gfx90a", target_features = "">} {
+module attributes {dlti.dl_spec = #dlti.dl_spec<#dlti.dl_entry<"dlti.alloca_memory_space", 5 : ui32>>, llvm.data_layout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9", llvm.target_triple = "amdgcn-amd-amdhsa", omp.is_gpu = true, omp.is_target_device = true} {
llvm.func @_QQmain_omp_outline_1(%arg0: !llvm.ptr) attributes {omp.declare_target = #omp.declaretarget<device_type = (host), capture_clause = (to)>, omp.outline_parent_name = "_QQmain"} {
%0 = omp.map_info var_ptr(%arg0 : !llvm.ptr, i32) map_clauses(from) capture(ByRef) -> !llvm.ptr {name = "d"}
omp.target map_entries(%0 -> %arg2 : !llvm.ptr) {
diff --git a/mlir/test/Target/LLVMIR/omptarget-target-cpu-features.mlir b/mlir/test/Target/LLVMIR/omptarget-target-cpu-features.mlir
new file mode 100644
index 000000000000000..fddb799142820b7
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/omptarget-target-cpu-features.mlir
@@ -0,0 +1,23 @@
+// Test that the target_features and target_cpu llvm.func attributes are
+// forwarded to outlined target region functions.
+
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+module attributes {omp.is_target_device = false} {
+ llvm.func @omp_target_region() attributes {
+ target_cpu = "x86-64",
+ target_features = #llvm.target_features<["+mmx", "+sse"]>
+ } {
+ omp.target {
+ omp.terminator
+ }
+ llvm.return
+ }
+}
+
+// CHECK: define void @omp_target_region() #[[ATTRS:.*]] {
+// CHECK: define internal void @__omp_offloading_{{.*}}_omp_target_region_{{.*}}() #[[ATTRS]] {
+
+// CHECK: attributes #[[ATTRS]] = {
+// CHECK-SAME: "target-cpu"="x86-64"
+// CHECK-SAME: "target-features"="+mmx,+sse"
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Please wait for @DominikAdamski
Not for this patch, but would we need to add this info for all outlined functions?
I think we probably should, since otherwise there would be no target information for the new functions. |
a1701f9
to
ca41e03
Compare
ca41e03
to
4246063
Compare
52669d2
to
0d8be7c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM,
Nit: please check if old OpenMP attribute is not present in other tests.
0d8be7c
to
cd0c557
Compare
Thanks for the review, I checked that it's not present. I'll wait one more day before merging to give time in case there are any other concerns. |
…nMP lowering This patch removes the `omp.target` module attribute, since the information it held on the target CPU and features is available through the `fir.target_cpu` and `fir.target_features` module attributes. Target outlining during the MLIR to LLVM IR translation stage is updated, so that these attributes, at that point available as `llvm.func` attributes, are passed along to the newly created function.
cd0c557
to
a206adf
Compare
…nMP lowering (llvm#78291) This patch removes the omp.target module attribute, since the information it held on the target CPU and features is available through the fir.target_cpu and fir.target_features module attributes. Target outlining during the MLIR to LLVM IR translation stage is updated, so that these attributes, at that point available as llvm.func attributes, are passed along to the newly created function.
This patch removes the omp.target module attribute, since the information it held on the target CPU and features is available through the fir.target_cpu and fir.target_features module attributes. Target outlining during the MLIR to LLVM IR translation stage is updated, so that these attributes, at that point available as llvm.func attributes, are passed along to the newly created function.