diff --git a/flang/docs/OpenMP-descriptor-management.md b/flang/docs/OpenMP-descriptor-management.md new file mode 100644 index 0000000000000..90a20282e0512 --- /dev/null +++ b/flang/docs/OpenMP-descriptor-management.md @@ -0,0 +1,125 @@ + + +# OpenMP dialect: Fortran descriptor type mapping for offload + +The initial method for mapping Fortran types tied to descriptors for OpenMP offloading is to treat these types +as a special case of OpenMP record type (C/C++ structure/class, Fortran derived type etc.) mapping as far as the +runtime is concerned. Where the box (descriptor information) is the holding container and the underlying +data pointer is contained within the container, and we must generate explicit maps for both the pointer member and +the container. As an example, a small C++ program that is equivalent to the concept described, with the +`mock_descriptor` class being representative of the class utilised for descriptors in Clang: + +```C++ +struct mock_descriptor { + long int x; + std::byte x1, x2, x3, x4; + void *pointer; + long int lx[1][3]; +}; + +int main() { +mock_descriptor data; +#pragma omp target map(tofrom: data, data.pointer[:upper_bound]) +{ + do something... +} + + return 0; +} +``` + +In the above, we have to map both the containing structure, with its non-pointer members and the +data pointed to by the pointer contained within the structure to appropriately access the data. This +is effectively what is done with descriptor types for the time being. Other pointers that are part +of the descriptor container such as the addendum should also be treated as the data pointer is +treated. + +Currently, Flang will lower these descriptor types in the OpenMP lowering (lower/OpenMP.cpp) similarly +to all other map types, generating an omp.MapInfoOp containing relevant information required for lowering +the OpenMP dialect to LLVM-IR during the final stages of the MLIR lowering. However, after +the lowering to FIR/HLFIR has been performed an OpenMP dialect specific pass for Fortran, +`OMPDescriptorMapInfoGenPass` (Optimizer/OMPDescriptorMapInfoGen.cpp) will expand the +`omp.MapInfoOp`'s containing descriptors (which currently will be a `BoxType` or `BoxAddrOp`) into multiple +mappings, with one extra per pointer member in the descriptor that is supported on top of the original +descriptor map operation. These pointers members are linked to the parent descriptor by adding them to +the member field of the original descriptor map operation, they are then inserted into the relevant map +owning operation's (`omp.TargetOp`, `omp.DataOp` etc.) map operand list and in cases where the owning operation +is `IsolatedFromAbove`, it also inserts them as `BlockArgs` to canonicalize the mappings and simplify lowering. + +An example transformation by the `OMPDescriptorMapInfoGenPass`: + +``` + +... +%12 = omp.map_info var_ptr(%1#1 : !fir.ref>>>, !fir.box>>) map_clauses(tofrom) capture(ByRef) bounds(%11) -> !fir.ref>>> {name = "arg_alloc"} +... +omp.target map_entries(%12 -> %arg1, %13 -> %arg2 : !fir.ref>>>, !fir.ref) { + ^bb0(%arg1: !fir.ref>>>, %arg2: !fir.ref): +... + +====> + +... +%12 = fir.box_offset %1#1 base_addr : (!fir.ref>>>) -> !fir.llvm_ptr>> +%13 = omp.map_info var_ptr(%1#1 : !fir.ref>>>, !fir.array) var_ptr_ptr(%12 : !fir.llvm_ptr>>) map_clauses(tofrom) capture(ByRef) bounds(%11) -> !fir.llvm_ptr>> {name = ""} +%14 = omp.map_info var_ptr(%1#1 : !fir.ref>>>, !fir.box>>) map_clauses(tofrom) capture(ByRef) members(%13 : !fir.llvm_ptr>>) -> !fir.ref>>> {name = "arg_alloc"} +... +omp.target map_entries(%13 -> %arg1, %14 -> %arg2, %15 -> %arg3 : !fir.llvm_ptr>>, !fir.ref>>>, !fir.ref) { + ^bb0(%arg1: !fir.llvm_ptr>>, %arg2: !fir.ref>>>, %arg3: !fir.ref): +... + +``` + +In later stages of the compilation flow when the OpenMP dialect is being lowered to LLVM-IR these descriptor +mappings are treated as if they were structure mappings with explicit member maps on the same directive as +their parent was mapped. + +This implementation utilises the member field of the `map_info` operation to indicate that the pointer +descriptor elements which are contained in their own `map_info` operation are part of their respective +parent descriptor. This allows the descriptor containing the descriptor pointer member to be mapped +as a composite entity during lowering, with the correct mappings being generated to tie them together, +allowing the OpenMP runtime to map them correctly, attaching the pointer member to the parent +structure so it can be accessed during execution. If we opt to not treat the descriptor as a single +entity we have issues with the member being correctly attached to the parent and being accessible, +this can cause runtime segfaults on the device when we try to access the data through the parent. It +may be possible to avoid this member mapping, treating them as individual entities, but treating a +composite mapping as an individual mapping could lead to problems such as the runtime taking +liberties with the mapping it usually wouldn't if it knew they were linked, we would also have to +be careful to maintian the correct order of mappings as we lower, if we misorder the maps, it'd be +possible to overwrite already written data, e.g. if we write the descriptor data pointer first, and +then the containing descriptor, we would overwrite the descriptor data pointer with the incorrect +address. + +This method is generic in the sense that the OpenMP dialect doesn't need to understand that it is mapping a +Fortran type containing a descriptor, it just thinks it's a record type from either Fortran or C++. However, +it is a little rigid in how the descriptor mappings are handled as there is no specialisation or possibility +to specialise the mappings for possible edge cases without polluting the dialect or lowering with further +knowledge of Fortran and the FIR dialect. + +# OpenMP dialect differences from OpenACC dialect + +The descriptor mapping for OpenMP currently works differently to the planned direction for OpenACC, however, +it is possible and would likely be ideal to align the method with OpenACC in the future. + +Currently the OpenMP specification is less descriptive and has less stringent rules around descriptor based +types so does not require as complex a set of descriptor management rules as OpenACC (although, in certain +cases for the interim adopting OpenACC's rules where it makes sense could be useful). To handle the more +complex descriptor mapping rules OpenACC has opted to utilise a more runtime oriented approach, where +specialized runtime functions for handling descriptor mapping for OpenACC are created and these runtime +function handles are attatched to a special OpenACC dialect operation. When this operation is lowered it +will lower to the attatched OpenACC descriptor mapping runtime function. This sounds like it will work +(no implementation yet) similarly to some of the existing HLFIR operations which optionally lower to +Fortran runtime calls. + +This methodology described by OpenACC which utilises runtime functions to handle specialised mappings allows +more flexibility as a significant amount of the mapping logic can be moved into the runtime from the compiler. +It also allows specialisation of the mapping for fortran specific types. This may be a desireable approach +to take for OpenMP in the future, in particular if we find need to specialise mapping further for +descriptors or other Fortran types. However, for the moment the currently chosen implementation for OpenMP +appears sufficient as far as the OpenMP specification and current testing can show. diff --git a/flang/include/flang/Optimizer/CodeGen/CodeGenOpenMP.h b/flang/include/flang/Optimizer/CodeGen/CodeGenOpenMP.h new file mode 100644 index 0000000000000..1832d4967b7ec --- /dev/null +++ b/flang/include/flang/Optimizer/CodeGen/CodeGenOpenMP.h @@ -0,0 +1,26 @@ +//===------- Optimizer/CodeGen/CodeGenOpenMP.h - OpenMP codegen -*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#ifndef FORTRAN_OPTIMIZER_CODEGEN_CODEGENOPENMP_H +#define FORTRAN_OPTIMIZER_CODEGEN_CODEGENOPENMP_H + +#include "mlir/Pass/Pass.h" +#include "mlir/Pass/PassRegistry.h" + +namespace fir { +class LLVMTypeConverter; + +/// Specialised conversion patterns of OpenMP operations for FIR to LLVM +/// dialect, utilised in cases where the default OpenMP dialect handling cannot +/// handle all cases for intermingled fir types and operations. +void populateOpenMPFIRToLLVMConversionPatterns( + LLVMTypeConverter &converter, mlir::RewritePatternSet &patterns); + +} // namespace fir + +#endif // FORTRAN_OPTIMIZER_CODEGEN_CODEGENOPENMP_H diff --git a/flang/include/flang/Optimizer/Dialect/FIRType.h b/flang/include/flang/Optimizer/Dialect/FIRType.h index 0fb8e6a442a32..a526b4ddf3b98 100644 --- a/flang/include/flang/Optimizer/Dialect/FIRType.h +++ b/flang/include/flang/Optimizer/Dialect/FIRType.h @@ -321,6 +321,9 @@ bool isBoxNone(mlir::Type ty); /// e.g. !fir.box> bool isBoxedRecordType(mlir::Type ty); +/// Return true iff `ty` is a type that contains descriptor information. +bool isTypeWithDescriptor(mlir::Type ty); + /// Return true iff `ty` is a scalar boxed record type. /// e.g. !fir.box> /// !fir.box>> diff --git a/flang/include/flang/Optimizer/Transforms/Passes.h b/flang/include/flang/Optimizer/Transforms/Passes.h index 6970da8698ae8..aefb277f7966b 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.h +++ b/flang/include/flang/Optimizer/Transforms/Passes.h @@ -76,6 +76,7 @@ std::unique_ptr createAlgebraicSimplificationPass(const mlir::GreedyRewriteConfig &config); std::unique_ptr createPolymorphicOpConversionPass(); +std::unique_ptr createOMPDescriptorMapInfoGenPass(); std::unique_ptr createOMPFunctionFilteringPass(); std::unique_ptr> createOMPMarkDeclareTargetPass(); diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index e3c45d41f04cc..270b83774bcbf 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -318,6 +318,18 @@ def LoopVersioning : Pass<"loop-versioning", "mlir::func::FuncOp"> { let dependentDialects = [ "fir::FIROpsDialect" ]; } +def OMPDescriptorMapInfoGenPass + : Pass<"omp-descriptor-map-info-gen", "mlir::func::FuncOp"> { + let summary = "expands OpenMP MapInfo operations containing descriptors"; + let description = [{ + Expands MapInfo operations containing descriptor types into multiple + MapInfo's for each pointer element in the descriptor that requires + explicit individual mapping by the OpenMP runtime. + }]; + let constructor = "::fir::createOMPDescriptorMapInfoGenPass()"; + let dependentDialects = ["mlir::omp::OpenMPDialect"]; +} + def OMPMarkDeclareTargetPass : Pass<"omp-mark-declare-target", "mlir::ModuleOp"> { let summary = "Marks all functions called by an OpenMP declare target function as declare target"; diff --git a/flang/include/flang/Tools/CLOptions.inc b/flang/include/flang/Tools/CLOptions.inc index 96d3869cd0939..8dee3074a5d4f 100644 --- a/flang/include/flang/Tools/CLOptions.inc +++ b/flang/include/flang/Tools/CLOptions.inc @@ -274,6 +274,7 @@ inline void createHLFIRToFIRPassPipeline( /// rather than the host device. inline void createOpenMPFIRPassPipeline( mlir::PassManager &pm, bool isTargetDevice) { + pm.addPass(fir::createOMPDescriptorMapInfoGenPass()); pm.addPass(fir::createOMPMarkDeclareTargetPass()); if (isTargetDevice) pm.addPass(fir::createOMPFunctionFilteringPass()); diff --git a/flang/lib/Lower/OpenMP.cpp b/flang/lib/Lower/OpenMP.cpp index be2117efbabc0..0a68aba162618 100644 --- a/flang/lib/Lower/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP.cpp @@ -1821,27 +1821,25 @@ bool ClauseProcessor::processLink( static mlir::omp::MapInfoOp createMapInfoOp(fir::FirOpBuilder &builder, mlir::Location loc, - mlir::Value baseAddr, std::stringstream &name, - mlir::SmallVector bounds, uint64_t mapType, - mlir::omp::VariableCaptureKind mapCaptureType, - mlir::Type retTy) { - mlir::Value varPtr, varPtrPtr; - mlir::TypeAttr varType; - + mlir::Value baseAddr, mlir::Value varPtrPtr, std::string name, + mlir::SmallVector bounds, + mlir::SmallVector members, uint64_t mapType, + mlir::omp::VariableCaptureKind mapCaptureType, mlir::Type retTy, + bool isVal = false) { if (auto boxTy = baseAddr.getType().dyn_cast()) { baseAddr = builder.create(loc, baseAddr); retTy = baseAddr.getType(); } - varPtr = baseAddr; - varType = mlir::TypeAttr::get( + mlir::TypeAttr varType = mlir::TypeAttr::get( llvm::cast(retTy).getElementType()); mlir::omp::MapInfoOp op = builder.create( - loc, retTy, varPtr, varType, varPtrPtr, bounds, + loc, retTy, baseAddr, varType, varPtrPtr, members, bounds, builder.getIntegerAttr(builder.getIntegerType(64, false), mapType), builder.getAttr(mapCaptureType), - builder.getStringAttr(name.str())); + builder.getStringAttr(name)); + return op; } @@ -1904,6 +1902,7 @@ bool ClauseProcessor::processMap( std::get(mapClause->v.t).v) { llvm::SmallVector bounds; std::stringstream asFortran; + Fortran::lower::AddrAndBoundsInfo info = Fortran::lower::gatherDataOperandAddrAndBounds< Fortran::parser::OmpObject, mlir::omp::DataBoundsOp, @@ -1911,21 +1910,29 @@ bool ClauseProcessor::processMap( converter, firOpBuilder, semanticsContext, stmtCtx, ompObject, clauseLocation, asFortran, bounds, treatIndexAsSection); + auto origSymbol = + converter.getSymbolAddress(*getOmpObjectSymbol(ompObject)); + mlir::Value symAddr = info.addr; + if (origSymbol && fir::isTypeWithDescriptor(origSymbol.getType())) + symAddr = origSymbol; + // Explicit map captures are captured ByRef by default, // optimisation passes may alter this to ByCopy or other capture // types to optimise mlir::Value mapOp = createMapInfoOp( - firOpBuilder, clauseLocation, info.addr, asFortran, bounds, + firOpBuilder, clauseLocation, symAddr, mlir::Value{}, + asFortran.str(), bounds, {}, static_cast< std::underlying_type_t>( mapTypeBits), - mlir::omp::VariableCaptureKind::ByRef, info.addr.getType()); + mlir::omp::VariableCaptureKind::ByRef, symAddr.getType()); mapOperands.push_back(mapOp); if (mapSymTypes) - mapSymTypes->push_back(info.addr.getType()); + mapSymTypes->push_back(symAddr.getType()); if (mapSymLocs) - mapSymLocs->push_back(info.addr.getLoc()); + mapSymLocs->push_back(symAddr.getLoc()); + if (mapSymbols) mapSymbols->push_back(getOmpObjectSymbol(ompObject)); } @@ -2032,12 +2039,22 @@ bool ClauseProcessor::processMotionClauses( converter, firOpBuilder, semanticsContext, stmtCtx, ompObject, clauseLocation, asFortran, bounds, treatIndexAsSection); + auto origSymbol = + converter.getSymbolAddress(*getOmpObjectSymbol(ompObject)); + mlir::Value symAddr = info.addr; + if (origSymbol && fir::isTypeWithDescriptor(origSymbol.getType())) + symAddr = origSymbol; + + // Explicit map captures are captured ByRef by default, + // optimisation passes may alter this to ByCopy or other capture + // types to optimise mlir::Value mapOp = createMapInfoOp( - firOpBuilder, clauseLocation, info.addr, asFortran, bounds, + firOpBuilder, clauseLocation, symAddr, mlir::Value{}, + asFortran.str(), bounds, {}, static_cast< std::underlying_type_t>( mapTypeBits), - mlir::omp::VariableCaptureKind::ByRef, info.addr.getType()); + mlir::omp::VariableCaptureKind::ByRef, symAddr.getType()); mapOperands.push_back(mapOp); } @@ -2812,7 +2829,8 @@ static void genBodyOfTargetOp( std::stringstream name; firOpBuilder.setInsertionPoint(targetOp); mlir::Value mapOp = createMapInfoOp( - firOpBuilder, copyVal.getLoc(), copyVal, name, bounds, + firOpBuilder, copyVal.getLoc(), copyVal, mlir::Value{}, name.str(), + bounds, llvm::SmallVector{}, static_cast< std::underlying_type_t>( llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_IMPLICIT), @@ -2934,18 +2952,21 @@ genTargetOp(Fortran::lower::AbstractConverter &converter, llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_IMPLICIT; mlir::omp::VariableCaptureKind captureKind = mlir::omp::VariableCaptureKind::ByRef; - if (auto refType = baseOp.getType().dyn_cast()) { - auto eleType = refType.getElementType(); - if (fir::isa_trivial(eleType) || fir::isa_char(eleType)) { - captureKind = mlir::omp::VariableCaptureKind::ByCopy; - } else if (!fir::isa_builtin_cptr_type(eleType)) { - mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO; - mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM; - } + + mlir::Type eleType = baseOp.getType(); + if (auto refType = baseOp.getType().dyn_cast()) + eleType = refType.getElementType(); + + if (fir::isa_trivial(eleType) || fir::isa_char(eleType)) { + captureKind = mlir::omp::VariableCaptureKind::ByCopy; + } else if (!fir::isa_builtin_cptr_type(eleType)) { + mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO; + mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM; } mlir::Value mapOp = createMapInfoOp( - converter.getFirOpBuilder(), baseOp.getLoc(), baseOp, name, bounds, + converter.getFirOpBuilder(), baseOp.getLoc(), baseOp, mlir::Value{}, + name.str(), bounds, {}, static_cast< std::underlying_type_t>( mapFlag), diff --git a/flang/lib/Optimizer/CodeGen/CMakeLists.txt b/flang/lib/Optimizer/CodeGen/CMakeLists.txt index 0daa97b00dfa0..175ab9fefda2a 100644 --- a/flang/lib/Optimizer/CodeGen/CMakeLists.txt +++ b/flang/lib/Optimizer/CodeGen/CMakeLists.txt @@ -2,6 +2,7 @@ add_flang_library(FIRCodeGen BoxedProcedure.cpp CGOps.cpp CodeGen.cpp + CodeGenOpenMP.cpp PreCGRewrite.cpp TBAABuilder.cpp Target.cpp diff --git a/flang/lib/Optimizer/CodeGen/CodeGen.cpp b/flang/lib/Optimizer/CodeGen/CodeGen.cpp index 8b0d47ec08ec3..f89f28c006dec 100644 --- a/flang/lib/Optimizer/CodeGen/CodeGen.cpp +++ b/flang/lib/Optimizer/CodeGen/CodeGen.cpp @@ -13,6 +13,7 @@ #include "flang/Optimizer/CodeGen/CodeGen.h" #include "CGOps.h" +#include "flang/Optimizer/CodeGen/CodeGenOpenMP.h" #include "flang/Optimizer/Dialect/FIRAttr.h" #include "flang/Optimizer/Dialect/FIROps.h" #include "flang/Optimizer/Dialect/FIRType.h" @@ -3959,6 +3960,11 @@ class FIRToLLVMLowering mlir::populateMathToLibmConversionPatterns(pattern); mlir::populateComplexToLLVMConversionPatterns(typeConverter, pattern); mlir::populateVectorToLLVMConversionPatterns(typeConverter, pattern); + + // Flang specific overloads for OpenMP operations, to allow for special + // handling of things like Box types. + fir::populateOpenMPFIRToLLVMConversionPatterns(typeConverter, pattern); + mlir::ConversionTarget target{*context}; target.addLegalDialect(); // The OpenMP dialect is legal for Operations without regions, for those diff --git a/flang/lib/Optimizer/CodeGen/CodeGenOpenMP.cpp b/flang/lib/Optimizer/CodeGen/CodeGenOpenMP.cpp new file mode 100644 index 0000000000000..a6fa05fe06542 --- /dev/null +++ b/flang/lib/Optimizer/CodeGen/CodeGenOpenMP.cpp @@ -0,0 +1,98 @@ +//===-- CodeGenOpenMP.cpp -------------------------------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// Coding style: https://mlir.llvm.org/getting_started/DeveloperGuide/ +// +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/CodeGen/CodeGenOpenMP.h" + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/LowLevelIntrinsics.h" +#include "flang/Optimizer/CodeGen/CodeGen.h" +#include "flang/Optimizer/Dialect/FIRDialect.h" +#include "flang/Optimizer/Dialect/FIROps.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/Dialect/Support/FIRContext.h" +#include "flang/Optimizer/Support/FatalError.h" +#include "flang/Optimizer/Support/InternalNames.h" +#include "mlir/Conversion/LLVMCommon/ConversionTarget.h" +#include "mlir/Conversion/LLVMCommon/Pattern.h" +#include "mlir/Dialect/LLVMIR/LLVMDialect.h" +#include "mlir/Dialect/OpenMP/OpenMPDialect.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Transforms/DialectConversion.h" + +using namespace fir; + +#define DEBUG_TYPE "flang-codegen-openmp" + +// fir::LLVMTypeConverter for converting to LLVM IR dialect types. +#include "flang/Optimizer/CodeGen/TypeConverter.h" + +namespace { +/// A pattern that converts the region arguments in a single-region OpenMP +/// operation to the LLVM dialect. The body of the region is not modified and is +/// expected to either be processed by the conversion infrastructure or already +/// contain ops compatible with LLVM dialect types. +template +class OpenMPFIROpConversion : public mlir::ConvertOpToLLVMPattern { +public: + explicit OpenMPFIROpConversion(const fir::LLVMTypeConverter &lowering) + : mlir::ConvertOpToLLVMPattern(lowering) {} + + const fir::LLVMTypeConverter &lowerTy() const { + return *static_cast( + this->getTypeConverter()); + } +}; + +// FIR Op specific conversion for MapInfoOp that overwrites the default OpenMP +// Dialect lowering, this allows FIR specific lowering of types, required for +// descriptors of allocatables currently. +struct MapInfoOpConversion + : public OpenMPFIROpConversion { + using OpenMPFIROpConversion::OpenMPFIROpConversion; + + mlir::LogicalResult + matchAndRewrite(mlir::omp::MapInfoOp curOp, OpAdaptor adaptor, + mlir::ConversionPatternRewriter &rewriter) const override { + const mlir::TypeConverter *converter = getTypeConverter(); + llvm::SmallVector resTypes; + if (failed(converter->convertTypes(curOp->getResultTypes(), resTypes))) + return mlir::failure(); + + llvm::SmallVector newAttrs; + mlir::omp::MapInfoOp newOp; + for (mlir::NamedAttribute attr : curOp->getAttrs()) { + if (auto typeAttr = mlir::dyn_cast(attr.getValue())) { + mlir::Type newAttr; + if (fir::isTypeWithDescriptor(typeAttr.getValue())) { + newAttr = lowerTy().convertBoxTypeAsStruct( + mlir::cast(typeAttr.getValue())); + } else { + newAttr = converter->convertType(typeAttr.getValue()); + } + newAttrs.emplace_back(attr.getName(), mlir::TypeAttr::get(newAttr)); + } else { + newAttrs.push_back(attr); + } + } + + rewriter.replaceOpWithNewOp( + curOp, resTypes, adaptor.getOperands(), newAttrs); + + return mlir::success(); + } +}; +} // namespace + +void fir::populateOpenMPFIRToLLVMConversionPatterns( + LLVMTypeConverter &converter, mlir::RewritePatternSet &patterns) { + patterns.add(converter); +} diff --git a/flang/lib/Optimizer/Dialect/FIRType.cpp b/flang/lib/Optimizer/Dialect/FIRType.cpp index 9c8812276a0a4..8a2c681d95860 100644 --- a/flang/lib/Optimizer/Dialect/FIRType.cpp +++ b/flang/lib/Optimizer/Dialect/FIRType.cpp @@ -334,6 +334,12 @@ bool isAllocatableOrPointerArray(mlir::Type ty) { return false; } +bool isTypeWithDescriptor(mlir::Type ty) { + if (mlir::isa(unwrapRefType(ty))) + return true; + return false; +} + bool isPolymorphicType(mlir::Type ty) { // CLASS(T) or CLASS(*) if (mlir::isa(fir::unwrapRefType(ty))) diff --git a/flang/lib/Optimizer/Transforms/CMakeLists.txt b/flang/lib/Optimizer/Transforms/CMakeLists.txt index fc067ad358539..ba2e267996150 100644 --- a/flang/lib/Optimizer/Transforms/CMakeLists.txt +++ b/flang/lib/Optimizer/Transforms/CMakeLists.txt @@ -17,6 +17,7 @@ add_flang_library(FIRTransforms AddDebugFoundation.cpp PolymorphicOpConversion.cpp LoopVersioning.cpp + OMPDescriptorMapInfoGen.cpp OMPFunctionFiltering.cpp OMPMarkDeclareTarget.cpp VScaleAttr.cpp diff --git a/flang/lib/Optimizer/Transforms/OMPDescriptorMapInfoGen.cpp b/flang/lib/Optimizer/Transforms/OMPDescriptorMapInfoGen.cpp new file mode 100644 index 0000000000000..6ffcf0746c76f --- /dev/null +++ b/flang/lib/Optimizer/Transforms/OMPDescriptorMapInfoGen.cpp @@ -0,0 +1,168 @@ +//===- OMPDescriptorMapInfoGen.cpp +//---------------------------------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +//===----------------------------------------------------------------------===// +/// \file +/// An OpenMP dialect related pass for FIR/HLFIR which expands MapInfoOp's +/// containing descriptor related types (fir::BoxType's) into multiple +/// MapInfoOp's containing the parent descriptor and pointer member components +/// for individual mapping, treating the descriptor type as a record type for +/// later lowering in the OpenMP dialect. +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/Dialect/Support/KindMapping.h" +#include "flang/Optimizer/Transforms/Passes.h" +#include "mlir/Dialect/Func/IR/FuncOps.h" +#include "mlir/Dialect/OpenMP/OpenMPDialect.h" +#include "mlir/IR/BuiltinDialect.h" +#include "mlir/IR/BuiltinOps.h" +#include "mlir/IR/Operation.h" +#include "mlir/IR/SymbolTable.h" +#include "mlir/Pass/Pass.h" +#include "mlir/Support/LLVM.h" +#include "llvm/ADT/SmallPtrSet.h" +#include + +namespace fir { +#define GEN_PASS_DEF_OMPDESCRIPTORMAPINFOGENPASS +#include "flang/Optimizer/Transforms/Passes.h.inc" +} // namespace fir + +namespace { +class OMPDescriptorMapInfoGenPass + : public fir::impl::OMPDescriptorMapInfoGenPassBase< + OMPDescriptorMapInfoGenPass> { + + void genDescriptorMemberMaps(mlir::omp::MapInfoOp op, + fir::FirOpBuilder &builder, + mlir::Operation *target) { + mlir::Location loc = builder.getUnknownLoc(); + mlir::Value descriptor = op.getVarPtr(); + + // If we enter this function, but the mapped type itself is not the + // descriptor, then it's likely the address of the descriptor so we + // must retrieve the descriptor SSA. + if (!fir::isTypeWithDescriptor(op.getVarType())) { + if (auto addrOp = mlir::dyn_cast_if_present( + op.getVarPtr().getDefiningOp())) { + descriptor = addrOp.getVal(); + } + } + + // The fir::BoxOffsetOp only works with !fir.ref> types, as + // allowing it to access non-reference box operations can cause some + // problematic SSA IR. However, in the case of assumed shape's the type + // is not a !fir.ref, in these cases to retrieve the appropriate + // !fir.ref> to access the data we need to map we must + // perform an alloca and then store to it and retrieve the data from the new + // alloca. + if (mlir::isa(descriptor.getType())) { + mlir::OpBuilder::InsertPoint insPt = builder.saveInsertionPoint(); + builder.setInsertionPointToStart(builder.getAllocaBlock()); + auto alloca = builder.create(loc, descriptor.getType()); + builder.restoreInsertionPoint(insPt); + builder.create(loc, descriptor, alloca); + descriptor = alloca; + } + + mlir::Value baseAddrAddr = builder.create( + loc, descriptor, fir::BoxFieldAttr::base_addr); + + // Member of the descriptor pointing at the allocated data + mlir::Value baseAddr = builder.create( + loc, baseAddrAddr.getType(), descriptor, + llvm::cast( + fir::unwrapRefType(baseAddrAddr.getType())) + .getElementType(), + baseAddrAddr, mlir::SmallVector{}, op.getBounds(), + builder.getIntegerAttr(builder.getIntegerType(64, false), + op.getMapType().value()), + builder.getAttr( + mlir::omp::VariableCaptureKind::ByRef), + builder.getStringAttr("") /*name*/); + + // TODO: map the addendum segment of the descriptor, similarly to the + // above base address/data pointer member. + + if (auto mapClauseOwner = + llvm::dyn_cast(target)) { + llvm::SmallVector newMapOps; + mlir::OperandRange mapOperandsArr = mapClauseOwner.getMapOperands(); + + for (size_t i = 0; i < mapOperandsArr.size(); ++i) { + if (mapOperandsArr[i] == op) { + // Push new implicit maps generated for the descriptor. + newMapOps.push_back(baseAddr); + + // for TargetOp's which have IsolatedFromAbove we must align the + // new additional map operand with an appropriate BlockArgument, + // as the printing and later processing currently requires a 1:1 + // mapping of BlockArgs to MapInfoOp's at the same placement in + // each array (BlockArgs and MapOperands). + if (auto targetOp = llvm::dyn_cast(target)) + targetOp.getRegion().insertArgument(i, baseAddr.getType(), loc); + } + newMapOps.push_back(mapOperandsArr[i]); + } + mapClauseOwner.getMapOperandsMutable().assign(newMapOps); + } + + mlir::Value newDescParentMapOp = builder.create( + op->getLoc(), op.getResult().getType(), descriptor, + fir::unwrapRefType(descriptor.getType()), mlir::Value{}, + mlir::SmallVector{baseAddr}, + mlir::SmallVector{}, + builder.getIntegerAttr(builder.getIntegerType(64, false), + op.getMapType().value()), + op.getMapCaptureTypeAttr(), op.getNameAttr()); + op.replaceAllUsesWith(newDescParentMapOp); + op->erase(); + } + + // This pass executes on mlir::ModuleOp's finding omp::MapInfoOp's containing + // descriptor based types (allocatables, pointers, assumed shape etc.) and + // expanding them into multiple omp::MapInfoOp's for each pointer member + // contained within the descriptor. + void runOnOperation() override { + mlir::func::FuncOp func = getOperation(); + mlir::ModuleOp module = func->getParentOfType(); + fir::KindMapping kindMap = fir::getKindMapping(module); + fir::FirOpBuilder builder{module, std::move(kindMap)}; + + func->walk([&](mlir::omp::MapInfoOp op) { + if (fir::isTypeWithDescriptor(op.getVarType()) || + mlir::isa_and_present( + op.getVarPtr().getDefiningOp())) { + builder.setInsertionPoint(op); + // TODO: Currently only supports a single user for the MapInfoOp, this + // is fine for the moment as the Fortran Frontend will generate a + // new MapInfoOp per Target operation for the moment. However, when/if + // we optimise/cleanup the IR, it likely isn't too difficult to + // extend this function, it would require some modification to create a + // single new MapInfoOp per new MapInfoOp generated and share it across + // all users appropriately, making sure to only add a single member link + // per new generation for the original originating descriptor MapInfoOp. + assert(llvm::hasSingleElement(op->getUsers()) && + "OMPDescriptorMapInfoGen currently only supports single users " + "of a MapInfoOp"); + genDescriptorMemberMaps(op, builder, *op->getUsers().begin()); + } + }); + } +}; + +} // namespace + +namespace fir { +std::unique_ptr createOMPDescriptorMapInfoGenPass() { + return std::make_unique(); +} +} // namespace fir diff --git a/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir b/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir index 6efa4d0a09586..beb399ec3ac05 100644 --- a/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir +++ b/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir @@ -893,3 +893,22 @@ func.func @omp_critical_() { } return } + +// ----- + +// CHECK-LABEL: llvm.func @omp_map_info_descriptor_type_conversion +// CHECK-SAME: %[[ARG_0:.*]]: !llvm.ptr) + +func.func @omp_map_info_descriptor_type_conversion(%arg0 : !fir.ref>>) { + // CHECK: %[[GEP:.*]] = llvm.getelementptr %[[ARG_0]][0, 0] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> + %0 = fir.box_offset %arg0 base_addr : (!fir.ref>>) -> !fir.llvm_ptr> + // CHECK: %[[MEMBER_MAP:.*]] = omp.map_info var_ptr(%[[GEP]] : !llvm.ptr, i32) map_clauses(tofrom) capture(ByRef) -> !llvm.ptr {name = ""} + %1 = omp.map_info var_ptr(%0 : !fir.llvm_ptr>, i32) map_clauses(tofrom) capture(ByRef) -> !fir.llvm_ptr> {name = ""} + // CHECK: %[[DESC_MAP:.*]] = omp.map_info var_ptr(%[[ARG_0]] : !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>) map_clauses(always, delete) capture(ByRef) members(%[[MEMBER_MAP]] : !llvm.ptr) -> !llvm.ptr {name = ""} + %2 = omp.map_info var_ptr(%arg0 : !fir.ref>>, !fir.box>) map_clauses(always, delete) capture(ByRef) members(%1 : !fir.llvm_ptr>) -> !fir.ref>> {name = ""} + // CHECK: omp.target_exit_data map_entries(%[[DESC_MAP]] : !llvm.ptr) + omp.target_exit_data map_entries(%2 : !fir.ref>>) + return +} + +// ----- diff --git a/flang/test/Integration/OpenMP/map-types-and-sizes.f90 b/flang/test/Integration/OpenMP/map-types-and-sizes.f90 index f0a0e5e765b41..7c438302e6398 100644 --- a/flang/test/Integration/OpenMP/map-types-and-sizes.f90 +++ b/flang/test/Integration/OpenMP/map-types-and-sizes.f90 @@ -30,8 +30,8 @@ subroutine mapType_array !$omp end target end subroutine mapType_array -!CHECK: @.offload_sizes{{.*}} = private unnamed_addr constant [1 x i64] [i64 8] -!CHECK: @.offload_maptypes{{.*}} = private unnamed_addr constant [1 x i64] [i64 547] +!CHECK: @.offload_sizes{{.*}} = private unnamed_addr constant [3 x i64] [i64 0, i64 24, i64 4] +!CHECK: @.offload_maptypes{{.*}} = private unnamed_addr constant [3 x i64] [i64 32, i64 281474976710657, i64 281474976711187] subroutine mapType_ptr integer, pointer :: a !$omp target @@ -39,6 +39,37 @@ subroutine mapType_ptr !$omp end target end subroutine mapType_ptr +!CHECK: @.offload_sizes{{.*}} = private unnamed_addr constant [3 x i64] [i64 0, i64 24, i64 4] +!CHECK: @.offload_maptypes{{.*}} = private unnamed_addr constant [3 x i64] [i64 32, i64 281474976710657, i64 281474976711187] +subroutine mapType_allocatable + integer, allocatable :: a + allocate(a) + !$omp target + a = 10 + !$omp end target + deallocate(a) +end subroutine mapType_allocatable + +!CHECK: @.offload_sizes{{.*}} = private unnamed_addr constant [3 x i64] [i64 0, i64 24, i64 4] +!CHECK: @.offload_maptypes{{.*}} = private unnamed_addr constant [3 x i64] [i64 32, i64 281474976710657, i64 281474976710675] +subroutine mapType_ptr_explicit + integer, pointer :: a + !$omp target map(tofrom: a) + a = 10 + !$omp end target +end subroutine mapType_ptr_explicit + +!CHECK: @.offload_sizes{{.*}} = private unnamed_addr constant [3 x i64] [i64 0, i64 24, i64 4] +!CHECK: @.offload_maptypes{{.*}} = private unnamed_addr constant [3 x i64] [i64 32, i64 281474976710657, i64 281474976710675] +subroutine mapType_allocatable_explicit + integer, allocatable :: a + allocate(a) + !$omp target map(tofrom: a) + a = 10 + !$omp end target + deallocate(a) +end subroutine mapType_allocatable_explicit + !CHECK: @.offload_sizes{{.*}} = private unnamed_addr constant [2 x i64] [i64 8, i64 4] !CHECK: @.offload_maptypes{{.*}} = private unnamed_addr constant [2 x i64] [i64 544, i64 800] subroutine mapType_c_ptr @@ -58,3 +89,24 @@ subroutine mapType_char a = 'b' !$omp end target end subroutine mapType_char + +!CHECK-LABEL: define void @maptype_ptr_explicit_() { +!CHECK: %[[ALLOCA:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 }, i64 1, align 8 +!CHECK: %[[ALLOCA_GEP:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8 }, ptr %[[ALLOCA]], i32 1 +!CHECK: %[[ALLOCA_GEP_INT:.*]] = ptrtoint ptr %[[ALLOCA_GEP]] to i64 +!CHECK: %[[ALLOCA_INT:.*]] = ptrtoint ptr %[[ALLOCA]] to i64 +!CHECK: %[[SIZE_DIFF:.*]] = sub i64 %[[ALLOCA_GEP_INT]], %[[ALLOCA_INT]] +!CHECK: %[[DIV:.*]] = sdiv exact i64 %[[SIZE_DIFF]], ptrtoint (ptr getelementptr (i8, ptr null, i32 1) to i64) +!CHECK: %[[OFFLOAD_SIZE_ARR:.*]] = getelementptr inbounds [3 x i64], ptr %.offload_sizes, i32 0, i32 0 +!CHECK: store i64 %[[DIV]], ptr %[[OFFLOAD_SIZE_ARR]], align 8 + + +!CHECK-LABEL: define void @maptype_allocatable_explicit_() { +!CHECK: %[[ALLOCA:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 }, i64 1, align 8 +!CHECK: %[[ALLOCA_GEP:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8 }, ptr %[[ALLOCA]], i32 1 +!CHECK: %[[ALLOCA_GEP_INT:.*]] = ptrtoint ptr %[[ALLOCA_GEP]] to i64 +!CHECK: %[[ALLOCA_INT:.*]] = ptrtoint ptr %[[ALLOCA]] to i64 +!CHECK: %[[SIZE_DIFF:.*]] = sub i64 %[[ALLOCA_GEP_INT]], %[[ALLOCA_INT]] +!CHECK: %[[DIV:.*]] = sdiv exact i64 %[[SIZE_DIFF]], ptrtoint (ptr getelementptr (i8, ptr null, i32 1) to i64) +!CHECK: %[[OFFLOAD_SIZE_ARR:.*]] = getelementptr inbounds [3 x i64], ptr %.offload_sizes, i32 0, i32 0 +!CHECK: store i64 %[[DIV]], ptr %[[OFFLOAD_SIZE_ARR]], align 8 diff --git a/flang/test/Lower/OpenMP/FIR/array-bounds.f90 b/flang/test/Lower/OpenMP/FIR/array-bounds.f90 index 0e0aeaed0b553..3cd284c46e727 100644 --- a/flang/test/Lower/OpenMP/FIR/array-bounds.f90 +++ b/flang/test/Lower/OpenMP/FIR/array-bounds.f90 @@ -35,6 +35,7 @@ module assumed_array_routines contains !ALL-LABEL: func.func @_QMassumed_array_routinesPassumed_shape_array( !ALL-SAME: %[[ARG0:.*]]: !fir.box> {fir.bindc_name = "arr_read_write"}) +!ALL: %[[INTERMEDIATE_ALLOCA:.*]] = fir.alloca !fir.box> !ALL: %[[ALLOCA:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QMassumed_array_routinesFassumed_shape_arrayEi"} !ALL: %[[C0:.*]] = arith.constant 1 : index !ALL: %[[C1:.*]] = arith.constant 0 : index @@ -44,20 +45,20 @@ module assumed_array_routines !ALL: %[[C0_1:.*]] = arith.constant 0 : index !ALL: %[[DIMS1:.*]]:3 = fir.box_dims %arg0, %[[C0_1]] : (!fir.box>, index) -> (index, index, index) !ALL: %[[BOUNDS:.*]] = omp.bounds lower_bound(%[[C3]] : index) upper_bound(%[[C4]] : index) extent(%[[DIMS1]]#1 : index) stride(%[[DIMS0]]#2 : index) start_idx(%[[C0]] : index) {stride_in_bytes = true} -!ALL: %[[ADDROF:.*]] = fir.box_addr %arg0 : (!fir.box>) -> !fir.ref> -!ALL: %[[MAP:.*]] = omp.map_info var_ptr(%[[ADDROF]] : !fir.ref>, !fir.array) map_clauses(tofrom) capture(ByRef) bounds(%[[BOUNDS]]) -> !fir.ref> {name = "arr_read_write(2:5)"} +!ALL: %[[BOXADDRADDR:.*]] = fir.box_offset %0 base_addr : (!fir.ref>>) -> !fir.llvm_ptr>> +!ALL: %[[MAP_MEMBER:.*]] = omp.map_info var_ptr(%0 : !fir.ref>>, !fir.array) var_ptr_ptr(%[[BOXADDRADDR]] : !fir.llvm_ptr>>) map_clauses(tofrom) capture(ByRef) bounds(%[[BOUNDS]]) -> !fir.llvm_ptr>> {name = ""} +!ALL: %[[MAP:.*]] = omp.map_info var_ptr(%0 : !fir.ref>>, !fir.box>) map_clauses(tofrom) capture(ByRef) members(%[[MAP_MEMBER]] : !fir.llvm_ptr>>) -> !fir.ref> {name = "arr_read_write(2:5)"} !ALL: %[[MAP2:.*]] = omp.map_info var_ptr(%[[ALLOCA]] : !fir.ref, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !fir.ref {name = "i"} -!ALL: omp.target map_entries(%[[MAP]] -> %{{.*}}, %[[MAP2]] -> %{{.*}} : !fir.ref>, !fir.ref) { - +!ALL: omp.target map_entries(%[[MAP_MEMBER]] -> %{{.*}}, %[[MAP]] -> %{{.*}}, %[[MAP2]] -> %{{.*}} : !fir.llvm_ptr>>, !fir.ref>, !fir.ref) { subroutine assumed_shape_array(arr_read_write) - integer, intent(inout) :: arr_read_write(:) + integer, intent(inout) :: arr_read_write(:) !$omp target map(tofrom:arr_read_write(2:5)) do i = 2, 5 arr_read_write(i) = i end do !$omp end target - end subroutine assumed_shape_array + end subroutine assumed_shape_array !ALL-LABEL: func.func @_QMassumed_array_routinesPassumed_size_array( !ALL-SAME: %[[ARG0:.*]]: !fir.ref> {fir.bindc_name = "arr_read_write"}) @@ -71,17 +72,16 @@ end subroutine assumed_shape_array !ALL: %[[MAP:.*]] = omp.map_info var_ptr(%[[ARG0]] : !fir.ref>, !fir.array) map_clauses(tofrom) capture(ByRef) bounds(%[[BOUNDS]]) -> !fir.ref> {name = "arr_read_write(2:5)"} !ALL: %[[MAP2:.*]] = omp.map_info var_ptr(%[[ALLOCA]] : !fir.ref, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !fir.ref {name = "i"} !ALL: omp.target map_entries(%[[MAP]] -> %{{.*}}, %[[MAP2]] -> %{{.*}} : !fir.ref>, !fir.ref) { + subroutine assumed_size_array(arr_read_write) + integer, intent(inout) :: arr_read_write(*) - subroutine assumed_size_array(arr_read_write) - integer, intent(inout) :: arr_read_write(*) - - !$omp target map(tofrom:arr_read_write(2:5)) - do i = 2, 5 - arr_read_write(i) = i - end do - !$omp end target - end subroutine assumed_size_array - end module assumed_array_routines + !$omp target map(tofrom:arr_read_write(2:5)) + do i = 2, 5 + arr_read_write(i) = i + end do + !$omp end target + end subroutine assumed_size_array +end module assumed_array_routines !DEVICE-NOT:func.func @_QPcall_assumed_shape_and_size_array() { @@ -113,7 +113,6 @@ end module assumed_array_routines !HOST:fir.call @_QMassumed_array_routinesPassumed_size_array(%[[ARG1]]) fastmath : (!fir.ref>) -> () !HOST:return !HOST:} - subroutine call_assumed_shape_and_size_array use assumed_array_routines integer :: arr_read_write(20) diff --git a/flang/test/Lower/OpenMP/FIR/target.f90 b/flang/test/Lower/OpenMP/FIR/target.f90 index 5d36699bf0e90..06772771647de 100644 --- a/flang/test/Lower/OpenMP/FIR/target.f90 +++ b/flang/test/Lower/OpenMP/FIR/target.f90 @@ -450,8 +450,9 @@ end subroutine omp_target_device_ptr subroutine omp_target_device_addr integer, pointer :: a !CHECK: %[[VAL_0:.*]] = fir.alloca !fir.box> {bindc_name = "a", uniq_name = "_QFomp_target_device_addrEa"} - !CHECK: %[[MAP:.*]] = omp.map_info var_ptr({{.*}}) map_clauses(tofrom) capture(ByRef) -> {{.*}} {name = "a"} - !CHECK: omp.target_data map_entries(%[[MAP]] : {{.*}}) use_device_addr(%[[VAL_0]] : !fir.ref>>) { + !CHECK: %[[MAP_MEMBERS:.*]] = omp.map_info var_ptr({{.*}} : !fir.ref>>, i32) var_ptr_ptr({{.*}} : !fir.llvm_ptr>) map_clauses(tofrom) capture(ByRef) -> !fir.llvm_ptr> {name = ""} + !CHECK: %[[MAP:.*]] = omp.map_info var_ptr({{.*}} : !fir.ref>>, !fir.box>) map_clauses(tofrom) capture(ByRef) members(%[[MAP_MEMBERS]] : !fir.llvm_ptr>) -> !fir.ref>> {name = "a"} + !CHECK: omp.target_data map_entries(%[[MAP_MEMBERS]], %[[MAP]] : {{.*}}) use_device_addr(%[[VAL_0]] : !fir.ref>>) { !$omp target data map(tofrom: a) use_device_addr(a) !CHECK: ^bb0(%[[VAL_1:.*]]: !fir.ref>>): !CHECK: {{.*}} = fir.load %[[VAL_1]] : !fir.ref>> diff --git a/flang/test/Lower/OpenMP/allocatable-array-bounds.f90 b/flang/test/Lower/OpenMP/allocatable-array-bounds.f90 new file mode 100644 index 0000000000000..adf74efa9b596 --- /dev/null +++ b/flang/test/Lower/OpenMP/allocatable-array-bounds.f90 @@ -0,0 +1,117 @@ +!RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s --check-prefixes HOST + +!HOST-LABEL: func.func @_QPread_write_section() { + +!HOST: %[[ALLOCA_1:.*]] = fir.alloca !fir.box>> {bindc_name = "sp_read", uniq_name = "_QFread_write_sectionEsp_read"} +!HOST: %[[DECLARE_1:.*]]:2 = hlfir.declare %[[ALLOCA_1]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFread_write_sectionEsp_read"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) + +!HOST: %[[ALLOCA_2:.*]] = fir.alloca !fir.box>> {bindc_name = "sp_write", uniq_name = "_QFread_write_sectionEsp_write"} +!HOST: %[[DECLARE_2:.*]]:2 = hlfir.declare %[[ALLOCA_2]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFread_write_sectionEsp_write"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) + +!HOST: %[[LOAD_1:.*]] = fir.load %[[DECLARE_1]]#0 : !fir.ref>>> +!HOST: %[[LOAD_2:.*]] = fir.load %[[DECLARE_1]]#1 : !fir.ref>>> +!HOST: %[[CONSTANT_1:.*]] = arith.constant 0 : index +!HOST: %[[BOX_1:.*]]:3 = fir.box_dims %[[LOAD_2]], %[[CONSTANT_1]] : (!fir.box>>, index) -> (index, index, index) +!HOST: %[[CONSTANT_2:.*]] = arith.constant 0 : index +!HOST: %[[BOX_2:.*]]:3 = fir.box_dims %[[LOAD_1]], %[[CONSTANT_2]] : (!fir.box>>, index) -> (index, index, index) +!HOST: %[[CONSTANT_3:.*]] = arith.constant 2 : index +!HOST: %[[LB_1:.*]] = arith.subi %[[CONSTANT_3]], %[[BOX_1]]#0 : index +!HOST: %[[CONSTANT_4:.*]] = arith.constant 5 : index +!HOST: %[[UB_1:.*]] = arith.subi %[[CONSTANT_4]], %[[BOX_1]]#0 : index +!HOST: %[[LOAD_3:.*]] = fir.load %[[DECLARE_1]]#1 : !fir.ref>>> +!HOST: %[[CONSTANT_3:.*]] = arith.constant 0 : index +!HOST: %[[BOX_3:.*]]:3 = fir.box_dims %[[LOAD_3]], %[[CONSTANT_3]] : (!fir.box>>, index) -> (index, index, index) +!HOST: %[[BOUNDS_1:.*]] = omp.bounds lower_bound(%[[LB_1]] : index) upper_bound(%[[UB_1]] : index) extent(%[[BOX_3]]#1 : index) stride(%[[BOX_2]]#2 : index) start_idx(%[[BOX_1]]#0 : index) {stride_in_bytes = true} +!HOST: %[[VAR_PTR_PTR:.*]] = fir.box_offset %[[DECLARE_1]]#1 base_addr : (!fir.ref>>>) -> !fir.llvm_ptr>> +!HOST: %[[MAP_INFO_MEMBER:.*]] = omp.map_info var_ptr(%[[DECLARE_1]]#1 : !fir.ref>>>, !fir.array) var_ptr_ptr(%[[VAR_PTR_PTR]] : !fir.llvm_ptr>>) map_clauses(tofrom) capture(ByRef) bounds(%[[BOUNDS_1]]) -> !fir.llvm_ptr>> {name = ""} +!HOST: %[[MAP_INFO_1:.*]] = omp.map_info var_ptr(%[[DECLARE_1]]#1 : !fir.ref>>>, !fir.box>>) map_clauses(tofrom) capture(ByRef) members(%[[MAP_INFO_MEMBER]] : !fir.llvm_ptr>>) -> !fir.ref>>> {name = "sp_read(2:5)"} + +!HOST: %[[LOAD_3:.*]] = fir.load %[[DECLARE_2]]#0 : !fir.ref>>> +!HOST: %[[LOAD_4:.*]] = fir.load %[[DECLARE_2]]#1 : !fir.ref>>> +!HOST: %[[CONSTANT_5:.*]] = arith.constant 0 : index +!HOST: %[[BOX_3:.*]]:3 = fir.box_dims %[[LOAD_4]], %[[CONSTANT_5]] : (!fir.box>>, index) -> (index, index, index) +!HOST: %[[CONSTANT_6:.*]] = arith.constant 0 : index +!HOST: %[[BOX_4:.*]]:3 = fir.box_dims %[[LOAD_3]], %[[CONSTANT_6]] : (!fir.box>>, index) -> (index, index, index) +!HOST: %[[CONSTANT_7:.*]] = arith.constant 2 : index +!HOST: %[[LB_2:.*]] = arith.subi %[[CONSTANT_7]], %[[BOX_3]]#0 : index +!HOST: %[[CONSTANT_8:.*]] = arith.constant 5 : index +!HOST: %[[UB_2:.*]] = arith.subi %[[CONSTANT_8]], %[[BOX_3]]#0 : index +!HOST: %[[LOAD_5:.*]] = fir.load %[[DECLARE_2]]#1 : !fir.ref>>> +!HOST: %[[CONSTANT_5:.*]] = arith.constant 0 : index +!HOST: %[[BOX_5:.*]]:3 = fir.box_dims %[[LOAD_5]], %[[CONSTANT_5]] : (!fir.box>>, index) -> (index, index, index) +!HOST: %[[BOUNDS_2:.*]] = omp.bounds lower_bound(%[[LB_2]] : index) upper_bound(%[[UB_2]] : index) extent(%[[BOX_5]]#1 : index) stride(%[[BOX_4]]#2 : index) start_idx(%[[BOX_3]]#0 : index) {stride_in_bytes = true} +!HOST: %[[VAR_PTR_PTR:.*]] = fir.box_offset %[[DECLARE_2]]#1 base_addr : (!fir.ref>>>) -> !fir.llvm_ptr>> +!HOST: %[[MAP_INFO_MEMBER:.*]] = omp.map_info var_ptr(%[[DECLARE_2]]#1 : !fir.ref>>>, !fir.array) var_ptr_ptr(%[[VAR_PTR_PTR]] : !fir.llvm_ptr>>) map_clauses(tofrom) capture(ByRef) bounds(%[[BOUNDS_2]]) -> !fir.llvm_ptr>> {name = ""} +!HOST: %[[MAP_INFO_2:.*]] = omp.map_info var_ptr(%[[DECLARE_2]]#1 : !fir.ref>>>, !fir.box>>) map_clauses(tofrom) capture(ByRef) members(%[[MAP_INFO_MEMBER]] : !fir.llvm_ptr>>) -> !fir.ref>>> {name = "sp_write(2:5)"} + +subroutine read_write_section() + integer, allocatable :: sp_read(:) + integer, allocatable :: sp_write(:) + allocate(sp_read(10)) + allocate(sp_write(10)) + sp_write = (/0,0,0,0,0,0,0,0,0,0/) + sp_read = (/1,2,3,4,5,6,7,8,9,10/) + +!$omp target map(tofrom:sp_read(2:5)) map(tofrom:sp_write(2:5)) + do i = 2, 5 + sp_write(i) = sp_read(i) + end do +!$omp end target +end subroutine read_write_section + +module assumed_allocatable_array_routines + contains + +!HOST-LABEL: func.func @_QMassumed_allocatable_array_routinesPassumed_shape_array( + +!HOST: %[[DECLARE:.*]]:2 = hlfir.declare %[[ARG:.*]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QMassumed_allocatable_array_routinesFassumed_shape_arrayEarr_read_write"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) +!HOST: %[[LOAD_1:.*]] = fir.load %[[DECLARE]]#0 : !fir.ref>>> +!HOST: %[[LOAD_2:.*]] = fir.load %[[DECLARE]]#1 : !fir.ref>>> +!HOST: %[[CONSTANT_1:.*]] = arith.constant 0 : index +!HOST: %[[BOX_1:.*]]:3 = fir.box_dims %[[LOAD_2]], %[[CONSTANT_1]] : (!fir.box>>, index) -> (index, index, index) +!HOST: %[[CONSTANT_2:.*]] = arith.constant 0 : index +!HOST: %[[BOX_2:.*]]:3 = fir.box_dims %[[LOAD_1]], %[[CONSTANT_2]] : (!fir.box>>, index) -> (index, index, index) +!HOST: %[[CONSTANT_3:.*]] = arith.constant 2 : index +!HOST: %[[LB:.*]] = arith.subi %[[CONSTANT_3]], %[[BOX_1]]#0 : index +!HOST: %[[CONSTANT_4:.*]] = arith.constant 5 : index +!HOST: %[[UB:.*]] = arith.subi %[[CONSTANT_4]], %[[BOX_1]]#0 : index +!HOST: %[[LOAD_3:.*]] = fir.load %[[DECLARE]]#1 : !fir.ref>>> +!HOST: %[[CONSTANT_3:.*]] = arith.constant 0 : index +!HOST: %[[BOX_3:.*]]:3 = fir.box_dims %[[LOAD_3]], %[[CONSTANT_3]] : (!fir.box>>, index) -> (index, index, index) +!HOST: %[[BOUNDS:.*]] = omp.bounds lower_bound(%[[LB]] : index) upper_bound(%[[UB]] : index) extent(%[[BOX_3]]#1 : index) stride(%[[BOX_2]]#2 : index) start_idx(%[[BOX_1]]#0 : index) {stride_in_bytes = true} +!HOST: %[[VAR_PTR_PTR:.*]] = fir.box_offset %[[DECLARE]]#1 base_addr : (!fir.ref>>>) -> !fir.llvm_ptr>> +!HOST: %[[MAP_INFO_MEMBER:.*]] = omp.map_info var_ptr(%[[DECLARE]]#1 : !fir.ref>>>, !fir.array) var_ptr_ptr(%[[VAR_PTR_PTR]] : !fir.llvm_ptr>>) map_clauses(tofrom) capture(ByRef) bounds(%[[BOUNDS]]) -> !fir.llvm_ptr>> {name = ""} +!HOST: %[[MAP_INFO:.*]] = omp.map_info var_ptr(%[[DECLARE]]#1 : !fir.ref>>>, !fir.box>>) map_clauses(tofrom) capture(ByRef) members(%[[MAP_INFO_MEMBER]] : !fir.llvm_ptr>>) -> !fir.ref>>> {name = "arr_read_write(2:5)"} +subroutine assumed_shape_array(arr_read_write) + integer, allocatable, intent(inout) :: arr_read_write(:) + +!$omp target map(tofrom:arr_read_write(2:5)) + do i = 2, 5 + arr_read_write(i) = i + end do +!$omp end target +end subroutine assumed_shape_array +end module assumed_allocatable_array_routines + +!HOST-LABEL: func.func @_QPcall_assumed_shape_and_size_array() { +!HOST: %[[ALLOCA:.*]] = fir.alloca !fir.box>> {bindc_name = "arr_read_write", uniq_name = "_QFcall_assumed_shape_and_size_arrayEarr_read_write"} +!HOST: %[[DECLARE:.*]]:2 = hlfir.declare %[[ALLOCA]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFcall_assumed_shape_and_size_arrayEarr_read_write"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) +!HOST: %[[ALLOCA_MEM:.*]] = fir.allocmem !fir.array, %{{.*}} {fir.must_be_heap = true, uniq_name = "_QFcall_assumed_shape_and_size_arrayEarr_read_write.alloc"} +!HOST: %[[SHAPE:.*]] = fir.shape %{{.*}} : (index) -> !fir.shape<1> +!HOST: %[[EMBOX:.*]] = fir.embox %[[ALLOCA_MEM]](%[[SHAPE]]) : (!fir.heap>, !fir.shape<1>) -> !fir.box>> +!HOST: fir.store %[[EMBOX]] to %[[DECLARE]]#1 : !fir.ref>>> +!HOST: %[[LOAD:.*]] = fir.load %[[DECLARE]]#0 : !fir.ref>>> +!HOST: %[[CONSTANT_1:.*]] = arith.constant 10 : index +!HOST: %[[CONSTANT_2:.*]] = arith.constant 20 : index +!HOST: %[[CONSTANT_3:.*]] = arith.constant 1 : index +!HOST: %[[CONSTANT_4:.*]] = arith.constant 11 : index +!HOST: %[[SHAPE:.*]] = fir.shape %[[CONSTANT_4]] : (index) -> !fir.shape<1> +!HOST: %[[DESIGNATE:.*]] = hlfir.designate %[[LOAD]] (%[[CONSTANT_1]]:%[[CONSTANT_2]]:%[[CONSTANT_3]]) shape %[[SHAPE]] : (!fir.box>>, index, index, index, !fir.shape<1>) -> !fir.ref> +!HOST: fir.call @_QPassumed_size_array(%[[DESIGNATE]]) fastmath : (!fir.ref>) -> () +subroutine call_assumed_shape_and_size_array + use assumed_allocatable_array_routines + integer, allocatable :: arr_read_write(:) + allocate(arr_read_write(20)) + call assumed_size_array(arr_read_write(10:20)) + deallocate(arr_read_write) +end subroutine call_assumed_shape_and_size_array diff --git a/flang/test/Lower/OpenMP/allocatable-map.f90 b/flang/test/Lower/OpenMP/allocatable-map.f90 new file mode 100644 index 0000000000000..ddc20b582b26e --- /dev/null +++ b/flang/test/Lower/OpenMP/allocatable-map.f90 @@ -0,0 +1,13 @@ +!RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s --check-prefixes="HLFIRDIALECT" + +!HLFIRDIALECT: %[[POINTER:.*]]:2 = hlfir.declare %0 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFpointer_routineEpoint"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) +!HLFIRDIALECT: %[[BOX_OFF:.*]] = fir.box_offset %[[POINTER]]#1 base_addr : (!fir.ref>>) -> !fir.llvm_ptr> +!HLFIRDIALECT: %[[POINTER_MAP_MEMBER:.*]] = omp.map_info var_ptr(%[[POINTER]]#1 : !fir.ref>>, i32) var_ptr_ptr(%[[BOX_OFF]] : !fir.llvm_ptr>) map_clauses(implicit, tofrom) capture(ByRef) -> !fir.llvm_ptr> {name = ""} +!HLFIRDIALECT: %[[POINTER_MAP:.*]] = omp.map_info var_ptr(%[[POINTER]]#1 : !fir.ref>>, !fir.box>) map_clauses(implicit, tofrom) capture(ByRef) members(%[[POINTER_MAP_MEMBER]] : !fir.llvm_ptr>) -> !fir.ref>> {name = "point"} +!HLFIRDIALECT: omp.target map_entries({{.*}}, %[[POINTER_MAP_MEMBER]] -> {{.*}}, %[[POINTER_MAP]] -> {{.*}} : {{.*}}, !fir.llvm_ptr>, !fir.ref>>) { +subroutine pointer_routine() + integer, pointer :: point +!$omp target map(tofrom:pointer) + point = 1 +!$omp end target +end subroutine pointer_routine diff --git a/flang/test/Lower/OpenMP/array-bounds.f90 b/flang/test/Lower/OpenMP/array-bounds.f90 index 92c0c5307f0a1..7d76ff4b106a0 100644 --- a/flang/test/Lower/OpenMP/array-bounds.f90 +++ b/flang/test/Lower/OpenMP/array-bounds.f90 @@ -40,6 +40,7 @@ module assumed_array_routines !HOST-LABEL: func.func @_QMassumed_array_routinesPassumed_shape_array( !HOST-SAME: %[[ARG0:.*]]: !fir.box> {fir.bindc_name = "arr_read_write"}) { +!HOST: %[[INTERMEDIATE_ALLOCA:.*]] = fir.alloca !fir.box> !HOST: %[[ARG0_DECL:.*]]:2 = hlfir.declare %[[ARG0]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QMassumed_array_routinesFassumed_shape_arrayEarr_read_write"} : (!fir.box>) -> (!fir.box>, !fir.box>) !HOST: %[[C0:.*]] = arith.constant 1 : index !HOST: %[[C1:.*]] = arith.constant 0 : index @@ -49,9 +50,10 @@ module assumed_array_routines !HOST: %[[C0_1:.*]] = arith.constant 0 : index !HOST: %[[DIMS1:.*]]:3 = fir.box_dims %[[ARG0_DECL]]#1, %[[C0_1]] : (!fir.box>, index) -> (index, index, index) !HOST: %[[BOUNDS:.*]] = omp.bounds lower_bound(%[[C3]] : index) upper_bound(%[[C4]] : index) extent(%[[DIMS1]]#1 : index) stride(%[[DIMS0]]#2 : index) start_idx(%[[C0]] : index) {stride_in_bytes = true} -!HOST: %[[ADDROF:.*]] = fir.box_addr %[[ARG0_DECL]]#0 : (!fir.box>) -> !fir.ref> -!HOST: %[[MAP:.*]] = omp.map_info var_ptr(%[[ADDROF]] : !fir.ref>, !fir.array) map_clauses(tofrom) capture(ByRef) bounds(%[[BOUNDS]]) -> !fir.ref> {name = "arr_read_write(2:5)"} -!HOST: omp.target map_entries(%[[MAP]] -> %{{.*}}, {{.*}} -> {{.*}} : !fir.ref>, !fir.ref) { +!HOST: %[[VAR_PTR_PTR:.*]] = fir.box_offset %0 base_addr : (!fir.ref>>) -> !fir.llvm_ptr>> +!HOST: %[[MAP_INFO_MEMBER:.*]] = omp.map_info var_ptr(%[[INTERMEDIATE_ALLOCA]] : !fir.ref>>, !fir.array) var_ptr_ptr(%[[VAR_PTR_PTR]] : !fir.llvm_ptr>>) map_clauses(tofrom) capture(ByRef) bounds(%[[BOUNDS]]) -> !fir.llvm_ptr>> {name = ""} +!HOST: %[[MAP:.*]] = omp.map_info var_ptr(%[[INTERMEDIATE_ALLOCA]] : !fir.ref>>, !fir.box>) map_clauses(tofrom) capture(ByRef) members(%[[MAP_INFO_MEMBER]] : !fir.llvm_ptr>>) -> !fir.ref> {name = "arr_read_write(2:5)"} +!HOST: omp.target map_entries(%[[MAP_INFO_MEMBER]] -> %{{.*}}, %[[MAP]] -> %{{.*}}, {{.*}} -> {{.*}} : !fir.llvm_ptr>>, !fir.ref>, !fir.ref) { subroutine assumed_shape_array(arr_read_write) integer, intent(inout) :: arr_read_write(:) @@ -60,11 +62,12 @@ subroutine assumed_shape_array(arr_read_write) arr_read_write(i) = i end do !$omp end target - end subroutine assumed_shape_array + end subroutine assumed_shape_array !HOST-LABEL: func.func @_QMassumed_array_routinesPassumed_size_array( !HOST-SAME: %[[ARG0:.*]]: !fir.ref> {fir.bindc_name = "arr_read_write"}) { +!HOST: %[[INTERMEDIATE_ALLOCA:.*]] = fir.alloca !fir.box> !HOST: %[[ARG0_SHAPE:.*]] = fir.shape %{{.*}} : (index) -> !fir.shape<1> !HOST: %[[ARG0_DECL:.*]]:2 = hlfir.declare %[[ARG0]](%[[ARG0_SHAPE]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QMassumed_array_routinesFassumed_size_arrayEarr_read_write"} : (!fir.ref>, !fir.shape<1>) -> (!fir.box>, !fir.ref>) !HOST: %[[ALLOCA:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QMassumed_array_routinesFassumed_size_arrayEi"} @@ -72,20 +75,20 @@ end subroutine assumed_shape_array !HOST: %[[C4_1:.*]] = arith.subi %c4, %c1{{.*}} : index !HOST: %[[EXT:.*]] = arith.addi %[[C4_1]], %c1{{.*}} : index !HOST: %[[BOUNDS:.*]] = omp.bounds lower_bound(%c1{{.*}} : index) upper_bound(%c4{{.*}} : index) extent(%[[EXT]] : index) stride(%[[DIMS0]]#2 : index) start_idx(%c1{{.*}} : index) {stride_in_bytes = true} -!HOST: %[[ADDR:.*]] = fir.box_addr %[[ARG0_DECL]]#0 : (!fir.box>) -> !fir.ref> -!HOST: %[[MAP:.*]] = omp.map_info var_ptr(%[[ADDR]] : !fir.ref>, !fir.array) map_clauses(tofrom) capture(ByRef) bounds(%7) -> !fir.ref> {name = "arr_read_write(2:5)"} -!HOST: omp.target map_entries(%[[MAP]] -> %{{.*}}, {{.*}} -> {{.*}} : !fir.ref>, !fir.ref) { - subroutine assumed_size_array(arr_read_write) - integer, intent(inout) :: arr_read_write(*) - - !$omp target map(tofrom:arr_read_write(2:5)) - do i = 2, 5 - arr_read_write(i) = i - end do - !$omp end target - end subroutine assumed_size_array - end module assumed_array_routines +!HOST: %[[VAR_PTR_PTR:.*]] = fir.box_offset %[[INTERMEDIATE_ALLOCA]] base_addr : (!fir.ref>>) -> !fir.llvm_ptr>> +!HOST: %[[MAP_INFO_MEMBER:.*]] = omp.map_info var_ptr(%[[INTERMEDIATE_ALLOCA]] : !fir.ref>>, !fir.array) var_ptr_ptr(%[[VAR_PTR_PTR]] : !fir.llvm_ptr>>) map_clauses(tofrom) capture(ByRef) bounds(%[[BOUNDS]]) -> !fir.llvm_ptr>> {name = ""} +!HOST: %[[MAP:.*]] = omp.map_info var_ptr(%[[INTERMEDIATE_ALLOCA]] : !fir.ref>>, !fir.box>) map_clauses(tofrom) capture(ByRef) members(%[[MAP_INFO_MEMBER]] : !fir.llvm_ptr>>) -> !fir.ref> {name = "arr_read_write(2:5)"} +!HOST: omp.target map_entries(%[[MAP_INFO_MEMBER]] -> %{{.*}}, %[[MAP]] -> %{{.*}}, {{.*}} -> {{.*}} : !fir.llvm_ptr>>, !fir.ref>, !fir.ref) { + subroutine assumed_size_array(arr_read_write) + integer, intent(inout) :: arr_read_write(*) + !$omp target map(tofrom:arr_read_write(2:5)) + do i = 2, 5 + arr_read_write(i) = i + end do + !$omp end target + end subroutine assumed_size_array +end module assumed_array_routines !HOST-LABEL:func.func @_QPcall_assumed_shape_and_size_array() { !HOST: %[[C20:.*]] = arith.constant 20 : index diff --git a/flang/test/Lower/OpenMP/target.f90 b/flang/test/Lower/OpenMP/target.f90 index e9255cc9b1c0f..fa07b7f71d514 100644 --- a/flang/test/Lower/OpenMP/target.f90 +++ b/flang/test/Lower/OpenMP/target.f90 @@ -445,8 +445,9 @@ subroutine omp_target_device_addr integer, pointer :: a !CHECK: %[[VAL_0:.*]] = fir.alloca !fir.box> {bindc_name = "a", uniq_name = "_QFomp_target_device_addrEa"} !CHECK: %[[VAL_0_DECL:.*]]:2 = hlfir.declare %0 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFomp_target_device_addrEa"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) - !CHECK: %[[MAP:.*]] = omp.map_info var_ptr({{.*}}) map_clauses(tofrom) capture(ByRef) -> {{.*}} {name = "a"} - !CHECK: omp.target_data map_entries(%[[MAP]] : {{.*}}) use_device_addr(%[[VAL_0_DECL]]#1 : !fir.ref>>) { + !CHECK: %[[MAP_MEMBERS:.*]] = omp.map_info var_ptr({{.*}} : !fir.ref>>, i32) var_ptr_ptr({{.*}} : !fir.llvm_ptr>) map_clauses(tofrom) capture(ByRef) -> !fir.llvm_ptr> {name = ""} + !CHECK: %[[MAP:.*]] = omp.map_info var_ptr({{.*}} : !fir.ref>>, !fir.box>) map_clauses(tofrom) capture(ByRef) members(%[[MAP_MEMBERS]] : !fir.llvm_ptr>) -> !fir.ref>> {name = "a"} + !CHECK: omp.target_data map_entries(%[[MAP_MEMBERS]], %[[MAP]] : {{.*}}) use_device_addr(%[[VAL_0_DECL]]#1 : !fir.ref>>) { !$omp target data map(tofrom: a) use_device_addr(a) !CHECK: ^bb0(%[[VAL_1:.*]]: !fir.ref>>): !CHECK: %[[VAL_1_DECL:.*]]:2 = hlfir.declare %[[VAL_1]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFomp_target_device_addrEa"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) diff --git a/flang/test/Transforms/omp-descriptor-map-info-gen.fir b/flang/test/Transforms/omp-descriptor-map-info-gen.fir new file mode 100644 index 0000000000000..22594ec88c9cb --- /dev/null +++ b/flang/test/Transforms/omp-descriptor-map-info-gen.fir @@ -0,0 +1,44 @@ +// RUN: fir-opt --omp-descriptor-map-info-gen %s | FileCheck %s + +module attributes {omp.is_target_device = false} { + func.func @test_descriptor_expansion_pass(%arg0: !fir.box>) { + %0 = fir.alloca !fir.box> + %1 = fir.zero_bits !fir.heap + %2:2 = hlfir.declare %arg0 {fortran_attrs = #fir.var_attrs, uniq_name = "test"} : (!fir.box>) -> (!fir.box>, !fir.box>) + %3 = fir.embox %1 : (!fir.heap) -> !fir.box> + fir.store %3 to %0 : !fir.ref>> + %4:2 = hlfir.declare %0 {fortran_attrs = #fir.var_attrs, uniq_name = "test2"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) + %5 = fir.allocmem i32 {fir.must_be_heap = true} + %6 = fir.embox %5 : (!fir.heap) -> !fir.box> + fir.store %6 to %4#1 : !fir.ref>> + %c0 = arith.constant 1 : index + %c1 = arith.constant 0 : index + %c2 = arith.constant 10 : index + %dims:3 = fir.box_dims %2#1, %c1 : (!fir.box>, index) -> (index, index, index) + %bounds = omp.bounds lower_bound(%c1 : index) upper_bound(%c2 : index) extent(%dims#1 : index) stride(%dims#2 : index) start_idx(%c0 : index) {stride_in_bytes = true} + %7 = fir.box_addr %2#1 : (!fir.box>) -> !fir.ref> + %8 = omp.map_info var_ptr(%4#1 : !fir.ref>>, !fir.box>) map_clauses(tofrom) capture(ByRef) -> !fir.ref>> + %9 = omp.map_info var_ptr(%7 : !fir.ref>, !fir.array) map_clauses(from) capture(ByRef) bounds(%bounds) -> !fir.ref> + omp.target map_entries(%8 -> %arg1, %9 -> %arg2 : !fir.ref>>, !fir.ref>) { + ^bb0(%arg1: !fir.ref>>, %arg2: !fir.ref>): + omp.terminator + } + return + } +} + +// CHECK: func.func @test_descriptor_expansion_pass(%[[ARG0:.*]]: !fir.box>) { +// CHECK: %[[ALLOCA:.*]] = fir.alloca !fir.box> +// CHECK: %[[ALLOCA2:.*]] = fir.alloca !fir.box> +// CHECK: %[[DECLARE1:.*]]:2 = hlfir.declare %[[ARG0]] {fortran_attrs = #fir.var_attrs, uniq_name = "test"} : (!fir.box>) -> (!fir.box>, !fir.box>) +// CHECK: %[[DECLARE2:.*]]:2 = hlfir.declare %[[ALLOCA2]] {fortran_attrs = #fir.var_attrs, uniq_name = "test2"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) +// CHECK: %[[BOUNDS:.*]] = omp.bounds lower_bound(%{{.*}} : index) upper_bound(%{{.*}} : index) extent(%{{.*}} : index) stride(%{{.*}} : index) start_idx(%{{.*}} : index) {stride_in_bytes = true} +// CHECK: %[[BASE_ADDR_OFF:.*]] = fir.box_offset %[[DECLARE2]]#1 base_addr : (!fir.ref>>) -> !fir.llvm_ptr> +// CHECK: %[[DESC_MEMBER_MAP:.*]] = omp.map_info var_ptr(%[[DECLARE2]]#1 : !fir.ref>>, i32) var_ptr_ptr(%[[BASE_ADDR_OFF]] : !fir.llvm_ptr>) map_clauses(tofrom) capture(ByRef) -> !fir.llvm_ptr> {name = ""} +// CHECK: %[[DESC_PARENT_MAP:.*]] = omp.map_info var_ptr(%[[DECLARE2]]#1 : !fir.ref>>, !fir.box>) map_clauses(tofrom) capture(ByRef) members(%[[DESC_MEMBER_MAP]] : !fir.llvm_ptr>) -> !fir.ref>> +// CHECK: fir.store %[[DECLARE1]]#1 to %[[ALLOCA]] : !fir.ref>> +// CHECK: %[[BASE_ADDR_OFF_2:.*]] = fir.box_offset %[[ALLOCA]] base_addr : (!fir.ref>>) -> !fir.llvm_ptr>> +// CHECK: %[[DESC_MEMBER_MAP_2:.*]] = omp.map_info var_ptr(%[[ALLOCA]] : !fir.ref>>, !fir.array) var_ptr_ptr(%[[BASE_ADDR_OFF_2]] : !fir.llvm_ptr>>) map_clauses(from) capture(ByRef) bounds(%[[BOUNDS]]) -> !fir.llvm_ptr>> {name = ""} +// CHECK: %[[DESC_PARENT_MAP_2:.*]] = omp.map_info var_ptr(%[[ALLOCA]] : !fir.ref>>, !fir.box>) map_clauses(from) capture(ByRef) members(%15 : !fir.llvm_ptr>>) -> !fir.ref> +// CHECK: omp.target map_entries(%[[DESC_MEMBER_MAP]] -> %[[ARG1:.*]], %[[DESC_PARENT_MAP]] -> %[[ARG2:.*]], %[[DESC_MEMBER_MAP_2]] -> %[[ARG3:.*]], %[[DESC_PARENT_MAP_2]] -> %[[ARG4:.*]] : {{.*}}) { +// CHECK: ^bb0(%[[ARG1]]: !fir.llvm_ptr>, %[[ARG2]]: !fir.ref>>, %[[ARG3]]: !fir.llvm_ptr>>, %[[ARG4]]: !fir.ref>): diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td index 451828ec4ba77..ca36350548577 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td @@ -1194,6 +1194,7 @@ def MapInfoOp : OpenMP_Op<"map_info", [AttrSizedOperandSegments]> { let arguments = (ins OpenMP_PointerLikeType:$var_ptr, TypeAttr:$var_type, Optional:$var_ptr_ptr, + Variadic:$members, Variadic:$bounds, /* rank-0 to rank-{n-1} */ OptionalAttr:$map_type, OptionalAttr:$map_capture_type, @@ -1233,13 +1234,17 @@ def MapInfoOp : OpenMP_Op<"map_info", [AttrSizedOperandSegments]> { - `var_type`: The type of the variable to copy. - `var_ptr_ptr`: Used when the variable copied is a member of a class, structure or derived type and refers to the originating struct. + - `members`: Used to indicate mapped child members for the current MapInfoOp, + represented as other MapInfoOp's, utilised in cases where a parent structure + type and members of the structure type are being mapped at the same time. + For example: map(to: parent, parent->member, parent->member2[:10]) - `bounds`: Used when copying slices of array's, pointers or pointer members of - objects (e.g. derived types or classes), indicates the bounds to be copied - of the variable. When it's an array slice it is in rank order where rank 0 - is the inner-most dimension. + objects (e.g. derived types or classes), indicates the bounds to be copied + of the variable. When it's an array slice it is in rank order where rank 0 + is the inner-most dimension. - 'map_clauses': OpenMP map type for this map capture, for example: from, to and - always. It's a bitfield composed of the OpenMP runtime flags stored in - OpenMPOffloadMappingFlags. + always. It's a bitfield composed of the OpenMP runtime flags stored in + OpenMPOffloadMappingFlags. - 'map_capture_type': Capture type for the variable e.g. this, byref, byvalue, byvla this can affect how the variable is lowered. - `name`: Holds the name of variable as specified in user clause (including bounds). @@ -1251,6 +1256,7 @@ def MapInfoOp : OpenMP_Op<"map_info", [AttrSizedOperandSegments]> { `var_ptr_ptr` `(` $var_ptr_ptr `:` type($var_ptr_ptr) `)` | `map_clauses` `(` custom($map_type) `)` | `capture` `(` custom($map_capture_type) `)` + | `members` `(` $members `:` type($members) `)` | `bounds` `(` $bounds `)` ) `->` type($omp_ptr) attr-dict }]; @@ -1272,7 +1278,8 @@ def MapInfoOp : OpenMP_Op<"map_info", [AttrSizedOperandSegments]> { // 2.14.2 target data Construct //===---------------------------------------------------------------------===// -def Target_DataOp: OpenMP_Op<"target_data", [AttrSizedOperandSegments]>{ +def Target_DataOp: OpenMP_Op<"target_data", [AttrSizedOperandSegments, + MapClauseOwningOpInterface]>{ let summary = "target data construct"; let description = [{ Map variables to a device data environment for the extent of the region. @@ -1329,7 +1336,8 @@ def Target_DataOp: OpenMP_Op<"target_data", [AttrSizedOperandSegments]>{ //===---------------------------------------------------------------------===// def Target_EnterDataOp: OpenMP_Op<"target_enter_data", - [AttrSizedOperandSegments]>{ + [AttrSizedOperandSegments, + MapClauseOwningOpInterface]>{ let summary = "target enter data construct"; let description = [{ The target enter data directive specifies that variables are mapped to @@ -1375,7 +1383,8 @@ def Target_EnterDataOp: OpenMP_Op<"target_enter_data", //===---------------------------------------------------------------------===// def Target_ExitDataOp: OpenMP_Op<"target_exit_data", - [AttrSizedOperandSegments]>{ + [AttrSizedOperandSegments, + MapClauseOwningOpInterface]>{ let summary = "target exit data construct"; let description = [{ The target exit data directive specifies that variables are mapped to a @@ -1421,7 +1430,8 @@ def Target_ExitDataOp: OpenMP_Op<"target_exit_data", //===---------------------------------------------------------------------===// def Target_UpdateDataOp: OpenMP_Op<"target_update_data", - [AttrSizedOperandSegments]>{ + [AttrSizedOperandSegments, + MapClauseOwningOpInterface]>{ let summary = "target update data construct"; let description = [{ The target update directive makes the corresponding list items in the device @@ -1453,13 +1463,13 @@ def Target_UpdateDataOp: OpenMP_Op<"target_update_data", let arguments = (ins Optional:$if_expr, Optional:$device, UnitAttr:$nowait, - Variadic:$motion_operands); + Variadic:$map_operands); let assemblyFormat = [{ oilist(`if` `(` $if_expr `:` type($if_expr) `)` | `device` `(` $device `:` type($device) `)` | `nowait` $nowait - | `motion_entries` `(` $motion_operands `:` type($motion_operands) `)` + | `motion_entries` `(` $map_operands `:` type($map_operands) `)` ) attr-dict }]; @@ -1470,7 +1480,8 @@ def Target_UpdateDataOp: OpenMP_Op<"target_update_data", // 2.14.5 target construct //===----------------------------------------------------------------------===// -def TargetOp : OpenMP_Op<"target",[IsolatedFromAbove, OutlineableOpenMPOpInterface, AttrSizedOperandSegments]> { +def TargetOp : OpenMP_Op<"target",[IsolatedFromAbove, MapClauseOwningOpInterface, + OutlineableOpenMPOpInterface, AttrSizedOperandSegments]> { let summary = "target construct"; let description = [{ The target construct includes a region of code which is to be executed diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOpsInterfaces.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOpsInterfaces.td index 198a9a2357f2f..ed086d36424c1 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOpsInterfaces.td +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOpsInterfaces.td @@ -18,7 +18,7 @@ include "mlir/IR/OpBase.td" def OutlineableOpenMPOpInterface : OpInterface<"OutlineableOpenMPOpInterface"> { let description = [{ OpenMP operations whose region will be outlined will implement this - interface. These operations will + interface. }]; let cppNamespace = "::mlir::omp"; @@ -31,6 +31,28 @@ def OutlineableOpenMPOpInterface : OpInterface<"OutlineableOpenMPOpInterface"> { ]; } +def MapClauseOwningOpInterface : OpInterface<"MapClauseOwningOpInterface"> { + let description = [{ + OpenMP operations which own a list of omp::MapInfoOp's implement this interface + to allow generic access to deal with map operands to more easily manipulate + this class of operations. + }]; + + let cppNamespace = "::mlir::omp"; + + let methods = [ + InterfaceMethod<"Get map operands", "::mlir::OperandRange", "getMapOperands", + (ins), [{ + return $_op.getMapOperands(); + }]>, + InterfaceMethod<"Get mutable map operands", "::mlir::MutableOperandRange", + "getMapOperandsMutable", + (ins), [{ + return $_op.getMapOperandsMutable(); + }]>, + ]; +} + def ReductionClauseInterface : OpInterface<"ReductionClauseInterface"> { let description = [{ OpenMP operations that support reduction clause have this interface. diff --git a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp index 13cc16125a273..381f17d080419 100644 --- a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp +++ b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp @@ -973,7 +973,7 @@ LogicalResult ExitDataOp::verify() { } LogicalResult UpdateDataOp::verify() { - return verifyMapClause(*this, getMotionOperands()); + return verifyMapClause(*this, getMapOperands()); } LogicalResult TargetOp::verify() { diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp index 17ce14fe642be..79956f82ed141 100644 --- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp @@ -1640,6 +1640,7 @@ getRefPtrIfDeclareTarget(mlir::Value value, // value) more than neccessary. struct MapInfoData : llvm::OpenMPIRBuilder::MapInfosTy { llvm::SmallVector IsDeclareTarget; + llvm::SmallVector IsAMember; llvm::SmallVector MapClause; llvm::SmallVector OriginalValue; // Stripped off array/pointer to get the underlying @@ -1676,14 +1677,14 @@ uint64_t getArrayElementSizeInBits(LLVM::LLVMArrayType arrTy, DataLayout &dl) { // This function is somewhat equivalent to Clang's getExprTypeSize inside of // CGOpenMPRuntime.cpp. llvm::Value *getSizeInBytes(DataLayout &dl, const mlir::Type &type, - Operation *clauseOp, llvm::IRBuilderBase &builder, + Operation *clauseOp, llvm::Value *basePointer, + llvm::Type *baseType, llvm::IRBuilderBase &builder, LLVM::ModuleTranslation &moduleTranslation) { // utilising getTypeSizeInBits instead of getTypeSize as getTypeSize gives // the size in inconsistent byte or bit format. uint64_t underlyingTypeSzInBits = dl.getTypeSizeInBits(type); - if (auto arrTy = llvm::dyn_cast_if_present(type)) { + if (auto arrTy = llvm::dyn_cast_if_present(type)) underlyingTypeSzInBits = getArrayElementSizeInBits(arrTy, dl); - } if (auto memberClause = mlir::dyn_cast_if_present(clauseOp)) { @@ -1729,16 +1730,16 @@ void collectMapDataFromMapOperands(MapInfoData &mapData, DataLayout &dl, llvm::IRBuilderBase &builder) { for (mlir::Value mapValue : mapOperands) { - assert(mlir::isa(mapValue.getDefiningOp()) && - "missing map info operation or incorrect map info operation type"); if (auto mapOp = mlir::dyn_cast_if_present( mapValue.getDefiningOp())) { + mlir::Value offloadPtr = + mapOp.getVarPtrPtr() ? mapOp.getVarPtrPtr() : mapOp.getVarPtr(); mapData.OriginalValue.push_back( - moduleTranslation.lookupValue(mapOp.getVarPtr())); + moduleTranslation.lookupValue(offloadPtr)); mapData.Pointers.push_back(mapData.OriginalValue.back()); if (llvm::Value *refPtr = - getRefPtrIfDeclareTarget(mapOp.getVarPtr(), + getRefPtrIfDeclareTarget(offloadPtr, moduleTranslation)) { // declare target mapData.IsDeclareTarget.push_back(true); mapData.BasePointers.push_back(refPtr); @@ -1747,10 +1748,11 @@ void collectMapDataFromMapOperands(MapInfoData &mapData, mapData.BasePointers.push_back(mapData.OriginalValue.back()); } - mapData.Sizes.push_back(getSizeInBytes(dl, mapOp.getVarType(), mapOp, - builder, moduleTranslation)); mapData.BaseType.push_back( moduleTranslation.convertType(mapOp.getVarType())); + mapData.Sizes.push_back(getSizeInBytes( + dl, mapOp.getVarType(), mapOp, mapData.BasePointers.back(), + mapData.BaseType.back(), builder, moduleTranslation)); mapData.MapClause.push_back(mapOp.getOperation()); mapData.Types.push_back( llvm::omp::OpenMPOffloadMappingFlags(mapOp.getMapType().value())); @@ -1758,10 +1760,205 @@ void collectMapDataFromMapOperands(MapInfoData &mapData, mapOp.getLoc(), *moduleTranslation.getOpenMPBuilder())); mapData.DevicePointers.push_back( llvm::OpenMPIRBuilder::DeviceInfoTy::None); + + // Check if this is a member mapping and correctly assign that it is, if + // it is a member of a larger object. + // TODO: Need better handling of members, and distinguishing of members + // that are implicitly allocated on device vs explicitly passed in as + // arguments. + // TODO: May require some further additions to support nested record + // types, i.e. member maps that can have member maps. + mapData.IsAMember.push_back(false); + for (mlir::Value mapValue : mapOperands) { + if (auto map = mlir::dyn_cast_if_present( + mapValue.getDefiningOp())) { + for (auto member : map.getMembers()) { + if (member == mapOp) { + mapData.IsAMember.back() = true; + } + } + } + } + } + } +} + +// This creates two insertions into the MapInfosTy data structure for the +// "parent" of a set of members, (usually a container e.g. +// class/structure/derived type) when subsequent members have also been +// explicitly mapped on the same map clause. Certain types, such as Fortran +// descriptors are mapped like this as well, however, the members are +// implicit as far as a user is concerned, but we must explicitly map them +// internally. +// +// This function also returns the memberOfFlag for this particular parent, +// which is utilised in subsequent member mappings (by modifying there map type +// with it) to indicate that a member is part of this parent and should be +// treated by the runtime as such. Important to achieve the correct mapping. +static llvm::omp::OpenMPOffloadMappingFlags mapParentWithMembers( + LLVM::ModuleTranslation &moduleTranslation, llvm::IRBuilderBase &builder, + llvm::OpenMPIRBuilder &ompBuilder, DataLayout &dl, + llvm::OpenMPIRBuilder::MapInfosTy &combinedInfo, MapInfoData &mapData, + uint64_t mapDataIndex, bool isTargetParams) { + // Map the first segment of our structure + combinedInfo.Types.emplace_back( + isTargetParams + ? llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TARGET_PARAM + : llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_NONE); + combinedInfo.DevicePointers.emplace_back( + llvm::OpenMPIRBuilder::DeviceInfoTy::None); + combinedInfo.Names.emplace_back(LLVM::createMappingInformation( + mapData.MapClause[mapDataIndex]->getLoc(), ompBuilder)); + combinedInfo.BasePointers.emplace_back(mapData.BasePointers[mapDataIndex]); + combinedInfo.Pointers.emplace_back(mapData.Pointers[mapDataIndex]); + + // Calculate size of the parent object being mapped based on the + // addresses at runtime, highAddr - lowAddr = size. This of course + // doesn't factor in allocated data like pointers, hence the further + // processing of members specified by users, or in the case of + // Fortran pointers and allocatables, the mapping of the pointed to + // data by the descriptor (which itself, is a structure containing + // runtime information on the dynamically allocated data). + llvm::Value *lowAddr = builder.CreatePointerCast( + mapData.Pointers[mapDataIndex], builder.getPtrTy()); + llvm::Value *highAddr = builder.CreatePointerCast( + builder.CreateConstGEP1_32(mapData.BaseType[mapDataIndex], + mapData.Pointers[mapDataIndex], 1), + builder.getPtrTy()); + llvm::Value *size = builder.CreateIntCast( + builder.CreatePtrDiff(builder.getInt8Ty(), highAddr, lowAddr), + builder.getInt64Ty(), + /*isSigned=*/false); + combinedInfo.Sizes.push_back(size); + + // This creates the initial MEMBER_OF mapping that consists of + // the parent/top level container (same as above effectively, except + // with a fixed initial compile time size and seperate maptype which + // indicates the true mape type (tofrom etc.) and that it is a part + // of a larger mapping and indicating the link between it and it's + // members that are also explicitly mapped). + llvm::omp::OpenMPOffloadMappingFlags mapFlag = + llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO; + if (isTargetParams) + mapFlag &= ~llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TARGET_PARAM; + + llvm::omp::OpenMPOffloadMappingFlags memberOfFlag = + ompBuilder.getMemberOfFlag(combinedInfo.BasePointers.size() - 1); + ompBuilder.setCorrectMemberOfFlag(mapFlag, memberOfFlag); + + combinedInfo.Types.emplace_back(mapFlag); + combinedInfo.DevicePointers.emplace_back( + llvm::OpenMPIRBuilder::DeviceInfoTy::None); + combinedInfo.Names.emplace_back(LLVM::createMappingInformation( + mapData.MapClause[mapDataIndex]->getLoc(), ompBuilder)); + combinedInfo.BasePointers.emplace_back(mapData.BasePointers[mapDataIndex]); + combinedInfo.Pointers.emplace_back(mapData.Pointers[mapDataIndex]); + combinedInfo.Sizes.emplace_back(mapData.Sizes[mapDataIndex]); + + return memberOfFlag; +} + +// This function is intended to add explicit mappings of members +static void processMapMembersWithParent( + LLVM::ModuleTranslation &moduleTranslation, llvm::IRBuilderBase &builder, + llvm::OpenMPIRBuilder &ompBuilder, DataLayout &dl, + llvm::OpenMPIRBuilder::MapInfosTy &combinedInfo, MapInfoData &mapData, + uint64_t mapDataIndex, llvm::omp::OpenMPOffloadMappingFlags memberOfFlag) { + + auto parentClause = + mlir::dyn_cast(mapData.MapClause[mapDataIndex]); + + for (auto mappedMembers : parentClause.getMembers()) { + auto memberClause = + mlir::dyn_cast(mappedMembers.getDefiningOp()); + int memberDataIdx = -1; + for (size_t i = 0; i < mapData.MapClause.size(); ++i) { + if (mapData.MapClause[i] == memberClause) + memberDataIdx = i; + } + + assert(memberDataIdx >= 0 && "could not find mapped member of structure"); + + // Same MemberOfFlag to indicate its link with parent and other members + // of, and we flag that it's part of a pointer and object coupling. + auto mapFlag = + llvm::omp::OpenMPOffloadMappingFlags(memberClause.getMapType().value()); + mapFlag &= ~llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TARGET_PARAM; + ompBuilder.setCorrectMemberOfFlag(mapFlag, memberOfFlag); + mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_PTR_AND_OBJ; + combinedInfo.Types.emplace_back(mapFlag); + combinedInfo.DevicePointers.emplace_back( + llvm::OpenMPIRBuilder::DeviceInfoTy::None); + combinedInfo.Names.emplace_back( + LLVM::createMappingInformation(memberClause.getLoc(), ompBuilder)); + + combinedInfo.BasePointers.emplace_back(mapData.BasePointers[memberDataIdx]); + + std::vector idx{builder.getInt64(0)}; + llvm::Value *offsetAddress = nullptr; + if (!memberClause.getBounds().empty()) { + if (mapData.BaseType[memberDataIdx]->isArrayTy()) { + for (int i = memberClause.getBounds().size() - 1; i >= 0; --i) { + if (auto boundOp = mlir::dyn_cast_if_present( + memberClause.getBounds()[i].getDefiningOp())) { + idx.push_back( + moduleTranslation.lookupValue(boundOp.getLowerBound())); + } + } + } else { + std::vector dimensionIndexSizeOffset{ + builder.getInt64(1)}; + for (size_t i = 1; i < memberClause.getBounds().size(); ++i) { + if (auto boundOp = mlir::dyn_cast_if_present( + memberClause.getBounds()[i].getDefiningOp())) { + dimensionIndexSizeOffset.push_back(builder.CreateMul( + moduleTranslation.lookupValue(boundOp.getExtent()), + dimensionIndexSizeOffset[i - 1])); + } + } + + for (int i = memberClause.getBounds().size() - 1; i >= 0; --i) { + if (auto boundOp = mlir::dyn_cast_if_present( + memberClause.getBounds()[i].getDefiningOp())) { + if (!offsetAddress) + offsetAddress = builder.CreateMul( + moduleTranslation.lookupValue(boundOp.getLowerBound()), + dimensionIndexSizeOffset[i]); + else + offsetAddress = builder.CreateAdd( + offsetAddress, + builder.CreateMul( + moduleTranslation.lookupValue(boundOp.getLowerBound()), + dimensionIndexSizeOffset[i])); + } + } + } } + + llvm::Value *memberIdx = + builder.CreateLoad(builder.getPtrTy(), mapData.Pointers[memberDataIdx]); + memberIdx = builder.CreateInBoundsGEP( + mapData.BaseType[memberDataIdx], memberIdx, + offsetAddress ? std::vector{offsetAddress} : idx, + "member_idx"); + combinedInfo.Pointers.emplace_back(memberIdx); + combinedInfo.Sizes.emplace_back(mapData.Sizes[memberDataIdx]); } } +static void processMapWithMembersOf( + LLVM::ModuleTranslation &moduleTranslation, llvm::IRBuilderBase &builder, + llvm::OpenMPIRBuilder &ompBuilder, DataLayout &dl, + llvm::OpenMPIRBuilder::MapInfosTy &combinedInfo, MapInfoData &mapData, + uint64_t mapDataIndex, bool isTargetParams) { + llvm::omp::OpenMPOffloadMappingFlags memberOfParentFlag = + mapParentWithMembers(moduleTranslation, builder, ompBuilder, dl, + combinedInfo, mapData, mapDataIndex, isTargetParams); + processMapMembersWithParent(moduleTranslation, builder, ompBuilder, dl, + combinedInfo, mapData, mapDataIndex, + memberOfParentFlag); +} + // Generate all map related information and fill the combinedInfo. static void genMapInfos(llvm::IRBuilderBase &builder, LLVM::ModuleTranslation &moduleTranslation, @@ -1788,9 +1985,25 @@ static void genMapInfos(llvm::IRBuilderBase &builder, // utilise the size from any component of MapInfoData, if we can't // something is missing from the initial MapInfoData construction. for (size_t i = 0; i < mapData.MapClause.size(); ++i) { + // NOTE/TODO: We currently do not handle member mapping seperately from it's + // parent or explicit mapping of a parent and member in the same operation, + // this will need to change in the near future, for now we primarily handle + // descriptor mapping from fortran, generalised as mapping record types + // with implicit member maps. This lowering needs further generalisation to + // fully support fortran derived types, and C/C++ structures and classes. + if (mapData.IsAMember[i]) + continue; + + auto mapInfoOp = mlir::dyn_cast(mapData.MapClause[i]); + if (!mapInfoOp.getMembers().empty()) { + processMapWithMembersOf(moduleTranslation, builder, *ompBuilder, dl, + combinedInfo, mapData, i, isTargetParams); + continue; + } + // Declare Target Mappings are excluded from being marked as - // OMP_MAP_TARGET_PARAM as they are not passed as parameters, they're marked - // with OMP_MAP_PTR_AND_OBJ instead. + // OMP_MAP_TARGET_PARAM as they are not passed as parameters, they're + // marked with OMP_MAP_PTR_AND_OBJ instead. auto mapFlag = mapData.Types[i]; if (mapData.IsDeclareTarget[i]) mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_PTR_AND_OBJ; @@ -1932,7 +2145,7 @@ convertOmpTargetData(Operation *op, llvm::IRBuilderBase &builder, deviceID = intAttr.getInt(); RTLFn = llvm::omp::OMPRTL___tgt_target_data_update_mapper; - mapOperands = updateDataOp.getMotionOperands(); + mapOperands = updateDataOp.getMapOperands(); return success(); }) .Default([&](Operation *op) { @@ -2441,9 +2654,14 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder, }; llvm::SmallVector kernelInput; - for (size_t i = 0; i < mapData.MapClause.size(); ++i) { + for (size_t i = 0; i < mapOperands.size(); ++i) { // declare target arguments are not passed to kernels as arguments - if (!mapData.IsDeclareTarget[i]) + // TODO: We currently do not handle cases where a member is explicitly + // passed in as an argument, this will likley need to be handled in + // the near future, rather than using IsAMember, it may be better to + // test if the relevant BlockArg is used within the target region and + // then use that as a basis for exclusion in the kernel inputs. + if (!mapData.IsDeclareTarget[i] && !mapData.IsAMember[i]) kernelInput.push_back(mapData.OriginalValue[i]); } diff --git a/mlir/test/Dialect/OpenMP/ops.mlir b/mlir/test/Dialect/OpenMP/ops.mlir index ccf72ae31d439..65a704d18107b 100644 --- a/mlir/test/Dialect/OpenMP/ops.mlir +++ b/mlir/test/Dialect/OpenMP/ops.mlir @@ -2124,3 +2124,17 @@ func.func @omp_target_update_data (%if_cond : i1, %device : si32, %map1: memref< return } +// CHECK-LABEL: omp_targets_is_allocatable +// CHECK-SAME: (%[[ARG0:.*]]: !llvm.ptr, %[[ARG1:.*]]: !llvm.ptr) +func.func @omp_targets_is_allocatable(%arg0: !llvm.ptr, %arg1: !llvm.ptr) -> () { + // CHECK: %[[MAP0:.*]] = omp.map_info var_ptr(%[[ARG0]] : !llvm.ptr, i32) map_clauses(tofrom) capture(ByRef) -> !llvm.ptr {name = ""} + %mapv1 = omp.map_info var_ptr(%arg0 : !llvm.ptr, i32) map_clauses(tofrom) capture(ByRef) -> !llvm.ptr {name = ""} + // CHECK: %[[MAP1:.*]] = omp.map_info var_ptr(%[[ARG1]] : !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>) map_clauses(tofrom) capture(ByRef) members(%[[MAP0]] : !llvm.ptr) -> !llvm.ptr {name = ""} + %mapv2 = omp.map_info var_ptr(%arg1 : !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>) map_clauses(tofrom) capture(ByRef) members(%mapv1 : !llvm.ptr) -> !llvm.ptr {name = ""} + // CHECK: omp.target map_entries(%[[MAP0]] -> {{.*}}, %[[MAP1]] -> {{.*}} : !llvm.ptr, !llvm.ptr) + omp.target map_entries(%mapv1 -> %arg2, %mapv2 -> %arg3 : !llvm.ptr, !llvm.ptr) { + ^bb0(%arg2: !llvm.ptr, %arg3 : !llvm.ptr): + omp.terminator + } + return +} diff --git a/mlir/test/Target/LLVMIR/omptarget-fortran-allocatable-types-host.mlir b/mlir/test/Target/LLVMIR/omptarget-fortran-allocatable-types-host.mlir new file mode 100644 index 0000000000000..831cd05871c4e --- /dev/null +++ b/mlir/test/Target/LLVMIR/omptarget-fortran-allocatable-types-host.mlir @@ -0,0 +1,148 @@ +// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s + +// This test checks the offload sizes, map types and base pointers and pointers +// provided to the OpenMP kernel argument structure are correct when lowering +// to LLVM-IR from MLIR when the fortran allocatables flag is switched on and +// a fortran allocatable descriptor type is provided alongside the omp.map_info, +// the test utilises mapping of array sections, full arrays and individual +// allocated scalars. + +module attributes {omp.is_target_device = false} { + llvm.func @_QQmain() { + %0 = llvm.mlir.constant(5 : index) : i64 + %1 = llvm.mlir.constant(2 : index) : i64 + %2 = llvm.mlir.constant(1 : index) : i64 + %3 = llvm.mlir.addressof @_QFEfull_arr : !llvm.ptr + %4 = llvm.mlir.constant(1 : i64) : i64 + %5 = llvm.alloca %4 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> {bindc_name = "scalar"} : (i64) -> !llvm.ptr + %6 = llvm.mlir.addressof @_QFEsect_arr : !llvm.ptr + %7 = llvm.mlir.constant(0 : i64) : i64 + %8 = llvm.getelementptr %3[0, 7, %7, 0] : (!llvm.ptr, i64) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)> + %9 = llvm.load %8 : !llvm.ptr -> i64 + %10 = llvm.getelementptr %3[0, 7, %7, 1] : (!llvm.ptr, i64) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)> + %11 = llvm.load %10 : !llvm.ptr -> i64 + %12 = llvm.getelementptr %3[0, 7, %7, 2] : (!llvm.ptr, i64) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)> + %13 = llvm.load %12 : !llvm.ptr -> i64 + %14 = llvm.sub %11, %2 : i64 + %15 = omp.bounds lower_bound(%7 : i64) upper_bound(%14 : i64) extent(%11 : i64) stride(%13 : i64) start_idx(%9 : i64) {stride_in_bytes = true} + %16 = llvm.getelementptr %3[0, 0] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)> + %17 = omp.map_info var_ptr(%16 : !llvm.ptr, f32) map_clauses(tofrom) capture(ByRef) bounds(%15) -> !llvm.ptr {name = "full_arr"} + %18 = omp.map_info var_ptr(%3 : !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)>) map_clauses(tofrom) capture(ByRef) members(%17 : !llvm.ptr) -> !llvm.ptr {name = "full_arr"} + %19 = llvm.getelementptr %6[0, 7, %7, 0] : (!llvm.ptr, i64) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)> + %20 = llvm.load %19 : !llvm.ptr -> i64 + %21 = llvm.getelementptr %6[0, 7, %7, 1] : (!llvm.ptr, i64) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)> + %22 = llvm.load %21 : !llvm.ptr -> i64 + %23 = llvm.getelementptr %6[0, 7, %7, 2] : (!llvm.ptr, i64) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)> + %24 = llvm.load %23 : !llvm.ptr -> i64 + %25 = llvm.sub %1, %20 : i64 + %26 = llvm.sub %0, %20 : i64 + %27 = omp.bounds lower_bound(%25 : i64) upper_bound(%26 : i64) extent(%22 : i64) stride(%24 : i64) start_idx(%20 : i64) {stride_in_bytes = true} + %28 = llvm.getelementptr %6[0, 0] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)> + %29 = omp.map_info var_ptr(%6 : !llvm.ptr, i32) var_ptr_ptr(%28 : !llvm.ptr) map_clauses(tofrom) capture(ByRef) bounds(%27) -> !llvm.ptr {name = "sect_arr(2:5)"} + %30 = omp.map_info var_ptr(%6 : !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)>) map_clauses(tofrom) capture(ByRef) members(%29 : !llvm.ptr) -> !llvm.ptr {name = "sect_arr(2:5)"} + %31 = llvm.getelementptr %5[0, 0] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)> + %32 = omp.map_info var_ptr(%5 : !llvm.ptr, f32) var_ptr_ptr(%31 : !llvm.ptr) map_clauses(tofrom) capture(ByRef) -> !llvm.ptr {name = "scalar"} + %33 = omp.map_info var_ptr(%5 : !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>) map_clauses(tofrom) capture(ByRef) members(%32 : !llvm.ptr) -> !llvm.ptr {name = "scalar"} + omp.target map_entries(%17 -> %arg0, %18 -> %arg1, %29 -> %arg2, %30 -> %arg3, %32 -> %arg4, %33 -> %arg5 : !llvm.ptr, !llvm.ptr, !llvm.ptr, !llvm.ptr, !llvm.ptr, !llvm.ptr) { + ^bb0(%arg0: !llvm.ptr, %arg1: !llvm.ptr, %arg2: !llvm.ptr, %arg3: !llvm.ptr, %arg4: !llvm.ptr, %arg5: !llvm.ptr): + omp.terminator + } + llvm.return + } + llvm.mlir.global internal @_QFEfull_arr() {addr_space = 0 : i32} : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)> { + %0 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)> + llvm.return %0 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)> + } + llvm.mlir.global internal @_QFEsect_arr() {addr_space = 0 : i32} : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)> { + %0 = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)> + llvm.return %0 : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)> + } +} + +// CHECK: @[[FULL_ARR_GLOB:.*]] = internal global { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] } undef +// CHECK: @[[ARR_SECT_GLOB:.*]] = internal global { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] } undef +// CHECK: @.offload_sizes = private unnamed_addr constant [9 x i64] [i64 0, i64 48, i64 0, i64 0, i64 48, i64 0, i64 0, i64 24, i64 4] +// CHECK: @.offload_maptypes = private unnamed_addr constant [9 x i64] [i64 32, i64 281474976710657, i64 281474976710675, i64 32, i64 1125899906842625, i64 1125899906842643, i64 32, i64 1970324836974593, i64 1970324836974611] +// CHECK: @.offload_mapnames = private constant [9 x ptr] [ptr @{{.*}}, ptr @{{.*}}, ptr @{{.*}}, ptr @{{.*}}, ptr @{{.*}}, ptr @{{.*}}, ptr @{{.*}}, ptr @{{.*}}, ptr @{{.*}}] + +// CHECK: define void @_QQmain() +// CHECK: %[[SCALAR_ALLOCA:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 }, i64 1, align 8 +// CHECK: %[[FULL_ARR_SIZE5:.*]] = load i64, ptr getelementptr inbounds ({ ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] }, ptr @[[FULL_ARR_GLOB]], i32 0, i32 7, i64 0, i32 1), align 4 +// CHECK: %[[FULL_ARR_SIZE4:.*]] = sub i64 %[[FULL_ARR_SIZE5]], 1 +// CHECK: %[[ARR_SECT_OFFSET3:.*]] = load i64, ptr getelementptr inbounds ({ ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] }, ptr @[[ARR_SECT_GLOB]], i32 0, i32 7, i64 0, i32 0), align 4 +// CHECK: %[[ARR_SECT_OFFSET2:.*]] = sub i64 2, %[[ARR_SECT_OFFSET3]] +// CHECK: %[[ARR_SECT_SIZE4:.*]] = sub i64 5, %[[ARR_SECT_OFFSET3]] +// CHECK: %[[SCALAR_BASE:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8 }, ptr %[[SCALAR_ALLOCA]], i32 0, i32 0 +// CHECK: %[[FULL_ARR_SIZE3:.*]] = sub i64 %[[FULL_ARR_SIZE4]], 0 +// CHECK: %[[FULL_ARR_SIZE2:.*]] = add i64 %[[FULL_ARR_SIZE3]], 1 +// CHECK: %[[FULL_ARR_SIZE1:.*]] = mul i64 1, %[[FULL_ARR_SIZE2]] +// CHECK: %[[FULL_ARR_SIZE:.*]] = mul i64 %[[FULL_ARR_SIZE1]], 4 +// CHECK: %[[ARR_SECT_SIZE3:.*]] = sub i64 %[[ARR_SECT_SIZE4]], %[[ARR_SECT_OFFSET2]] +// CHECK: %[[ARR_SECT_SIZE2:.*]] = add i64 %[[ARR_SECT_SIZE3]], 1 +// CHECK: %[[ARR_SECT_SIZE1:.*]] = mul i64 1, %[[ARR_SECT_SIZE2]] +// CHECK: %[[ARR_SECT_SIZE:.*]] = mul i64 %[[ARR_SECT_SIZE1]], 4 +// CHECK: %[[FULL_ARR_DESC_SIZE:.*]] = sdiv exact i64 sub (i64 ptrtoint (ptr getelementptr inbounds ({ ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] }, ptr @_QFEfull_arr, i32 1) to i64), i64 ptrtoint (ptr @_QFEfull_arr to i64)), ptrtoint (ptr getelementptr (i8, ptr null, i32 1) to i64) +// CHECK: %[[LFULL_ARR:.*]] = load ptr, ptr @_QFEfull_arr, align 8 +// CHECK: %[[FULL_ARR_PTR:.*]] = getelementptr inbounds float, ptr %[[LFULL_ARR]], i64 0 +// CHECK: %[[ARR_SECT_DESC_SIZE:.*]] = sdiv exact i64 sub (i64 ptrtoint (ptr getelementptr inbounds ({ ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] }, ptr @_QFEsect_arr, i32 1) to i64), i64 ptrtoint (ptr @_QFEsect_arr to i64)), ptrtoint (ptr getelementptr (i8, ptr null, i32 1) to i64) +// CHECK: %[[ARR_SECT_OFFSET1:.*]] = mul i64 %[[ARR_SECT_OFFSET2]], 1 +// CHECK: %[[LARR_SECT:.*]] = load ptr, ptr @_QFEsect_arr, align 8 +// CHECK: %[[ARR_SECT_PTR:.*]] = getelementptr inbounds i32, ptr %[[LARR_SECT]], i64 %[[ARR_SECT_OFFSET1]] +// CHECK: %[[SCALAR_DESC_SZ4:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8 }, ptr %[[SCALAR_ALLOCA]], i32 1 +// CHECK: %[[SCALAR_DESC_SZ3:.*]] = ptrtoint ptr %[[SCALAR_DESC_SZ4]] to i64 +// CHECK: %[[SCALAR_DESC_SZ2:.*]] = ptrtoint ptr %[[SCALAR_ALLOCA]] to i64 +// CHECK: %[[SCALAR_DESC_SZ1:.*]] = sub i64 %[[SCALAR_DESC_SZ3]], %[[SCALAR_DESC_SZ2]] +// CHECK: %[[SCALAR_DESC_SZ:.*]] = sdiv exact i64 %[[SCALAR_DESC_SZ1]], ptrtoint (ptr getelementptr (i8, ptr null, i32 1) to i64) +// CHECK: %[[SCALAR_PTR_LOAD:.*]] = load ptr, ptr %[[SCALAR_BASE]], align 8 +// CHECK: %[[SCALAR_PTR:.*]] = getelementptr inbounds float, ptr %[[SCALAR_PTR_LOAD]], i64 0 + +// CHECK: %[[OFFLOADBASEPTRS:.*]] = getelementptr inbounds [9 x ptr], ptr %.offload_baseptrs, i32 0, i32 0 +// CHECK: store ptr @_QFEfull_arr, ptr %[[OFFLOADBASEPTRS]], align 8 +// CHECK: %[[OFFLOADPTRS:.*]] = getelementptr inbounds [9 x ptr], ptr %.offload_ptrs, i32 0, i32 0 +// CHECK: store ptr @_QFEfull_arr, ptr %[[OFFLOADPTRS]], align 8 + +// CHECK: %[[OFFLOADSIZES:.*]] = getelementptr inbounds [9 x i64], ptr %.offload_sizes, i32 0, i32 0 +// CHECK: store i64 %[[FULL_ARR_DESC_SIZE]], ptr %[[OFFLOADSIZES]], align 8 + +// CHECK: %[[OFFLOADBASEPTRS:.*]] = getelementptr inbounds [9 x ptr], ptr %.offload_baseptrs, i32 0, i32 1 +// CHECK: store ptr @_QFEfull_arr, ptr %[[OFFLOADBASEPTRS]], align 8 +// CHECK: %[[OFFLOADPTRS:.*]] = getelementptr inbounds [9 x ptr], ptr %.offload_ptrs, i32 0, i32 1 +// CHECK: store ptr @_QFEfull_arr, ptr %[[OFFLOADPTRS]], align 8 +// CHECK: %[[OFFLOADBASEPTRS:.*]] = getelementptr inbounds [9 x ptr], ptr %.offload_baseptrs, i32 0, i32 2 +// CHECK: store ptr @_QFEfull_arr, ptr %[[OFFLOADBASEPTRS]], align 8 +// CHECK: %[[OFFLOADPTRS:.*]] = getelementptr inbounds [9 x ptr], ptr %.offload_ptrs, i32 0, i32 2 +// CHECK: store ptr %[[FULL_ARR_PTR]], ptr %[[OFFLOADPTRS]], align 8 +// CHECK: %[[OFFLOADSIZES:.*]] = getelementptr inbounds [9 x i64], ptr %.offload_sizes, i32 0, i32 2 +// CHECK: store i64 %[[FULL_ARR_SIZE]], ptr %[[OFFLOADSIZES]], align 8 + +// CHECK: %[[OFFLOADBASEPTRS:.*]] = getelementptr inbounds [9 x ptr], ptr %.offload_baseptrs, i32 0, i32 3 +// CHECK: store ptr @_QFEsect_arr, ptr %[[OFFLOADBASEPTRS]], align 8 +// CHECK: %[[OFFLOADPTRS:.*]] = getelementptr inbounds [9 x ptr], ptr %.offload_ptrs, i32 0, i32 3 +// CHECK: store ptr @_QFEsect_arr, ptr %[[OFFLOADPTRS]], align 8 +// CHECK: %[[OFFLOADSIZES:.*]] = getelementptr inbounds [9 x i64], ptr %.offload_sizes, i32 0, i32 3 +// CHECK: store i64 %[[ARR_SECT_DESC_SIZE]], ptr %[[OFFLOADSIZES]], align 8 +// CHECK: %[[OFFLOADBASEPTRS:.*]] = getelementptr inbounds [9 x ptr], ptr %.offload_baseptrs, i32 0, i32 4 +// CHECK: store ptr @_QFEsect_arr, ptr %[[OFFLOADBASEPTRS]], align 8 +// CHECK: %[[OFFLOADPTRS:.*]] = getelementptr inbounds [9 x ptr], ptr %.offload_ptrs, i32 0, i32 4 +// CHECK: store ptr @_QFEsect_arr, ptr %[[OFFLOADPTRS]], align 8 +// CHECK: %[[OFFLOADBASEPTRS:.*]] = getelementptr inbounds [9 x ptr], ptr %.offload_baseptrs, i32 0, i32 5 +// CHECK: store ptr @_QFEsect_arr, ptr %[[OFFLOADBASEPTRS]], align 8 +// CHECK: %[[OFFLOADPTRS:.*]] = getelementptr inbounds [9 x ptr], ptr %.offload_ptrs, i32 0, i32 5 +// CHECK: store ptr %[[ARR_SECT_PTR]], ptr %[[OFFLOADPTRS]], align 8 +// CHECK: %[[OFFLOADSIZES:.*]] = getelementptr inbounds [9 x i64], ptr %.offload_sizes, i32 0, i32 5 +// CHECK: store i64 %[[ARR_SECT_SIZE]], ptr %[[OFFLOADSIZES]], align 8 + +// CHECK: %[[OFFLOADBASEPTRS:.*]] = getelementptr inbounds [9 x ptr], ptr %.offload_baseptrs, i32 0, i32 6 +// CHECK: store ptr %[[SCALAR_ALLOCA]], ptr %[[OFFLOADBASEPTRS]], align 8 +// CHECK: %[[OFFLOADPTRS:.*]] = getelementptr inbounds [9 x ptr], ptr %.offload_ptrs, i32 0, i32 6 +// CHECK: store ptr %[[SCALAR_ALLOCA]], ptr %[[OFFLOADPTRS]], align 8 +// CHECK: %[[OFFLOADSIZES:.*]] = getelementptr inbounds [9 x i64], ptr %.offload_sizes, i32 0, i32 6 +// CHECK: store i64 %[[SCALAR_DESC_SZ]], ptr %[[OFFLOADSIZES]], align 8 +// CHECK: %[[OFFLOADBASEPTRS:.*]] = getelementptr inbounds [9 x ptr], ptr %.offload_baseptrs, i32 0, i32 7 +// CHECK: store ptr %[[SCALAR_ALLOCA]], ptr %[[OFFLOADBASEPTRS]], align 8 +// CHECK: %[[OFFLOADPTRS:.*]] = getelementptr inbounds [9 x ptr], ptr %.offload_ptrs, i32 0, i32 7 +// CHECK: store ptr %[[SCALAR_ALLOCA]], ptr %[[OFFLOADPTRS]], align 8 +// CHECK: %[[OFFLOADBASEPTRS:.*]] = getelementptr inbounds [9 x ptr], ptr %.offload_baseptrs, i32 0, i32 8 +// CHECK: store ptr %[[SCALAR_BASE]], ptr %[[OFFLOADBASEPTRS]], align 8 +// CHECK: %[[OFFLOADPTRS:.*]] = getelementptr inbounds [9 x ptr], ptr %.offload_ptrs, i32 0, i32 8 +// CHECK: store ptr %[[SCALAR_PTR]], ptr %[[OFFLOADPTRS]], align 8 diff --git a/openmp/libomptarget/test/offloading/fortran/target-map-allocatable-array-section-1d-bounds.f90 b/openmp/libomptarget/test/offloading/fortran/target-map-allocatable-array-section-1d-bounds.f90 new file mode 100644 index 0000000000000..99dbe99d40497 --- /dev/null +++ b/openmp/libomptarget/test/offloading/fortran/target-map-allocatable-array-section-1d-bounds.f90 @@ -0,0 +1,46 @@ +! Offloading test checking interaction of a +! two 1-D allocatable arrays with a target region +! while providing the map upper and lower bounds +! REQUIRES: flang, amdgcn-amd-amdhsa +! UNSUPPORTED: nvptx64-nvidia-cuda +! UNSUPPORTED: nvptx64-nvidia-cuda-LTO +! UNSUPPORTED: aarch64-unknown-linux-gnu +! UNSUPPORTED: aarch64-unknown-linux-gnu-LTO +! UNSUPPORTED: x86_64-pc-linux-gnu +! UNSUPPORTED: x86_64-pc-linux-gnu-LTO + +! RUN: %libomptarget-compile-fortran-run-and-check-generic +program main + integer, allocatable :: sp_read(:), sp_write(:) + allocate(sp_read(10)) + allocate(sp_write(10)) + + do i = 1, 10 + sp_read(i) = i + sp_write(i) = 0 + end do + + !$omp target map(tofrom:sp_read(2:6)) map(tofrom:sp_write(2:6)) + do i = 1, 10 + sp_write(i) = sp_read(i) + end do + !$omp end target + + do i = 1, 10 + print *, sp_write(i) + end do + + deallocate(sp_read) + deallocate(sp_write) +end program + +! CHECK: 0 +! CHECK: 2 +! CHECK: 3 +! CHECK: 4 +! CHECK: 5 +! CHECK: 6 +! CHECK: 0 +! CHECK: 0 +! CHECK: 0 +! CHECK: 0 diff --git a/openmp/libomptarget/test/offloading/fortran/target-map-allocatable-array-section-3d-bounds.f90 b/openmp/libomptarget/test/offloading/fortran/target-map-allocatable-array-section-3d-bounds.f90 new file mode 100644 index 0000000000000..0786e0fd744e7 --- /dev/null +++ b/openmp/libomptarget/test/offloading/fortran/target-map-allocatable-array-section-3d-bounds.f90 @@ -0,0 +1,44 @@ +! Offloading test checking interaction of allocatables +! with multi-dimensional bounds (3-D in this case) and +! a target region +! REQUIRES: flang, amdgcn-amd-amdhsa +! UNSUPPORTED: nvptx64-nvidia-cuda +! UNSUPPORTED: nvptx64-nvidia-cuda-LTO +! UNSUPPORTED: aarch64-unknown-linux-gnu +! UNSUPPORTED: aarch64-unknown-linux-gnu-LTO +! UNSUPPORTED: x86_64-pc-linux-gnu +! UNSUPPORTED: x86_64-pc-linux-gnu-LTO + +! RUN: %libomptarget-compile-fortran-run-and-check-generic +program main + integer, allocatable :: inArray(:,:,:) + integer, allocatable :: outArray(:,:,:) + + allocate(inArray(3,3,3)) + allocate(outArray(3,3,3)) + + do i = 1, 3 + do j = 1, 3 + do k = 1, 3 + inArray(i, j, k) = 42 + outArray(i, j, k) = 0 + end do + end do + end do + +!$omp target map(tofrom:inArray(1:3, 1:3, 2:2), outArray(1:3, 1:3, 1:3)) + do j = 1, 3 + do k = 1, 3 + outArray(k, j, 2) = inArray(k, j, 2) + end do + end do +!$omp end target + +print *, outArray + +deallocate(inArray) +deallocate(outArray) + +end program + +! CHECK: 0 0 0 0 0 0 0 0 0 42 42 42 42 42 42 42 42 42 0 0 0 0 0 0 0 0 0 diff --git a/openmp/libomptarget/test/offloading/fortran/target-map-allocatable-map-scopes.f90 b/openmp/libomptarget/test/offloading/fortran/target-map-allocatable-map-scopes.f90 new file mode 100644 index 0000000000000..bb47d3de96d2a --- /dev/null +++ b/openmp/libomptarget/test/offloading/fortran/target-map-allocatable-map-scopes.f90 @@ -0,0 +1,66 @@ +! Offloading test checking interaction of allocatables +! with target in different scopes +! REQUIRES: flang, amdgcn-amd-amdhsa +! UNSUPPORTED: nvptx64-nvidia-cuda +! UNSUPPORTED: nvptx64-nvidia-cuda-LTO +! UNSUPPORTED: aarch64-unknown-linux-gnu +! UNSUPPORTED: aarch64-unknown-linux-gnu-LTO +! UNSUPPORTED: x86_64-pc-linux-gnu +! UNSUPPORTED: x86_64-pc-linux-gnu-LTO + +! RUN: %libomptarget-compile-fortran-run-and-check-generic +module test + contains + subroutine func_arg(arg_alloc) + integer, allocatable, intent (inout) :: arg_alloc(:) + + !$omp target map(tofrom: arg_alloc) + do index = 1, 10 + arg_alloc(index) = arg_alloc(index) + index + end do + !$omp end target + + print *, arg_alloc + end subroutine func_arg +end module + +subroutine func + integer, allocatable :: local_alloc(:) + allocate(local_alloc(10)) + + !$omp target map(tofrom: local_alloc) + do index = 1, 10 + local_alloc(index) = index + end do + !$omp end target + + print *, local_alloc + + deallocate(local_alloc) +end subroutine func + + +program main + use test + integer, allocatable :: map_ptr(:) + + allocate(map_ptr(10)) + + !$omp target map(tofrom: map_ptr) + do index = 1, 10 + map_ptr(index) = index + end do + !$omp end target + + call func + + print *, map_ptr + + call func_arg(map_ptr) + + deallocate(map_ptr) +end program + +! CHECK: 1 2 3 4 5 6 7 8 9 10 +! CHECK: 1 2 3 4 5 6 7 8 9 10 +! CHECK: 2 4 6 8 10 12 14 16 18 20 diff --git a/openmp/libomptarget/test/offloading/fortran/target-map-enter-exit-allocatables.f90 b/openmp/libomptarget/test/offloading/fortran/target-map-enter-exit-allocatables.f90 new file mode 100644 index 0000000000000..865be95ba9682 --- /dev/null +++ b/openmp/libomptarget/test/offloading/fortran/target-map-enter-exit-allocatables.f90 @@ -0,0 +1,44 @@ +! Offloading test checking interaction of allocatables +! with enter, exit and target +! REQUIRES: flang, amdgcn-amd-amdhsa +! UNSUPPORTED: nvptx64-nvidia-cuda +! UNSUPPORTED: nvptx64-nvidia-cuda-LTO +! UNSUPPORTED: aarch64-unknown-linux-gnu +! UNSUPPORTED: aarch64-unknown-linux-gnu-LTO +! UNSUPPORTED: x86_64-pc-linux-gnu +! UNSUPPORTED: x86_64-pc-linux-gnu-LTO + +! RUN: %libomptarget-compile-fortran-run-and-check-generic +program main + integer, allocatable :: A(:) + allocate(A(10)) + + !$omp target enter data map(alloc: A) + + !$omp target + do I = 1, 10 + A(I) = I + end do + !$omp end target + + !$omp target exit data map(from: A) + + !$omp target exit data map(delete: A) + + do i = 1, 10 + print *, A(i) + end do + + deallocate(A) +end program + +! CHECK: 1 +! CHECK: 2 +! CHECK: 3 +! CHECK: 4 +! CHECK: 5 +! CHECK: 6 +! CHECK: 7 +! CHECK: 8 +! CHECK: 9 +! CHECK: 10 diff --git a/openmp/libomptarget/test/offloading/fortran/target-map-enter-exit-array.f90 b/openmp/libomptarget/test/offloading/fortran/target-map-enter-exit-array.f90 new file mode 100644 index 0000000000000..4a9fb6ee177f6 --- /dev/null +++ b/openmp/libomptarget/test/offloading/fortran/target-map-enter-exit-array.f90 @@ -0,0 +1,41 @@ +! Offloading test checking interaction of fixed size +! arrays with enter, exit and target +! REQUIRES: flang, amdgcn-amd-amdhsa +! UNSUPPORTED: nvptx64-nvidia-cuda +! UNSUPPORTED: nvptx64-nvidia-cuda-LTO +! UNSUPPORTED: aarch64-unknown-linux-gnu +! UNSUPPORTED: aarch64-unknown-linux-gnu-LTO +! UNSUPPORTED: x86_64-pc-linux-gnu +! UNSUPPORTED: x86_64-pc-linux-gnu-LTO + +! RUN: %libomptarget-compile-fortran-run-and-check-generic +program main + integer :: A(10) + + !$omp target enter data map(alloc: A) + + !$omp target + do I = 1, 10 + A(I) = I + end do + !$omp end target + + !$omp target exit data map(from: A) + + !$omp target exit data map(delete: A) + + do i = 1, 10 + print *, A(i) + end do +end program + +! CHECK: 1 +! CHECK: 2 +! CHECK: 3 +! CHECK: 4 +! CHECK: 5 +! CHECK: 6 +! CHECK: 7 +! CHECK: 8 +! CHECK: 9 +! CHECK: 10 diff --git a/openmp/libomptarget/test/offloading/fortran/target-map-pointer-scopes-enter-exit.f90 b/openmp/libomptarget/test/offloading/fortran/target-map-pointer-scopes-enter-exit.f90 new file mode 100644 index 0000000000000..dee75af06927b --- /dev/null +++ b/openmp/libomptarget/test/offloading/fortran/target-map-pointer-scopes-enter-exit.f90 @@ -0,0 +1,83 @@ +! Offloading test checking interaction of pointers +! with target in different scopes +! REQUIRES: flang, amdgcn-amd-amdhsa +! UNSUPPORTED: nvptx64-nvidia-cuda +! UNSUPPORTED: nvptx64-nvidia-cuda-LTO +! UNSUPPORTED: aarch64-unknown-linux-gnu +! UNSUPPORTED: aarch64-unknown-linux-gnu-LTO +! UNSUPPORTED: x86_64-pc-linux-gnu +! UNSUPPORTED: x86_64-pc-linux-gnu-LTO + +! RUN: %libomptarget-compile-fortran-run-and-check-generic +module test + contains + subroutine func_arg(arg_alloc) + integer, pointer, intent (inout) :: arg_alloc(:) + + !$omp target enter data map(alloc: arg_alloc) + + !$omp target + do index = 1, 10 + arg_alloc(index) = arg_alloc(index) + index + end do + !$omp end target + + !$omp target exit data map(from: arg_alloc) + + !$omp target exit data map(delete: arg_alloc) + + print *, arg_alloc + end subroutine func_arg +end module + +subroutine func + integer, pointer :: local_alloc(:) + allocate(local_alloc(10)) + + !$omp target enter data map(alloc: local_alloc) + + !$omp target + do index = 1, 10 + local_alloc(index) = index + end do + !$omp end target + + !$omp target exit data map(from: local_alloc) + + !$omp target exit data map(delete: local_alloc) + + print *, local_alloc + + deallocate(local_alloc) +end subroutine func + + +program main + use test + integer, pointer :: map_ptr(:) + allocate(map_ptr(10)) + + !$omp target enter data map(alloc: map_ptr) + + !$omp target + do index = 1, 10 + map_ptr(index) = index + end do + !$omp end target + + !$omp target exit data map(from: map_ptr) + + !$omp target exit data map(delete: map_ptr) + + call func + + print *, map_ptr + + call func_arg(map_ptr) + + deallocate(map_ptr) +end program + +! CHECK: 1 2 3 4 5 6 7 8 9 10 +! CHECK: 1 2 3 4 5 6 7 8 9 10 +! CHECK: 2 4 6 8 10 12 14 16 18 20 diff --git a/openmp/libomptarget/test/offloading/fortran/target-map-pointer-target-array-section-3d-bounds.f90 b/openmp/libomptarget/test/offloading/fortran/target-map-pointer-target-array-section-3d-bounds.f90 new file mode 100644 index 0000000000000..ff2298cf5dbc9 --- /dev/null +++ b/openmp/libomptarget/test/offloading/fortran/target-map-pointer-target-array-section-3d-bounds.f90 @@ -0,0 +1,43 @@ +! Offloading test checking interaction of pointer +! and target with target where 3-D bounds have +! been specified +! REQUIRES: flang, amdgcn-amd-amdhsa +! UNSUPPORTED: nvptx64-nvidia-cuda +! UNSUPPORTED: nvptx64-nvidia-cuda-LTO +! UNSUPPORTED: aarch64-unknown-linux-gnu +! UNSUPPORTED: aarch64-unknown-linux-gnu-LTO +! UNSUPPORTED: x86_64-pc-linux-gnu +! UNSUPPORTED: x86_64-pc-linux-gnu-LTO + +! RUN: %libomptarget-compile-fortran-run-and-check-generic +program main + integer, pointer :: inArray(:,:,:) + integer, pointer :: outArray(:,:,:) + integer, target :: in(3,3,3) + integer, target :: out(3,3,3) + + inArray => in + outArray => out + + do i = 1, 3 + do j = 1, 3 + do k = 1, 3 + inArray(i, j, k) = 42 + outArray(i, j, k) = 0 + end do + end do + end do + +!$omp target map(tofrom:inArray(1:3, 1:3, 2:2), outArray(1:3, 1:3, 1:3)) + do j = 1, 3 + do k = 1, 3 + outArray(k, j, 2) = inArray(k, j, 2) + end do + end do +!$omp end target + + print *, outArray + +end program + +! CHECK: 0 0 0 0 0 0 0 0 0 42 42 42 42 42 42 42 42 42 0 0 0 0 0 0 0 0 0 diff --git a/openmp/libomptarget/test/offloading/fortran/target-map-pointer-target-scopes.f90 b/openmp/libomptarget/test/offloading/fortran/target-map-pointer-target-scopes.f90 new file mode 100644 index 0000000000000..d9a7000719f0e --- /dev/null +++ b/openmp/libomptarget/test/offloading/fortran/target-map-pointer-target-scopes.f90 @@ -0,0 +1,64 @@ +! Offloading test checking interaction of pointer +! and target with target across multiple scopes +! REQUIRES: flang, amdgcn-amd-amdhsa +! UNSUPPORTED: nvptx64-nvidia-cuda +! UNSUPPORTED: nvptx64-nvidia-cuda-LTO +! UNSUPPORTED: aarch64-unknown-linux-gnu +! UNSUPPORTED: aarch64-unknown-linux-gnu-LTO +! UNSUPPORTED: x86_64-pc-linux-gnu +! UNSUPPORTED: x86_64-pc-linux-gnu-LTO + +! RUN: %libomptarget-compile-fortran-run-and-check-generic +module test + contains + subroutine func_arg(arg_alloc) + integer, pointer, intent (inout) :: arg_alloc(:) + + !$omp target map(tofrom: arg_alloc) + do index = 1, 10 + arg_alloc(index) = arg_alloc(index) + index + end do + !$omp end target + + print *, arg_alloc + end subroutine func_arg +end module + +subroutine func + integer, pointer :: local_alloc(:) + integer, target :: b(10) + local_alloc => b + + !$omp target map(tofrom: local_alloc) + do index = 1, 10 + local_alloc(index) = index + end do + !$omp end target + + print *, local_alloc + end subroutine func + + + program main + use test + integer, pointer :: map_ptr(:) + integer, target :: b(10) + + map_ptr => b + + !$omp target map(tofrom: map_ptr) + do index = 1, 10 + map_ptr(index) = index + end do + !$omp end target + + call func + + print *, map_ptr + + call func_arg(map_ptr) +end program + +!CHECK: 1 2 3 4 5 6 7 8 9 10 +!CHECK: 1 2 3 4 5 6 7 8 9 10 +!CHECK: 2 4 6 8 10 12 14 16 18 20