-
Notifications
You must be signed in to change notification settings - Fork 12k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[flang][openacc] Enable lowering support for OpenACC atomic operations #65776
Conversation
Since the OpenACC atomics specification is a subset of OpenMP atomics, the same lowering implementation can be used. This change extracts out the necessary pieces from the OpenMP lowering and puts them in a shared spot. The shared spot is a header file so that each implementation can template specialize directly. After putting the OpenMP implementation in a common spot, the following changes were needed to make it work for OpenACC: * Ensure parsing works correctly by avoiding hardcoded offsets. * Templatize based on atomic type. * The checking whether it is OpenMP or OpenACC is done by checking for OmpAtomicClauseList (OpenACC does not implement this so we just templatize with void). It was preferable to check this instead of atomic type because in some cases, like atomic capture, the read/write/update implementations are called - and we want compile time evaluation of these conditional parts. * The memory order and hint are used only for OpenMP. * Generate acc dialect operations instead of omp dialect operations.
I will abandon previous revision on phabricator: https://reviews.llvm.org/D158772 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very nice to me. Thank you!
flang/lib/Lower/DirectivesCommon.h
Outdated
static inline void genOmpAccAtomicCaptureStatement( | ||
Fortran::lower::AbstractConverter &converter, | ||
Fortran::lower::pft::Evaluation &eval, mlir::Value fromAddress, | ||
mlir::Value toAddress, const AtomicListT *leftHandClauseList, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: some Flang sources use C++ [[maybe_unused]]
attribute - this will allow avoiding dummy references in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know. The dummy references was a recommendation from description of LLVM_ATTRIBUTE_UNUSED. I like your recommendation better.
flang/lib/Lower/DirectivesCommon.h
Outdated
template <typename AtomicListT> | ||
static inline void genOmpAccAtomicCaptureStatement( | ||
Fortran::lower::AbstractConverter &converter, | ||
Fortran::lower::pft::Evaluation &eval, mlir::Value fromAddress, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it looks like eval
is unused, but we may keep it since this is coming from the original OpenMP code. It may be cleaned up later like in #65678
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems OK to change now since already the code loses its history and there were multiple modifications to make it work for OpenACC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I plan to merge in about 2 hours. :) Thank you for the reviews! |
@llvm/pr-subscribers-flang-fir-hlfir ChangesSince the OpenACC atomics specification is a subset of OpenMP atomics, the same lowering implementation can be used. This change extracts out the necessary pieces from the OpenMP lowering and puts them in a shared spot. The shared spot is a header file so that each implementation can template specialize directly. After putting the OpenMP implementation in a common spot, the following changes were needed to make it work for OpenACC:
Patch is 69.58 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/65776.diff 9 Files Affected:
diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 75784f4e5a72be2..cbe108194dd212f 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -2287,9 +2287,11 @@ class FirConverter : public Fortran::lower::AbstractConverter { void genFIR(const Fortran::parser::OpenACCConstruct &acc) { mlir::OpBuilder::InsertPoint insertPt = builder->saveInsertionPoint(); + localSymbols.pushScope(); genOpenACCConstruct(*this, bridge.getSemanticsContext(), getEval(), acc); for (Fortran::lower::pft::Evaluation &e : getEval().getNestedEvaluations()) genFIR(e); + localSymbols.popScope(); builder->restoreInsertionPoint(insertPt); } diff --git a/flang/lib/Lower/DirectivesCommon.h b/flang/lib/Lower/DirectivesCommon.h new file mode 100644 index 000000000000000..35825a20b4cf93f --- /dev/null +++ b/flang/lib/Lower/DirectivesCommon.h @@ -0,0 +1,593 @@ +//===-- Lower/DirectivesCommon.h --------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// Coding style: https://mlir.llvm.org/getting_started/DeveloperGuide/ +// +//===----------------------------------------------------------------------===// +/// +/// A location to place directive utilities shared across multiple lowering +/// files, e.g. utilities shared in OpenMP and OpenACC. The header file can +/// be used for both declarations and templated/inline implementations. +//===----------------------------------------------------------------------===// + +#ifndef FORTRAN_LOWER_DIRECTIVES_COMMON_H +#define FORTRAN_LOWER_DIRECTIVES_COMMON_H + +#include "flang/Common/idioms.h" +#include "flang/Lower/Bridge.h" +#include "flang/Lower/ConvertExpr.h" +#include "flang/Lower/ConvertVariable.h" +#include "flang/Lower/OpenACC.h" +#include "flang/Lower/OpenMP.h" +#include "flang/Lower/PFTBuilder.h" +#include "flang/Lower/StatementContext.h" +#include "flang/Optimizer/Builder/BoxValue.h" +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/Todo.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" +#include "flang/Parser/parse-tree.h" +#include "flang/Semantics/openmp-directive-sets.h" +#include "flang/Semantics/tools.h" +#include "mlir/Dialect/OpenACC/OpenACC.h" +#include "mlir/Dialect/OpenMP/OpenMPDialect.h" +#include "mlir/Dialect/SCF/IR/SCF.h" +#include "llvm/Frontend/OpenMP/OMPConstants.h" +#include + +namespace Fortran { +namespace lower { + +/// Checks if the assignment statement has a single variable on the RHS. +static inline bool checkForSingleVariableOnRHS( + const Fortran::parser::AssignmentStmt &assignmentStmt) { + const Fortran::parser::Expr &expr{ + std::get(assignmentStmt.t)}; + const Fortran::common::Indirection *designator = + std::get_if>( + &expr.u); + const Fortran::parser::Name *name = + designator + ? Fortran::semantics::getDesignatorNameIfDataRef(designator->value()) + : nullptr; + return name != nullptr; +} + +/// Checks if the symbol on the LHS of the assignment statement is present in +/// the RHS expression. +static inline bool +checkForSymbolMatch(const Fortran::parser::AssignmentStmt &assignmentStmt) { + const auto &var{std::get(assignmentStmt.t)}; + const auto &expr{std::get(assignmentStmt.t)}; + const auto *e{Fortran::semantics::GetExpr(expr)}; + const auto *v{Fortran::semantics::GetExpr(var)}; + auto varSyms{Fortran::evaluate::GetSymbolVector(*v)}; + const Fortran::semantics::Symbol &varSymbol{*varSyms.front()}; + for (const Fortran::semantics::Symbol &symbol : + Fortran::evaluate::GetSymbolVector(*e)) + if (varSymbol == symbol) + return true; + return false; +} + +/// Populates \p hint and \p memoryOrder with appropriate clause information +/// if present on atomic construct. +static inline void genOmpAtomicHintAndMemoryOrderClauses( + Fortran::lower::AbstractConverter &converter, + const Fortran::parser::OmpAtomicClauseList &clauseList, + mlir::IntegerAttr &hint, + mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const Fortran::parser::OmpAtomicClause &clause : clauseList.v) { + if (const auto *ompClause = + std::get_if(&clause.u)) { + if (const auto *hintClause = + std::get_if(&ompClause->u)) { + const auto *expr = Fortran::semantics::GetExpr(hintClause->v); + uint64_t hintExprValue = *Fortran::evaluate::ToInt64(*expr); + hint = firOpBuilder.getI64IntegerAttr(hintExprValue); + } + } else if (const auto *ompMemoryOrderClause = + std::get_if( + &clause.u)) { + if (std::get_if( + &ompMemoryOrderClause->v.u)) { + memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( + firOpBuilder.getContext(), + mlir::omp::ClauseMemoryOrderKind::Acquire); + } else if (std::get_if( + &ompMemoryOrderClause->v.u)) { + memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( + firOpBuilder.getContext(), + mlir::omp::ClauseMemoryOrderKind::Relaxed); + } else if (std::get_if( + &ompMemoryOrderClause->v.u)) { + memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( + firOpBuilder.getContext(), + mlir::omp::ClauseMemoryOrderKind::Seq_cst); + } else if (std::get_if( + &ompMemoryOrderClause->v.u)) { + memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( + firOpBuilder.getContext(), + mlir::omp::ClauseMemoryOrderKind::Release); + } + } + } +} + +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +template +static inline void genOmpAccAtomicCaptureStatement( + Fortran::lower::AbstractConverter &converter, mlir::Value fromAddress, + mlir::Value toAddress, + [[maybe_unused]] const AtomicListT *leftHandClauseList, + [[maybe_unused]] const AtomicListT *rightHandClauseList, + mlir::Type elementType) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + if constexpr (std::is_same()) { + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, + hint, memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, + hint, memoryOrder); + firOpBuilder.create( + currentLocation, fromAddress, toAddress, + mlir::TypeAttr::get(elementType), hint, memoryOrder); + } else { + firOpBuilder.create( + currentLocation, fromAddress, toAddress, + mlir::TypeAttr::get(elementType)); + } +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +template +static inline void genOmpAccAtomicWriteStatement( + Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Value rhsExpr, [[maybe_unused]] const AtomicListT *leftHandClauseList, + [[maybe_unused]] const AtomicListT *rightHandClauseList, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + if constexpr (std::is_same()) { + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, + hint, memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, + hint, memoryOrder); + firOpBuilder.create(currentLocation, lhsAddr, + rhsExpr, hint, memoryOrder); + } else { + firOpBuilder.create(currentLocation, lhsAddr, + rhsExpr); + } +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +template +static inline void genOmpAccAtomicUpdateStatement( + Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, + const Fortran::parser::Expr &assignmentStmtExpr, + [[maybe_unused]] const AtomicListT *leftHandClauseList, + [[maybe_unused]] const AtomicListT *rightHandClauseList) { + // Generate `omp.atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + const auto *varDesignator = + std::get_if>( + &assignmentStmtVariable.u); + assert(varDesignator && "Variable designator for atomic update assignment " + "statement does not exist"); + const Fortran::parser::Name *name = + Fortran::semantics::getDesignatorNameIfDataRef(varDesignator->value()); + if (!name) + TODO(converter.getCurrentLocation(), + "Array references as atomic update variable"); + assert(name && name->symbol && + "No symbol attached to atomic update variable"); + if (Fortran::semantics::IsAllocatableOrPointer(name->symbol->GetUltimate())) + converter.bindSymbol(*name->symbol, lhsAddr); + + // Lowering is in two steps : + // subroutine sb + // integer :: a, b + // !$omp atomic update + // a = a + b + // end subroutine + // + // 1. Lower to scf.execute_region_op + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = scf.execute_region -> i32 { + // %3 = fir.load %0 : !fir.ref + // %4 = fir.load %1 : !fir.ref + // %5 = arith.addi %3, %4 : i32 + // scf.yield %5 : i32 + // } + // return + // } + auto tempOp = + firOpBuilder.create(currentLocation, varType); + firOpBuilder.createBlock(&tempOp.getRegion()); + mlir::Block &block = tempOp.getRegion().back(); + firOpBuilder.setInsertionPointToEnd(&block); + Fortran::lower::StatementContext stmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *Fortran::semantics::GetExpr(assignmentStmtExpr), stmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + // Insert the terminator: YieldOp. + firOpBuilder.create(currentLocation, convertResult); + firOpBuilder.setInsertionPointToStart(&block); + + // 2. Create the omp.atomic.update Operation using the Operations in the + // temporary scf.execute_region Operation. + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = fir.load %1 : !fir.ref + // %4 = arith.addi %arg0, %3 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + mlir::Value updateVar = converter.getSymbolAddress(*name->symbol); + if (auto decl = updateVar.getDefiningOp()) + updateVar = decl.getBase(); + + firOpBuilder.setInsertionPointAfter(tempOp); + + mlir::Operation *atomicUpdateOp = nullptr; + if constexpr (std::is_same()) { + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, + hint, memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, + hint, memoryOrder); + atomicUpdateOp = firOpBuilder.create( + currentLocation, updateVar, hint, memoryOrder); + } else { + atomicUpdateOp = firOpBuilder.create( + currentLocation, updateVar); + } + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + llvm::SmallVector ops; + for (mlir::Operation &op : tempOp.getRegion().getOps()) + ops.push_back(&op); + + // SCF Yield is converted to OMP Yield. All other operations are copied + for (mlir::Operation *op : ops) { + if (auto y = mlir::dyn_cast(op)) { + firOpBuilder.setInsertionPointToEnd( + &atomicUpdateOp->getRegion(0).front()); + if constexpr (std::is_same()) { + firOpBuilder.create(currentLocation, + y.getResults()); + } else { + firOpBuilder.create(currentLocation, + y.getResults()); + } + op->erase(); + } else { + op->remove(); + atomicUpdateOp->getRegion(0).front().push_back(op); + } + } + + // Remove the load and replace all uses of load with the block argument + for (mlir::Operation &op : atomicUpdateOp->getRegion(0).getOps()) { + fir::LoadOp y = mlir::dyn_cast(&op); + if (y && y.getMemref() == updateVar) + y.getRes().replaceAllUsesWith(val); + } + + tempOp.erase(); +} + +/// Processes an atomic construct with write clause. +template +void genOmpAccAtomicWrite(Fortran::lower::AbstractConverter &converter, + const AtomicT &atomicWrite) { + const AtomicListT *rightHandClauseList = nullptr; + const AtomicListT *leftHandClauseList = nullptr; + if constexpr (std::is_same()) { + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicWrite.t); + leftHandClauseList = &std::get<0>(atomicWrite.t); + } + + const Fortran::parser::AssignmentStmt &stmt = + std::get>( + atomicWrite.t) + .statement; + const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; + Fortran::lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genOmpAccAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, + rightHandClauseList); +} + +/// Processes an atomic construct with read clause. +template +void genOmpAccAtomicRead(Fortran::lower::AbstractConverter &converter, + const AtomicT &atomicRead) { + const AtomicListT *rightHandClauseList = nullptr; + const AtomicListT *leftHandClauseList = nullptr; + if constexpr (std::is_same()) { + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicRead.t); + leftHandClauseList = &std::get<0>(atomicRead.t); + } + + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicRead.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genOmpAccAtomicCaptureStatement(converter, fromAddress, toAddress, + leftHandClauseList, rightHandClauseList, + elementType); +} + +/// Processes an atomic construct with update clause. +template +void genOmpAccAtomicUpdate(Fortran::lower::AbstractConverter &converter, + const AtomicT &atomicUpdate) { + const AtomicListT *rightHandClauseList = nullptr; + const AtomicListT *leftHandClauseList = nullptr; + if constexpr (std::is_same()) { + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicUpdate.t); + leftHandClauseList = &std::get<0>(atomicUpdate.t); + } + + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assign... |
llvm#65776) Since the OpenACC atomics specification is a subset of OpenMP atomics, the same lowering implementation can be used. This change extracts out the necessary pieces from the OpenMP lowering and puts them in a shared spot. The shared spot is a header file so that each implementation can template specialize directly. After putting the OpenMP implementation in a common spot, the following changes were needed to make it work for OpenACC: * Ensure parsing works correctly by avoiding hardcoded offsets. * Templatize based on atomic type. * The checking whether it is OpenMP or OpenACC is done by checking for OmpAtomicClauseList (OpenACC does not implement this so we just templatize with void). It was preferable to check this instead of atomic type because in some cases, like atomic capture, the read/write/update implementations are called - and we want compile time evaluation of these conditional parts. * The memory order and hint are used only for OpenMP. * Generate acc dialect operations instead of omp dialect operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the late review. I mainly worked on the OpenMP side of atomic construct.
Thank you for the sharing of code between OpenACC and OpenMP. Overall, the sharing between OpenMP and OpenACC looks good to me. Minor comments have already been pointed out by others and fixed.
Thanks!
llvm#65776) Since the OpenACC atomics specification is a subset of OpenMP atomics, the same lowering implementation can be used. This change extracts out the necessary pieces from the OpenMP lowering and puts them in a shared spot. The shared spot is a header file so that each implementation can template specialize directly. After putting the OpenMP implementation in a common spot, the following changes were needed to make it work for OpenACC: * Ensure parsing works correctly by avoiding hardcoded offsets. * Templatize based on atomic type. * The checking whether it is OpenMP or OpenACC is done by checking for OmpAtomicClauseList (OpenACC does not implement this so we just templatize with void). It was preferable to check this instead of atomic type because in some cases, like atomic capture, the read/write/update implementations are called - and we want compile time evaluation of these conditional parts. * The memory order and hint are used only for OpenMP. * Generate acc dialect operations instead of omp dialect operations.
Since the OpenACC atomics specification is a subset of OpenMP atomics, the same lowering implementation can be used. This change extracts out the necessary pieces from the OpenMP lowering and puts them in a shared spot. The shared spot is a header file so that each implementation can template specialize directly.
After putting the OpenMP implementation in a common spot, the following changes were needed to make it work for OpenACC: