-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[OpenACC][CIR] Generate private recipe pointer/array 'alloca's #160911
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||
---|---|---|---|---|
|
@@ -10,6 +10,8 @@ | |||
// | ||||
//===----------------------------------------------------------------------===// | ||||
|
||||
#include <numeric> | ||||
|
||||
#include "CIRGenOpenACCRecipe.h" | ||||
|
||||
namespace clang::CIRGen { | ||||
|
@@ -35,6 +37,110 @@ mlir::Block *OpenACCRecipeBuilderBase::createRecipeBlock(mlir::Region ®ion, | |||
return builder.createBlock(®ion, region.end(), types, locs); | ||||
} | ||||
|
||||
mlir::Value OpenACCRecipeBuilderBase::makeBoundsAlloca( | ||||
mlir::Block *block, SourceRange exprRange, mlir::Location loc, | ||||
std::string_view allocaName, size_t numBounds, | ||||
llvm::ArrayRef<QualType> boundTypes) { | ||||
mlir::OpBuilder::InsertionGuard guardCase(builder); | ||||
|
||||
// Get the range of bounds arguments, which are all but the 1st arg. | ||||
llvm::ArrayRef<mlir::BlockArgument> boundsRange = | ||||
block->getArguments().drop_front(1); | ||||
|
||||
// boundTypes contains the before and after of each bounds, so it ends up | ||||
// having 1 extra. Assert this is the case to ensure we don't call this in the | ||||
// wrong 'block'. | ||||
assert(boundsRange.size() + 1 == boundTypes.size()); | ||||
|
||||
mlir::Type itrTy = cgf.cgm.convertType(cgf.getContext().UnsignedLongLongTy); | ||||
auto idxType = mlir::IndexType::get(&cgf.getMLIRContext()); | ||||
|
||||
auto getUpperBound = [&](mlir::Value bound) { | ||||
auto upperBoundVal = | ||||
mlir::acc::GetUpperboundOp::create(builder, loc, idxType, bound); | ||||
return mlir::UnrealizedConversionCastOp::create(builder, loc, itrTy, | ||||
upperBoundVal.getResult()) | ||||
.getResult(0); | ||||
}; | ||||
|
||||
auto isArrayTy = [&](QualType ty) { | ||||
if (ty->isArrayType() && !ty->isConstantArrayType()) | ||||
cgf.cgm.errorNYI(exprRange, "OpenACC recipe init for VLAs"); | ||||
return ty->isConstantArrayType(); | ||||
}; | ||||
|
||||
mlir::Type topLevelTy = cgf.convertType(boundTypes.back()); | ||||
cir::PointerType topLevelTyPtr = builder.getPointerTo(topLevelTy); | ||||
// Do an alloca for the 'top' level type without bounds. | ||||
mlir::Value initialAlloca = builder.createAlloca( | ||||
loc, topLevelTyPtr, topLevelTy, allocaName, | ||||
cgf.getContext().getTypeAlignInChars(boundTypes.back())); | ||||
|
||||
bool lastBoundWasArray = isArrayTy(boundTypes.back()); | ||||
|
||||
// Since we're iterating the types in reverse, this sets up for each index | ||||
// corresponding to the boundsRange to be the 'after application of the | ||||
// bounds. | ||||
llvm::ArrayRef<QualType> boundResults = boundTypes.drop_back(1); | ||||
|
||||
// Collect the 'do we have any allocas needed after this type' list. | ||||
llvm::SmallVector<bool> allocasLeftArr; | ||||
llvm::ArrayRef<QualType> resultTypes = boundTypes.drop_front(); | ||||
std::transform_inclusive_scan( | ||||
resultTypes.begin(), resultTypes.end(), | ||||
std::back_inserter(allocasLeftArr), std::plus<bool>{}, | ||||
[](QualType ty) { return !ty->isConstantArrayType(); }); | ||||
|
||||
Comment on lines
+89
to
+93
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure if I misunderstood something, but I think this part will not compile, from the header file, this part will report a type mismatch error ->
Error message
I don't know why in The code will compile if we use the other overloading version of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @erichkeane I just encountered the same compilation error. As @AmrDeveloper says the default parameter fixes this: #161428 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is strange, it compiles for me! I wonder if the lookup/etc rules have changed or are just different in our compilers. Thank you for the patch, I'll look at it ASAP. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks you two! I looked at the review, that is fine. It appears to be a mistake in the library, I suspect they're copying the restriction from |
||||
// Keep track of the number of 'elements' that we're allocating. Individual | ||||
// allocas should multiply this by the size of its current allocation. | ||||
mlir::Value cumulativeElts; | ||||
for (auto [bound, resultType, allocasLeft] : llvm::reverse( | ||||
llvm::zip_equal(boundsRange, boundResults, allocasLeftArr))) { | ||||
|
||||
// if there is no further 'alloca' operation we need to do, we can skip | ||||
// creating the UB/multiplications/etc. | ||||
if (!allocasLeft) | ||||
break; | ||||
|
||||
// First: figure out the number of elements in the current 'bound' list. | ||||
mlir::Value eltsPerSubArray = getUpperBound(bound); | ||||
mlir::Value eltsToAlloca; | ||||
|
||||
// IF we are in a sub-bounds, the total number of elements to alloca is | ||||
// the product of that one and the current 'bounds' size. That is, | ||||
// arr[5][5], we would need 25 elements, not just 5. Else it is just the | ||||
// current number of elements. | ||||
if (cumulativeElts) | ||||
eltsToAlloca = builder.createMul(loc, eltsPerSubArray, cumulativeElts); | ||||
else | ||||
eltsToAlloca = eltsPerSubArray; | ||||
|
||||
if (!lastBoundWasArray) { | ||||
// If we have to do an allocation, figure out the size of the | ||||
// allocation. alloca takes the number of bytes, not elements. | ||||
TypeInfoChars eltInfo = cgf.getContext().getTypeInfoInChars(resultType); | ||||
cir::ConstantOp eltSize = builder.getConstInt( | ||||
loc, itrTy, eltInfo.Width.alignTo(eltInfo.Align).getQuantity()); | ||||
mlir::Value curSize = builder.createMul(loc, eltsToAlloca, eltSize); | ||||
|
||||
mlir::Type eltTy = cgf.convertType(resultType); | ||||
cir::PointerType ptrTy = builder.getPointerTo(eltTy); | ||||
builder.createAlloca(loc, ptrTy, eltTy, "openacc.init.bounds", | ||||
cgf.getContext().getTypeAlignInChars(resultType), | ||||
curSize); | ||||
|
||||
// TODO: OpenACC : At this point we should be copying the addresses of | ||||
// each element of this to the last allocation. At the moment, that is | ||||
// not yet implemented. | ||||
cgf.cgm.errorNYI(exprRange, "OpenACC recipe alloca copying"); | ||||
} | ||||
|
||||
cumulativeElts = eltsToAlloca; | ||||
lastBoundWasArray = isArrayTy(resultType); | ||||
} | ||||
return initialAlloca; | ||||
} | ||||
|
||||
mlir::Value | ||||
OpenACCRecipeBuilderBase::createBoundsLoop(mlir::Value subscriptedValue, | ||||
mlir::Value bound, | ||||
|
@@ -258,7 +364,11 @@ void OpenACCRecipeBuilderBase::createPrivateInitRecipe( | |||
cgf.emitAutoVarAlloca(*allocaDecl, builder.saveInsertionPoint()); | ||||
cgf.emitAutoVarInit(tempDeclEmission); | ||||
} else { | ||||
cgf.cgm.errorNYI(exprRange, "private-init with bounds"); | ||||
makeBoundsAlloca(block, exprRange, loc, "openacc.private.init", numBounds, | ||||
boundTypes); | ||||
|
||||
if (initExpr) | ||||
cgf.cgm.errorNYI(exprRange, "private-init with bounds initialization"); | ||||
} | ||||
|
||||
mlir::acc::YieldOp::create(builder, locEnd); | ||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still not completely clear on the why here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I basically need all of the types for one reason or another, both before and after each 'bounds' operation depending on what we need. SO I just store them all in
boundTypes
. For example, the 'top level' allocation happens at the least-bounded type. See 84 and 89 for cases where we need the first-set or second set (befores vs afters).Consider: A[N]:: 1 bound operation, 2 types (type of A, and type of an element of A). Or: A[N][M]: 2 bound operations, 3 types, etc.
The assert is a sanity check for me on that case.
Or is there something else I can clarify? I know I was a little verbose in my comments (I kept losing my way getting this right), so perhaps I wrote something confusing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you.