-
Notifications
You must be signed in to change notification settings - Fork 15.1k
[OpenMPIRBuilder] Add ThreadLimit and NumTeams clauses to teams construct #68364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This patch adds `num_teams` (upperbound) and `thread_limit` clauses to `OpenMPIRBuilder`.
@llvm/pr-subscribers-flang-openmp ChangesThis patch adds support for Added testcases for the same. Full diff: https://github.com/llvm/llvm-project/pull/68364.diff 3 Files Affected:
diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
index 75da461cfd8d95e..95f30d4a0d5cd5d 100644
--- a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
+++ b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
@@ -1917,8 +1917,13 @@ class OpenMPIRBuilder {
///
/// \param Loc The location where the teams construct was encountered.
/// \param BodyGenCB Callback that will generate the region code.
+ /// \param NumTeamsUpper Upper bound on the number of teams.
+ /// \param ThreadLimit on the number of threads that may participate in a
+ /// contention group created by each team.
InsertPointTy createTeams(const LocationDescription &Loc,
- BodyGenCallbackTy BodyGenCB);
+ BodyGenCallbackTy BodyGenCB,
+ Value *NumTeamsUpper = nullptr,
+ Value *ThreadLimit = nullptr);
/// Generate conditional branch and relevant BasicBlocks through which private
/// threads copy the 'copyin' variables from Master copy to threadprivate
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index 72e1af55fe63f60..b6d79e068f61e6e 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -5733,7 +5733,8 @@ OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createAtomicCompare(
OpenMPIRBuilder::InsertPointTy
OpenMPIRBuilder::createTeams(const LocationDescription &Loc,
- BodyGenCallbackTy BodyGenCB) {
+ BodyGenCallbackTy BodyGenCB, Value *NumTeamsUpper,
+ Value *ThreadLimit) {
if (!updateToLocation(Loc))
return InsertPointTy();
@@ -5771,6 +5772,17 @@ OpenMPIRBuilder::createTeams(const LocationDescription &Loc,
BasicBlock *AllocaBB =
splitBB(Builder, /*CreateBranch=*/true, "teams.alloca");
+ // Push num_teams
+ if (NumTeamsUpper || ThreadLimit) {
+ NumTeamsUpper =
+ NumTeamsUpper == nullptr ? Builder.getInt32(0) : NumTeamsUpper;
+ ThreadLimit = ThreadLimit == nullptr ? Builder.getInt32(0) : ThreadLimit;
+ Value *ThreadNum = getOrCreateThreadID(Ident);
+ Builder.CreateCall(
+ getOrCreateRuntimeFunctionPtr(OMPRTL___kmpc_push_num_teams),
+ {Ident, ThreadNum, NumTeamsUpper, ThreadLimit});
+ }
+
OutlineInfo OI;
OI.EntryBB = AllocaBB;
OI.ExitBB = ExitBB;
diff --git a/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp b/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
index fd524f6067ee0ea..fb87389023910c2 100644
--- a/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
+++ b/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
@@ -4082,6 +4082,163 @@ TEST_F(OpenMPIRBuilderTest, CreateTeams) {
[](Instruction &inst) { return isa<ICmpInst>(&inst); }));
}
+TEST_F(OpenMPIRBuilderTest, CreateTeamsWithThreadLimit) {
+ using InsertPointTy = OpenMPIRBuilder::InsertPointTy;
+ OpenMPIRBuilder OMPBuilder(*M);
+ OMPBuilder.initialize();
+ F->setName("func");
+ IRBuilder<> &Builder = OMPBuilder.Builder;
+ Builder.SetInsertPoint(BB);
+
+ Function *FakeFunction =
+ Function::Create(FunctionType::get(Builder.getVoidTy(), false),
+ GlobalValue::ExternalLinkage, "fakeFunction", M.get());
+
+ auto BodyGenCB = [&](InsertPointTy AllocaIP, InsertPointTy CodeGenIP) {
+ Builder.restoreIP(CodeGenIP);
+ Builder.CreateCall(FakeFunction, {});
+ };
+
+ OpenMPIRBuilder::LocationDescription Loc({Builder.saveIP(), DL});
+ Builder.restoreIP(
+ OMPBuilder.createTeams(Builder, BodyGenCB, nullptr, F->arg_begin()));
+
+ Builder.CreateRetVoid();
+ OMPBuilder.finalize();
+
+ ASSERT_FALSE(verifyModule(*M));
+
+ Function *PushNumTeamsRTL =
+ OMPBuilder.getOrCreateRuntimeFunctionPtr(OMPRTL___kmpc_push_num_teams);
+ EXPECT_EQ(PushNumTeamsRTL->getNumUses(), 1U);
+
+ CallInst *PushNumTeamsCallInst =
+ findSingleCall(F, OMPRTL___kmpc_push_num_teams, OMPBuilder);
+ ASSERT_NE(PushNumTeamsCallInst, nullptr);
+
+ EXPECT_EQ(PushNumTeamsCallInst->getArgOperand(2), Builder.getInt32(0));
+ EXPECT_EQ(PushNumTeamsCallInst->getArgOperand(3), &*F->arg_begin());
+
+ // Verifying that the next instruction to execute is kmpc_fork_teams
+ BranchInst *BrInst =
+ dyn_cast<BranchInst>(PushNumTeamsCallInst->getNextNonDebugInstruction());
+ ASSERT_NE(BrInst, nullptr);
+ ASSERT_EQ(BrInst->getNumSuccessors(), 1U);
+ Instruction *NextInstruction =
+ BrInst->getSuccessor(0)->getFirstNonPHIOrDbgOrLifetime();
+ CallInst *ForkTeamsCI = dyn_cast_if_present<CallInst>(NextInstruction);
+ ASSERT_NE(ForkTeamsCI, nullptr);
+ EXPECT_EQ(ForkTeamsCI->getCalledFunction(),
+ OMPBuilder.getOrCreateRuntimeFunctionPtr(OMPRTL___kmpc_fork_teams));
+}
+
+TEST_F(OpenMPIRBuilderTest, CreateTeamsWithNumTeams) {
+ using InsertPointTy = OpenMPIRBuilder::InsertPointTy;
+ OpenMPIRBuilder OMPBuilder(*M);
+ OMPBuilder.initialize();
+ F->setName("func");
+ IRBuilder<> &Builder = OMPBuilder.Builder;
+ Builder.SetInsertPoint(BB);
+
+ Function *FakeFunction =
+ Function::Create(FunctionType::get(Builder.getVoidTy(), false),
+ GlobalValue::ExternalLinkage, "fakeFunction", M.get());
+
+ auto BodyGenCB = [&](InsertPointTy AllocaIP, InsertPointTy CodeGenIP) {
+ Builder.restoreIP(CodeGenIP);
+ Builder.CreateCall(FakeFunction, {});
+ };
+
+ OpenMPIRBuilder::LocationDescription Loc({Builder.saveIP(), DL});
+ Builder.restoreIP(OMPBuilder.createTeams(Builder, BodyGenCB, F->arg_begin()));
+
+ Builder.CreateRetVoid();
+ OMPBuilder.finalize();
+
+ ASSERT_FALSE(verifyModule(*M));
+
+ Function *PushNumTeamsRTL =
+ OMPBuilder.getOrCreateRuntimeFunctionPtr(OMPRTL___kmpc_push_num_teams);
+ EXPECT_EQ(PushNumTeamsRTL->getNumUses(), 1U);
+
+ CallInst *PushNumTeamsCallInst =
+ findSingleCall(F, OMPRTL___kmpc_push_num_teams, OMPBuilder);
+ ASSERT_NE(PushNumTeamsCallInst, nullptr);
+
+ EXPECT_EQ(PushNumTeamsCallInst->getArgOperand(2), &*F->arg_begin());
+ EXPECT_EQ(PushNumTeamsCallInst->getArgOperand(3), Builder.getInt32(0));
+
+ // Verifying that the next instruction to execute is kmpc_fork_teams
+ BranchInst *BrInst =
+ dyn_cast<BranchInst>(PushNumTeamsCallInst->getNextNonDebugInstruction());
+ ASSERT_NE(BrInst, nullptr);
+ ASSERT_EQ(BrInst->getNumSuccessors(), 1U);
+ Instruction *NextInstruction =
+ BrInst->getSuccessor(0)->getFirstNonPHIOrDbgOrLifetime();
+ CallInst *ForkTeamsCI = dyn_cast_if_present<CallInst>(NextInstruction);
+ ASSERT_NE(ForkTeamsCI, nullptr);
+ EXPECT_EQ(ForkTeamsCI->getCalledFunction(),
+ OMPBuilder.getOrCreateRuntimeFunctionPtr(OMPRTL___kmpc_fork_teams));
+}
+
+TEST_F(OpenMPIRBuilderTest, CreateTeamsWithNumTeamsAndThreadLimit) {
+ using InsertPointTy = OpenMPIRBuilder::InsertPointTy;
+ OpenMPIRBuilder OMPBuilder(*M);
+ OMPBuilder.initialize();
+ F->setName("func");
+ IRBuilder<> &Builder = OMPBuilder.Builder;
+ Builder.SetInsertPoint(BB);
+
+ BasicBlock *CodegenBB = splitBB(Builder, true);
+ Builder.SetInsertPoint(CodegenBB);
+
+ Value *NumTeamsUpper =
+ Builder.CreateAdd(F->arg_begin(), Builder.getInt32(10), "numTeamsUpper");
+ Value *ThreadLimit =
+ Builder.CreateAdd(F->arg_begin(), Builder.getInt32(20), "threadLimit");
+
+ Function *FakeFunction =
+ Function::Create(FunctionType::get(Builder.getVoidTy(), false),
+ GlobalValue::ExternalLinkage, "fakeFunction", M.get());
+
+ auto BodyGenCB = [&](InsertPointTy AllocaIP, InsertPointTy CodeGenIP) {
+ Builder.restoreIP(CodeGenIP);
+ Builder.CreateCall(FakeFunction, {});
+ };
+
+ OpenMPIRBuilder::LocationDescription Loc({Builder.saveIP(), DL});
+ Builder.restoreIP(
+ OMPBuilder.createTeams(Builder, BodyGenCB, NumTeamsUpper, ThreadLimit));
+
+ Builder.CreateRetVoid();
+ OMPBuilder.finalize();
+
+ ASSERT_FALSE(verifyModule(*M));
+
+ Function *PushNumTeamsRTL =
+ OMPBuilder.getOrCreateRuntimeFunctionPtr(OMPRTL___kmpc_push_num_teams);
+ EXPECT_EQ(PushNumTeamsRTL->getNumUses(), 1U);
+
+ CallInst *PushNumTeamsCallInst =
+ findSingleCall(F, OMPRTL___kmpc_push_num_teams, OMPBuilder);
+ ASSERT_NE(PushNumTeamsCallInst, nullptr);
+
+ EXPECT_EQ(PushNumTeamsCallInst->getArgOperand(2), NumTeamsUpper);
+ EXPECT_EQ(PushNumTeamsCallInst->getArgOperand(3), ThreadLimit);
+
+ // Verifying that the next instruction to execute is kmpc_fork_teams
+ BranchInst *BrInst =
+ dyn_cast<BranchInst>(PushNumTeamsCallInst->getNextNonDebugInstruction());
+ ASSERT_NE(BrInst, nullptr);
+ ASSERT_EQ(BrInst->getNumSuccessors(), 1U);
+ Instruction *NextInstruction =
+ BrInst->getSuccessor(0)->getFirstNonPHIOrDbgOrLifetime();
+ CallInst *ForkTeamsCI = dyn_cast_if_present<CallInst>(NextInstruction);
+ ASSERT_NE(ForkTeamsCI, nullptr);
+ EXPECT_EQ(ForkTeamsCI->getCalledFunction(),
+ OMPBuilder.getOrCreateRuntimeFunctionPtr(OMPRTL___kmpc_fork_teams));
+}
+
/// Returns the single instruction of InstTy type in BB that uses the value V.
/// If there is more than one such instruction, returns null.
template <typename InstTy>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this change for OMP 5.0 and before?
InsertPointTy createTeams(const LocationDescription &Loc, | ||
BodyGenCallbackTy BodyGenCB); | ||
BodyGenCallbackTy BodyGenCB, | ||
Value *NumTeamsUpper = nullptr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about the LowerBound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lowerbound isn't handled for clang either - it is not even recognized as syntactically correct. I don't know what the generated code should look like, as @__kmpc_push_num_teams
does not accept a lowerbound on number of teams. So, I kept it unimplemented for future work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, there are two functions in the OpenMP runtime. It is OK to leave with a TODO, just write a status of the support in Clang and the OpenMP runtime.
KMP_EXPORT void __kmpc_push_num_teams(ident_t *loc, kmp_int32 global_tid,
kmp_int32 num_teams,
kmp_int32 num_threads);
/* Function for OpenMP 5.1 num_teams clause */
KMP_EXPORT void __kmpc_push_num_teams_51(ident_t *loc, kmp_int32 global_tid,
kmp_int32 num_teams_lb,
kmp_int32 num_teams_ub,
kmp_int32 num_threads);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I had not seen this. I think I will update the PR to handle this. Thank you for pointing this out! 👍
|
||
OpenMPIRBuilder::LocationDescription Loc({Builder.saveIP(), DL}); | ||
Builder.restoreIP( | ||
OMPBuilder.createTeams(Builder, BodyGenCB, nullptr, F->arg_begin())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment for what is nullptr
and why using F->arg_begin()
is fine. Same for the createTeams
invocation in the other tests.
No, it should be for all the standards since |
9f36870
to
e01c867
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is OK with me.
The only question is whether we should use the __kmpc_push_num_teams
in the case when the lowerbound
is not specified.
Please wait for a day to see whether there is any opinion on this topic.
This patch adds support for
thread_limit
and bounds onnum_teams
clause for the teams construct in OpenMP.Added testcases for the same.