-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[OpenMPIRBuilder] Added createTeams
#65767
Conversation
This patch adds a generator for the teams construct. The generated IR looks like the following: ``` current_fn() { ... call @__kmpc_fork_teams(ptr @Ident, i32 num_args, ptr @outlined_omp_teams, ...args) ... } outlined_omp_teams(ptr %global_tid, ptr %bound_tid, ...args) { ; teams body } ``` It does this by first generating the body in the current function. Then we outline the body in a temporary function. We then create the @outlined_omp_teams function and embed the temporary outlined function in this function. We then emit the call to runtime function.
…lvm#65762) This whitespaces breaking `Format` CI. ``` $ trap 'kill -- $$' INT TERM QUIT; clang/utils/ci/run-buildbot check-format + set -o pipefail + unset LANG + unset LC_ALL + unset LC_COLLATE ++ basename clang/utils/ci/run-buildbot + PROGNAME=run-buildbot + [[ 1 == 0 ]] + [[ 1 -gt 0 ]] + case ${1} in + BUILDER=check-format + shift + [[ 0 -gt 0 ]] ++ git rev-parse --show-toplevel + MONOREPO_ROOT=/var/lib/buildkite-agent/builds/linux-56-fdf668759-vtv88-1/llvm-project/clang-ci + BUILD_DIR=/var/lib/buildkite-agent/builds/linux-56-fdf668759-vtv88-1/llvm-project/clang-ci/build/check-format + INSTALL_DIR=/var/lib/buildkite-agent/builds/linux-56-fdf668759-vtv88-1/llvm-project/clang-ci/build/check-format/install + cmake --version cmake version 3.23.3 CMake suite maintained and supported by Kitware (kitware.com/cmake). + ninja --version 1.10.1 + case "${BUILDER}" in + grep -rnI '[[:blank:]]$' clang/lib clang/include clang/docs clang/include/clang/Basic/riscv_vector.td:580: let Name = op # eew # "_v", IRName = op, MaskedIRName = op # "_mask", clang/include/clang/Basic/riscv_vector.td:1786: let RequiredFeatures = ["ZvfhminOrZvfh"] in ``` Signed-off-by: yronglin <yronglin777@gmail.com>
This reverts the following commits: - 5d7b75e [NFC][memprof] Temporarly remove RTSanitizerCommonSymbolizerInternal - edb211c [NFC][memprof] Temporarly remove RTSanitizerCommonSymbolizerInternal - 4d14b4a [sanitizer] Add CMake flag to build with internal symbolizer They break macOS nodes because CMake can't evaluate generator expressions: Error evaluating generator expression: $<TARGET_OBJECTS:RTSanitizerCommonSymbolizerInternal.osx> $<TARGET_OBJECTS:RTSanitizerCommonSymbolizerInternal.ios> $<TARGET_OBJECTS:RTSanitizerCommonSymbolizerInternal.iossim>
llvm#65738) Due to a copy-paste error in 32b7c1f this was incorrectly mapped to `TosaProfileEnum::MainTraining` rather than `TosaProfileEnum::Undefined`.
…rent SDNode (llvm#65765) SITargetLowering::adjustWritemask calls SelectionDAG::UpdateNodeOperands to update an EXTRACT_SUBREG node in-place to refer to a new IMAGE_LOAD instruction, before we delete the old IMAGE_LOAD instruction. But in UpdateNodeOperands can do CSE on the fly and return a different EXTRACT_SUBREG node, so the original EXTRACT_SUBREG node would still exist and would refer to the old deleted IMAGE_LOAD instruction. This caused errors like: t31: v3i32,ch = <<Deleted Node!>> # D:1 This target-independent node should have been selected! UNREACHABLE executed at lib/CodeGen/SelectionDAG/InstrEmitter.cpp:1209! Fix it by detecting the CSE case and replacing all uses of the original EXTRACT_SUBREG node with the CSE'd one. Recommit with a fix for a use-after-free bug in the first version of this patch (llvm#65340) which was caught by asan.
This allows front ends to insert s_nops - this is most often when a delay less than s_sleep 1 is required.
…lvm#65671) Fix Function Multi Versioning crash reported in llvm#65669
The return value of the function is only used to get the debug location. Directly return the debug location, as this avoids an extra null check in the caller.
llvm::ArrayType::get( | ||
llvm::PointerType::getUnqual(M.getContext()), Names.size()), | ||
llvm::ArrayType::get(llvm::PointerType::getUnqual(M.getContext()), | ||
Names.size()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like you clang-formated the entire file. Either only format your changes, or pre-commit a NFC that formats the existing file. This way it clutters the changes with unrelated ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry. I assumed the file would be formatted. I will update this soon.
BodyGenCallbackTy BodyGenCB) { | ||
if (!updateToLocation(Loc)) { | ||
return Loc.IP; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style: No braces
} | ||
FunctionCallee WrapperFuncVal = M.getOrInsertFunction( | ||
"outlined_omp_teams", | ||
FunctionType::get(Builder.getVoidTy(), WrapperArgTys, false)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this function generated? I assume it's internal, if so, we cannot use the name.
Builder.SetInsertPoint(ExitBB); | ||
|
||
return Builder.saveIP(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this different from the parallel handling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is similar to task handling. Parallel handling uses CodeExtractor to correct the arguments I think. I didn't quite understand that - CodeExtractor and OutlineInfo seem to do similar things.
@@ -2005,6 +2005,13 @@ class OpenMPIRBuilder { | |||
/// \param Loc The insert and source location description. | |||
void createTargetDeinit(const LocationDescription &Loc); | |||
|
|||
/// Generator for `#omp teams` | |||
/// | |||
/// \param Loc The location where the task construct was encountered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit : task construct -> teams construct
This commit contains refactorings around __dynamic_cast without changing its behavior. Some important changes include: - Refactor __dynamic_cast into various small helper functions; - Move dynamic_cast_stress.pass.cpp to libcxx/benchmarks and refactor it into a benchmark. The benchmark performance numbers are updated as well. Differential Revision: https://reviews.llvm.org/D138006
Like concepts checking, a trailing return type of a lambda in a dependent context may refer to captures in which case they may need to be rebuilt, so the map of local decl should include captures. This patch reveal a pre-existing issue. `this` is always recomputed by TreeTransform. `*this` (like all captures) only become `const` after the parameter list. However, if try to recompute the value of `this` (in a parameter) during template instantiation while determining the type of the call operator, we will determine it to be const (unless the lambda is mutable). There is no good way to know at that point that we are in a parameter or not, the easiest/best solution is to transform the type of this. Note that doing so break a handful of HLSL tests. So this is a prototype at this point. Fixes llvm#65067 Fixes llvm#63675 Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D159126
… /W4 Differential Revision: https://reviews.llvm.org/D96408
Note that these are non-canonical. At IR, we will generally canonicalize to the intptrty width form if knownbits allows.
The first test case added in the LIT test demonstrates the problem. Even though we did not consider the inner loop as a candidate for the transformation due to the array_coor with a slice, we decided to version the outer loop for the same function argument. During the cloning of the outer loop we dropped the slicing completely producing invalid code. I restructured the code so that we record all arg uses that cannot be transformed (regardless of the reason), and then fixup the usage information across the loop nests. I also noticed that we may generate redundant contiguity checks for the inner loops, so I fixed it since it was easy with the new way of keeping the usage data.
…#65707) This is a copy of the corresponding ArrayValueCopy analysis for non-overlapping array slices. It is required to achieve the same performance for Polyhedron/nf, though, additional changes are needed in the alias analysis for disambiguating host associated accesses.
…tes (llvm#65685) With `-fpseudo-probe-for-profiling`, the dwarf discriminator for a callsite will be overwritten to pseudo probe related information for that callsite. The probe information is encoded in a special format (i.e., with all lowest three digits be one) in order to be distinguished from regular dwarf discriminator. The special encoding format will be decoded to zero by the regular discriminator logic. This means all callsites would have a zero discriminator in both the sample profile and the compiler, for classic AutoFDO. This is inconvenient in that no decent classic AutoFDO can be generated from a pseudo probe build. I'm mitigating the issue by allowing callsite probe id to be used as the base dwarf discriminator for classic AutoFDO, since probe id is also unique and can be used to differentiate callsites on the same source line.
This patch adds a generator for the teams construct. The generated IR looks like the following: ``` current_fn() { ... call @__kmpc_fork_teams(ptr @Ident, i32 num_args, ptr @outlined_omp_teams, ...args) ... } outlined_omp_teams(ptr %global_tid, ptr %bound_tid, ...args) { ; teams body } ``` It does this by first generating the body in the current function. Then we outline the body in a temporary function. We then create the @outlined_omp_teams function and embed the temporary outlined function in this function. We then emit the call to runtime function.
…nto teams_ompbuilder
Apologies for the ping, unfortunately I have messed up rebasing my branch. I will close this pull request to avoid future notifications for others in this thread. :) |
This patch adds a generator for the teams construct. The generated IR looks like the following:
It does this by first generating the body in the current function. Then we outline the
body in a temporary function. We then create the @outlined_omp_teams function and embed
the temporary outlined function in this function. We then emit the call to runtime
function.