-
Notifications
You must be signed in to change notification settings - Fork 153
Rename mlir-clang to cgeist #221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
chelini
reviewed
Jun 17, 2022
tools/cgeist/Test/polybench/datamining/correlation/correlation.c
Outdated
Show resolved
Hide resolved
mmoadeli
added a commit
to InteonCo/Polygeist
that referenced
this pull request
Aug 1, 2022
* LLVM global initialization (llvm#167) * Add template member test * Add LLVM global initialization * Fix sign/zero integer extension (llvm#162) * bump LLVM (llvm#166) * bump LLVM * fix * more fixes * fix * Fix build * Use actual llvm hash Co-authored-by: William S. Moses <gh@wsmoses.com> * Silence compound literals warning (NFC) * Variable Redeclaration Fixes (llvm#172) * Non ODR constexpr * Fix redeclarable decls * Support addassign binary operation for memrefs (llvm#173) * Support add assign for memrefs * Add test case for memref add assign * Fix constructor of LLVM global (llvm#175) * Visibility * Use ctors * Constructor global llvm * Constexpr (llvm#176) * add std initializer * Fix loop inc bound * Add type align (llvm#177) * LLVM Rebase and type size/alignment (llvm#178) * Add type align * Add align * Increase * Bump LLVM * Bump DL * format * Add canonicalizer * Fix SCF lowering * SCF lowering * Fix format * OpenMPOpt * Add OpenMPOpt * Fix tests * Allow building * Fix OpenMP upstream (llvm#180) * Fix crash when adding malloc/free functions to module (llvm#184) Happens when two threads try to do it at the same time * Transform MemcpyToSymbol call ops (llvm#183) * Remove bad assertion (llvm#186) * Handle noexcept (llvm#182) * Fix raising with index cast (llvm#185) * Fix raising with index cast * Fix affine raising * Create for with break handling * Canonicalize to for with with break * look through * Fix bug and bump llvm * Fix rebase * Fix build * Fix negative step * Fix raise bug * Fix format * Barrier hoist and similar * Bump LLVM * Bump and fix * Add excluded middle * Fix raising of integers * Mem2reg with affine if * Internal opts * Parallel interchange * Fix distribute bug * Fix format * Update affine raise * Fix for raising and distribute * Fix affine raising * Handle wrapping if and affine if with barrier * WIP min cut cache optimisation * Some fixes * mincut fixes * fixes * Recalculate values after the barrier * Recalculate vals * Fix * Working with mincut optimisation disabled * recalculation fix * Load cached values only once * Do not add barriers prior for's with only recomputable values preceeding * Lower polygeist cache load to memref load * Support fors and ifs of different types with preceeding recomputable ops Large scale refactoring to eliminate code duplication * Fix test * Support for nested if's with barriers Properly remap values to cloned recomputations in parallel regions * For and if interchange fixes * Support for interchanging while loops * Refactor recomputable insertion into a separate function * Fix rebase * Add test case * Recognise command line option "distribute.mincut" for cpuify * Correct handling of recomputed for boundaries/if conditions * Add test with if bound cache load * Add missing check for recomputables before while when interchanging * Remove unneeded code and comments * Handle some TODO's * Finalise rebase * Add unneeded alloca deletions * Fix test case * Fix rebase * Fix tests for alloca scope change * Fix test allocdist * Remove wrong comment * Fix Openmp opt * Fix barrier memory semantics * Fix mem2reg bug * Fix OpenMPOpt interchange and bump if (llvm#191) * Fix if interchange * Fix allocation region * Fix format * Fix test * Fix sizecheck * Fix affine for raising * Fix canonicalize * Fix memory issue * fix idx * Fix mem2reg switch * Add Try/catch (llvm#193) * Add Try * Fix placement new * Update clang-mlir.cc * Add an option for openmp opt (llvm#195) * Clang-tidy (NFC) run-clang-tidy -fix tools/mlir-clang/Lib/* -checks=-*,llvm* run-clang-tidy -fix lib/polygeist/* -checks=-*, llvm* run-clang-tidy -fix tools/mlir-clang/mlir-clang.cc -checks=-*,llvm* Checks: llvm-header-guard llvm-include-order llvm-namespace-comment llvm-prefer-isa-or-dyn-cast-in-conditionals llvm-prefer-register-over-unsigned llvm-qualified-auto llvm-twine-local * Parallel LICM and pointer math bugfixes (llvm#194) * Fix * Only allow constant rems * foo * Fix emit direct callee * Add fetch add * No condition for * Add test * Fix subclass abi * New derived handling * Fix Parallel LICM & inheritence (llvm#196) * Fix Parallel LICM condition * Fix ParallelLICM * Fix inheritence base lowering * Fix address merged offset of base * Add tests * Handle new virtual/pure only * Stabilize ptraddsubtest * Fix flag name * Override cudaGetLastError * Fix format * Prefix ABI (llvm#197) * Prefix ABI * Fix format Co-authored-by: William S. Moses <gh@wsmoses.com> * Fix llvm struct abi field lowering (llvm#198) * Fix lookup where no base is created in llvm abi * Add test * Bump LLVM * Fix API change * Fix union * Add test * Foreach and prefix malloc/free (llvm#199) * Don't prefix malloc/free * C++ Foreach * Cleanup Mem2Reg and GPU (llvm#204) * Upgrade mem2reg * Fix test * Correct barrier elimination * Fix affine raising * Fix affine raise * Fix affine raising * Lowering update * Simplify affine and whilelicm * Fix affine raise * Fix infinite loop in memset/cpy to load/store * Optionally disable loop unroll * Fix cmpi bug * Fix allocation location * Add memcpy behavior * Properly handle continue * Do not duplicate already cached values crossing a barrier * Mem2reg and licm with affine * Fix unused scf inductive var * Fix parallel affine raise * Remove print * Partially speed up mem2reg * Reduce max unroll size to 32 * Add rank reduction * Disable RankReduction, fix for InductiveVarRemoval * Work around affineexpr bug * Improve ordering of loop distribute * Improve if reg2mem * Make cacheload * Fix reg2mem for (2) * Skip recursive load/store * Now ignoring barriers * No unnecessary internal store forwarding * Attempt fix * Add inner serialize * Fix set operands * Fix infinite loop * Remove unnecessary yield * Fix inner serialize * Do not use cacheload when not mincut * Fix recompute threshold * Strengthen alias analysis * No infinite loop * Add single execution query * Fix bug * Fix compile bug * First pass fix tests * Bump LLVM * Cleanup polygeist opt tests * Fix tests * Fix format Co-authored-by: Ivan Radanov Ivanov <ivanov.i.aa@m.titech.ac.jp> * Distribute immediately after wrap (llvm#207) * Fix barriers betting removed when they are required for interchange * Fix distributing after wrap * Make dsitrib after for respect mincut setting * Ignore CacheLoadOp's memory effects when checking recomputability * Use FullyRecomputable function in while wrap and interchange * Make CacheLoad a no side effect op * Reuse code for recomputability check * clang-format * Remove NoSideEffects from CacheLoad, handle it explicitly * Fix collectEffect behaviour for CanonicalizeFor * Fix distributing around the wrong barrier after a wrap * Forgot to set singleExecution=true for recomputable check before ifs * Update tests * Comments * Do not replicate reads if possible in distribute around barriers * Fix some resultless operations such as yield not getting recomputed * Fix op recalculation when distributing around barrier * Improved check for recomputability in distribute * Update tests * clang format * Bump llvm - FuncOp namespace - include path for moveLoopInvariantCode * Global function code gen fix (llvm#208) * Call device stub from host code for global functions * Update tests * Don't reverse inputs * Add a test case for cuda global code gen * Add nocuda{inc,lib} options and fix cuda test * clang format * Add barebones cuda header for cuda tests * Handles basic syntax using ext_vector_type type. (llvm#179) * Handles basic syntax using ext_vector_type type. * Improvement is needed to support more complex syntax (such as ext.xyz) * Refactors implementation of VisitExtVectorElementExpr using CommonArrayLookup. * Adds test. * CUDA Stream support (llvm#213) * CUDA Stream support * Async lowering [WIP] * Fix lowering to moccuda * Convert to malloc/free * Fix non-async * Update LLVM * Fix build * Install polygeist tools (llvm#216) * Emit some std:: clang builtin functions (llvm#217) * Add support for some builtin std:: functions clang::Builtin::BImove clang::Builtin::BImove_if_noexcept clang::Builtin::BIforward clang::Builtin::BIas_const * Add test * Infer location of LLVM source tree (llvm#218) * Rename mlir-clang to cgeist (llvm#221) * Rename mlir-clang to cgeist * Remove todos * Update build.yml * update README (llvm#226) * Handle allocation when allocating struct with new (llvm#227) * Move tests in 'Verification' (NFC) (llvm#225) * [BUGFIX] correctly emitting calls with addressoff operator (llvm#224) [BUGFIX] correctly emitting calls with addressoff operator Formatting moved test to cgeist dir reduced lit test reduced lit test Co-authored-by: pietro.ghiglio <pietro.ghiglio@codeplay.com> * [BUGFIX] Handling elaborated types is VisitArrayLoopInit (llvm#230) authored-by: pietro.ghiglio <pietro.ghiglio@codeplay.com> * Erase op (llvm#231) * update test for LLVM 15 * update sycl dialect include * update merge conflict * fix Unexpectedly Passed errors * fix virt and cuda failing tests * Fixes a number of failures after merge. - Removes un-necessary // XFAIL: *s - Removes a verify in driver.cc, which is added by Codeplay earlier. * [FIX] Positional arguments needs to sorted by their order of construction. Merge breaked that order * [FIX] Remove expected failed on polybench's tests * [CLEAN] Remove debug utilities Co-authored-by: William Moses <gh@wsmoses.com> Co-authored-by: lorenzo chelini <l.chelini@icloud.com> Co-authored-by: Ivan <ivanov.i.aa@m.titech.ac.jp> Co-authored-by: Stephen Neuendorffer <stephen.neuendorffer@xilinx.com> Co-authored-by: PietroGhg <38155419+PietroGhg@users.noreply.github.com> Co-authored-by: pietro.ghiglio <pietro.ghiglio@codeplay.com> Co-authored-by: Jefferson Le Quellec <jefferson.lequellec@codeplay.com>
mmoadeli
added a commit
to InteonCo/Polygeist
that referenced
this pull request
Aug 1, 2022
* LLVM global initialization (llvm#167) * Add template member test * Add LLVM global initialization * Fix sign/zero integer extension (llvm#162) * bump LLVM (llvm#166) * bump LLVM * fix * more fixes * fix * Fix build * Use actual llvm hash Co-authored-by: William S. Moses <gh@wsmoses.com> * Silence compound literals warning (NFC) * Variable Redeclaration Fixes (llvm#172) * Non ODR constexpr * Fix redeclarable decls * Support addassign binary operation for memrefs (llvm#173) * Support add assign for memrefs * Add test case for memref add assign * Fix constructor of LLVM global (llvm#175) * Visibility * Use ctors * Constructor global llvm * Constexpr (llvm#176) * add std initializer * Fix loop inc bound * Add type align (llvm#177) * LLVM Rebase and type size/alignment (llvm#178) * Add type align * Add align * Increase * Bump LLVM * Bump DL * format * Add canonicalizer * Fix SCF lowering * SCF lowering * Fix format * OpenMPOpt * Add OpenMPOpt * Fix tests * Allow building * Fix OpenMP upstream (llvm#180) * Fix crash when adding malloc/free functions to module (llvm#184) Happens when two threads try to do it at the same time * Transform MemcpyToSymbol call ops (llvm#183) * Remove bad assertion (llvm#186) * Handle noexcept (llvm#182) * Fix raising with index cast (llvm#185) * Fix raising with index cast * Fix affine raising * Create for with break handling * Canonicalize to for with with break * look through * Fix bug and bump llvm * Fix rebase * Fix build * Fix negative step * Fix raise bug * Fix format * Barrier hoist and similar * Bump LLVM * Bump and fix * Add excluded middle * Fix raising of integers * Mem2reg with affine if * Internal opts * Parallel interchange * Fix distribute bug * Fix format * Update affine raise * Fix for raising and distribute * Fix affine raising * Handle wrapping if and affine if with barrier * WIP min cut cache optimisation * Some fixes * mincut fixes * fixes * Recalculate values after the barrier * Recalculate vals * Fix * Working with mincut optimisation disabled * recalculation fix * Load cached values only once * Do not add barriers prior for's with only recomputable values preceeding * Lower polygeist cache load to memref load * Support fors and ifs of different types with preceeding recomputable ops Large scale refactoring to eliminate code duplication * Fix test * Support for nested if's with barriers Properly remap values to cloned recomputations in parallel regions * For and if interchange fixes * Support for interchanging while loops * Refactor recomputable insertion into a separate function * Fix rebase * Add test case * Recognise command line option "distribute.mincut" for cpuify * Correct handling of recomputed for boundaries/if conditions * Add test with if bound cache load * Add missing check for recomputables before while when interchanging * Remove unneeded code and comments * Handle some TODO's * Finalise rebase * Add unneeded alloca deletions * Fix test case * Fix rebase * Fix tests for alloca scope change * Fix test allocdist * Remove wrong comment * Fix Openmp opt * Fix barrier memory semantics * Fix mem2reg bug * Fix OpenMPOpt interchange and bump if (llvm#191) * Fix if interchange * Fix allocation region * Fix format * Fix test * Fix sizecheck * Fix affine for raising * Fix canonicalize * Fix memory issue * fix idx * Fix mem2reg switch * Add Try/catch (llvm#193) * Add Try * Fix placement new * Update clang-mlir.cc * Add an option for openmp opt (llvm#195) * Clang-tidy (NFC) run-clang-tidy -fix tools/mlir-clang/Lib/* -checks=-*,llvm* run-clang-tidy -fix lib/polygeist/* -checks=-*, llvm* run-clang-tidy -fix tools/mlir-clang/mlir-clang.cc -checks=-*,llvm* Checks: llvm-header-guard llvm-include-order llvm-namespace-comment llvm-prefer-isa-or-dyn-cast-in-conditionals llvm-prefer-register-over-unsigned llvm-qualified-auto llvm-twine-local * Parallel LICM and pointer math bugfixes (llvm#194) * Fix * Only allow constant rems * foo * Fix emit direct callee * Add fetch add * No condition for * Add test * Fix subclass abi * New derived handling * Fix Parallel LICM & inheritence (llvm#196) * Fix Parallel LICM condition * Fix ParallelLICM * Fix inheritence base lowering * Fix address merged offset of base * Add tests * Handle new virtual/pure only * Stabilize ptraddsubtest * Fix flag name * Override cudaGetLastError * Fix format * Prefix ABI (llvm#197) * Prefix ABI * Fix format Co-authored-by: William S. Moses <gh@wsmoses.com> * Fix llvm struct abi field lowering (llvm#198) * Fix lookup where no base is created in llvm abi * Add test * Bump LLVM * Fix API change * Fix union * Add test * Foreach and prefix malloc/free (llvm#199) * Don't prefix malloc/free * C++ Foreach * Cleanup Mem2Reg and GPU (llvm#204) * Upgrade mem2reg * Fix test * Correct barrier elimination * Fix affine raising * Fix affine raise * Fix affine raising * Lowering update * Simplify affine and whilelicm * Fix affine raise * Fix infinite loop in memset/cpy to load/store * Optionally disable loop unroll * Fix cmpi bug * Fix allocation location * Add memcpy behavior * Properly handle continue * Do not duplicate already cached values crossing a barrier * Mem2reg and licm with affine * Fix unused scf inductive var * Fix parallel affine raise * Remove print * Partially speed up mem2reg * Reduce max unroll size to 32 * Add rank reduction * Disable RankReduction, fix for InductiveVarRemoval * Work around affineexpr bug * Improve ordering of loop distribute * Improve if reg2mem * Make cacheload * Fix reg2mem for (2) * Skip recursive load/store * Now ignoring barriers * No unnecessary internal store forwarding * Attempt fix * Add inner serialize * Fix set operands * Fix infinite loop * Remove unnecessary yield * Fix inner serialize * Do not use cacheload when not mincut * Fix recompute threshold * Strengthen alias analysis * No infinite loop * Add single execution query * Fix bug * Fix compile bug * First pass fix tests * Bump LLVM * Cleanup polygeist opt tests * Fix tests * Fix format Co-authored-by: Ivan Radanov Ivanov <ivanov.i.aa@m.titech.ac.jp> * Distribute immediately after wrap (llvm#207) * Fix barriers betting removed when they are required for interchange * Fix distributing after wrap * Make dsitrib after for respect mincut setting * Ignore CacheLoadOp's memory effects when checking recomputability * Use FullyRecomputable function in while wrap and interchange * Make CacheLoad a no side effect op * Reuse code for recomputability check * clang-format * Remove NoSideEffects from CacheLoad, handle it explicitly * Fix collectEffect behaviour for CanonicalizeFor * Fix distributing around the wrong barrier after a wrap * Forgot to set singleExecution=true for recomputable check before ifs * Update tests * Comments * Do not replicate reads if possible in distribute around barriers * Fix some resultless operations such as yield not getting recomputed * Fix op recalculation when distributing around barrier * Improved check for recomputability in distribute * Update tests * clang format * Bump llvm - FuncOp namespace - include path for moveLoopInvariantCode * Global function code gen fix (llvm#208) * Call device stub from host code for global functions * Update tests * Don't reverse inputs * Add a test case for cuda global code gen * Add nocuda{inc,lib} options and fix cuda test * clang format * Add barebones cuda header for cuda tests * Handles basic syntax using ext_vector_type type. (llvm#179) * Handles basic syntax using ext_vector_type type. * Improvement is needed to support more complex syntax (such as ext.xyz) * Refactors implementation of VisitExtVectorElementExpr using CommonArrayLookup. * Adds test. * CUDA Stream support (llvm#213) * CUDA Stream support * Async lowering [WIP] * Fix lowering to moccuda * Convert to malloc/free * Fix non-async * Update LLVM * Fix build * Install polygeist tools (llvm#216) * Emit some std:: clang builtin functions (llvm#217) * Add support for some builtin std:: functions clang::Builtin::BImove clang::Builtin::BImove_if_noexcept clang::Builtin::BIforward clang::Builtin::BIas_const * Add test * Infer location of LLVM source tree (llvm#218) * Rename mlir-clang to cgeist (llvm#221) * Rename mlir-clang to cgeist * Remove todos * Update build.yml * update README (llvm#226) * Handle allocation when allocating struct with new (llvm#227) * Move tests in 'Verification' (NFC) (llvm#225) * [BUGFIX] correctly emitting calls with addressoff operator (llvm#224) [BUGFIX] correctly emitting calls with addressoff operator Formatting moved test to cgeist dir reduced lit test reduced lit test Co-authored-by: pietro.ghiglio <pietro.ghiglio@codeplay.com> * [BUGFIX] Handling elaborated types is VisitArrayLoopInit (llvm#230) authored-by: pietro.ghiglio <pietro.ghiglio@codeplay.com> * Erase op (llvm#231) * update test for LLVM 15 * update sycl dialect include * update merge conflict * fix Unexpectedly Passed errors * fix virt and cuda failing tests * Fixes a number of failures after merge. - Removes un-necessary // XFAIL: *s - Removes a verify in driver.cc, which is added by Codeplay earlier. * [FIX] Positional arguments needs to sorted by their order of construction. Merge breaked that order * [FIX] Remove expected failed on polybench's tests * [CLEAN] Remove debug utilities Co-authored-by: William Moses <gh@wsmoses.com> Co-authored-by: lorenzo chelini <l.chelini@icloud.com> Co-authored-by: Ivan <ivanov.i.aa@m.titech.ac.jp> Co-authored-by: Stephen Neuendorffer <stephen.neuendorffer@xilinx.com> Co-authored-by: PietroGhg <38155419+PietroGhg@users.noreply.github.com> Co-authored-by: pietro.ghiglio <pietro.ghiglio@codeplay.com> Co-authored-by: Jefferson Le Quellec <jefferson.lequellec@codeplay.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.