-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update to latest alpaka 1.0.0-dev #242
Merged
psychocoderHPC
merged 3 commits into
alpaka-group:dev
from
psychocoderHPC:topic-updateAlpakaVersion
Aug 22, 2023
Merged
update to latest alpaka 1.0.0-dev #242
psychocoderHPC
merged 3 commits into
alpaka-group:dev
from
psychocoderHPC:topic-updateAlpakaVersion
Aug 22, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
6b1ef2a
to
f2d595f
Compare
f2d595f
to
cac686f
Compare
195cf87
to
e6e16da
Compare
alpaka renamed `Pltf` into `Platform` and make platform an object. - Additionally this commit fixes the examples which are broken because changes from alpaka-group#225 was not taken into account.
76c6bba28c Reduce example work sizes (#2084) 516e9f9b2e Add alpaka_RELOCATABLE_DEVICE_CODE option (#1467) 79c3113a98 fix CUDA CMake support bb74c9129e Disable recursive macro expansion warning for icpx a6f9e6e053 Reduce console output 93da137545 Simplify alpaka_add_{executable,library} b539d47d8d remove atomics from MemFence 23edf577e5 Move Gitlab clang jobs to Ubuntu 22.04 eeb7bdce9f Move ASan CI job to clang-16 44d1109168 Update README.md and temporarily disable SYCL runtime checks 78d436c228 Move m_extentWidthBytes outside of debug guards e7ebee94a4 Add oneAPI CI jobs 7aa8b043ca Ensure CallbackThread/ThreadPool propagate exceptions 575f64dfaf Test value categories for enqueued tasks 12980a9865 Add missing <cstdint> 7aeafa5269 Remove remainders of Accessor b23e3cf4e0 change formatting for clang-format 2edf839a23 change fixed size_t to auto ca69fc644b include feedback 73e46fcb3c change some const variables to constexpr effde28370 Remove ALPAKA_SYCL_BACKEND_ONEAPI 0222a7aecd apply reviewer comemnts e0bc8cbb62 event test missing checks 813c970bcb add new event tests e4ee1e0e21 fix host thread event implementation and evenet tests 6c442c71cb fix accessors for the SYCL backend 74c320e7c6 Add a CI run with UBSan ede19d7b9e add mdspan tests to the CI e793c4ef3a Add a test that a task is destroyed after execution 388483ce6b Modernize CMake 062a9feda4 Fix missing <cstdint> include ac7b41daf2 Mark mdspan includes as SYSTEM includes fcec7c2fc7 Fix compilation of MdSpan tests b6eb4b62ee Drop Accessor 727f55b71b Update special CUDA jobs 53e17b8aa9 Rewrite counterBasedRng example using mdspan d62dd59bfa Add math::copysign 558d2698cd add CUDA 12.2 9388d8f249 Fix compilation of bufferCopy example d8a41f26c2 ci job generator: print warning if parameter value is not supported by the alpaka-job-coverage library a0d731d43a Remove unused variable b524591014 Add CUDA/HIP headers e2a994ebae Forward declare AccGpuUniformCudaHipRt to avoid a dependency loop 49e90324aa Add the alpaka_DISABLE_VENDOR_RNG option 2b265c01fa Make the vendor-specific random number generators optional f32efc2664 Add missing include 2ed16fbf58 GitLab CI: job generator checks if container images exists 7ebf53fab7 Enable release builds for gcc-9 + CUDA 8e5ae6e749 remove alpaka_SYCL_ENABLE_IOSTREAM from cmake 67ef8a736f Fix a dangling pointer in the SYCL memory buffer deleter 16e32caad3 Remove the dependency on Intel MKL 4f787e1ca3 Rename FenceTest to fenceTest 99e5b04fad Refactor OpenMP2 collective queue b322395974 Add clang-16 test runner (#1971) fa0af94515 modified allocMappedBuf in the tests b046d0d0c0 add the platform as an argument to allocMappedBuf f117eb1a0c Add missing ALPAKA_UNIFORM_CUDA_HIP_RT_CHECK calls in debug mode (#2034) ee1b7c4b3a update SYCL README 6002d3ebfb Complete renaming Pltf to Platform eba6db5d8e Add one more digit to the logs ea0120f07e Implement math tests for ternary ops and fma 308be5dce9 Add the fused multiply-add functions 255c5d1fe6 documentation: QueueCpuOmp2Collective 459d32612b add missing ADL tests for math hyperbolic functions ab4eb3fee6 fix callback thread task lifetime 164edf14f6 Trivial clean up of some SYCL-related headers dfdca84d33 add math::log2 and math::log10 8cf861bd6a Rename Pltf to Platform 5251061e13 CI: enable Clang 15 as CUDA compiler for release builds b0fbddf9da fix CallBack thread data race b5d541b4d3 Update the main SYCL include file name 56cd5cdc78 Various fixes related to the SYCL back-end ac0143dbdd Support compile-time warp size in SYCL kernels cffed4c8dc First draft adding the warp size as a kernel trait d9fbf7bf36 Rewrite the SYCL memcpy and memset operations c1fe0763f1 Generalise the SYCL CpuSelector to non-Intel CPUs 24ca7fdfd7 Rewrite the SYCL backend for the SYCL 2020 standard and USM allocations (part 4) 09e65a28c9 Rewrite the SYCL backend for the SYCL 2020 standard and USM allocations (part 3) 76a13a774f Rewrite the SYCL backend for the SYCL 2020 standard and USM allocations (part 2) 251482ede2 Rewrite the SYCL backend for the SYCL 2020 standard and USM allocations (part 1) 467384a5ca queue test: fix catch2 usages within threads 087009956a Fix the delagating constructor in KernelExecutionFixture fa7ce64499 Fix compilation errors in PltfGenericSycl 41e99568f1 Make alpaka platforms full objects 1eadba4f27 Fix typos in the comments in alpaka/vec/Vec (#2019) bfc35edcdb Remove conflicting entry from .clang-format (#2018) a35e49ee27 Update the Any test to work with a sub-group size of 4 6281b5f106 update alpaka-job-matrix-library to 1.3.5 007f1ee632 Update the separable compilation test 8490bb7a38 Mark mysqrt as ALPAKA_FN_EXTERN abb32ddca3 Remove the requirement that the native handle is an int eab49eefbe Update the bufSlicingTest 6abab49170 Update the type expected by View::operator[] 334e26586f Implement logical operations on alpaka::Vec 9cbf890ae4 Update .readthedocs.yml to version 2 2f989fe7ca Do not run tests on 0-dimensional accelerators 64f3f35091 Move NonZeroTestDims to TestDims.hpp ff04bf3d99 Silence clang-16 warnings 909613a05e Enable two phase lookup with MSVC 29f6ed2b83 Simplify ConcurrentExecPool and rename to ThreadPool 5cbd95bb27 fix host callback unit test 4d378772c5 Refactor TaskKernelCpuThreads 3923c0828a Refactor ndloop 49e6f2ce83 Use a nested namespace specifier and struct c41e56b38b Remove cleanup actions from CPU device da34256c83 Remove detaching logic from ConcurrentExecPool e0577b4087 Replace ConcurrentExecPool by CallbackThread in QueueGenericThreadsNonBlockingImpl 3158f18ee2 Add a benchmark for enqueue of a host func 375f4f0e16 Fix compilation with TSAN and serial backend 8fe1d1c8dc Use nested namespace specifier 37add41746 Avoid unnecessary copy aba05a272c Include missing header c1e0d3b6dc Demangle the kernel names in the integration tests 78e984d463 SYCL: update to the SYCL 2020 standard f92989616f SYCL: revert spurious changes dd9e30c67a Fix wrong CallbackThread termination 620ba96104 Fix a typo 5957371fd6 Make ALPAKA_UNIFORM_CUDA_HIP_RT_CHECK_IGNORE multiline safe 9ece8221a4 Improve debug output in QueueTest 625c7acb79 Allow CallbackThread to take any callable type 50cc03fb64 Update copyright information 6e521624c3 SYCL: change the default stream size to 8 KiB 0ce95ec66c SYCL: update to the SYCL 2020 standard d7ce88dbc7 Make unused arguments anonymous 12392a4d7d Fix CMake build instructions for SYCL back-end. 2caf919872 Disable CATCH_CONFIG_FAST_COMPILE 08e1037f79 Restrict atomic tests to the supported types edd1eb7883 Do not run tests on 0-dimensional accelerators cb470230d4 Add a meta type to select non-zero integral constants 8de720070e fix callback-thread tasks lifetime ef23ccd184 Add Xcode 14.3.1 test runner 3fe070aba0 Update SYCL CMake and remove Xilinx support d7d873e96c Try to fix amalgamation CI 2346940a13 Update CMake / Boost minor versions and copyright info (#1969) d6eb7146d4 Add gcc-13 test runner 3838fbcd16 Fix ill-formed spelling of ctor in C++20 20cd62e9f5 Fix amalgamation CI job 46cdf9bfe2 Refactoring b1b5fc9956 Add CI job to create amalgamated alpaka.hpp 2af4dcc210 Use quotes for including local alpaka headers 9a128adb5a CI: test ROCm 5.5 56c12da983 Update Catch2 version requirement 99e131d121 Update to Catch2 v3.3.2 59cb5ebca1 Update Catch2 version requirement 0c27664de8 Update to Catch2 v3.3.1 0b151d0378 Disable MSVC + CUDA jobs 786ce2c3b1 Add CUDA 12.1 support (#1957) 0b25fc9e7e GitLab CI: enable nvcc 12.x c++20 test 461e1017e4 GitLab CI: support alpaka-job-coverage 1.3.0 88860c99cc fixed alpine image for Gitlab CI job generator 89a411fea6 Remove OpenMP 5 back-end 1983489609 Remove OpenACC back-end 07a8458ed9 Update documentation for kernel arguments 11a6ac1342 Deactivate icpx omp5 job a9f5b59da0 Removed ext::oneapi namespace bd515b89d8 Removed experimental namespace from SYCL 6c3ab90687 CI: fix compile issue if cpuonly job is executed on a GPU runner (#1939) 90dc85db96 Avoid repeated writing to shared mem in BabelStream dot 10c7c4c77f Reserve the devices vector memory before filling it 193f2c4be3 CI: fix job generator af21e943d3 HIP <=5.3 avoid compiler error 8ea325d31e Restore macOS OpenMP jobs (#1922) 2346ca6a51 Consider CUDA 11.7 stream memory operations a2a9695e96 Add ROCm 5.4 support (#1915) 1d8772cc14 Update CI container to version 3.1 759d754577 Use SPDX License identifiers b849ce43db Replace BOOST_LANG_HIP with HIP_VERSION b5b2d00475 Fix compilation warnings ecc06294a0 Always inline with ALPAKA_FN_INLINE 162773e2a7 Remove old clang-cuda workaround 74ee8c8c11 Raise required CMake version to 3.22 5573fc351c Disable tests if used by add_subdirectory a68c866cc6 GitLab CI: split compile only and runtime test in two child pipelines 40140fa153 add CUDA to job generator a36139a041 fix Clang installation in CI 100a047e7e make GitLab CI jobs interruptible 9716c5673d Make SYCL runtime objects static 60ede546fe Update CMake and Boost point releases (#1903) 2e99977245 Implement trait constants 8ef9ccb3a1 fix agc-manager detection in the CI da31798980 Enable more C++20 jobs 9946b859af Manually install sanitizer libraries 7a75cb3eaf Update to Xcode 14.2 28338849bd Use hipMallocAsync/hipFreeAsync with HIP 5.2.0 and later 921a6bf8bb Use hipLaunchHostFunc with HIP 5.4.0 and later d44d2ab572 Add clang-15 to CI 4e6fecb3d7 CUDA CI update 6674ce6bab add HIP to the CI generator parameter e28a9f34c3 Make use of mdspan configurable in CI 242ea84aad Rewrite bufferCopy example using mdspan 0aadfedd73 Add customizable function getMdSpan/getTransposedMdSpan(View) e5107da242 Add mdspan to cmake 4363bbb912 Drop MSVC 2019 97ea65a63d remove support for HIP 4.x 24d8a3ee11 use agc-image for GitLab CI f79f08a235 add support for agc-manager e02b42624f fix icpx error implicit conversion 89c93d1075 Drop legacy compilers and CUDA versions 7582d6c123 Fix undefined constant with nvcc 12.0 76fb556517 Select serial accelerator for tests/examples (#1843) 1c18024e33 gitlab CI run more jobs as compile only 1c6ea20f46 Port babelstream from cupla to alpaka 291cff54cd Run clang-format 908ef12064 Add cupla version of BabelStream e31eed92ac remove boost 1.73 warning 4a7c9db41e Mark CATCH cmake variables as advanced a8460487c3 Add gcc-12 + OpenACC CI job 8434b9ea79 Update to Catch2 v3.2.1 7b77a28461 fix CUDA memory allocation mapped/async a2f8d778a7 Collapse compiler matrix (#1860) 4b50b39267 Add clang-14 to CI 3308d8bbbd Avoid use-after-free of m_cvWakeup ccb8683d7c Fix use after move in QueueGenericThreadsNonBlocking 2c2588989d Refactor QueueGenericThreadsNonBlocking a690c3d206 Refactor ConcurrentExecPool 5f499f8c2f Refactor ITaskPkg and TaskPkg 2bf0149dd4 Refactor ThreadSafeQueue 19bed293a4 Merge ConcurrentExecPool primary template with specialization 3c33af6542 Drop CUDA 9.2 85abc80984 Drop Boost.fiber back-end b3be00fc07 fix CUDA CI ef234bc98e Create a patch if clang-format CI fails (#1823) bde1dc6a6c Fix missing `final` keyword for acceleriator inheritance c95c9d0891 Test calling getValidWorkDiv with Idx type directly 8fa8648389 Refactor subDivideGridElems 4494a2c9b9 Fix createView for containers without a size argument d0d7c14253 Add a new example demonstrating parallel loop patterns fbecfb5e8e add math hyperbolic functions (#1828) db2457997c CI: add HIP/ROCm 5.3 6e7e50a1df Make BlockSyncTestKernel::gridThreadExtentPerDim constepr function 162c330cfa Update CI NVHPC versions to 22.3 bc3b863846 GetDevProps<AccOacc>: report m_multiprocessorCount = 1 c47bf10dd7 CI: change ROCm CI node (#1844) 058785a838 Drop alpaka/time df795ddc16 fix warning calling `__host__` from `__host__ __device__` function c9377d33fc CI: remove OMP2 backend tests for MACOSX 12e0b302ef Run clang-format eccab29627 Add some tests for subDivideGridElems b8ddf35c39 Run clang-format dcbf43aa9d Enable new formatting options 96c3920cff Update to clang-format-14 d3064f036f Upgrade to GH checkout action v3 1111fd083c delete copy, assign, move and move assign operator for accelerators ffb8307194 Update to Catch2 v3.1.1 b004375c3d Implement accelerator tags 08724b5f40 CI: test ROCm 5.2.3 (#1812) c73f8b7605 Apply suggestions from code review c235fd67f0 Add example counterBasedRng 4695951762 CudaVectorArrayWrapper: Add convertability to/from std::array 3fbfb08076 Add PhiloxStatelessVector f8ee7c9bf1 Add PhiloxStateless d98d7707a6 Make mangled CUDA kernel name as short as possible (#1795) 6feb271d80 Rename result and reference values e824302444 Add tests for elementwise_min and max functions 201f53f26d Add elementwise_min and max functions 3742e88648 Workaround nvcc 11.3 116a36712e Add deduction guide for Vec c1d6ace30c Upgrade clang/CUDA headercheck CI to clang 13 and CUDA 11.2 0e112bd0ff add job generator to gitlab ci bca3bfbf60 Remove the functions to pin/unpin an existing buffer 77b060355a Add a comment and a unit test that default engine type is trivially copyable 28cc847ce4 Make Philox random engines trivially copyable b518e8c943 Document alpaka::allocAsyncBuf f4c0b639d4 Document alpaka::allocMappedBuf 1f3babfec3 Improve error handling for memory de/allocation 2f8c6b0423 Move allocAsyncBufIfSupported to mem/buf/Traits.hpp 47e3278fb3 Update some tests to use allocMappedBufIfSupported 9d7de18e2e Add a trait for pinned/mapped memory allocation capability 31a847236d Change the interface of allocMappedBuf c4424f2a9d PhiloxBaseCommon,PhiloxConstants: constexpr workaround for GCC 540397c429 Use CUDART_VERSION instead of BOOST_LANG_CUDA 0d2cec0bc3 Add missing template parameters 30d205f46d Update copyright notice 6c990fe660 Apply code style and formtting a72874556e Use a nested namespace definition a3380a364b Query te OS for free pages instead of reading /proc/meminfo 5afdda9869 Move includes to the global namespace d8c6e5f94c Drop support for icc/icpc d92f22850e fix clang CUDA atomics f27d78c23c HIP: use emulated `atomicAdd(float*,float)` 31993fcbb0 HIP: workaround atomicMax and atomicMin 071417f50b HIP: usa atomic load within atomicCas emulation a96bc8c733 OpenACC: test only 32bit and 64bit atomics cab648ec0e disable OpenACC float comparisn warning 543310c0ab workaround for clang 9 with cuda 9.2 62db17a094 refactore atomic unit tests d1c34cde30 `alpaka::AtomicCas` add floating point support 3d76d95222 refactor HIP/CUDA atomic implementations 0b96515b1c HIP: use build-in `atomicAdd(double)` 8acfbe42d6 Add gcc-12 to CI f5118e82e2 Simplify clang installation 5a4691c826 Add ViewConst 4b6ead16de Fix misleading parameter name in DefaultQueue a310c437b0 Remove redudant check fbd1ac0c32 CI Update for ROCm e2958beb22 Remove ALPAKA_FN_HOST_ACC on defaulted functions a82be7374e CI Update for macOS 01a80e42bf OpenMP fixes for clang 13 6754e5bf7c OpenACC fixes for gcc12 dd3352be8f Set policy CMP0091 to NEW e76b69b16b Remove support for clang-5 afb49a0c47 Accumulate memcpy/memset static_asserts 815490192f Upgrade to Catch2 v3 7ff5fdd478 Allow temporary destination views in memset/memcpy 38c24f6c4e Refactor TaskCopyOacc 5da464ff40 Diagnose CallbackThread joining itself git-subtree-dir: alpaka git-subtree-split: 76c6bba28c7a94a58b420e91ba135705f59cde44
…dateAlpakaVersion
e6e16da
to
620e5da
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
alpaka subtree version: https://github.com/alpaka-group/alpaka/tree/eba6db5d8efc3c2585470085e76ba3dcab510e49
Add compatibility fix to keep support for alpaka 0.9.0