Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update to latest alpaka 1.0.0-dev #242

Merged

Conversation

psychocoderHPC
Copy link
Member

@psychocoderHPC psychocoderHPC commented Jul 28, 2023

alpaka subtree version: https://github.com/alpaka-group/alpaka/tree/eba6db5d8efc3c2585470085e76ba3dcab510e49

GIT_AUTHOR_NAME="Third Party" GIT_AUTHOR_EMAIL="mallocMC@hzdr.de" git subtree pull --prefix alpaka git@github.com:alpaka-group/alpaka.git develop --squash

Add compatibility fix to keep support for alpaka 0.9.0

  • the platform interface changed too, therefore more updates are required.

@psychocoderHPC psychocoderHPC added this to the 2.6.0crp milestone Jul 28, 2023
@psychocoderHPC psychocoderHPC marked this pull request as draft July 28, 2023 12:49
@psychocoderHPC psychocoderHPC marked this pull request as ready for review July 28, 2023 13:58
@psychocoderHPC psychocoderHPC force-pushed the topic-updateAlpakaVersion branch 3 times, most recently from 195cf87 to e6e16da Compare August 22, 2023 08:34
psychocoderHPC and others added 3 commits August 22, 2023 10:36
alpaka renamed `Pltf` into `Platform` and make platform an object.

- Additionally this commit fixes the examples which are broken because
  changes from alpaka-group#225 was not taken into account.
76c6bba28c Reduce example work sizes (#2084)
516e9f9b2e Add alpaka_RELOCATABLE_DEVICE_CODE option (#1467)
79c3113a98 fix CUDA CMake support
bb74c9129e Disable recursive macro expansion warning for icpx
a6f9e6e053 Reduce console output
93da137545 Simplify alpaka_add_{executable,library}
b539d47d8d remove atomics from MemFence
23edf577e5 Move Gitlab clang jobs to Ubuntu 22.04
eeb7bdce9f Move ASan CI job to clang-16
44d1109168 Update README.md and temporarily disable SYCL runtime checks
78d436c228 Move m_extentWidthBytes outside of debug guards
e7ebee94a4 Add oneAPI CI jobs
7aa8b043ca Ensure CallbackThread/ThreadPool propagate exceptions
575f64dfaf Test value categories for enqueued tasks
12980a9865 Add missing <cstdint>
7aeafa5269 Remove remainders of Accessor
b23e3cf4e0 change formatting for clang-format
2edf839a23 change fixed size_t to auto
ca69fc644b include feedback
73e46fcb3c change some const variables to constexpr
effde28370 Remove ALPAKA_SYCL_BACKEND_ONEAPI
0222a7aecd apply reviewer comemnts
e0bc8cbb62 event test missing checks
813c970bcb add new event tests
e4ee1e0e21 fix host thread event implementation and evenet tests
6c442c71cb fix accessors for the SYCL backend
74c320e7c6 Add a CI run with UBSan
ede19d7b9e add mdspan tests to the CI
e793c4ef3a Add a test that a task is destroyed after execution
388483ce6b Modernize CMake
062a9feda4 Fix missing <cstdint> include
ac7b41daf2 Mark mdspan includes as SYSTEM includes
fcec7c2fc7 Fix compilation of MdSpan tests
b6eb4b62ee Drop Accessor
727f55b71b Update special CUDA jobs
53e17b8aa9 Rewrite counterBasedRng example using mdspan
d62dd59bfa Add math::copysign
558d2698cd add CUDA 12.2
9388d8f249 Fix compilation of bufferCopy example
d8a41f26c2 ci job generator: print warning if parameter value is not supported by the alpaka-job-coverage library
a0d731d43a Remove unused variable
b524591014 Add CUDA/HIP headers
e2a994ebae Forward declare AccGpuUniformCudaHipRt to avoid a dependency loop
49e90324aa Add the alpaka_DISABLE_VENDOR_RNG option
2b265c01fa Make the vendor-specific random number generators optional
f32efc2664 Add missing include
2ed16fbf58 GitLab CI: job generator checks if container images exists
7ebf53fab7 Enable release builds for gcc-9 + CUDA
8e5ae6e749 remove alpaka_SYCL_ENABLE_IOSTREAM from cmake
67ef8a736f Fix a dangling pointer in the SYCL memory buffer deleter
16e32caad3 Remove the dependency on Intel MKL
4f787e1ca3 Rename FenceTest to fenceTest
99e5b04fad Refactor OpenMP2 collective queue
b322395974 Add clang-16 test runner (#1971)
fa0af94515 modified allocMappedBuf in the tests
b046d0d0c0 add the platform as an argument to allocMappedBuf
f117eb1a0c Add missing ALPAKA_UNIFORM_CUDA_HIP_RT_CHECK calls in debug mode (#2034)
ee1b7c4b3a update SYCL README
6002d3ebfb Complete renaming Pltf to Platform
eba6db5d8e Add one more digit to the logs
ea0120f07e Implement math tests for ternary ops and fma
308be5dce9 Add the fused multiply-add functions
255c5d1fe6 documentation: QueueCpuOmp2Collective
459d32612b add missing ADL tests for math hyperbolic functions
ab4eb3fee6 fix callback thread task lifetime
164edf14f6 Trivial clean up of some SYCL-related headers
dfdca84d33 add math::log2 and math::log10
8cf861bd6a Rename Pltf to Platform
5251061e13 CI: enable Clang 15 as CUDA compiler for release builds
b0fbddf9da fix CallBack thread data race
b5d541b4d3 Update the main SYCL include file name
56cd5cdc78 Various fixes related to the SYCL back-end
ac0143dbdd Support compile-time warp size in SYCL kernels
cffed4c8dc First draft adding the warp size as a kernel trait
d9fbf7bf36 Rewrite the SYCL memcpy and memset operations
c1fe0763f1 Generalise the SYCL CpuSelector to non-Intel CPUs
24ca7fdfd7 Rewrite the SYCL backend for the SYCL 2020 standard and USM allocations (part 4)
09e65a28c9 Rewrite the SYCL backend for the SYCL 2020 standard and USM allocations (part 3)
76a13a774f Rewrite the SYCL backend for the SYCL 2020 standard and USM allocations (part 2)
251482ede2 Rewrite the SYCL backend for the SYCL 2020 standard and USM allocations (part 1)
467384a5ca queue test: fix catch2 usages within threads
087009956a Fix the delagating constructor in KernelExecutionFixture
fa7ce64499 Fix compilation errors in PltfGenericSycl
41e99568f1 Make alpaka platforms full objects
1eadba4f27 Fix typos in the comments in alpaka/vec/Vec (#2019)
bfc35edcdb Remove conflicting entry from .clang-format (#2018)
a35e49ee27 Update the Any test to work with a sub-group size of 4
6281b5f106 update alpaka-job-matrix-library to 1.3.5
007f1ee632 Update the separable compilation test
8490bb7a38 Mark mysqrt as ALPAKA_FN_EXTERN
abb32ddca3 Remove the requirement that the native handle is an int
eab49eefbe Update the bufSlicingTest
6abab49170 Update the type expected by View::operator[]
334e26586f Implement logical operations on alpaka::Vec
9cbf890ae4 Update .readthedocs.yml to version 2
2f989fe7ca Do not run tests on 0-dimensional accelerators
64f3f35091 Move NonZeroTestDims to TestDims.hpp
ff04bf3d99 Silence clang-16 warnings
909613a05e Enable two phase lookup with MSVC
29f6ed2b83 Simplify ConcurrentExecPool and rename to ThreadPool
5cbd95bb27 fix host callback unit test
4d378772c5 Refactor TaskKernelCpuThreads
3923c0828a Refactor ndloop
49e6f2ce83 Use a nested namespace specifier and struct
c41e56b38b Remove cleanup actions from CPU device
da34256c83 Remove detaching logic from ConcurrentExecPool
e0577b4087 Replace ConcurrentExecPool by CallbackThread in QueueGenericThreadsNonBlockingImpl
3158f18ee2 Add a benchmark for enqueue of a host func
375f4f0e16 Fix compilation with TSAN and serial backend
8fe1d1c8dc Use nested namespace specifier
37add41746 Avoid unnecessary copy
aba05a272c Include missing header
c1e0d3b6dc Demangle the kernel names in the integration tests
78e984d463 SYCL: update to the SYCL 2020 standard
f92989616f SYCL: revert spurious changes
dd9e30c67a Fix wrong CallbackThread termination
620ba96104 Fix a typo
5957371fd6 Make ALPAKA_UNIFORM_CUDA_HIP_RT_CHECK_IGNORE multiline safe
9ece8221a4 Improve debug output in QueueTest
625c7acb79 Allow CallbackThread to take any callable type
50cc03fb64 Update copyright information
6e521624c3 SYCL: change the default stream size to 8 KiB
0ce95ec66c SYCL: update to the SYCL 2020 standard
d7ce88dbc7 Make unused arguments anonymous
12392a4d7d Fix CMake build instructions for SYCL back-end.
2caf919872 Disable CATCH_CONFIG_FAST_COMPILE
08e1037f79 Restrict atomic tests to the supported types
edd1eb7883 Do not run tests on 0-dimensional accelerators
cb470230d4 Add a meta type to select non-zero integral constants
8de720070e fix callback-thread tasks lifetime
ef23ccd184 Add Xcode 14.3.1 test runner
3fe070aba0 Update SYCL CMake and remove Xilinx support
d7d873e96c Try to fix amalgamation CI
2346940a13 Update CMake / Boost minor versions and copyright info (#1969)
d6eb7146d4 Add gcc-13 test runner
3838fbcd16 Fix ill-formed spelling of ctor in C++20
20cd62e9f5 Fix amalgamation CI job
46cdf9bfe2 Refactoring
b1b5fc9956 Add CI job to create amalgamated alpaka.hpp
2af4dcc210 Use quotes for including local alpaka headers
9a128adb5a CI: test ROCm 5.5
56c12da983 Update Catch2 version requirement
99e131d121 Update to Catch2 v3.3.2
59cb5ebca1 Update Catch2 version requirement
0c27664de8 Update to Catch2 v3.3.1
0b151d0378 Disable MSVC + CUDA jobs
786ce2c3b1 Add CUDA 12.1 support (#1957)
0b25fc9e7e GitLab CI: enable nvcc 12.x c++20 test
461e1017e4 GitLab CI: support alpaka-job-coverage 1.3.0
88860c99cc fixed alpine image for Gitlab CI job generator
89a411fea6 Remove OpenMP 5 back-end
1983489609 Remove OpenACC back-end
07a8458ed9 Update documentation for kernel arguments
11a6ac1342 Deactivate icpx omp5 job
a9f5b59da0 Removed ext::oneapi namespace
bd515b89d8 Removed experimental namespace from SYCL
6c3ab90687 CI: fix compile issue if cpuonly job is executed on a GPU runner (#1939)
90dc85db96 Avoid repeated writing to shared mem in BabelStream dot
10c7c4c77f Reserve the devices vector memory before filling it
193f2c4be3 CI: fix job generator
af21e943d3 HIP <=5.3 avoid compiler error
8ea325d31e Restore macOS OpenMP jobs (#1922)
2346ca6a51 Consider CUDA 11.7 stream memory operations
a2a9695e96 Add ROCm 5.4 support (#1915)
1d8772cc14 Update CI container to version 3.1
759d754577 Use SPDX License identifiers
b849ce43db Replace BOOST_LANG_HIP with HIP_VERSION
b5b2d00475 Fix compilation warnings
ecc06294a0 Always inline with ALPAKA_FN_INLINE
162773e2a7 Remove old clang-cuda workaround
74ee8c8c11 Raise required CMake version to 3.22
5573fc351c Disable tests if used by add_subdirectory
a68c866cc6 GitLab CI: split compile only and runtime test in two child pipelines
40140fa153 add CUDA to job generator
a36139a041 fix Clang installation in CI
100a047e7e make GitLab CI jobs interruptible
9716c5673d Make SYCL runtime objects static
60ede546fe Update CMake and Boost point releases (#1903)
2e99977245 Implement trait constants
8ef9ccb3a1 fix agc-manager detection in the CI
da31798980 Enable more C++20 jobs
9946b859af Manually install sanitizer libraries
7a75cb3eaf Update to Xcode 14.2
28338849bd Use hipMallocAsync/hipFreeAsync with HIP 5.2.0 and later
921a6bf8bb Use hipLaunchHostFunc with HIP 5.4.0 and later
d44d2ab572 Add clang-15 to CI
4e6fecb3d7 CUDA CI update
6674ce6bab add HIP to the CI generator parameter
e28a9f34c3 Make use of mdspan configurable in CI
242ea84aad Rewrite bufferCopy example using mdspan
0aadfedd73 Add customizable function getMdSpan/getTransposedMdSpan(View)
e5107da242 Add mdspan to cmake
4363bbb912 Drop MSVC 2019
97ea65a63d remove support for HIP 4.x
24d8a3ee11 use agc-image for GitLab CI
f79f08a235 add support for agc-manager
e02b42624f fix icpx error implicit conversion
89c93d1075 Drop legacy compilers and CUDA versions
7582d6c123 Fix undefined constant with nvcc 12.0
76fb556517 Select serial accelerator for tests/examples (#1843)
1c18024e33 gitlab CI run more jobs as compile only
1c6ea20f46 Port babelstream from cupla to alpaka
291cff54cd Run clang-format
908ef12064 Add cupla version of BabelStream
e31eed92ac remove boost 1.73 warning
4a7c9db41e Mark CATCH cmake variables as advanced
a8460487c3 Add gcc-12 + OpenACC CI job
8434b9ea79 Update to Catch2 v3.2.1
7b77a28461 fix CUDA memory allocation mapped/async
a2f8d778a7 Collapse compiler matrix (#1860)
4b50b39267 Add clang-14 to CI
3308d8bbbd Avoid use-after-free of m_cvWakeup
ccb8683d7c Fix use after move in QueueGenericThreadsNonBlocking
2c2588989d Refactor QueueGenericThreadsNonBlocking
a690c3d206 Refactor ConcurrentExecPool
5f499f8c2f Refactor ITaskPkg and TaskPkg
2bf0149dd4 Refactor ThreadSafeQueue
19bed293a4 Merge ConcurrentExecPool primary template with specialization
3c33af6542 Drop CUDA 9.2
85abc80984 Drop Boost.fiber back-end
b3be00fc07 fix CUDA CI
ef234bc98e  Create a patch if clang-format CI fails (#1823)
bde1dc6a6c Fix missing `final` keyword for acceleriator inheritance
c95c9d0891 Test calling getValidWorkDiv with Idx type directly
8fa8648389 Refactor subDivideGridElems
4494a2c9b9 Fix createView for containers without a size argument
d0d7c14253 Add a new example demonstrating parallel loop patterns
fbecfb5e8e add math hyperbolic functions (#1828)
db2457997c CI: add HIP/ROCm 5.3
6e7e50a1df Make BlockSyncTestKernel::gridThreadExtentPerDim constepr function
162c330cfa Update CI NVHPC versions to 22.3
bc3b863846 GetDevProps<AccOacc>: report m_multiprocessorCount = 1
c47bf10dd7 CI: change ROCm CI node (#1844)
058785a838 Drop alpaka/time
df795ddc16 fix warning calling `__host__` from `__host__ __device__` function
c9377d33fc CI: remove OMP2 backend tests for MACOSX
12e0b302ef Run clang-format
eccab29627 Add some tests for subDivideGridElems
b8ddf35c39 Run clang-format
dcbf43aa9d Enable new formatting options
96c3920cff Update to clang-format-14
d3064f036f Upgrade to GH checkout action v3
1111fd083c delete copy, assign, move and move assign operator for accelerators
ffb8307194 Update to Catch2 v3.1.1
b004375c3d Implement accelerator tags
08724b5f40 CI: test ROCm 5.2.3 (#1812)
c73f8b7605 Apply suggestions from code review
c235fd67f0 Add example counterBasedRng
4695951762 CudaVectorArrayWrapper: Add convertability to/from std::array
3fbfb08076 Add PhiloxStatelessVector
f8ee7c9bf1 Add PhiloxStateless
d98d7707a6 Make mangled CUDA kernel name as short as possible (#1795)
6feb271d80 Rename result and reference values
e824302444 Add tests for elementwise_min and max functions
201f53f26d Add elementwise_min and max functions
3742e88648 Workaround nvcc 11.3
116a36712e Add deduction guide for Vec
c1d6ace30c Upgrade clang/CUDA headercheck CI to clang 13 and CUDA 11.2
0e112bd0ff add job generator to gitlab ci
bca3bfbf60 Remove the functions to pin/unpin an existing buffer
77b060355a Add a comment and a unit test that default engine type is trivially copyable
28cc847ce4 Make Philox random engines trivially copyable
b518e8c943 Document alpaka::allocAsyncBuf
f4c0b639d4 Document alpaka::allocMappedBuf
1f3babfec3 Improve error handling for memory de/allocation
2f8c6b0423 Move allocAsyncBufIfSupported to mem/buf/Traits.hpp
47e3278fb3 Update some tests to use allocMappedBufIfSupported
9d7de18e2e Add a trait for pinned/mapped memory allocation capability
31a847236d Change the interface of allocMappedBuf
c4424f2a9d PhiloxBaseCommon,PhiloxConstants: constexpr workaround for GCC
540397c429 Use CUDART_VERSION instead of BOOST_LANG_CUDA
0d2cec0bc3 Add missing template parameters
30d205f46d Update copyright notice
6c990fe660 Apply code style and formtting
a72874556e Use a nested namespace definition
a3380a364b Query te OS for free pages instead of reading /proc/meminfo
5afdda9869 Move includes to the global namespace
d8c6e5f94c Drop support for icc/icpc
d92f22850e fix clang CUDA atomics
f27d78c23c HIP: use emulated `atomicAdd(float*,float)`
31993fcbb0 HIP: workaround atomicMax and atomicMin
071417f50b HIP: usa atomic load within atomicCas emulation
a96bc8c733 OpenACC: test only 32bit and 64bit atomics
cab648ec0e disable OpenACC float comparisn warning
543310c0ab workaround for clang 9 with cuda 9.2
62db17a094 refactore atomic unit tests
d1c34cde30 `alpaka::AtomicCas` add floating point support
3d76d95222 refactor HIP/CUDA atomic implementations
0b96515b1c HIP: use build-in `atomicAdd(double)`
8acfbe42d6 Add gcc-12 to CI
f5118e82e2 Simplify clang installation
5a4691c826 Add ViewConst
4b6ead16de Fix misleading parameter name in DefaultQueue
a310c437b0 Remove redudant check
fbd1ac0c32 CI Update for ROCm
e2958beb22 Remove ALPAKA_FN_HOST_ACC on defaulted functions
a82be7374e CI Update for macOS
01a80e42bf OpenMP fixes for clang 13
6754e5bf7c OpenACC fixes for gcc12
dd3352be8f Set policy CMP0091 to NEW
e76b69b16b Remove support for clang-5
afb49a0c47 Accumulate memcpy/memset static_asserts
815490192f Upgrade to Catch2 v3
7ff5fdd478 Allow temporary destination views in memset/memcpy
38c24f6c4e Refactor TaskCopyOacc
5da464ff40 Diagnose CallbackThread joining itself

git-subtree-dir: alpaka
git-subtree-split: 76c6bba28c7a94a58b420e91ba135705f59cde44
@psychocoderHPC psychocoderHPC merged commit c82b95f into alpaka-group:dev Aug 22, 2023
1 check passed
@psychocoderHPC psychocoderHPC deleted the topic-updateAlpakaVersion branch August 22, 2023 08:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant