This release contains one major change and some minor improvements to
compilation and performance.
Please download the RAJA-v0.8.0.tar.gz file below. The others will not work due to the way RAJA uses git submodules.
Major changes include:
- Build system updated to use the latest version of BLT (or close to it). Depending on how one builds RAJA, this could require changes to how information is passed to CMake. Content has been added to the relevant sections of the RAJA User Guide which describes how this is done.
Other notable changes include:
Features (These are not yet documented and should be considered experimental. There will be documentation and usage examples in the next RAJA release.):
- New thread, warp, and bitmask policies for CUDA. These are not yet documented and should be considered experimental.
- Added AtomicLocalArray type which returns data elements wrapped in an AtomicRef object.
- Fixed issue in RangeStrideSegment iteration.
- Fix 'align hint' macro to eliminate compile warning when XL compiler is used with nvcc.
- Fix issues associated with CUDA architecture level (e.g., sm_*) set too low and generated compiler warning/errors. Caveats for RAJA features (mostly atomic operations) available at different CUDA architecture levels added to User Guide.
- Some performance improvements in RAJA::kernel usage with CUDA back-end.