Release v0.9.0
This new stable version offers significant performance and code quality improvements of the generated kernel programs.
- Fixed invalid range checks in memory buffer implementations.
- Fixed invalid 32-bit offsets in memory buffer implementations.
- Fixed if-conversion transformation generating invalid programs in some cases (#232, #233).
- Fixed code-analyses issues that could cause invalid analysis results (#220).
- Added support for 64-bit length buffers and views (#196, #210, #215, #216).
Note that this feature includes breaking changes that might affect existing code bases. Please refer to the upgrade guide for more information. - Added new if-conversion transformation to improve performance (#183).
- Added support for 16-bit float (Half) types (#180, #208).
- Added initial support for fixed array buffers (#200).
- Added support for non-capturing lambda kernels (#79, #136).
- Added support for multidimensional ExchangeBuffers (#148).
- Extended ExchangeBuffers to support conversions to Span and Memory instances (#122).
- Fixed invalid lowering of arrays in divergent control flow (#201).
- Fixed invalid handling of prefixed IL instructions (#204, #211).
Special thanks to @MoFtZ, @Yey007 and @jgiannuzzi for contributing to this release.