Skip to content

Release v0.8.0-beta1

Pre-release
Pre-release
Compare
Choose a tag to compare
@m4rs-mt m4rs-mt released this 03 Jan 02:52
· 1865 commits to master since this release
  • Added support for on-the-fly specialization of kernels using dynamic partial evaluation.
  • Added support for dynamic shared memory (CPU & Cuda backends).
  • Added new KernelConfig structure to specify launch dimensions for explicitly grouped kernels.
  • Reworked explicitly grouped kernel launchers to use the new KernelConfig structure instead of GroupedIndex types.
  • Simplified static Grid and Group properties.
  • Added new Index1 structure to avoid name clashes with new System.Index structure.
  • Added additional tuple conversion methods to Index2 and Index3 types.
  • Added new EntryPointDescription structure to specify an entry point and its index type.
  • Added RuntimeKernelConfig structure to combine static and dynamic information about a particular kernel launch.
  • Removed all GroupedIndex types.
  • Extended PTXInstructions to support bool-based IOs in PTXBackend (#68).
  • Extended ExchangeBuffer to use new page-locked memory allocation (if available).
  • Extended CudaAPI to supported paged-lock host-memory allocation functions.
  • Reworked implementation of GetSubView in the context of generic and multidimensional array views (#19).
  • Fixed several issues in the scope of address-space inference.
  • Fixed critical code generation issues that could occur when replacing values.
  • Fixed invalid pointer types in the scope of AtomicCAS operations on AMD hardware (#67).