Release Release v0.8.0-beta1 · m4rs-mt/ILGPU

Added support for on-the-fly specialization of kernels using dynamic partial evaluation.
Added support for dynamic shared memory (CPU & Cuda backends).
Added new KernelConfig structure to specify launch dimensions for explicitly grouped kernels.
Reworked explicitly grouped kernel launchers to use the new KernelConfig structure instead of GroupedIndex types.
Simplified static Grid and Group properties.
Added new Index1 structure to avoid name clashes with new System.Index structure.
Added additional tuple conversion methods to Index2 and Index3 types.
Added new EntryPointDescription structure to specify an entry point and its index type.
Added RuntimeKernelConfig structure to combine static and dynamic information about a particular kernel launch.
Removed all GroupedIndex types.
Extended PTXInstructions to support bool-based IOs in PTXBackend (#68).
Extended ExchangeBuffer to use new page-locked memory allocation (if available).
Extended CudaAPI to supported paged-lock host-memory allocation functions.
Reworked implementation of GetSubView in the context of generic and multidimensional array views (#19).
Fixed several issues in the scope of address-space inference.
Fixed critical code generation issues that could occur when replacing values.
Fixed invalid pointer types in the scope of AtomicCAS operations on AMD hardware (#67).

Provide feedback