A nearly complete collection of prefix sum algorithms implemented in CUDA, D3D12, Unity and WGPU. Theoretically portable to all wave/warp/subgroup sizes.
unity
cuda
gpgpu
hlsl
d3d12
compute-shaders
chained-scan-with-decoupled-lookback
inclusive-prefix-sum
exclusive-prefix-sum
-
Updated
Nov 1, 2024 - C++