A simple library-less CUDA implementation of the OneSweep sorting algorithm.
-
Updated
Feb 26, 2024 - Cuda
A simple library-less CUDA implementation of the OneSweep sorting algorithm.
Efficient implementations of Merge Sort and Bitonic Sort algorithms using CUDA for GPU parallel processing, resulting in accelerated sorting of large arrays. Includes both CPU and GPU versions, along with a performance comparison.
Add a description, image, and links to the parallel-sorting topic page so that developers can more easily learn about it.
To associate your repository with the parallel-sorting topic, visit your repo's landing page and select "manage topics."