This is an updated implementation of GPU Sample Sort. The algorihm is the same as in the publication, but the code has been cleaned up & modified to work with more recent versions of CUDA and C++ compilers. Scripts for benchmarking the algorithm and for generating charts are included as well.
The code has been compiled & tested with CUDA 10 + MSVC 2017 15.9 on Windows 10 and CUDA 9 + GCC 6.5 on Ubuntu 18.10.
- CUDA 9+
- A compatible C++ compiler (e.g. GCC, MSVC)
- CMAKE 3.12
- Python 3.6 for generating charts