Skip to content

M-Gjerde/SYCLOneSweep

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SYCL Implementation of Nvidia's OneSweep

Welcome to the SYCL implementation of Nvidia's OneSweep! This project is an open-source, high-performance implementation designed to efficiently sort 1 billion 32-bit key/value pairs using Nvidia's RTX 4090 GPU.

Page: https://m-gjerde.github.io/SYCLOneSweep

Highlights

  • Blazing Fast Performance: Achieves sorting of 1 billion (2^30) 32-bit key/value pairs in just 0.17 seconds (5.88 GKeys/s) on Nvidia RTX 4090. (Slightly slower than cuda implementation, but optimizations can still be done)
  • Open Source: Completely free to use and modify under the MIT license.
  • SYCL Compatibility: Leverages SYCL for cross-platform compatibility and future-proofing.
  • Efficient Resource Usage: Optimized to make the most of GPU resources without requiring proprietary CUDA libraries.

Getting Started

Prerequisites

  • SYCL-compatible compiler (e.g., Intel DPC++, Codeplay ComputeCpp)
  • CMake (for building the project)

Installation

  1. Clone the repository:
    git clone https://github.com/M-Gjerde/SYCLOneSweep
    cd SYCLOneSweep
    mkdir build
    cd build
    cmake ..
    make
    ./onesweep_sycl
    
    

License

This project is licensed under the MIT License. See the LICENSE file for more details.

Special Thanks

A special thanks to the b0nes164/OneSweep repository for providing a CUDA library-less implementation that inspired and facilitated this project. https://github.com/b0nes164/OneSweep

About

A high-performance SYCL implementation of Nvidia's OneSweep algorithm, capable of sorting 1 billion 32-bit key/value pairs in 0.17 seconds on an Nvidia RTX 4090. Open source and optimized for cross-platform compatibility.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors