Benchmark_SpTRSV_using_CSC

A Synchronization-Free Algorithm for Parallel Sparse Triangular Solves (SpTRSV)

Introduction

This is the source code of the Euro-Par '16 paper "A Synchronization-Free Algorithm for Parallel Sparse Triangular Solves" by Weifeng Liu, Ang Li, Jonathan D. Hogg, Iain S. Duff, and Brian Vinter. [PDF] [Slides] [DOI]

Update (14 Feb. 2020, cuda): A problem about deadlock on CUDA 10 has been fixed.

Update (13 Feb. 2017, cuda): A problem about caching has been fixed for Tesla P100. Thanks to Hartwig Anzt for identifying the probem and Ang Li for fixing it!

Update (30 Nov. 2016): This algorithm has been improved to support both forward and backward substitution, and multiple right-hand sides. See https://github.com/bhSPARSE/Benchmark_SpTRSM_using_CSC for a newer version of this work.

Update (25 Feb. 2016, opencl): An OpenCL version has been added.

nVidia GPU (CUDA) version

Execution

Set CUDA path in the Makefile,
Run make,
Run ./sptrsv example.mtx.

Tested environments

nVidia GeForce GTX Titan X GPU in a host with CUDA v7.5 and Ubuntu 15.10 64-bit Linux installed.
nVidia Tesla K40c GPU in a host with CUDA v7.5 and Enterprise Linux installed.
nVidia Geforce GT 650m GPU in a host with CUDA v7.5 and Mac OS X 10.9.2 installed.

Data type

The code supports both double precision and single precision SpTRSV. Use make VALUE_TYPE=double for double precision or make VALUE_TYPE=float for single precision.

AMD GPU (OpenCL 2.0) version

Execution

Set OpenCL path in the Makefile,
Run make,
Run ./sptrsv example.mtx.

Tested environments (Note that an OpenCL 2.0 device is required for running the code)

AMD Radeon Fury X GPU in a host with AMD APP SDK 2.9.1 and Ubuntu 15.04 64-bit Linux installed.
AMD Radeon 290X GPU in a host with AMD APP SDK 2.9.1 and Ubuntu 15.04 64-bit Linux installed.

Data type

The code supports both double precision and single precision SpTRSV. Use make VALUE_TYPE=double for double precision or make VALUE_TYPE=float for single precision.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
SpTRSV_cuda		SpTRSV_cuda
SpTRSV_opencl_amd		SpTRSV_opencl_amd
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SpTRSV_cuda

SpTRSV_cuda

SpTRSV_opencl_amd

SpTRSV_opencl_amd

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Benchmark_SpTRSV_using_CSC

Introduction

nVidia GPU (CUDA) version

AMD GPU (OpenCL 2.0) version

About

Releases

Packages

Languages

License

weifengliu-ssslab/Benchmark_SpTRSV_using_CSC

Folders and files

Latest commit

History

Repository files navigation

Benchmark_SpTRSV_using_CSC

Introduction

nVidia GPU (CUDA) version

AMD GPU (OpenCL 2.0) version

About

Resources

License

Stars

Watchers

Forks

Languages