Conjugate Gradient solver written in CUDA
C C++
Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
test Add test successful message, and docs on testing to README Feb 16, 2012
.gitignore Add gitignore Aug 10, 2011
CHANGES Initial commit Jun 6, 2011
README Initial commit Jun 6, 2011


README for Conjugate Gradient GPU Solver - Graham Markall, 30/06/2009

(Presentation referenced below is at )

The included solver is not exactly the same as the one whose performance results are reported in the accompanying presentation - this version of the solver has been altered to make the code more straightforward, at the expense of some efficiency (more kernel launches are required).

The file implements a jacobi-preconditioner and conjugate gradient solver using the Compressed Sparse Row matrix format. The CSR matrix is expected to contain FORTRAN indices in the row_ptr and col_idx - that is, the indices are expected to begin at 1, and not 0. The function csr_c_to_fortran function in test/sparse_conversions.h gives an example of how to convert to the 1-based indexing if you have 0-based indexing.

The solver should work on any G200-series GPU (280GTX, Tesla C1060 etc...).

To make use of the solver in your code use:


  call gpucg_solve(row_ptr, size(row_ptr), col_idx, size(col_idx), val, size(val), rhs, size(rhs), x)

This assumes that a CSR matrix is is stored in the arrays row_ptr, col_idx and val. The known right-hand side vector is stored in rhs, and x is a vector in which the solution to the system will be placed. row_ptr and col_idx are arrays of integers, val, rhs and x are arrays of doubles.

In C:

  gpucg_solve_(*row_ptr, &size_row_ptr, *col_idx, &size_col_idx, *val, &size_val, *rhs, &size_rhs, *x)

This assumes that *row_ptr, *col_idx and *val contain pointers to the arrays representing the CSR matrix. *rhs is a pointer to the array storing the known right-hand side vector. *x is a pointer to an area of memory where the solution may be stored. These are all of type double*. size_row_ptr etc. all contain the the length of the respective array, and are integers.

To compile the solver, use:

  nvcc -arch sm_XX -c -o gpu_solve.o

Then link this with your code in the usual way.


There is a test for the CG solver in the test directory. You may need to adjust the Makefile to suit your system. The test solves a small system and checks that the solution is sufficiently close to a reference solution. To run the tests:

cd test


If you have any issues using this code please contact me at and I will attempt to be of assistance.