PyOpenCL wrappers for AMD clMathLibraries.clBLAS.
Currently only supports a small subset of the BLAS library (specifically the SWAP, SCAL, COPY, AXPY, GEMV, and GEMM families of functions), for real numbers (i.e., 32-bit or 64-bit floats) and complex numbers (i.e., complex64 and complex128).
sudo apt-get install opencl-headers
latest clBLAS release
(2.4.0 at the time of writing),
and unpack it somewhere (I suggest unpacking to
Inside, you will find an
include directory and a
lib64 on 64-bit machines).
Your machine will need to know where to find the libraries
when running the program.
On Linux, you can do this by putting a file in
that contains the full path to the
(the file name must end in
sudo ldconfig (you can do
sudo ldconfig -v | grep libclBLAS
to make sure that the library has been detected).
I am not sure how to add the libraries on other systems.
setup.py and change the include dirs in the extension
to target your OpenCL include directory
and your clBLAS include directory (which you just installed), respectively.
Also change the library dir to target
your clBLAS library directory (again, which you just installed).
Then, build the project:
python setup.py build_ext --inplace
It should compile without errors, and create
(as well as a corresponding
You can now install the project:
python setup.py install --user
or do a "developer" install:
python setup.py develop --user
The latter will mean that changes to this source directory will show up
when you import the package (making it easy to develop).
You do not need the
--user flag if installing to a virtualenv
(they're great; check them out!).
The basic usage is to start up PyOpenCL as usual, create some PyOpenCL Arrays, and pass them to the BLAS functions.
import numpy as np import pyopencl import pyopencl.array import pyopencl_blas pyopencl_blas.setup() # initialize the library ctx = pyopencl.create_some_context() queue = pyopencl.CommandQueue(ctx) dtype = 'float32' # also supports 'float64', 'complex64' and 'complex128' x = np.array([1, 2, 3, 4], dtype=dtype) y = np.array([4, 3, 2, 1], dtype=dtype) clx = pyopencl.array.to_device(queue, x) cly = pyopencl.array.to_device(queue, y) # call a BLAS function on the arrays pyopencl_blas.axpy(queue, clx, cly, alpha=0.8) print("Expected: %s" % (0.8 * x + y)) print("Actual: %s" % (cly.get()))
examples folder for more examples.