Be notified of new releases
Create your free GitHub account today to subscribe to this repository for new releases and build software alongside 28 million developers.Sign up
CUDA backend is added!
As of v1.0.0, VexCL provides two backends: OpenCL and CUDA. In order to choose either of those, user has to define
VEXCL_BACKEND_CUDA macros. In case neither of those are defined, OpenCL backend is chosen by default. One also has to link to either
OpenCL.dll for Windows users) or
For the CUDA backend to work, CUDA Toolkit has to be installed, NVIDIA CUDA compiler driver nvcc has to be in executable PATH and usable at runtime.
Benchmarks show that the CUDA backend is a couple of percents more efficient than the OpenCL backend, except for matrix-vector multiplication on multiple devices (there are some issues with asynchronous memory transfer with CUDA driver API). Note that first run of a program will take longer than usual, because there will be several invocations of nvcc compiler to compile each of compute kernels used in the program. Second and other runs will use offline kernel cache and will complete faster.
vex::Filter::General: modifiable container for device filters.
- Vector views (reduction, permutation) are all working with vector expressions.
vex::reshape()function for reshaping of multidimensional expressions.
vex::cast()function for changing deduced type of an expression.
vex::Filter::GLSharingfilters for the OpenCL backend (thanks, @johneih!)
VEXCL_SPLIT_MULTIEXPRESSIONSmacro allows componentwise splitting of
- Various bug fixes.