CUDA-based NumPy
Python Cuda C++ Makefile
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.

CUDA-based NumPy

CUDArray is a CUDA-accelerated subset of the NumPy library. The goal of CUDArray is to combine the easy of development from the NumPy with the computational power of Nvidia GPUs in a lightweight and extensible framework.

CUDArray currently imposes many limitations in order to span a manageable subset of the NumPy library. Nonetheless, it supports a neural network pipeline as demonstrated in the project deeppy.


  • Drop-in replacement for NumPy (limitations apply).
  • Fast array operations based on cuBLAS, cuRAND and cuDNN.
  • (somewhat) Simple C++/CUDA wrapper based on Cython.
  • Extends NumPy with specialized functions for neural networks.
  • CPU fall-back when CUDA is not available.


With CUDA back-end

First, you should consider specifying the following environment variables.

  • INSTALL_PREFIX (default: /usr/local). Path where to install libcudarray. For the Anaconda Python distribution this should be /path/to/anaconda.
  • CUDA_PREFIX (default: /usr/local/cuda). Path to the CUDA SDK organized in bin/, lib/, include/ folders.
  • CUDNN_ENABLED. Set CUDNN_ENABLED to 1 to include cuDNN operations in libcudarray.

Then build and install libcudarray with

make install

Finally, install the cudarray Python package:

python install
Without CUDA back-end

Install the cudarray Python package:

python --without-cuda install


Please consult the technical report for now. Proper documentation is on the TODO list.


Feel free to report an issue for feature requests and bug reports.

For a more informal chat, visit #cudarray on the freenode IRC network.


If you use CUDArray for research, please cite the technical report:

  author = "Larsen, Anders Boesen Lindbo",
  title = "{CUDArray}: {CUDA}-based {NumPy}",
  institution = "Department of Applied Mathematics and Computer Science, Technical University of Denmark",
  year = "2014",
  number = "DTU Compute 2014-21",


  • Proper transpose support,
  • Add functionality for copying from NumPy array to existing CUDArray array.
  • FFT module based on cuFFT.
  • Unit tests!
  • Add documentation to wiki.
  • Windows/OS X support.


Thanks to the following projects for inspiration.