Clone this wiki locally
Goal of compyte
Make a common GPU ndarray(matrix/tensor or n dimensions) that can be reused by all projects.
- Development/user mailing list: http://lists.tiker.net/listinfo/gpundarray
- Announce mailing list(low volume): http://lists.tiker.net/listinfo/gpundarray-announce
- The current development that include a real C back-end and support for OpenCL is in this branch: https://github.com/abergeron/compyte/tree/reorg
- Currently there are at least 6 different gpu arrays in python
- CudaNdarray(Theano), GPUArray(pycuda), CUDAMatrix(cudamat), GPUArray(pyopencl), Clyther, Copperhead, GPUmat(Matlab), ...
- There are even more if we include other languages: C/C++, scala gpu(optiml, opencl), ...
- They are incompatible
- None have the same properties and interface
- All of them are a subset of numpy.ndarray on the gpu!
Lack of Standard Creates Problems:
- Duplicates work
- GPU code is harder/slower to do correctly and fast than on the CPU/python
- Harder to port/reuse code
- Harder to find/distribute code
- Divides development work
Pitfalls to Avoid
- Start alone
- We need different people/groups to "adopt" the new GpuNdArray
- Too simple - other projects won't adopt
- Too general - other projects will implement "light" versions... and not adopt
- Having an easy way to convert/check conditions as numpy could alleviate this.
The preferred option is to have a general version with easy check/conversion to allow supporting only a subset!
- Make it VERY similar to numpy.ndarray
- Easier to attract other people from python community
- Have the base object in C to allow collaboration with more projects.
- We want people from C, C++, ruby, R, ... all use the same base Gpu ndarray.
- Be compatible with CUDA and OpenCL
Current behavior not wanted
- No CPU code generated from the python interface (for PyOpenCL and PyCUDA). Gpu code is OK.
All of the basic C code is done. Currently working on elementwise functionality in prevision of a PyOpenCL/PyCUDA integration.
Sketch of the file structure and the reasoning behind it
This section will detail the file structure and give you a hint of what to expect if you intent on shipping a project integrating this code. Also this applies to the code in the reorg branch which will become the mainline soon. It is located here: http://github.com/abergeron/compyte/tree/reorg
Some of these files are not in the repository yet, which means that this functionality is being worked on.
The main files are:
- Defines the base compyte_buffer object
- Also defines the structure for GpuArray and GpuKernel
- Implements the CUDA version of the compyte_buffer API
- Implements the OpenCL version of the compyte_buffer API
- Define a Cython wrapper that exposes the GpuArray object and a couple of function to mimic the interface of numpy.ndarray
- Support running arbitrary elementwise kernels on GpuArray of arbitrary memory layout (python-only).
These files serve as support for the functionality above:
- generated by ndarray/gen_types.py
- serve as a type table for operations that need to know some information about types involved
- some generally useful functions that don't really fit anywhere else.
- Builds the python module implemented in pygpu_ndarray.pyx along with all the supporting code
These files serve for portability (mainly to support windows):
Some tests for the python interface (that also test the underlying C code):
- ndarray/test_gpu_ndarray.py (test basic functionality: init, copy, indexing, ...)
- ndarray/test_gpu_elemwise.py (test elemwise functionality (only for CUDA at the moment))
Some random stuff:
- We have the updateifcopy flag as numpy, but it is always False and we expect it is False.