NOTE: this fork is not actively maintained anymore.
As far as the functionality and tests in CUDA.jl go, this is a drop-in replacement for the official CUDA driver.
Short overview of changes compared to upstream:
- Fix compilation on a modern system (
- Support for Clang
- Add/extend/fix certain API calls (
- Small bugfixes
Not that this fork happened before gpuocelot was available on GitHub, so there might be some differences between both code bases.
- CUDA toolkit <= 5.0 (you only need the toolkit, not the actual driver)
- Appropriate GCC for the CUDA toolkit you picked
- LLVM 3.5
Make sure you do a recursive check-out, so the hydrazine submodule is checked out too.
cd $(CHECKOUT_DIR) CUDA_BIN_PATH=/opt/cuda-5.0/bin CUDA_LIB_PATH=/opt/cuda-5.0/lib CUDA_INC_PATH=/opt/cuda-5.0/include \ CC=clang CXX=clang++ python2 build.py \ --install -p $(PREFIX) -j$(JOBS)
Note: if your main
gcc binary is not compatible with your CUDA toolkit
version (for example, CUDA 5.0 requires gcc <= 4.6), you will need to edit
ocelot/scripts/build_environment.py and change the two
nvcc invocations to
-ccbin=gcc-4.6 (or something similar).
Note: due to restrictions of the build system, make sure you only use
absolute paths when passing information through environment variables, and
always build with
Note: LLVM needs to be at version 3.5. If your distribution provides some
other version, install and compile LLVM from source and point the build system
to that installation's
llvm-config binary by setting the
environment variable before invoking
Compile and install gpuocelot into a directory you can load libraries from.
Next, you either rename
libcuda.so, or you use something
which knows about gpuocelot (like CUDA.jl does). After that, you can use the
available symbols just as it were the official NVIDIA implementation.