Skip to content

Build NVIDIA® CUDA™ code for OpenCL™ 1.2 devices

License

Notifications You must be signed in to change notification settings

iamqk/coriander

 
 

Repository files navigation

Coriander

Build applications written in NVIDIA® CUDA™ code for OpenCL™ 1.2 devices.

Concept

  • leave applications in NVIDIA® CUDA™
  • compile into OpenCL 1.2
  • run on any OpenCL 1.2 GPU

How to use

  • Write an NVIDIA® CUDA™ sourcecode file, or find an existing one
  • Let's use cuda_sample.cu
  • Compile, using cocl:
$ cocl cuda_sample.cu
   ...
   ... (bunch of compily stuff) ...
   ...

    ./cuda_sample.cu compiled into ./cuda_sample

Run:

$ ./cuda_sample
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics 5500 BroadWell U-Processor GT2
hostFloats[2] 123
hostFloats[2] 222
hostFloats[2] 444

Advanced usage

What Coriander provides

  • compiler for host-side code, including memory allocation, copy, streams, kernel launches
  • compiler for device-side code, handling templated C++ code, converting it into bog-standard OpenCL 1.2 code
  • cuBLAS API implementations for GEMM, GEMV, SCAL, SAXPY (using Cedric Nugteren's CLBlast)
  • cuDNN API implementations for: convolutions (using im2col algorithm over Cedric Nugteren's CLBlast, pooling, ReLU, tanh, and sigmoid

How Coriander works

Kernel compilation proceeds in two steps:

Slides on the IWOCL website, here

Installation

Coriander development is carried out using the following platforms:

  • Ubuntu 16.04, with:
    • NVIDIA K80 GPU
  • Mac Sierra, with:
    • Intel HD Graphics 530
    • Radeon Pro 450

Other systems should work too, ideally. You will need at a minimum at least one OpenCL-enabled GPU, and appropriate OpenCL drivers installed, for the GPU. Both linux and Mac systems stand a reasonable chance of working ok.

For installation, please see installation.md

Testing

See testing.md

Simplifications made by Coriander

Coriander makes the following relaxations/simplifications:

  • ints are generally assumed to be no longer than 32-bit, and truncated to 32-bit mostly
  • floats are assumed to be singles. doubles in the original kernels are converted to floats in the OpenCL code
  • buffer offsets are generally taken to be int32s for now. This might change in the future

Related projects

License

Apache 2.0

News

  • May 20:
    • renamed to Coriander
  • May 18:
    • Presented Coriander at this year's IWOCL :-) Full IWOCL program here, and there is a link to my own slides
  • May 5:
  • May 1:
    • dnn tests pass on Radeon Pro 450, on Mac Sierra now
    • fix crash bugs in pooling forward/backward, on Mac Sierra
    • thanks to my employer ASAPP giving me use of a nice Mac Book Pro 4th Generation, with Radeon Pro 450, unit tests now pass on said hardware :-)
  • April 29:
    • Updated to latest EasyCL. This lets you use environment variable CL_GPUOFFSET to choose different gpus, eg set to 1 to use second gpu, to 2 to use third gpu, etc
  • Older news

About

Build NVIDIA® CUDA™ code for OpenCL™ 1.2 devices

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C++ 71.4%
  • LLVM 7.7%
  • Python 7.1%
  • Cuda 7.1%
  • C 3.4%
  • CMake 1.7%
  • Other 1.6%