Opencl

The OpenCL block allows a user to interface with an OpenCL compatible device, like a GPU. This block handles most of the complications of using the OpenCL API. All the user has to do is feed the block a .cl file with the kernel source and click run! This block makes use of GRAS's special buffer model so memory allocated from OpenCL can be directly written by upstream blocks and read by downstream blocks.

Setup and install

OpenCL development environment

The first step is to install an OpenCL development environment. Now this part is specific to the hardware or GPU in question, so please refer to your vendor's installation or SDK install instructions for OpenCL. I personally found that this step was very easy on an Ubuntu machine with an Nvidia GPU. I simply had to install the nvidia-opencl-dev package and everything was taken care of.

Configure and build GRAS

After installing the OpenCL development environment. You should install GRAS according to the build instructions here:

https://github.com/guruofquality/gras/wiki/Build

During the cmake configuration step, you should see verbose similar to this:

Found OpenCL: /usr/lib/libOpenCL.so

If the cmake configuration cannot find the OpenCL development files, the development directories for OpenCL headers and libraries can also be manually set via the following variables in cmake:

OPENCL_LIBRARIES
OPENCL_INCLUDE_DIRS

Using OpenCL block

Using

The OpenCL block can be used in C++, python, or GNU Radio companion environments. The user has to know surprisingly little about the OpenCL API, this is the part of the API that revolves around buffer allocation, kernel compilation, device detection, etc... The OpenCL block wraps around all of that for you. The users only concern is implementing a kernel in .cl file.

Notes

Firstly, I would like to note that the OpenCL API encompass a great deal of things, and it would be impossible for this block to cover all of them. So far, this block handles linear arrays of data in and out, and exposes hooks to control linear work groups and work dimensions. I think this makes sense for GNU Radio applications which are often based on processing buffers of linear samples.

Implementation notes

The OpenCL block makes use of GRAS's advanced buffering model. Using the GRAS API, the OpenCL block swaps out its input and output buffer queues, and replaces these with a custom queue that uses OpenCL's buffer allocators. Therefore, blocks upstream of the OpenCL block write into memory allocated by OpenCL, and blocks downstream of the OpenCL block read from memory allocated by OpenCL.

Specifically, the OpenCL buffers are allocated with the CL_MEM_ALLOC_HOST_PTR flag. On a PCIe express graphics card, buffers may be be allocated in pinned memory and DMA'd over the PCIe interface. DMAs are executed via the enqueueMapBuffer and enqueueUnmapMemObject OpenCL API. Nvidia regards this method for best performance in the documentation.

http://www.nvidia.com/content/cudazone/CUDABrowser/downloads/papers/NVIDIA_OpenCL_BestPracticesGuide.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Opencl

Setup and install

OpenCL development environment

Configure and build GRAS

Using OpenCL block

Using

Notes

Implementation notes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally