Skip to content
Go to file


Failed to load latest commit information.
Latest commit message
Commit time


OpenCL bindings for Julia

Build status:

Code coverage:

Julia interface for the OpenCL parallel computation API

This package aims to be a complete solution for OpenCL programming in Julia, similar in scope to PyOpenCL for Python. It provides a high level api for OpenCL to make programing GPU's and multicore CPU's much less onerous.

OpenCL.jl provides access to OpenCL API versions 1.0, 1.1, 1.2 and 2.0.

This package is based off the work of others:

OpenCL.jl has had contributions from many developers.

Currently supported Julia versions

  • Julia v"0.4.x" is supported on the release-0.4 branch and the OpenCL.jl versions v"0.4.x". Only bug-fixes will be applied.
  • Julia v"0.5.x" is supported on the master branch and the OpenCL.jl versions v"0.5.x".
  • Julia v"0.6.x" is experimentally supported on the master branch and the OpenCL.jl versions v"0.5.x".

Discontinued support

  • Julia v"0.3.x" was supported on OpenCL.jl versions v"0.3.x". It should still be installable and work.


  1. Install an OpenCL driver. If you use OSX, OpenCL is already available
  2. Checkout the packages from the Julia repl
  1. OpenCL will be installed in your .julia directory
  2. cd into your .julia directory to run the tests and try out the examples
  3. To update to the latest development version, from the Julia repl:

IJulia Notebooks

Quick Example

using LinearAlgebra
using OpenCL

const sum_kernel = "
   __kernel void sum(__global const float *a,
                     __global const float *b,
                     __global float *c)
      int gid = get_global_id(0);
      c[gid] = a[gid] + b[gid];
a = rand(Float32, 50_000)
b = rand(Float32, 50_000)

device, ctx, queue = cl.create_compute_context()

a_buff = cl.Buffer(Float32, ctx, (:r, :copy), hostbuf=a)
b_buff = cl.Buffer(Float32, ctx, (:r, :copy), hostbuf=b)
c_buff = cl.Buffer(Float32, ctx, :w, length(a))

p = cl.Program(ctx, source=sum_kernel) |>!
k = cl.Kernel(p, "sum")

queue(k, size(a), nothing, a_buff, b_buff, c_buff)

r =, c_buff)

if isapprox(norm(r - (a+b)), zero(Float32))
    @info "Success!"
    @error "Norm should be 0.0f"


Here's a rough translation between the OpenCL API in C to this Julia version. Optional arguments are indicated by [name?] (see clCreateBuffer, for example). For a quick reference to the C version, see the Khronos quick reference card.

Platform and Devices

C Julia Notes
clGetPlatformIDs cl.platforms()
clGetPlatformInfo, :symbol) Platform info: :profile, :version, :name, :vendor, :extensions
clGetDeviceIDs cl.devices(), cl.devices(platform), cl.devices(:type) Device types: :all, :cpu, :gpu, :accelerator, :custom, :default
clGetDeviceInfo, :symbol) Device info: :driver_version, :version, :profile, :extensions, :platform, :name, :device_type, :has_image_support, :queue_properties, :has_queue_out_of_order_exec, :has_queue_profiling, :has_native_kernel, :vendor_id, :max_compute_units, :max_work_item_size, :max_clock_frequency, :address_bits, :max_read_image_args, :max_write_image_args, :global_mem_size, :max_mem_alloc_size, :max_const_buffer_size, :local_mem_size, :has_local_mem, :host_unified_memory, :available, :compiler_available, :max_work_group_size, :max_work_item_dims, :max_parameter_size, :profiling_timer_resolution, :max_image2d_shape, :max_image3d_shape
clCreateContext cl.context(queue), cl.context(CLMemObject), cl.context(CLArray)`
clReleaeContext cl.release!


C Julia Notes
clCreateBuffer cl.Buffer(type, context, [length?]; [hostbuf?]), cl.Buffer(type, context, flags, [length?]; [hostbuf?]) Memory flags: :rw, :r, :w, :use, :alloc, :copy
clEnqueueCopyBuffer cl.copy!(queue, destination, source)
clEnqueueFillBuffer cl.enqueue_fill_buffer(queue, buffer, pattern, offset, nbytesm wait_for)
clEnqueueReadBuffer cl.enqueue_read_buffer(queue, buffer, hostbuf, dev_offset, wait_for, is_blocking)
clEnqueueWriteBuffer cl.enqueue_write_buffer(queue, buffer, hostbuf, byte_count, offset, wait_for, is_blocking)

Program Objects

C Julia Notes
clCreateProgramWithSource cl.Program(ctx; source)
clCreateProgramWithBinaries cl.Program(ctx; binaries)
clReleaseProgram cl.release!
clBuildProgram!(progrm, options)
clGetProgramInfo, :symbol) Program info: :reference_count, :devices, :context, :num_devices, :source, :binaries, :build_log, :build_status

Kernel and Event Objects

C Julia Notes
clCreateKernel cl.Kernel(program, "kernel_name")
clGetKernelInfo, :symbol) Kernel info: :name, :num_args, :reference_count, :program, :attributes
clEnqueueNDRangeKernel cl.enqueue_kernel(queue, kernel, global_work_size), cl.enqueue_kernel(queue, kernel, global_work_size, local_work_size; global_work_offset, wait_on)
clSetKernelArg cl.set_arg!(kernel, idx, arg) idx starts at 1
clCreateUserEvent cl.UserEvent(ctx; retain)
clGetEventInfo, :symbol) Event info: :context, :command_queue, :reference_count, :command_type, :status, :profile_start, :profile_end, :profile_queued, :profile_submit, :profile_duration
clWaitForEvents cl.wait(event), cl.wait(events)
clEnqueueMarkerWithWaitList cl.enqueue_marker_with_wait_list(queue, wait_for)
clEnqueueBarrierWithWaitList cl.enqueue_barrier_with_wait_list(queue, wait_for)