OpenCL bindings for Chicken Scheme
These bindings target OpenCL version 1.1 for wide support. It should be fairly easy to add support to newer OpenCL APIs.
Hosted here.
Most Scheme procedures map 1-1 to OpenCL C procedures, with some
naming conventions changed. All procedures start with the type of
object they work on. For example, clCreateBuffer becomes
buffer-create.
All OpenCL objects are wrapped in a record structure. All finalizer
keyword arguments default to using set-finalizer! to call the
appropriate *-release! procedure when the object is freed by the
CHICKEN GC. This automatic memory management should be sufficient for
many use cases.
For a large number of OpenCL objects it may be a good idea to call the
*-release! procedures explicitly. All the *-release! procedures
are idempotent, so it is even safe to use them on objects that have
finalizers attached. Also, (gc #t) can sometimes be used to free
ObjecCL objects no longer needed by the application, freeing OpenCL
resources sooner.
If any of the OpenCL C procedures return an error code other than
CL_SUCCESS, an error is signaled, hopefully with a meaningful error
message.
You must have a working OpenCL installed on your system for this egg
to work. clinfo is a useful tool to diagnose and verify
this. Environment variables like OCL_ICD_VENDORS may be useful to
out-source the device-picking process:
~> csi -R opencl -P '(platforms)'
(#<cl_platform "Intel(R) OpenCL HD Graphics">
#<cl_platform "Clover">
#<cl_platform "Portable Computing Language">)
~> env OCL_ICD_VENDORS=intel.icd csi -R opencl -P '(platforms)'
(#<cl_platform "Intel(R) OpenCL HD Graphics">)There is unfortunately a bit boilerplate to get started, but luckily it's less than the equivalent C code. This code snippet might be useful:
(import srfi-4 opencl)
(define device (car (flatten (map platform-devices (platforms)))))
(define context (context-create device))
(define cq (command-queue-create context device))
(define buffer (buffer-create (f32vector -1 -1 -1 -1 -1) cq))
(define kernel (kernel-create (program-build (program-create context "
__kernel void foo(__global float *out) {
out[get_global_id(0)] = get_global_id(0) * 10;
}
") device) "foo"))
(kernel-arg-set! kernel 0 buffer)
(kernel-enqueue kernel cq (list 4))
(print (buffer-read buffer cq))When run, it should output #f32(0.0 10.0 20.0 30.0 -1.0). The last
-1 is leftovers from buffer creation. More examples can be found in
the examples folder.
[procedure] (platforms) => (list-of cl_platform)
List all the OpenCL platforms on the system.
[procedure] (platform-extensions platform) => string
[procedure] (platform-name platform) => string
[procedure] (platform-profile platform) => string
[procedure] (platform-vendor platform) => string
[procedure] (platform-version platform) => string
Get various information for a given platform. See clGetPlatformIDs
[procedure] (platform-devices platform #!optional device-type) => (list-of cl_device)
List the available devices for platform. device-type can
optionally be supplied as a filter, and symbol among cpu gpu all
accelerator default.
[procedure] (device-info device) => alist
[procedure] (device-address-bits device) => number
[procedure] (device-name device) => string
[procedure] (device-vendor device) => string
[procedure] (device-version device) => string
Get various information of device. See this
page.
[procedure] (context-create devices #!key (finalizer #t)) => cl_context
[procedure] (context-release!) => void
Create or release an OpenCL context. See this
page. Unfurtunately,
currently devices must be a single cl_device. In the future, a
list of cl_devices might be implemented.
[procedure] (command-queue-create context device #!key out-of-order profile (finalizer #t)) => cl_command_queue
[procedure] (command-queue-release! command_queue) => void
Create or release a cl_command-queue. out-of-order is a boolean
flag and is #f by default. See this
page.
[procedure] (command-queue-context context) => cl_context
Get back the cl_context passed into command-queue-create.
[procedure] (buffer-create s c #!key (type #f) (flags 0) (finalizer #t)) => cl_mem
[procedure] (buffer-release! buffer) => void
Create or release an OpenCL buffer object, a memory chunk that is usually stored on the target device. See this page.
If c is a cl_context, s (s for source) must be an integer
representing the buffer size in bytes.
If c is a cl_command-queue, s must be a srfi-4 vector or a
blob. The content of s is copied to the newly created buffer
buffer using buffer-write. The size in bytes of this buffer and
the size in bytes of s will be equal.
type defaults to the type of s for srfi-4 vectors, or blob if
s is an integer. If s is an integer, type can be explicitly
supplied. See buffer-type for valid values.
Support for flags is currently limited, and must be a number
corresponding to the C API. The default is CL_MEM_READ_WRITE.
[procedure] (buffer-write buffer cq src #!key (offset 0)) => cl_mem
Copy the content of src to buffer. See this
page. Only
blocking writes are supported.
cq must be a valid cl_command-queue. src must be a srfi-4 vector
or a blob. offset is the offset in bytes in the buffer object to
write to.
[procedure] (buffer-read buffer cq #!key type (dst #f) (byte-offset 0) (bytes #f)) => srfi-4-vector
Copy the content of buffer into host memory. The returned object is
dst if supplied, otherwise a newly allocated srfi-4 vector or
blob. Which vector type depends on type, which defaults to
(buffer-type buffer). If supplied, dst must be a srfi-4 vector or
blob and its content will be overwritten with data from buffer.
If supplied, bytes must be the number of bytes to copy. If supplied,
byte-offset must be the number of bytes to skip in buffer.
[procedure] (buffer-size buffer) => integer
Return the size of buffer in bytes.
[procedure] (buffer-type buffer) => symbol
Return the type tag symbol. This is stored in the cl_mem Scheme
record, not in the OpenCL cl_mem object. It is used by buffer-read
to construct a suitable srfi-4 vector. Valid types are u8 s8 u16
s16 u32 s32 u64 s64 f32 f64.
[procedure] (program-create context source #!key (finalizer #t)) => cl_program
[procedure] (program-release! program) => void
Create or release a OpenCL with clCreateProgramWithSource.
context must be a valid cl_context. source must be OpenCL C
source code as a string. Note that the returned program must be
built before it can be used to create kernels.
[procedure] (program-build program devices #!key options) => program
devices must be a valid cl_device. Unfortunately, only a single
device is currently supported. If the build process fails, an error is
signaled with the content of (program-build-log program) which will
hopefully contain useful compiler errors.
If supplied, options must be a string as specified in
clBuildProgram.
[procedure] (program-build-log program) => string
[procedure] (program-build-options program) => string
Retrieve various information about program, see this
page.
[procedure] (kernel-create program name #!key (finalizer #t)) => cl_kernel
[procedure] (kernel-release! kernel) => void
Create or release a cl_kernel. See this page.
name must be a string, the name of a __kernel function inside
program.
[procedure] (kernel-arg-set! kernel idx value) => void
Calls
clSetKernelArg. idx
must be an integer, where 0 is the first kernel argument. value must
be a srfi-4 vector for by-value argument types, or cl_buffer for
arguments pointers.
The type of the kernel argument (ie. float2) must match the srfi-4
vector type (ie. (f32vector 1 1)). It is unfortunately not possible
to enforce checks for this. The srfi-4 vector's length must match the
kernel argument. For example, the arguments to __kernel foo(long4 a, float *result) could be initialized like this:
(kernel-arg-set! kernel 0 (s64vector 1 2 3 4))
(kernel-arg-set! kernel 1 (buffer-create (* 4 1000) ctx type: 'f32))I don't know if you must call this for all arguments each time.
[procedure] (kernel-enqueue kernel cq global-work-sizes #!key event wait global-work-offsets local-work-sizes) => (or cl_event #f)
Enqueues kernel for execution on cq as described
here.
global-work-sizes must be a list of integers, each representing a
work size for each dimension. On most platforms, the maximum number of
dimensions (and thus the length of global-work-sizes) is 3. The
minimum is always 1.
The numbers supplied here will be reflected inside kernel as
get_global_size(n). Note that total number of worker-items executed should be (foldl * 1 global-work-sizes).
Unlike most of the other procedures, kernel-enqueue is non-blocking
and may return before kernel is finished executing. If event is
supplied and is #t or a cl_event, the returned event can be used
to query the kernel execution status. If event is not supplied,
kernel-enqueue returns #f.
The event API is a bit experimental and difficult to test.
[procedure] (event-create context) => cl_event
[procedure] (event-release!) => void
Create or release a user-event cl_event, see this
page.
[procedure] (event-complete! event #!optional (status 'complete)) => cl_event
Set the execution status of event, see
clSetUserEventStatus.
[procedure] (event-status event) => symbol
Retrieve event status, one of: queued submitted running
complete.
[procedure] (cl_command-queue? x) => bool
[procedure] (cl_context? x) => bool
[procedure] (cl_device? x) => bool
[procedure] (cl_event? x) => bool
[procedure] (cl_kernel? x) => bool
[procedure] (cl_mem? x) => bool
[procedure] (cl_platform? x) => bool
[procedure] (cl_program? x) => bool
Predicates for record types. buffer object are cl_mem? records.
- automatic memory management of OpenCL objects
- list platforms and devices
- create context and command queue
- create, compile and run programs from source-code
- create, compile and run programs from IR/Binary
- build options (in
clBuildProgram) - launch kernels (global-size, local-size and global-offset)
- kernel name
-
clCreateKernelsInProgram - create, read and write buffer objects
- set read-only and write-only buffer flags
- create, read and write image objects
- copy buffers
- copy images
- samplers
- sub-buffers
- non-blocking API
- event support for kernel and user (test fails on my Clover platform)
- event support for buffer commands
- event callbacks
- event blocking (
clWaitForEvents) - OpenCL <-> OpenGL interop
So far everything is blocking. Implementing the non-blocking API would complicate things as the Chicken GC might get in the way.