Pre-release

@dmed256 dmed256 released this Mar 23, 2018 · 443 commits to master since this release

Assets 2

Porting from v0.2 -> v1.0

Properties

The biggest change is the introduction of occa::properties.
We use occa::properties as a way to generalize the OCCA API to support current backends as well as future ones.

It supports JSON format with some Javascript short-hand notations:

For example:

JSON:
  {
    "1": [1],
    "2": 2,
    "3": "3"
  }
Shorthand:
  {
    1: [1,],
    2: 2,
    3: '3',
  }

The differences:

  • Supports trailing commas
  • Doesn't require quotes for object keys
  • Can use single quotes for strings

Since the occa::properties is a key-value store, it doesn't require the enclosing {}

v1.0

occa::properties props("1: [1,],"
                       "2: 2,",
                       "3: '3',");

Device

The occa::device expects an occa::properties object which can be auto-generated from a string

v0.2

occa::device device("mode = OpenCL, deviceID = 0");

v1.0

occa::device device("mode: 'OpenCL', deviceID: 0");

Kernel

The occa::kernelInfo object was replaced by occa::properties

v0.2

occa::kernelInfo kernelInfo;

kernelInfo.addDefine("p_foo", foo);

kernelInfo.addInclude("#include \"myfile.okl\"");
kernelInfo.addIncludeDefine("myfile.okl");

kernelInfo.addCompilerFlag("-O3");
kernelInfo.addCompilerFlag(" -DN=10");

occa::kernel addVectors = device.buildKernelFromSource("addVectors.okl",
                                                       "addVectors",
                                                       kernelInfo);

occa::kernel addVectors = device.buildKernelFromString("kernel void ...",
                                                       "addVectors",
                                                       kernelInfo);

v1.0

occa::properties props = occa::getKernelProperties();

props["defines/p_foo"]    = foo;
props["defines"]["p_foo"] = foo;

props["header"]   += "#include \"myfile.okl\"";
props["includes"] += "myfile.okl";

props["compilerFlag"]  = "-O3";
props["compilerFlag"] += " -DN=10"; // Note the space since we're creating a string

occa::kernel addVectors = device.buildKernel("addVectors.okl",
                                            "addVectors",
                                            props);

occa::kernel addVectors = device.buildKernelFromString("kernel void ...",
                                                       "addVectors",
                                                       props);

Memory

We can pass occa::properties to device::malloc to expose additional backend features.

v0.2

occa::memory o_a = device.malloc(2 * sizeof(int));
occa::memory o_b = device.malloc(2 * sizeof(int), ptr);

v1.0

occa::memory o_a = device.malloc(2 * sizeof(int));
occa::memory o_b = device.malloc(2 * sizeof(int), ptr);

## Use CUDA managed memory
occa::memory o_c = device.malloc(2 * sizeof(int), "managed: true");
occa::memory o_d = device.malloc(2 * sizeof(int), ptr, "managed: true");

When coping data between hosts/devices, async must now be passed as a property

v0.2

o_a.copyTo(a);
o_a.asyncCopyTo(a);

v1.0

o_a.copyTo(a);
o_a.copyTo(a, "async: true");

To avoid parsing the string every time, the property can be created and reused

v1.0

occa::properties async = "async: true";
o_a.copyTo(a, async);

Stream

v0.2

occa::streamTag tag;

tag.clEvent;
tag.cuEvent;

v1.0

occa::streamTag tag;

cl_event *event = (cl_event*) tag.handle;
CUevent *event  = (CUevent*) tag.handle;

Timer

The occa::timer was taken out since it seemed out of scope for the project.
An occa::kernel profiler might be added later though.

OKL

A new parser is being developed for the sake of user-friendly error outputs and no memory leaks.
Additonally, it's made to be extended for future backends outside of the OCCA repository.
For that reason, the custom qualifiers such as shared and exclusive needed to be generalized.

Before the release of 1.0, the following keywords will be changed to attributes (e.g. @attr)

shared           -> @shared
exclusive        -> @exclusive
kernel           -> @kernel
barrier          -> @barrier
restrict         -> @restrict
outer, outer0... -> @outer, @outer(0), ...
tile(256)        -> @tile(256)

Attributes are simliar to #pragmas that can be ignored and are meant to be handled by each backend in its own way.
For example, @shared doesn't mean anything in CPU modes.

Missing 1.0 Features

Stable C++ API, still missing updates on:

  • C API
  • OKL Parser
  • Backends
    • Serial
    • OpenMP
    • OpenCL
    • CUDA
    • Threads
  • Documentation
  • API Testing