enchancements metadefintions

Glen Fletcher edited this page Mar 23, 2016 · 19 revisions
Clone this wiki locally

CEP 528: Support for Metadefintions (mpdef), like cdef and cpdef.

  • Status: Open
  • Implementation status: Not started

Motivation

Support for generating OpenCl code has previously been suggested CEP 525 - OpenCL Support. This Concept would provide this feature, while making it possible to implement other similar such as CUDA, OpenACC, or OpenHMPP features in a suitable way.

Requirements

The Main idea is that these tools involve creating a kernel with is compiled and load on to another device, then data can be copied and a function evaluated on the dataset, these kernel are build from c-style code, and function based. cython could add the new function definition type like the existing cdef and cpdef i.e. mpdef and mdef, for defining metafunctions (see Syntax Examples).

These metafunction will be compile to different languages, in theroy this isn't limited to c-style languages but this would take more work.

These language would be defined by extensions, i.e. pyOpenCl could defined a cython extension implementing OpenCl, while pyCuda a cython extension implementing Cuda. All cython provides is the cython language construct and an interface for defining the code generations, possibly using codepy, for easy of extension creation.

The Extensions would also define a cython import module, defining any special functions required for the language likes OpenCls get_global_id()

The Final Task of the Extensions, would be to create simple python function wrappers

i.e.

def create_meta_wrapper(code):
    def wrapper(*args, **kwargs):
        safe_args = parse_args(code, args, kwargs)
        ret = call(code, safeargs)
        return ret
    return wrapper

Of course the construct would be much more complicated then this basic example it would need to:

  • Setup correct function name, and argument information
  • Check for valid arguments
  • Check the kernel is setup, if not setup it up
  • Possibly clone memory and copy on to remote memory
  • Call function on remote device
  • Retrieve and Return Result

How Code Generation Would Work

Cython all read generates a file filename.cpp, using this system it will also generate filename.opencl, filename.cuda, etc. for any meta languages used the basic python module should contain code to compile and load the kernel on to the appropiate device. i.e.

using pyopencl, and

meta.pyx

import numpy as np
cimport numpy as np
import pyopencl as cl
cimport pyopencl as cl

@metalanguage("opencl")
@cl.shape(a.shape)
cpdef gpu_do(np.ndarray a, np.ndarray b):
    cl.sizecheck(a,b,'same')
    np.ndarray c = cl.like_array(a) 
    int i = cl.get_global_id(0)
    c[i] = a[i]*b[i] - a[i] + b[i]
    return c

the cl function, pass special information to pyopencl, so it can construct the following:

kernel

__kernel void gpu_do(__global const float *a_g, __global const float *b_g, __global float *c_g) {
  int i = get_global_id(0);
  c_g[i] = a_g[i] * b_g[i] - a_g[i] + b_g[i];
}

wrapper (part of meta python module)

def gpu_do(a,b):
    if not cl.same_size(a,b):
        raise error('a, b Size Mismatch')
    if not opencl_kernel_exist:
        setup_opencl()
    mf = cl.mem_flags
    a_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a)
    b_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=b)
    c = np.empty_like(a)
    c_g = cl.Buffer(ctx, mf.WRITE_ONLY, c.nbytes)
    opencl_prg.gpu_do(opencl_state.queue,a.shape,None,a_g,b_g,c_g)
    cl.enqueue_copy(queue, c, c_g)
    return c

program.py

import pyximport;  pyximport.install()
import pyopencl as cl
import numpy as np
import matplotlib as plt
import meta


# Full Form Not required, setup_opencl, should call pyopencl.create_some_context(), to get ctx, if not supplied
ctx = cl.create_some_context()
meta.setup_opencl(ctx)

a = np.linspace(0,100,10**8)
b = np.linspace(5,8,10**8)

# If opencl not setup should call, meta.setup_opencl, i.e. ctx, not passed, hence should run with out doing any setup
c = meta.gpu_do(a,b)
plt.plot(c)
plt.show()

An Extension, could also impose requirements such as pure c, i.e. basic like GIL off, this would obviously be important for most metalanguages, as the code will run on a remote device, hence you only have raw data that's passed in and functions defined in the kernel.

Syntax Options:

Here type is a name identifying, what to use i.e. OpenCL, CUDA, OpenACC or OpenHMPP that I know of, there are likely others as well. In theory this could be a python variable supporting runtime code generation.

The First Form creates a python wrapper function to call it with suitable python value

The Second Form, is an internal only function, could still be called using pyopencl or pycuda in theory, however may be hard to find the functions.

Option 1:

@metalanguage(type)
cpdef function(arglist):

@metalanguage(type)
cdef function(arglist):

More Pythonic, uses existing constructs, however less obvious that this isn't a standard C-style function and it will be executed in a special manner.

Option 2:

mpdef(type) function(arglist):

mdef(type) function(arglist):

Use mpdef similar to a python function

Option 3:

mpdef<type> function(arglist):

mdef<type> function(arglist):

Use mpdef similar to a C++ template construct, clear then Options 1, but less pythonic, as python dosen't use such syntax.

Option 4:

mpdef[type] function(arglist):

mdef[type] function(arglist):

Use mpdef similar to a cython template construct