DagSverreSeljebotn edited this page May 13, 2008 · 26 revisions
Clone this wiki locally

Note: This document is heavily outdated! The goals of this document remains, however the overall approach is superseded by a combination of many other specs.


This represents Dag Sverre Seljebotn's suggestions for a plugin architecture for Cython capable of more easily providing syntax candy and special optimization for what I'll denote "Cython types" (which is basically the behaviour of C types after being wrapped in Cython). In particular, this might provide much C++ support without any extra explicit support.

If this (well, a heavily modified version) gets support from Cython developers, I might be able to spend some time implementing it during the summer of 2008, but cannot promise anything at this stage.

(Note: I am not a developer of Cython, just a random user of it that became interested in hacking on the source -- so this document does in no way reflect any "official" direction Cython development will go in.)

Update 2008-03-08

This is already a bit outdated, read enhancements/parsetreetransforms before this document.

Update 2008-03-07

I've spent quite a few hours reading Cython source now and a real approach to implementation is beginning to take form. Still estimate about a week for basic functionality.

If you are new to this then start reading the section at the bottom ("The idea") -- I keep the most finished, polished part to the top but they assume having read the unpolished stuff at the bottom.

Suggested changes as seperate specifications

As I work myself through the source code I get a better idea of how to specify it...

These features can be regarded seperate and can be added to get the desired result.

1. Parse tree transform system

See enhancements/parsetreetransforms

2. Type arguments

See enhancements/typeparameters

3. Behaviour system

(Cannot seem to find a good name for this...) Basically a more friendly interface to the parse tree transforms -- wraps a user-friendly plugin-interface for a more specific scenario and implements a tree transform on their behalf.

This is in order for such hooks to be able to live comfortably in other libraries (for instance NumPy might support Cython rather than the other way around), and then a new layer of indirection and simplification is added. This is not a pure duplication, because the Cython parse tree works on the level of source code tokens, while this system works on higher-level primitives by specifying the behaviour of certain variables, types (and perhaps eventually functions).

  • is added to Cython core, and specifies a default Behaviour outlining the possible actions to hook into. The instantiated object can be replaced at run-time by plugin modules (and the replaced object should make sure to forward the non-handled calls back to the default implementation).
   class Behaviour:
     def typeof_indexed_var(self, var): return None
     def typeof_allowed_indices(self, var): return None
     def getexpr_indexed_var(self, scope, var, indices): return None

Currently the implementation will return None which will trigger the Cython builtin, however the Cython core may be refactored to this class later if it turns out that the layer of indirection is useful.
* A TypeBehaviourFactory is provided, which can filter Behaviour events and if the variable has the correct type, one VariableBehaviour descendant is instantiated per unique var object and events forwarded to the right var object.
* A VariableBehaviour descendant gets the same calls but without the var parameter.
* The gettype_XXX calls should return a Cython type object. The getexpr_XXX should return an instance of Behaviour.ExprResult like this:
# within gettype_XXX function...:

# Need to do temporary calculation, store it in temporary variable which
# is the expression, and finally clean up...

result = scope.expr_result()
tmpvar = result.tmp_c_var("int[]")
result.c_pre("%s = new int[10];" % tmpvar)
result.c_post("delete[] %s;" % tmpvar)
return result

Steps to implement

  • Modify to parse the arguments
  • Add parametrized subtype support to
  • Make the parametrized subtype act like the original type everywhere.


cdef myfunc(cpp.vector(unsigned int) vec):
  cpp.vector(float) vec2

ctypedef numpy.ndarray(numpy.uint8, 2) grayscale_bitmap

Key observations

  • The main strategy of Cython source is to a) create a parse tree of the Python source, b) assume that it pretty much maps 1:1 to C source, except for type handling which is done by a hard-coded strategy of inserting coercion nodes, c) run methods on the parse tree to generate source.
  • What this architecture must do is to insert a new step that allows transforms to the tree (new kinds of transforms can be added as needed), in order to remove the 1:1 mapping.
Take the Numpy example as usual (below in "The idea"). The end result of the code written there should end up in a transform that:
  • Acts on all IndexNodes in the tree which has a base of the type of ndarray.
  • Replaces the node by a node which generates a numpy lookup operation.
  • Only a part of the tree is replaced -- the "index" child is in many cases a TupleNode and that will probably be unpacked so that individual arguments can be considered by numpy code generating code -- but below that level the tree must be kept to evaluate the index expressions.

Older notes on implementation strategy (a bit obsolete)

(These are mostly mental dumps for myself -- in case I end up implementing it. Most of this is also very applicable for implementing NumPy support directly as well.) I'll call the class above (VarGenerator) etc. for a "varhandler".

  • The plugins should have direct access to and use it as the type engine.
  • should include a field for specifying optional varhandler class for types.
  • The symbol table in should be changed to include an optional field for one varhandler instantiation

per symbol. One can either hack Symtab to instantiate the varhandler (if it exists for the type) or figure out where Symtab is called and see if it fits there. * Change lots of the classes in to first attempt to delegate to a varhandler if it exists for a variable -- example rewriting of IndexNode.analyse_types (, line ~1210):

       def analyse_types(self, env):

           varhandler = env.lookup_var_extension(  # dagss
           set_by_varhandler = False
           if varhandler != None:
               # Delegate call to extension
                 self.type = varhandler.type_when_indexed()
                 if self.type != None: set_by_varhandler = True
               except CompileError, e:
                 error(self.pos, e.message)

               self.py_index = CloneNode(self.index)

           if not set_by_varhandler: # original code follows...

* A problem here is that self.index.analyse_types will coerce all slice arguments etc. to Python objects, which may not be what we want. So the varhandler must
somehow control how the coercion of the indices happens...perhaps by a declarative strategy like this:
   class NumPyVarHandler....:
       def get_legal_indexes(self):
           return [int_type] * self.nd # Allow exactly nd ints in []-notation

Other specs could be:
   # First index can be string or int, second only int
   spec = [MultiTypes(str_type, int_type), int_type]

   # Must have two slices with int increments
   spec = [(int_type, int_type, int_type)] * 2

The result could be passed to analyse_types as an optional "accepts" argument (where None means "just do what Pyrex does")

The Idea

I'll introduce my ideas by example, by implementing NumPy array support simply assuming that Cython has been developed to the stage that it includes my underlying ideas. None of the code below will run now of course.

While the stuff below is very targeted to one class (numpy.ndarray), keep in mind that most plugins can be much more generic and cover more than one class. For instance CPPContainerVarGenerator should be able to tackle all C++ (formerly STL) container classes, and it will inherit from CPPTemplateVarGenerator which deals with the generic problem of handling C++ templates.

The code I expect to be able to write in Cython using NumPy is this:

def negative_grayscale_image(numpy.ndarray(numpy.uint8, 2) arr):
  cdef int i, j
  for i in range(0, arr.shape[0]):
    for j in range(0, arr.shape[1]):
      arr[i, j] = 255 - arr[i, j]

Also, this would be nice:

def my_func(numpy.ndarray(numpy.float, 2) arr):
  for i, j in arr.coords():
    arr[i, j] = i * j * j

def my_sum(numpy.ndarray(uint8, ANY) arr):
  # By not declaring number of dimensions I've yielded my right
  # to a [] operator, but I can still do:
  cdef int total_sum = 0
  for value in arr: total_sum += value
  return total_sum

In order to declare numpy.ndarray, one might say (note the added cextend and generator, syntax is just to get the idea across):

cextend cython_numpy

ctypedef extern class numpy.ndarray [object PyArrayObject, generator NumPyNdArrayVarGenerator]:
  cdef char *data
  cdef int nd
  cdef intp *dimensions
  cdef intp *strides
  cdef int flags

Now, the following is It is a bit crude - of course, care could be taken to implement more generic plugins that can be used for more than one type, an hierarchy and so on.

from cython.plugin import *

# The following is instantiated by Cython as the Cython code is
# parsed and C code generated. It is instantiated once per variable
# in the source code that has the type numpy.ndarray
# ctx is a "code generator" that represents the function that is
# currently being written to C source
class NumPyNdArrayVarGenerator(VarGenerator):
  def __init__(self, type, numdimensions):
    self.datatype = type
    self.nd = numdimensions

  def enter_scope(self, ctx, varname, is_function_argument):
    # This is called right after __init__

    # Sanity checking. This is easiest to write in Cython, and let ctx
    # spawn a sub-conversion to make C out of it
    ctx.prepend_statement_cython("if %s.nd != %d: raise ValueError(\"Wrong
numer of dimensions...\")" % (varname, self.nd))

    # For numpy: Cache stride variables in local variables
    self.varname = varname
    self.stridevars = [ ctx.new_temp_var("int") for x in range(self.nd) ]
    for stridevar, idx in zip(self.stridvars, range(self.nd)):
      # prepend a statement using C directly
      ctx.prepend_statement_c("%s = %s.strides[%d]" % (stridevar, varname, idx));

  def exit_scope(self, ctx): pass

  def get(self, ctx, type):
    # Called when we are the b variable in "a = b". Should generate
    # code for returning an expression (C or Python) for making it
    # happen. Code can be inserted through ctx in order to assign the
    # value to a temporary variable...

    # If we were an int, we might return ctx.expr_c(self.varname)
    # If we were a magic int that truncated if above 1000 when assigned to a short, we might
    # do:
    # if type == "short":
    #   tmpvar = ctx.new_temp_var("int")
    #   ctx.append_statement("%s = (%s > 1000) ? 1000 : %s" % (tmpvar, self.varname, self.varname))
    #   return ctx.c_expr(tmpvar)
    # Also, to return cython code instead, you would do return ctx.cython_expr(...)

    return None # cannot assign ndarray to anything

  def assign(self, ctx, value_expr):
    # Called whenever a = b where a is ourselves. b is in value_expr and
    # is an object that is a descendant of VarGenerator as well
    raise CompileError("Cannot assign directly to ndarray")
    # Or we might have code to copy entire buffer...

# Comment: assign and get cannot both convert to C, should probably
# drop either get or assign...well, it's a scetch anyway... (DagSverreSeljebotn)

  def get_slice(self, ctx, indices):
    # Called when we are the a of a[i, j:k] = b...
    # Indices is a list of tuples, in the example from last line would
    # be [(i, None), (j, k)], where i, j, k are VarGenerators like usual
    for f, t in indices:
      if not t is None: raise CompilerError("Slice notation not supported for typed numpy arrays.")
    if len(indices) != self.nd:
      raise CompilerError("Variabled %s allocated with %d dimensions, but indexed with %d indices" % ....)
    if value.get_type() != self.type: raise CompilerError(...)

    idxexprs = []
    for f, t, idx in zip(indices, len(indices)):
      # Use the get method of the index arguments... these will
      # allocate their own temporaries and insert code for calculating themselves
      # if needed
      e = f.get(ctx, "int")
      if e == None: raise CompilerError("Expression %s cannot be converted to integer index" % ...)
      idxexprs.append("%s * %s" % (e, self.slicevars[idx]))

    idxstring = " + ".join(idxexprs)
    return ctx.expr_c("%s[%s]" % (self.varname, idxstring))

  ... other ways of using a variable...

  def iterated_in_for_loop(self, ctx, itervars, body):
    # Called when we are the a of: for i, j, k in a: ....
    # Itervars is a list of VarGenerators
    # The way to process body is TODO
    # Basically, this should output an optimized loop, and is free
    # to use whatever means necesarry to raise performance (duplicate body in if-branch
    # depending on dimension ordering and so on)
    raise CompilerError("No NumPy direct iteration yet...")

Arguments in favor of the approach

  • Python allows simple syntax like a[2,3:4,:] to mean arbitrarily complex things because of operator overloading, getattr overloading etc.
  • It would be nice to not have a limit for how efficient C code Cython can generate
  • Thus, something like this is needed

(Of course, if applied everywhere this way of thinking becomes ridicilous, it simply then means that all lines of Python code must be able to mean one thing to Python and another to Cython, without there being any other link than what the user "intends" the code to do.

However NumPy is an excellent demonstration of an area were "reimplementing" just a few features like basic item access as a C-compiler rather than Python interpreter can have dramatic result. But we would rather not treat NumPy as a special case... and so...

Problems with the approach

A possible problem with this approach is that it is very little declarative, using instead an imperative approach to the wrapping. My thinking is that as long as this is available as a emegerency fallback, one can always polish the declarative language (ie "cdef class ...") one area after another so that using this framework becomes necesarry in fewer and fewer areas - however, currently a lot of basic support is "missing" or difficult to add (NumPy syntax candy, C++ templates) and this might provide a shortcut to getting there at all.