Skip to content
This repository


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

## page was renamed from enhancements/transforms

Enhancement proposal: Metaprogramming support

. The thoughtful use of macros leads to programs which are marvels of clarity and elegance.
. — Paul Graham, On Lisp
  • Author: MartinCMartin
  • Status: Basic version implemented, many details to be worked out. It is still unclear if such structures should be added to Cython (distancing it from Python) or as part of a hypothetical Cython+.


The rise of C++ template metaprogramming has brought widespread acceptance of something Lisp programmers have known for a long time: the usefulness of arbitrary computation at compile time that generates code. However, C++ template metaprogramming is a pure functional programming language with syntax and semantics vastly different from C++ itself. In contrast, this proposal allows arbitrary Python code to be run at compile time that produces Cython code.


This proposal adds a new type of function, called a transform, using the keyword deftrans. This function is executed at compile time. Rather than receiving the values of its arguments, as a normal function does, it receives the parse trees of the arguments. The return value should also be a parse tree, which the compiler inserts in place of the call to the transform.


The current proof-of-concept implementation can properly handle these examples.

1. Symbolic Differentiation

The following is a transform implements the sum and product rules of calculus, along with two base cases:

deftrans deriv (myexpr, var):
   if isinstance(myexpr, ExprNodes.AddNode):
      # (f + g)' = f' + g'
      return ExprNodes.AddNode(myexpr.pos,
        operator = '+',
        operand1 = deriv(myexpr.operand1, var),
        operand2 = deriv(myexpr.operand2, var))
   elif isinstance(myexpr, ExprNodes.MulNode):
      #(f * g)' = f * g' + f' * g
      return ExprNodes.AddNode(myexpr.pos,
         operator = '+',
         operand1 = ExprNodes.MulNode(myexpr.pos,
            operand1 = myexpr.operand1,
            operand2 = deriv(myexpr.operand2, var)),
         operand2 = ExprNodes.MulNode(myexpr.pos,
            operator = '*',
            operand1 = deriv(myexpr.operand1, var),
            operand2 = myexpr.operand2))
   elif isinstance(myexpr, ExprNodes.NameNode):
      if ==
         return ExprNodes.IntNode(myexpr.pos, value = '1')
         return ExprNodes.IntNode(myexpr.pos, value = '0')
   elif isinstance(myexpr, ExprNodes.ConstNode):
      return ExprNodes.IntNode(myexpr.pos, value = '0')

When used like this:

def eggs(a, b):
   return deriv(5*a+b, a)

it is translated at compile time to:

def eggs(a, b):
   return (((5 * 1) + (0 * a)) + 0)

2. Symbolic Differentiation with Templates

Generating nodes by explicitly calling the constructor of Node subclasses is verbose and tedious, and therefore error prone. All existing metaprogramming facilities that I'm aware of (C++, Common Lisp, Scheme, Dylan, R, Boo) allow the programmer to write a template where expressions can be substituted.

The proof-of-concept implementation has a simple template mechanism: each node is represented as a list, where the first element represents the node type, and the remaining ones the children. A better idea is the one proposed by Boo, but the one here is easier to implement, so its used in the proof-of-concept implementation. Using this mechanism, the above transform can be written:

deftrans deriv2 (myexpr, var):
   import Cython.Compiler.ExprNodes as ExprNodes
   if isinstance(myexpr, ExprNodes.AddNode):
      # (f + g)' = f' + g'
      return ['+', deriv2(myexpr.operand1, var), deriv2(myexpr.operand2, var)]
   elif isinstance(myexpr, ExprNodes.MulNode):
      #(f * g)' = f * g' + f' * g
      return ['+',
         ['*', myexpr.operand1, deriv2(myexpr.operand2, var)],
         ['*', deriv2(myexpr.operand1, var), myexpr.operand2]]
   elif isinstance(myexpr, ExprNodes.NameNode):
      if ==
         return '1'
         return '0'
   elif isinstance(myexpr, ExprNodes.ConstNode):
      return '0'

3. Newton's Method

Given a way to compute a function and its derivative, Newton's method is a way to numerically approximate a root. We can use the above symbolic differentiator to help us.

First, we define the core Newton's Method function. This is a standard Cython function that doesn't use any metaprogramming. It takes a function and an initial estimate of the root. The function that's passed in must return a tuple of length 2: the value of the function, and the value of its derivative.

def newtons(f, x):
   while True:
      (val, deriv) = f(x)
      print "f(", x, ") =", val, ", f'(", x, ") =", deriv
      if abs(val) < 1e-10:
         return x
      x -= val / deriv

To produce a suitable function, we can use our earlier deftrans, with added support for subtraction and exponentiation by a constant (not shown):

deftrans exprAndDeriv(expr, var):
   return ['tuple', expr, deriv2(expr, var)]

We can then define a function:

def myfunct(x):
   return exprAndDeriv(5*x**3 - 10, x)

which is transformed to:

def myfunct(x):
   return (((5 * (x ** 3)) - 10), (((5 * ((3 * (x ** (3 - 1))) * 1)) + (0 * (x ** 3))) - 0))

If we Cython this and load it into a Python interpreter, we get:

>>> import TestTrans
>>> TestTrans.myfunct(10.0)
(4990.0, 1500.0)
>>> TestTrans.newtons(TestTrans.myfunct, 10.0)
f( 10.0 ) = 4990.0 , f'( 10.0 ) = 1500.0
f( 6.67333333333 ) = 1475.93037185 , f'( 6.67333333333 ) = 668.000666667
f( 4.46385893383 ) = 434.735082042 , f'( 4.46385893383 ) = 298.890548717
f( 3.00936301916 ) = 126.267956666 , f'( 3.00936301916 ) = 135.843986716
f( 2.07985587115 ) = 34.9852072625 , f'( 2.07985587115 ) = 64.8870066716
f( 1.54068464016 ) = 8.28568621842 , f'( 1.54068464016 ) = 35.6056374064
f( 1.3079774954 ) = 1.18847303518 , f'( 1.3079774954 ) = 25.6620769269
f( 1.26166506953 ) = 0.0415843883547 , f'( 1.26166506953 ) = 23.8769812152
f( 1.25992345957 ) = 5.73769235945e-05 , f'( 1.25992345957 ) = 23.8111068596
f( 1.2599210499 ) = 1.09734443754e-10 , f'( 1.2599210499 ) = 23.8110157797
f( 1.25992104989 ) = 0.0 , f'( 1.25992104989 ) = 23.8110157795

4. Indexing an arbitrary-dimension array

Metaprogramming has many applications to speeding execution, by doing computation at compile time that would normally be done at run time. This example performs compile time loop unrolling for a hypothetical arbitrary-dimension array:

deftrans at(array, *indexes):
   expr = indexes[0]
   for i in xrange(1, len(indexes)):
      expr = ['+', ['*', expr, ['strides', array, str(i)]], indexes[i]]
   return ['array_get', array, expr]


def array_example(myarray, i, j):
    return at(myarray, i, j, 55)

is transformed to

def array_example(myarray, i, j):
   return array_get(myarray, ((((i * strides(myarray, 1)) + j) * strides(myarray, 2)) + 55))

Use Cases

. Complex programs may be written as a series of layers, each one acting as a programming language for the one above.
. — Paul Graham, On Lisp

The various Lisp communities have developed a number of use cases. No doubt, if this proposal is accepted, many more will develop that we can't currently forsee. What follows is a list of existing use cases drawn from the Paul Graham book On Lisp, the standard libraries of the programming language R., as well as from the 400,000 line Lisp program QPX by ITA Software.

1. Parsing expressions

  • Symbolic differentiation, as in examples 1 and 2.
  • Expressions as SQL queries. It should be possible to have an object representing a table in a database, and have a "select" method, like this:

. > '1969-1-1' and firstname == lastname) . and have a transform turn it into a SQL query. Groovy has been able to do this since Groovy 1.0. * Regression formulas. In R, the formula y ~ A + B + C means "estimate y as a linear function of A, B, and C, including an intercept." The formula y ~ (A + B + C)^2 means "estimate y as a linear function of A, B, C, A*B, A*C and B*C including an intercept." There are many variations.

2. Alternate representations of classes

  • In C & C++, class fields are laid out in the order they appear in the source code. However, most programmers lay out their source code in whatever order they consider clearest, e.g. listing all public ones then all private ones. This can potentially waste a lot of space in padding when different sized fields are mixed. It might be useful to have a transform, call it "class_pack," which simply reorders the fields to something more compact.
  • C's bit fields are one of the least specified parts of the language. For example,

. {{{#!cplusplus numbers=disable

struct Bits {
int a:3; int b:8;


. may put 'a' in the most significant or least significant byte, depending on the compiler. Even if Cython were to support this syntax, it might be useful to have a transform, call it "class_bits", which takes a normal class definition and transforms it to one that packs and unpacks the bits from ints itself.

3. Moving computation from runtime to compile time

  • Indexing into a multi-dimension array, as in example 4.
  • Creating a custom hash function given a list of arguments and their sizes in bits.
  • Given a set of values that can be computed, the functions that compute them, and the values needed by those functions, a transform could write a function which takes a value and calls just the needed functions in the necessary order to compute them. QPX's XML output works that way: at runtime, it looks at the requested output elements, and does the minimum amount of computation needed to provide it.

4. Simplifying syntax

If there's some idiom you find yourself using, rather than typing it out over and over again, you can create a transform to insert it for you.
  • An iterator which binds isFirst, isLast and index as well as the value.
  • when_debugging: a transform which takes a string and a block of code, and only compiles the code when the string is in a given list. Used to not even compile debugging code when its not being used. For example:

. {{{#!python numbers=no

@compiletime names_to_ids = {}

# Exists during compile time, then final value is used as initializer # in the C source, so it's available at runtime. @compileAndRuntime num_ids = 0

# Only exists at runtime, so initializer runs on program startup. beingDebugged = [False] * num_ids # Ideally this would be a C array.

def getid(name):
if name in names_to_ids:
return names_to_ids[name]
id = num_ids names_to_ids[name] = id num_ids += 1 return id
deftrans when_debugging(name, *args):
id = getid(name) return "if beingDebugged[$id]: when_debugging_internal($id, $name, *$args)"
deftrans start_debugging(name):
return "beingDebugged[$getid(name)] = True"
deftrans stop_debugging(name):
return "beingDebugged[$getid(name)] = False"

This assumes some features I haven't described yet, like the ability for compile time variables to be saved then restored at runtime, and the ability to return statements when a transform is called as the top level of a statement expression.

5. Other optimization

  • A transform that's given a number of different ways to compute the same thing. When compiled in one mode, it inserts code to collect statistics. When compiled in a different mode, it uses those statistics to pick the fastest method of computation.

To be continued.

Design Issues

1. Template format

I think the best way to go is the way proposed by Boo.

2. How to make this as natural as possible by avoiding explicit trees as much as possible

Related to template format, and perhaps the solution to "evaluating arguments more than once," if we go for "using the arg inserts its value; only if we do unevaluated(arg) do we get the tree" or something like that.

3. How to implement

  • Parsing (want to capture transforms as "raw" strings; may want the ability to have new identifiers parsed as class definers, function definers, etc.)

  • When, where and in what context to evaluate the transform definitions.

  • Keep expanding transforms until there are no more transform calls left.

  • Objects created at compile time: how do we make them available at run time?
    • Have an explicit "variables-to-restore"?

I'm sure more will come up as I read through On Lisp.

4. How will this work with method?

The examples are global functions. How does compile time dispatch work with methods in a dynamic language, where the type of the object isn't known until runtime? Do we only allow this for typed variables?

5. Whether/how to unify compiler macros and SBCL's transforms with this enhancement

6. When you want to insert statements inside an expression

7. Defining new statements, function definitions, class definitions, etc.

8. Variable capture and hygienic macros

9. Evaluating arguments more than once

Related Systems

Sources of inspiration for design decisions can come from the macro systems in various Lisps, namely Common Lisp, Scheme, and Dylan. Scheme and Dylan macros, like C++ templates, are purely substitution based. Thus there are no loops; loops must be implemented using recursion.

Common Lisp has two "gotchas" that catch both those new to macros and experts alike. The first is variable capture. The second is evaluating arguments more than once.

Scheme solves the variable capture problem with what it calls "hygienic macros." Identifiers that are passed in as part of the arguments are resolved in the lexical environment where the macro is called & expanded. Variables generated in the macro itself are resolved in the lexical environment where the macro was defined.

The R programming language doesn't have macros per say, but all functions get their arguments unevaluated.

Something went wrong with that request. Please try again.