This tutorial describes one particular intermediate representation used by the Devito Compiler: the Iteration/Expression Tree (IET), a special type of Abstract Syntax Tree (AST).

# Part I - Top Down

First, let's describe a _domain_, and a _function_ that will allow us to specify how such a domain gets modified. In particular, we will look at Functions that change through _time_. 

Thus we need a `Function` object with which we can build a timestepping scheme. For this purpose Devito provides so-called `TimeFunction` objects that encapsulate Functions that are differentiable in space and time, which are derived from basic _SymPy_ Functions. 

In [2]:
from devito import Eq, Grid, TimeFunction, Operator

grid = Grid(shape=(3, 3))
u = TimeFunction(name='u', grid=grid)
u.data

Data([[[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]],

      [[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]]], dtype=float32)

Here, we have just declared a two-dimensional domain with three points at each dimension (_x_ and _y_). The associated `TimeFunction` will be holding a real value for each point of such a 3-by-3 discrete space.

As we can see, we can always access the values of the Function. At this point, no modifications have been done to it yet. `u.data` give us a quick access over the values held by each node of such a domain. `u.data[0, :, :]` holds the values in the grid at the "current" iteration time, whereas `u.data[1, :, :]` holds the values of `u` for the "current+1" time-step.

We can now create an `Operator` that will perform modifications onto our Function according to differential equations through a computational stencil. 
It means that those differential equations will be translated into finite differences that will be used to update the values at each spatial and temporal coordinates.
Such finite differences, or _expressions_, will be applied for specific ranges of _iterations_ over the domain. 

In [3]:
eq = Eq(u.forward, u+1)
op = Operator(eq)
op.args['expressions']

Eq(u(t + dt, x, y), u(t, x, y) + 1)

For instance, the particular `Eq` object above allows us say that, at each time step, 1 will be added to every position of the domain.

Let's take a look at the _kernel_ that will be used to compute how this equation alters the domain.

In [4]:
print(op)

#define _POSIX_C_SOURCE 200809L
#include "stdlib.h"
#include "math.h"
#include "sys/time.h"
#include "xmmintrin.h"
#include "pmmintrin.h"

struct profiler
{
  double section0;
} ;


int Kernel(float *restrict u_vec, const int time_M, const int time_m, struct profiler* timers, const int x_M, const int x_m, const int x_size, const int y_M, const int y_m, const int y_size)
{
  float (*restrict u)[x_size + 1 + 1][y_size + 1 + 1] __attribute__((aligned(64))) = (float (*)[x_size + 1 + 1][y_size + 1 + 1]) u_vec;
  /* Flush denormal numbers to zero in hardware */
  _MM_SET_DENORMALS_ZERO_MODE(_MM_DENORMALS_ZERO_ON);
  _MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON);
  for (int time = time_m, t0 = (time)%(2), t1 = (time + 1)%(2); time <= time_M; time += 1, t0 = (time)%(2), t1 = (time + 1)%(2))
  {
    struct timeval start_section0, end_section0;
    gettimeofday(&start_section0, NULL);
    for (int x = x_m; x <= x_M; x += 1)
    {
      #pragma omp simd
      for (int y = y_m; y <= y_M; y += 1)
    

Now that we have an `Operator` set up, we are ready to update our Function throught the `apply` method. Without additional parameters specified, the `Operator` runs on the same data objects used to build it. It is important to stress that, the maximum iteration point along the time dimension must be explicitly specified (otherwise, the `Operator` wouldn't know how many iterations to run).

Notice that no modifications to the Function have been done so far. To verify that, query for `u.data`.

In [5]:
op.apply(time=2)
u.data

Operator `Kernel` run in 0.00 s


Data([[[2., 2., 2.],
       [2., 2., 2.],
       [2., 2., 2.]],

      [[3., 3., 3.],
       [3., 3., 3.],
       [3., 3., 3.]]], dtype=float32)

This is the first time that we are invoking the method `apply` from `Operator`. Therefore, the kernel that we saw will get written in a `.c` file, and compiled into a library with corresponding extension name (`.so`, `.dylib`,...) if `DEVITO_BACKEND` is set to `core`.

Then, as no key-value parameters are specified, the `Operator` runs with its default arguments, namely `u=u, x_m=0, x_M=2, y_m=0, y_M=2,` and `time_m=0`. The subindexes `m` and `M` stands for the minimum and the maximum at those specific Dimensions, respectivelly. Thus `time_M` will be set to `2`, here.

At this point, the same `Operator` can be used for a completely different run. For seeing that, let's create another `TimeFunction` with the same dimensions than the first one.

In [6]:
u2 = TimeFunction(name='u', grid=grid)
u2.data

Data([[[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]],

      [[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]]], dtype=float32)

Any of the operator default arguments may be replaced just by passing suitable key-value parameters.


In [7]:
op.apply(u=u2, x_m=1, x_M=2, y_m=0, y_M=1, time_M=3)
u2.data

Operator `Kernel` run in 0.00 s


Data([[[0., 0., 0.],
       [4., 4., 0.],
       [4., 4., 0.]],

      [[0., 0., 0.],
       [3., 3., 0.],
       [3., 3., 0.]]], dtype=float32)

Note, however, that there is no need for recompilation. Just-in-time (JIT) compilation occurs only once, triggered by the first execution.

In summary, an `Operator` takes as input an ordered sequence of `SymPy` equations and performs _indexification_, _substitution_ and _domain-alignment_ for building _lowered equations_. The lowered equations are analyzed to collect information relevant for the `Operator` construction and execution, resulting in `Clusters` of ordered sequences of equations having the same _iteration space_ (`ISpace`). Those `Clusters` go through _symbolic-optimization_ by the Devito Symbolic Engine (DSE), which consists of a series of passes, ranging from standard commmon sub-expression elimination (CSE) to more advanced rewrite procedures. The result is a new ordered sequence of `Clusters` that are lowered to an Iteration/Expression Tree (IET).

An IET is basically an *abstract syntax tree* in which `Iterations` and `Expressions` – two special node
types – play the main actors. Equations are wrapped within `Expressions`. Loop nest embedding such expressions are constructed by suitably nesting `Iterations`. Each `Cluster` is eventually placed in its own loop (`Iteration`) nest, although some (outer) loops may be shared by multiple `Clusters`.

In the IET construction pass, two main tasks are carried out: (i) cluster scheduling, namely the translation of a sequence of `Clusters` into an IET and (ii) the analysis of the constructed Iterations, to detect properties such as parallelism and vectorizability.

The constructed IET is analyzed to determine `Iteration` properties such as _sequential_, _parallel_, and _vectorizable_. These properties are attached directly to the nodes in the IET. In particular, the IET is rebuilt with decorated `Iteration` nodes – there is no global state in any of the intermediate representations used in Devito.

Here is another way to see the `Operator` defined above in a fashion closer to its IET structure.

In [8]:
from devito import pprint
pprint(op)

<Callable Kernel>
  <List (0, 2, 0)>

    <ArrayCast>
    <List (0, 2, 0)>

      <DenormalsMacro>

        <Element /* Flush denormal numbers to zero in hardware */>
        <Element _MM_SET_DENORMALS_ZERO_MODE(_MM_DENORMALS_ZERO_ON);>
        <Element _MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON);>

      <List (0, 1, 0)>

        <[affine,sequential,wrappable] Iteration time::time::(time_m, time_M, 1)::(0, 0)>
          <TimedList (2, 1, 2)>
            <C.Statement struct timeval start_section0, end_section0;>
            <C.Statement gettimeofday(&start_section0, NULL);>
            <Section (1)>

              <[affine,parallel] Iteration x::x::(x_m, x_M, 1)::(0, 0)>
                <[affine,parallel,vector-dim] Iteration y::y::(y_m, y_M, 1)::(0, 0)>
                  <ExpressionBundle (1)>

                    <Expression u[t1, x + 1, y + 1] = u[t0, x + 1, y + 1] + 1>


            <C.Statement gettimeofday(&end_section0, NULL);>
            <C.Statement timers->section0 += (double

The `op` object will be expressed as a `root` node of a tree. Walk through such a data structure allows us to regard specific parts of it.

Thus, taking the above kernel as example, `op` will be represented as a `<Callable Kernel>` that will be composed by `_headers`, `_includes` and a `body` (which is a `<List>`, in this example).

In [9]:
op._headers

['#define _POSIX_C_SOURCE 200809L']

In [10]:
op._includes

['stdlib.h', 'math.h', 'sys/time.h', 'xmmintrin.h', 'pmmintrin.h']

In [11]:
op.body

(<List (0, 2, 0)>,)

We can access such a `<List>` to observe that it is composed by an `<ArrayCast>`, and another `<List>` object. We will discuss more about those data structures later on. For now, we can focus on goin down through the IET, in order to find the fully discretized expression that translates the input `SymPy` equation, according to Devito.

The first element of the `<List>` is an `<ArrayCast>`:

In [12]:
print(op.body[0].body[0])

float (*restrict u)[x_size + 1 + 1][y_size + 1 + 1] __attribute__((aligned(64))) = (float (*)[x_size + 1 + 1][y_size + 1 + 1]) u_vec;


The second element of the `<List>`; another `<List>`:

In [13]:
print(op.body[0].body[1])

/* Flush denormal numbers to zero in hardware */
_MM_SET_DENORMALS_ZERO_MODE(_MM_DENORMALS_ZERO_ON);
_MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON);
for (int time = time_m, t0 = (time)%(2), t1 = (time + 1)%(2); time <= time_M; time += 1, t0 = (time)%(2), t1 = (time + 1)%(2))
{
  struct timeval start_section0, end_section0;
  gettimeofday(&start_section0, NULL);
  for (int x = x_m; x <= x_M; x += 1)
  {
    #pragma omp simd
    for (int y = y_m; y <= y_M; y += 1)
    {
      u[t1][x + 1][y + 1] = u[t0][x + 1][y + 1] + 1;
    }
  }
  gettimeofday(&end_section0, NULL);
  timers->section0 += (double)(end_section0.tv_sec-start_section0.tv_sec)+(double)(end_section0.tv_usec-start_section0.tv_usec)/1000000;
}


Now, let's catch the specific loop corresponding to the `time` dimension of our domain.

In [14]:
t_iter = op.body[0].body[1].body[1].body[0]
t_iter

<WithProperties[affine,sequential,wrappable]::Iteration time[t0,t1]; (time_m, time_M, 1)>

This is an `Iteration` structure.

In [15]:
print(t_iter)

for (int time = time_m, t0 = (time)%(2), t1 = (time + 1)%(2); time <= time_M; time += 1, t0 = (time)%(2), t1 = (time + 1)%(2))
{
  struct timeval start_section0, end_section0;
  gettimeofday(&start_section0, NULL);
  for (int x = x_m; x <= x_M; x += 1)
  {
    #pragma omp simd
    for (int y = y_m; y <= y_M; y += 1)
    {
      u[t1][x + 1][y + 1] = u[t0][x + 1][y + 1] + 1;
    }
  }
  gettimeofday(&end_section0, NULL);
  timers->section0 += (double)(end_section0.tv_sec-start_section0.tv_sec)+(double)(end_section0.tv_usec-start_section0.tv_usec)/1000000;
}


 We can further investigate its limits, for instance.

In [16]:
t_iter.limits

(time_m, time_M, 1)

And as we keep going down through the IET, we reach the expression that is wrapped by the iterations' scope.  

In [17]:
expr = t_iter.nodes[0].body[0].body[0].nodes[0].nodes[0].body[0]
expr.view

'<Expression u[t1, x + 1, y + 1] = u[t0, x + 1, y + 1] + 1>'

The compiler provides several IET visitors. One of these, `FindNodes`, can be used to
retrieve all nodes of a particular type within a given subtree. For example, we could retrieve
all `Expression` objects within an `Operator`.

In [18]:
from devito.ir.iet import Expression, FindNodes
exprs = FindNodes(Expression).visit(op)
print(exprs[0].view)

<Expression u[t1, x + 1, y + 1] = u[t0, x + 1, y + 1] + 1>
