## Tutorial 1: Introduction

The objective of the first tutorial is to introduce the concepts of some basic devito instructions that are often used when trying to solve a problem implicitly. Most implicit methods require solving an equation iteratively, when an approximated solution is made better and better after each iteration.

### Breaking Loops

One example of an iterative method is the Newton-Raphson method, which is a root-finding algorithm that can be used, for example, to find the square root of a number $N$ with arbitrary precision. The equation $f(x) = x^2 - N$ has two real roots, namely $\sqrt{N}$ and $-\sqrt{N}$. Given an initial approximation $x_0 \approx N$, the Newton method gives a better approximation by the equation $x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}$. One way of knowing when to stop iterating is to use the following stopping criteria $|x_{n+1} - x_n| < \epsilon$, with $\epsilon$ being a sufficiently small value. This criteria is not sufficient to guarantee the current approximation is inside an interval of size $\epsilon$ around the real solution, but is enough for our purposes.

The first step is to define the Newton-method parameters.

In [1]:
N = 2 # The number that will have its square root approximated
x0 = 1 # Initial guess
epsilon = 0.0001 # A constant that is used to know when to stop iterating
n_M = 50 # Maximum number of iterations (used to avoid infinite loops)

The next step is to define the devito variables that will store the approximations of each iteration. The __Dimension__ is defined, which will represent the dimension of iteration '$n$' in $x(n)$, and also the __TimeFunction__, which is the symbolic representation of the variable $x$ inside Devito. A residual variable is created to store a value that should represent how far from the solution the method is.

Also, the devito class used to break loops is called __ConditionalDimension__. It gives devito a necessary condition when executing an equation dependent on that dimension. In our case, we want the iteration equation to execute only when the stop criteria hasn't been met, in other words, the iteration should be executed only while $|x_{n+1} - x_n| \geq \epsilon$. Using the __brk__ parameter, we can tell devito to break the iteration whenever that condition stops being met.

In [2]:
from devito import TimeFunction, Dimension, ConditionalDimension, Eq

n = Dimension(name = 'n')

residual = TimeFunction(name = 'r', shape = (n_M,), dimensions = (n,))
residual.data[0] = 2 * epsilon # Initial residual must be bigger than epsilon for the first iteration to execute

cn = ConditionalDimension(name = 'cn', parent = n, condition = (residual >= epsilon), brk = True)

x = TimeFunction(name = 'x', shape = (n_M,), dimensions = (cn,))
x.data[0] = x0 # Set the initial approximation

Finally, we can define the two equations that represent our problem:

1. The Newton equation: $x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}$
2. The stop criteria residual: $residual = |x_{n+1} - x_n|$

In [3]:
f = x ** 2 - N
df = 2 * x

newton = Eq(x.forward, x - f / df)
stop_criteria = Eq(residual.forward, abs(x.forward - x))

print("%s = %s" % (newton.lhs, newton.rhs))

x(cn + h_n) = -(x(cn)**2 - 2)/(2*x(cn)) + x(cn)


The final step is to create and run the __Operator__ responsible for executing the iterations.

In [4]:
from devito import Operator

op = Operator([newton, stop_criteria])

op() # Run the operator

print("The approximations for sqrt(" + str(N) + ") are " + str(x.data))
print("The residuals for sqrt(" + str(N) + ") are " + str(residual.data))

Operator `Kernel` ran in 0.01 s


The approximations for sqrt(2) are [1.        1.5       1.4166666 1.4142157 1.4142135 0.        0.
 0.        0.        0.        0.        0.        0.        0.
 0.        0.        0.        0.        0.        0.        0.
 0.        0.        0.        0.        0.        0.        0.
 0.        0.        0.        0.        0.        0.        0.
 0.        0.        0.        0.        0.        0.        0.
 0.        0.        0.        0.        0.        0.        0.
 0.       ]
The residuals for sqrt(2) are [1.9999999e-04 5.0000000e-01 8.3333336e-02 2.4509407e-03 2.1215624e-06
 0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00
 0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00
 0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00
 0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00
 0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00
 0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000

Taking a look at the C code generated from the __Operator__, a 'break' instruction can be seen inside the iteration loop.

In [5]:
print(op.ccode) # Shows the kernel code

#define _POSIX_C_SOURCE 200809L
#define START_TIMER(S) struct timeval start_ ## S , end_ ## S ; gettimeofday(&start_ ## S , NULL);
#define STOP_TIMER(S,T) gettimeofday(&end_ ## S, NULL); T->S += (double)(end_ ## S .tv_sec-start_ ## S.tv_sec)+(double)(end_ ## S .tv_usec-start_ ## S .tv_usec)/1000000;

#include "stdlib.h"
#include "math.h"
#include "sys/time.h"
#include "xmmintrin.h"
#include "pmmintrin.h"

struct dataobj
{
  void *restrict data;
  int * size;
  int * npsize;
  int * dsize;
  int * hsize;
  int * hofs;
  int * oofs;
} ;

struct profiler
{
  double section0;
} ;


int Kernel(struct dataobj *restrict r_vec, struct dataobj *restrict x_vec, const int n_M, const int n_m, struct profiler * timers)
{
  float (*restrict r) __attribute__ ((aligned (64))) = (float (*)) r_vec->data;
  float (*restrict x) __attribute__ ((aligned (64))) = (float (*)) x_vec->data;

  /* Flush denormal numbers to zero in hardware */
  _MM_SET_DENORMALS_ZERO_MODE(_MM_DENORMALS_ZERO_ON);
  _MM_SET_FLUSH_