# Embedding a sequence of CNN layers in one Operator

In this example, we will create a simple 2D convolutional neural network (CNN) and execute a forward pass through it using Devito.

Firstly, let's import all the prerequisites:

In [1]:
import devito.ml
from devito import Operator
from sympy import Max

The CNN will have the following layers:

1. Max pooling on a 4x4 input with a 2x2 kernel, 1x1 stride, no padding and no bias
2. Convolution on a 3x3 input with a 2x2 kernel, 1x1 stride, no padding and no bias
3. Flat (this turns a matrix into a vector)
4. Full connection on a 4-element vector with a 2x4 weight matrix and softmax as an activation function

We'll instruct Devito not to generate any C code at this stage (i.e. declaring layers) by adding `generate_code=False` (this is because every layer is standalone and can be used in isolation, i.e. we can have one `Operator` per layer if we wish).

In [2]:
layer1 = devito.ml.Subsampling(kernel_size=(2, 2), input_size=(4, 4), function=lambda l: Max(*l),
                               generate_code=False)
layer2 = devito.ml.Conv(kernel_size=(2, 2), input_size=(3, 3), generate_code=False)
layer3 = devito.ml.Flat(input_size=(2, 2), generate_code=False)
layer4 = devito.ml.FullyConnectedSoftmax(weight_size=(2, 4), input_size=4, generate_code=False)

Every layer has an `equations()` method which returns a list of equations that can be supplied to an `Operator` in Devito. The method accepts an `input_function` argument, making it possible to merge equations from different layers into one list forming a chain of layers. We'll use it to create **one** `Operator` running a forward pass through our CNN.

In [3]:
equations = layer1.equations() + layer2.equations(input_function=layer1.result) + \
            layer3.equations(input_function=layer2.result) + \
            layer4.equations(input_function=layer3.result)
op = Operator(equations)

Now, let's inject sample data into layers by using `input` and `kernel` properties.

* `layer1.input` represents input data for the CNN.
* `layer2.kernel` represents a convolutional filter.
* `layer4.kernel` represents a weight matrix.

In [4]:
layer1.input.data[:] = [[5, 7, 8, 0],
                        [-1, -2, -3, 10],
                        [1, 2, 3, 4],
                        [11, 12, 9, 9]]
layer2.kernel.data[:] = [[1, -1],
                         [-1, 1]]
layer4.kernel.data[:] = [[1, 1, 1, 0.5],
                         [1, 1, 1, 0]]

Once all the data are added, we're ready to run the `Operator`.

In [5]:
op.apply()

Operator `Kernel` run in 0.01 s


PerformanceSummary([(PerfKey(name='section0', rank=None),
                     PerfEntry(time=1e-06, gflopss=0.0, gpointss=0.0, oi=0.0, ops=0, itershapes=[])),
                    (PerfKey(name='section1', rank=None),
                     PerfEntry(time=1e-06, gflopss=0.0, gpointss=0.0, oi=0.0, ops=0, itershapes=[])),
                    (PerfKey(name='section2', rank=None),
                     PerfEntry(time=1e-06, gflopss=0.0, gpointss=0.0, oi=0.0, ops=0, itershapes=[])),
                    (PerfKey(name='section3', rank=None),
                     PerfEntry(time=2e-06, gflopss=0.0, gpointss=0.0, oi=0.0, ops=0, itershapes=[])),
                    (PerfKey(name='section4', rank=None),
                     PerfEntry(time=2e-06, gflopss=0.0, gpointss=0.0, oi=0.0, ops=0, itershapes=[]))])

The results can be obtained by using a `result` property of the final layer.

In [6]:
print(layer4.result.data)

[0.00669285 0.9933071 ]


For reference purposes, here's the C code generated by our `Operator`:

In [7]:
print(op)

#define _POSIX_C_SOURCE 200809L
#include "stdlib.h"
#include "math.h"
#include "sys/time.h"
#include "xmmintrin.h"
#include "pmmintrin.h"

struct dataobj
{
  void *restrict data;
  int * size;
  int * npsize;
  int * dsize;
  int * hsize;
  int * hofs;
  int * oofs;
} ;

struct profiler
{
  double section0;
  double section1;
  double section2;
  double section3;
  double section4;
} ;


int Kernel(const float f0, struct dataobj *restrict f1_vec, struct dataobj *restrict f10_vec, struct dataobj *restrict f12_vec, struct dataobj *restrict f13_vec, struct dataobj *restrict f2_vec, const float f3, struct dataobj *restrict f4_vec, struct dataobj *restrict f6_vec, struct dataobj *restrict f8_vec, const float f9, const int d0_M, const int d0_m, const int d13_M, const int d13_m, const int d14_M, const int d14_m, const int d1_M, const int d1_m, const int d2_M, const int d2_m, const int d3_M, const int d3_m, const int d8_M, const int d8_m, const int d9_M, const int d9_m, struct profiler * timer