# XCore Optimisation Guide
   
The XMOS tflite compiler, xformer, converts regular tflite files to run on xcore devices. It tries to optimise models as much as possible both in terms of runtime and disk space, but some optimisations have certain requirements in the model creation phase, without which xformer cannot fully optimise them.

In [None]:
import keras
import tensorflow as tf
from keras import layers

## Convolutions

Convolutions can be implentented by different kernels, all of which have different runtime and space complexities.

Xformer currently supports four kernels. They are (from slowest to fastest):
1. Reference kernel (default)
2. Padded indirect
3. Valid indirect
4. Valid direct

These each have requirements relating to the inputs to the convolution. Consider padding or rounding values to meet the rules for these optimisations.

### Conv2D

`Conv2D` applies a filter(s) to each channel of an input. The convolution of each layer is summed together to produce the new channel. `Conv2D` can therefore be any *n* input channels to any *m* output channels.

#### REQUIREMENTS:
**No Optimisation (reference):**

None

**Padded Indirect:**
(for padding=same)

* Number of input channels is multiple of 4
* Number of filters is multiple of 4

**Valid Indirect**
(for padding=valid)

* Number of Input channels is multiple of 4
* Number of filters is multiple of 4


**Valid Direct**
(for padding=valid)

* Number of Input channels is multiple of 32
* Number of filters is multiple of 16

#### EXAMPLE:


In [None]:
# Unoptimisable
input = keras.Input(shape=(28, 28, 4), name="img")
x = layers.MaxPool2D(4, 4)(input)
x = layers.Conv2D(filters=15, kernel_size=4, activation="relu")(
    x
)  # Unoptimisable, the number of filters is not multiple of 16
output = layers.GlobalMaxPooling2D()(x)
model = keras.Model(input, output, name="Unoptimised")
model.summary()

# Optimisable
input = keras.Input(shape=(28, 28, 4), name="img")
x = layers.MaxPool2D(4, 4)(input)
x = layers.Conv2D(filters=16, kernel_size=4, activation="relu")(
    x
)  # Round this to 16 to get to the requirements to use the Valid Direct kernel
output = layers.GlobalMaxPooling2D()(x)
model = keras.Model(input, output, name="Optimised")
model.summary()

### DepthwiseConv2D

`DepthwiseConv2D` applies a filter to each channel of an input. Each channel has its own filter and the convolution is calculated independently of other channels. It is therefore *n* input channels to *n* output channels.

*(NB: DepthwiseConv2D has a depth_multiplier argument which means that the the true number of output channels is n * depth_multiplier.)*

#### REQUIREMENTS:

The requirements for each optimisation are the same as for a regular `Conv2D`, but because a `DepthwiseConv2D` is *n* input to *dn* (where *d* is the integer value of depth_multiplier), the only factor which affects this is the number of input channels.

**No Optimisation (reference):**

None

**Padded Indirect:**
(for padding=same)

* Number of input channels is multiple of 4

**Valid Direct**
(for padding=valid)

* Number of Input channels is multiple of 4