# Model V27

After more revisions, we settled on a model used the same innovations used by Xception and MobileNet, both by Google in 2017 to both improve the model performance for fewer model parameters.

The author of the Xception model's publication, François Chollet, describes some of those innovations used by the model in his book [*Deep Learning with Python, Second Edition*](https://books.google.ca/books/about/Deep_Learning_with_Python_Second_Edition.html?id=XHpKEAAAQBAJ&redir_esc=y). The two key features that were useful in our model were **batch normalization** and **separable convolution layers**.

Batch normalization is a layer the normalizes input data from the previous layer based on batch data mean and variance. The layer changes its mean and variance between batches to reduce the overall change in neuron weights in the previous layers. From a practical standpoint, batch normalization improves training backpropagation by regulating weight adjustments after each batch and preventing the calculated gradient descent from attenuating through the neural network {cite:p}`chollet2021deep`.

Separable convolution layers are a special kind of layer that separates the channels of the input data and performs an independent convolution on each layer. Chollet best describes how separable convolution layers are effective in his book:

> In much the same way that convolution relies on the assumption that the patterns in images are not tied to specific locations, depthwise separable convolution relies on the assumption that spatial locations in intermediate activations are highly correlated, but different channels are highly independent. Because this assumption is generally true for the image representations learned by deep neural networks, it serves as a useful prior that helps the model make more efficient use of its training data {cite:yearpar}`chollet2021deep`.

Through a stronger divide-and-conquer approach to convolution, the number of parameters in a separable convolution layers less than a regular convolution layer, optimizing model efficiency.

In [None]:
from keras import layers, models, optimizers

conv_params = {'kernel_size': (3,3), 'activation': 'relu', 'padding': 'same',
               'use_bias': False}

def conv_block(filters, thick, conv_params):
    layer_list = []
    for _ in range(thick):
        layer_list.append(layers.SeparableConv2D(filters, **conv_params))
    layer_list.append(layers.BatchNormalization())
    layer_list.append(layers.MaxPool2D((2,2)))
    return layer_list

model = models.Sequential([
    layers.Input(shape=(66, 100, 1)),

    *conv_block(8, 2, conv_params),
    *conv_block(16, 2, conv_params),
    *conv_block(32, 2, conv_params),
    *conv_block(64, 2, conv_params),
    *conv_block(128, 2, conv_params),
    
    layers.Dropout(0.5),
    layers.GlobalAvgPool2D(),

    layers.Dense(128, activation='relu'),
    layers.Dense(64, activation='relu'),
    layers.Dense(32, activation='relu'),
    layers.Dense(16, activation='relu'),
    layers.Dense(8, activation='relu'),
    
    layers.Dense(1, activation='linear')  # Output raw angle
])

optimizer = optimizers.Adam(0.002)
model.compile(optimizer=optimizer, loss='mse')
model.summary()

2025-04-05 18:32:04.996880: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2025-04-05 18:32:05.004947: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2025-04-05 18:32:05.007354: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-04-05 18:32:05.013635: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
I0000 00:00:1743903126.891489   296

The training of this model involves 150 epochs and also uses Adam, but with a learning rate of 0.002 and uses the `ReduceLROnPlateau` callback to reduce the learning rate when the validation loss becomes stagnant.