# Basic imports and utils

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import layer
from neurons import RBF
from typing import List
import loss

## Constants

# RBF


We choose a gaußian activation function for a RBF neuron with it's derivatives $$y(x)=e^{-\frac{||x - c||^2}{2\sigma^2}}$$
$$\frac d{dc}y(x)=-\frac{x - c}{\sigma^2}e^{-\frac{||x - c||^2}{2\sigma^2}}$$
$$\frac d{dx}y(x)=\frac{x - c}{\sigma^2}e^{-\frac{||x - c||^2}{2\sigma^2}}$$

The standard deviation is going to be a fixed size hyper parameter.
## Definition of an RBF in this module
To use the broadcasting abilities of numpy as much as possible, we design functions in a way that they can update multiple neurons at once. For this we design the data structures for our neurons in a data oriented way. For this to work, the only limit put onto our neurons is that the input dimensions should be the same for all neurons.

Since we are going to work on images, and an image is a $(w, h, 3)$ float or integer array an RBF neuron needs to define the following things.

c: Should be a $(k_w, k_h, k_c)$ array representing the centers for this neuron

x: A subimage that should be defined by the outside function with the same dimesnions as c.

# Layer

For a classic convolutional layer, it's common that we have overlapping kernels.

Since we are learning using gradient decent, we need to propagate the error back through the overlapping regions and take into account the mistake made by each kernel iteration.

In our case this means that
$$ y_{i,j}=e^{-\frac{\sum_k\sum_l (x_{i+k,j+l} - c_{k,l})^2}{2\sigma^2}} $$

# Experiments


In [None]:
image = np.random.uniform(0, 1, size=[6, 6, 3])
plt.figure()
plt.imshow(image)
l = layer.ConvolutionalLayer(image.shape, 1, 1, (3,3,3), RBF)
l.propagate(image)

In [3]:
TRAIN_RATE:float = 0.1

layers:List[layer.Layer] = []
layers.append(layer.DenseLayer(2, 2, RBF))
layers.append(layer.DenseLayer(2, 1, RBF))

inputs = np.array([[0.0,0.0],[1.0,0.0],[0.0,1.0],[1.0,1.0]])
expected = np.array([0.0,1.0,1.0,0.0])

for _ in range(100):
    order = np.arange(4)
    np.random.shuffle(order)
    for i in range(4):
        current_input = inputs[order[i]].reshape((1, -1))
        for l in layers:
            current_input = l.propagate(current_input)
        error = expected[order[i]] - current_input[0]

        loss_i, deriv = loss.cross_entropy_loss(current_input[0], expected[order[i]])

        deriv = layers[1].back_propagate(deriv, TRAIN_RATE)
        layers[0].back_propagate(deriv, TRAIN_RATE)

for i in range(4):
    current_input = inputs[i].reshape((1, -1))
    for l in layers:
        current_input = l.propagate(current_input)   




ValueError: operands could not be broadcast together with shapes (1,2) (1,4) 