# Backprogagation for two Output Neurons with one Input

The input of the two Neurons N_1 and N_2 is x and the bias, b, and the associated weights are called (w1, w2) and (w3, w4), respectively. The output then defines if the input value x is of group A or group B, depending on which value is larger.

First, we define the learning parameter, initalise randomly the weights and declare necessary functions: the sigmoid, the loss function as well as their derivations. Here we use the square error loss.

In [236]:
import numpy as np

n = 0.01 # Learning Parameter
w = np.random.uniform(-1, 1, (4,)) # Random initialisation of the weights w_1 and w_2

def sigmoid(x):
    return 1 / (1 + np.exp(-x))
def sigmoid_deri(x):
    val = sigmoid(x)
    return val / (1 - val)
def loss(y_hat, y):
    sum_y = np.power((y[0]-y_hat[0]), 2) + np.power((y[1]-y_hat[1]), 2)
    return sum_y
def loss_deri(y_hat, y):
    return 2.0*(y_hat-y)
    return 1.0

Using these elements, we can calculate a forward pass through the network for the input x and the label y

In [246]:
def forward_pass(x, y):
    if y==1:
        y=np.array([0, 1])
    else:
        y=np.array([1, 0])
    #z_1 = np.sum(w[0]*x, w[1]) # Sum in Neuron
    z_1 = w[0]*x + w[1]
    z_2 = w[2]*x + w[3]
    y_hat = np.array([sigmoid(z_1), sigmoid(z_2)]) # Output of activation function of Neuron
    loss_val = loss(y_hat, y) # Final value of the loss function
    return z_1, y_hat, loss_val, y
    

Next, we can define the backward pass which gives us the weight changes.

In [238]:
def backward_pass(y_hat, y, z_1, x):
    val1 = loss_deri(y_hat[0], y[0]) * sigmoid_deri(x)
    val2 = loss_deri(y_hat[1], y[1]) * sigmoid_deri(x)

    return np.array([val1*x, val1, val2*x, val2])

The update_weights function then updates the weights accordingly.

In [247]:
def update_weights(w, dw):
    w = w-n*dw # Pointwise actions

## Training

First, wegenerate some data that should be classified. In this case, we draw examples from two Gaussians, one with mean 0 and one with mean 10. Group A has the label "1", and group B the label "-1"

In [248]:
num_data_points = 2000
data_g1 = np.transpose(np.array([np.random.randn(num_data_points), -1*np.ones(num_data_points)]))
data_g2 = np.transpose(np.array([np.random.randn(num_data_points)+10, np.ones(num_data_points)]))

data = np.concatenate((data_g1, data_g2))
np.random.shuffle(data)

We can now split the data set into a training and validation set

In [241]:
ind = int(num_data_points*2*0.75)
data_train = data[:ind][:]
data_eval = data[ind:][:]

We can now start our training epoch. We use the classical "Batch Gradient Descent" which sweeps through the data, and averages the weight changes for each data point before updating the weight.
Once the sweep is over, the weights are updated accordingly, and the next epoch is started. The 200 epochs is chosen randomly, we just want to make sure that the best possible value is reached.

In [249]:
for i in range(200):
    dw = 0
    for data_point in data_train:
        z_1, y_hat, loss_val, y = forward_pass(data_point[0], data_point[1])
        dw += backward_pass(y_hat, y, z_1, data_point[0])
    dw /= len(data_train)
    update_weights(w, dw)
    

Finally, we can calculate the accuracy of the algorithm. We devide the number of correct assignments by all assignments to get a percentage of how well the algorithm was able to devide the two grou

In [243]:
correctly_pre = 0
for data_point in data_eval:
    if sigmoid(w[0]*data_point[0]+w[1]) < sigmoid(w[2]*data_point[0]+w[2]):
        y_pred = 1
    else:
        y_pred = -1
    if(y_pred == data_point[1]):
        correctly_pre+=1
        
acc = correctly_pre / len(data_eval)
print("Accuracy:", acc)

Accuracy: 0.724
