# Exercises week 43 and 44
### The OR, AND, and XOR gates

We have two input values $x_1$ and $x_2$ which decide the output from the two types of gates. Since each input value can be either 0 or 1 we can write the input as a design matrix $X$ where the first and second column represents $x_1$ and $x_2$ respectively as:
$$X = \begin{bmatrix} 0 & 0 \\ 0 & 1 \\ 1 & 0 \\ 1 & 1 \end{bmatrix}$$

The output $y$ for the different gates we can write as the vectors $y^T=[0, 1,1,1]$ for the OR gate, $y^T=[0,0,0,1]$ for the AND gate, and $y^T=[0, 1, 1, 0]$ for the XOR gate. We setup this matrix and these vectors:


In [2]:
import numpy as np

# Set up design matrix and output vectors
X = np.asarray([
        [0, 0],
        [0, 1],
        [1, 0],
        [1, 1]
])
yOR = np.asarray([0, 1, 1, 1])
yAND = np.asarray([0, 0, 0, 1])
yXOR = np.asarray([0, 1, 1, 0])

We create our NN architecture:

In [68]:
# Parameters
n_hidden_layers = 1  # hidden layers
n_hidden_nodes = 2  # hidden nodes
n_categories = 2  # output values, for gates we only find 0 or 1
n_inputs, n_features = X.shape  # 2 inputs, 4 features

# Activation function
def sigmoid(x):
    return 1/(1 + np.exp(-x))

# Initialize random number generator with seed
rng = np.random.default_rng(2023)

# Weights and bias in the hidden layer
hidden_weights = rng.standard_normal((n_features, n_hidden_nodes))  # weights normally distributed
hidden_bias = np.zeros(n_hidden_nodes) + 0.01

# Weights and bias in the output layer
output_weights = rng.standard_normal((n_hidden_nodes, n_categories))  # weights normally distributed
output_bias = np.zeros(n_hidden_nodes) + 0.01

print(hidden_weights)
print(output_weights)

[[ 0.60172129  1.15161897]
 [-1.35946236  0.22205533]]
[[-0.77586755  0.8087058 ]
 [-0.19862826 -1.57869386]]


Then we set up the feed forward algorithm and compare one pass with the target vectors $y^T$

In [84]:
def feed_forward(X):
    """Feed forward algorithm."""
    # weighted sum of inputs to the hidden layer
    z_h = X @ hidden_weights + hidden_bias
    
    # activation in the hidden layer
    a_h = sigmoid(z_h)
    
    # weighted sum of inputs to the output layer
    z_o = a_h @ output_weights + output_bias
    
    # softmax output
    # axis 0 holds each input and axis 1 the probabilities of each category
    probabilities = sigmoid(z_o)
    
    return probabilities


def predict(X):
    """Get neural network prediction using the feed forward algorithm."""
    probabilities = feed_forward(X)
    return np.argmax(probabilities, axis=1)


# Make prediction and compare with gate target y_vectors
predictions  = predict(X)

print("Targets:")
print("yOR =", yOR)
print("yAND =", yAND)
print("yXOR =", yXOR)

print("\nPrediction:")
print(predictions) 

Targets:
yOR = [0 1 1 1]
yAND = [0 0 0 1]
yXOR = [0 1 1 0]

Prediction:
[1 0 0 0]


We see this prediction does not match any target. This is because we only did one pass and that was with random starting weights. Now we setup the cost function and the back propagation algorithm.

For the cost function we use the cross entropy for binary cases

In [None]:
def cost_function(theta):
    ...