### **Simple Neural Network Demo**

This is a very simple neural network with 1 hidden layer and 2 hidden units using numpy.


Let's get started!

In [None]:
import numpy as np
import matplotlib.pyplot as plt

First, we are going to create a dataset with 2 features, which is the hight, wight of a person and the output is the person's gender.

We can randomly create this dataset by using np.random package. To simplify, the output label '0' denots female and '1' denotes male.

In [None]:
# Create the data
np.random.seed(0) # seed a random seed to make the data reproducible
height = np.random.randint(150, 200, 100)
weight = np.random.randint(50, 100, 100)
gender = np.random.randint(0, 2, 100)

# Stack the data to create out training data
X = np.column_stack((height, weight))
y = gender.reshape(-1, 1)

print(X.shape)
print(y.shape)

In [None]:
# We can plot the data to see how it looks
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(X[:, 0], X[:, 1], y, c=y, cmap='viridis')
ax.set_xlabel('Height')
ax.set_ylabel('Weight')
ax.set_zlabel('Gender')
plt.show()

### **Steps to build a neural network**

1. Define our objective
2. Forward Propagate
3. Compute the loss
4. Update the parameters

<img src="SimpleNN.png" width="800" height="600">

In [None]:
# Build a Sigmoid Function
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

In [None]:
# Initialize the weights
W1 = 0.01 * np.random.randn(2, 2)
W2 = 0.01 * np.random.randn(2, 1)

# Forward pass
def forward(X, W1, W2):
    hidden = sigmoid(np.dot(X, W1))
    output = sigmoid(np.dot(hidden, W2))
    output = np.round(output)
    return hidden, output

The loss function of a binary classification problem:

$$
L(y, \hat{y}) = -\sum_{i} y \log \hat{y}
$$

The partial deravative of $w$
$$
\frac{\partial L}{\partial w} = \frac{\partial L}{\partial \hat{y}} \cdot \frac{\partial \hat{y}}{\partial y} \cdot \frac{\partial y}{\partial w}
$$
$$
\frac{\partial L}{\partial \hat{y}} = -y \cdot \frac{1}{\hat{y}}
$$
$$
\hat {y} = sigmoid(y)
$$
$$
\frac{\partial \hat{y}}{\partial y} = \hat{y} \cdot (1-\hat{y})
$$
$$
y = w3 \cdot h1 + w4 \cdot h2
$$
$$
\frac{\partial y}{\partial w3} = h1
$$
$$
\frac{\partial y}{\partial w4} = h2
$$

**We can compute the gradient of $W2$**
$$
\frac{\partial L}{\partial w3} = -y(1-\hat{y}) \cdot h1
$$
$$
\frac{\partial L}{\partial w4} = -y(1-\hat{y}) \cdot h2
$$
**We can rewrite the 2 formula into**
$$
\frac{\partial L}{\partial W2} = -y(1-\hat{y}) \cdot Hidden
$$

**We can furthure compute the gradient of $w11$**
$$
y = w3 * h1 + w4 * h2
$$
$$
h1 = sigmoid(x1 * w11+ x2 * w21)
$$
$$
\frac{\partial L}{\partial w11} = \frac{\partial L}{\partial \hat{y}} \cdot \frac{\partial \hat{y}}{\partial y} \cdot \frac{\partial y}{\partial h1} \cdot \frac{\partial h1}{\partial w11}
$$
$$
\frac{\partial y}{\partial h1} = w3
$$
$$
\frac{\partial h1}{\partial w11} = h1(1-h1) \cdot x1
$$
$$
\frac{\partial L}{\partial w11} = -y(1-\hat {y}) \cdot w3 \cdot h1(1-h1) \cdot x1
$$
**Same as $w21$**
$$
\frac{\partial L}{\partial w21} = -y(1-\hat {y}) \cdot w3 \cdot h1(1-h1) \cdot x2
$$
**Hence, we can rewrite W1 into**
$$
\frac{\partial L}{\partial W1} = -y(1-\hat {y}) \cdot W2 \cdot Hidden(1-Hidden) \cdot X
$$

In [None]:
# Backward pass of Cross Entropy Loss
def backward(y, hidden, output, W1, W2, lr=0.01):
    dW2 = - np.dot(hidden.T, y * (1 - output))
    dW1 = - np.dot(X.T, np.dot(y * (1 - output), W2.T) * hidden * (1 - hidden))
    W1 = lr * dW1
    W2 -= lr * dW2
    return W1, W2

In [None]:
# Train the model
for i in range(1000):
    hidden, output = forward(X, W1, W2)
    W1, W2 = backward(y, hidden, output, W1, W2)

# Test the model
hidden, output = forward(X, W1, W2)
print(np.mean(output == y))