# Implementation of neural network
## Introduction
In this task, we use NumPy to implement a basic neural network.
1. We build a simple neuron.
2. We build a simple neural network.
3. We build a simple neural network with backward.
4. We train & evaluate the simple neural network.

All operations are run on the CPU.

Reference: https://victorzhou.com/blog/intro-to-neural-networks/

## Import packages

In [1]:
import numpy as np

## Build a simple neuron
<img src="Images/n.png" width="25%">
Neuron is a basic unit of a neural network. The figure above shows a simple neuron, which can be written as follows:
\begin{equation}
y = \sigma(x_1w_1 + x_2w_2 + b).
\label{eq:neuron}
\end{equation}
Here, $[x_1, x_2]$ is the input vector. $[w_1, w_2]$ and $b$ are the weights and bias of the neuron, respectively. $\sigma(\cdot)$ is a sigmoid function, which is the activation function of the neuron.

In [2]:
def sigmoid(x):
    # Our activation function: f(x) = 1 / (1 + e^(-x))
    return 1 / (1 + np.exp(-x))

class Neuron:
    def __init__(self, w, b, name=''):
        self.name = name
        self.w = w
        self.b = b
    
    def forward(self, x):
        # Weight inputs, add bias, then use the activation function
        t = np.dot(self.w, x) + self.b
        return sigmoid(t)

init_w = np.array([0, 1])    # w1 = 0, w2 = 1
init_b = 0    # b = 0
n = Neuron(init_w, init_b)

x = np.array([2, 3])    # x1 = 2, x2 = 3
print(n.forward(x))    # 0.9525741268224334

0.9525741268224334


## Build a simple neural network
<img src="Images/nn.png" width="35%">
The figure above shows a simple neural network, which contains three neurons.

In [3]:
class NeuralNetwork:
    '''
    A neural network with:
    - 2 inputs
    - a hidden layer with 2 neurons (h1, h2)
    - an output layer with 1 neuron (o1)
    Each neuron has the same weights and bias:
    - w = [0, 1]
    - b = 0
    '''
    def __init__(self):
        init_w = np.array([0, 1])
        init_b = 0

        # The Neuron class here is from the previous section
        self.h1 = Neuron(init_w, init_b, 'h1')
        self.h2 = Neuron(init_w, init_b, 'h2')
        self.o1 = Neuron(init_w, init_b, 'o1')

    def forward(self, x):
        out_h1 = self.h1.forward(x)
        out_h2 = self.h2.forward(x)

        # The inputs for o1 are the outputs from h1 and h2
        out_o1 = self.o1.forward(np.array([out_h1, out_h2]))

        return out_o1

network = NeuralNetwork()
x = np.array([2, 3])
print(network.forward(x))    # 0.7216325609518421

0.7216325609518421


## Build a simple neuron with backward
Let's consider a a simple neuron:
\begin{equation}
y = \sigma(x_1w_1 + x_2w_2 + b).
\label{eq:neuron}
\end{equation}
We set:
\begin{equation}
t = x_1w_1 + x_2w_2 + b. 
\end{equation}
Then we get:
\begin{equation}
y = \sigma(t)
\label{eq:t}
\end{equation}
Then,
\begin{equation}
\frac{\partial y}{\partial w_1} = \frac{\partial y}{\partial t} \times \frac{\partial t}{\partial w_1} = \sigma'(t) \times x_1;
\end{equation}
\begin{equation}
\frac{\partial y}{\partial w_2} = \frac{\partial y}{\partial t} \times \frac{\partial t}{\partial w_2} = \sigma'(t) \times x_2;
\end{equation}
\begin{equation}
\frac{\partial y}{\partial b} = \frac{\partial y}{\partial t} \times \frac{\partial t}{\partial b} = \sigma'(t);
\end{equation}
\begin{equation}
\frac{\partial y}{\partial x_1} = \frac{\partial y}{\partial t} \times \frac{\partial t}{\partial x_1} = \sigma'(t) \times w_1;
\end{equation}
\begin{equation}
\frac{\partial y}{\partial x_2} = \frac{\partial y}{\partial t} \times \frac{\partial t}{\partial x_2} = \sigma'(t) \times w_2.
\end{equation}


In [4]:
def d_sigmoid(x):
    # Derivative of sigmoid: f'(x) = f(x) * (1 - f(x))
    fx = sigmoid(x)
    return fx * (1 - fx)

class Neuron_with_backward:
    def __init__(self, w, b, name=''):
        self.name = name
        self.w = w
        self.b = b
        self.x = None
        self.t = None
        self.d_w = None
        self.d_b = None
    
    def forward(self, x):
        # Weight inputs, add bias, then use the activation function
        t = np.dot(self.w, x) + self.b
        self.x = x
        self.t = t
        return sigmoid(t)
    
    def backward(self, d_back):
        # d_back is the gradient from the back neurons.
        d_y_d_t = d_sigmoid(self.t)
        d_t_d_w = self.x
        d_t_d_x = self.w
        self.d_w = d_y_d_t * d_t_d_w 
        self.d_b = d_y_d_t
        self.d_x = d_y_d_t * d_t_d_x
        # SGD update
        self.w = self.w - self.d_w * d_back
        self.b = self.b - self.d_b * d_back
        return self.d_x * d_back
        
        
init_w = np.array([0, 1])    # w1 = 0, w2 = 1
init_b = 0    # b = 0
n = Neuron_with_backward(init_w, init_b)

x = np.array([2, 3])    # x1 = 2, x2 = 3
print(n.forward(x))    # 0.9525741268224334
print('Before backward w:', n.w, 'b:', n.b)
n.backward(1)
print('After backward w:', n.w, 'b:', n.b)

0.9525741268224334
Before backward w: [0 1] b: 0
After backward w: [-0.09035332  0.86447002] b: -0.045176659730912


## Build a simple neural network with backward
<img src="Images/back.png" width="30%">
The figure above shows consider the backward of $h_1$.\
The detailed process of backward can be found in the "backward" function in "NeuralNetwork_with_backward".

In [5]:
class NeuralNetwork_with_backward:
    '''
    A neural network with:
    - 2 inputs
    - a hidden layer with 2 neurons (h1, h2)
    - an output layer with 1 neuron (o1)
    Each neuron has the same weights and bias:
    - w = [0, 1]
    - b = 0
    '''
    def __init__(self):
        init_w = np.array([0, 1])
        init_b = 0

        # The Neuron class here is from the previous section
        self.h1 = Neuron_with_backward(init_w, init_b, 'h1')
        self.h2 = Neuron_with_backward(init_w, init_b, 'h2')
        self.o1 = Neuron_with_backward(init_w, init_b, 'o1')

    def forward(self, x):
        out_h1 = self.h1.forward(x)
        out_h2 = self.h2.forward(x)

        # The inputs for o1 are the outputs from h1 and h2
        out_o1 = self.o1.forward(np.array([out_h1, out_h2]))

        return out_o1
    
    def backward(self, d_back):
        # d_back is the gradient from the back loss function.
        d_outo1_d_outh1, d_outo1_d_outh2 = self.o1.backward(d_back)
        self.h1.backward(d_outo1_d_outh1)
        self.h2.backward(d_outo1_d_outh2)

nn = NeuralNetwork_with_backward()
x = np.array([2, 3])
print(nn.forward(x))    # 0.7216325609518421
print('Before backward.')
print('h1.w:', nn.h1.w, 'h1.b:', nn.h1.b)
print('h2.w:', nn.h2.w, 'h2.b:', nn.h2.b)
print('o1.w:', nn.o1.w, 'o1.b:', nn.o1.b)
nn.backward(1)
print('After backward.')
print('h1.w:', nn.h1.w, 'h1.b:', nn.h1.b)
print('h2.w:', nn.h2.w, 'h2.b:', nn.h2.b)
print('o1.w:', nn.o1.w, 'o1.b:', nn.o1.b)

0.7216325609518421
Before backward.
h1.w: [0 1] h1.b: 0
h2.w: [0 1] h2.b: 0
o1.w: [0 1] o1.b: 0
After backward.
h1.w: [0. 1.] h1.b: 0.0
h2.w: [-0.01815009  0.97277487] h2.b: -0.009075042588152822
o1.w: [-0.19135215  0.80864785] o1.b: -0.20087900792592797


## Build mean squared error (mse) loss

In [6]:
def mse_loss(label, pred):
    return (label - pred) ** 2

def d_mse_loss(label, pred):
    # Derivative of MSE Loss
    return -2 * (label - pred)

y_true = np.array([1, 0, 0, 1])
y_pred = np.array([0, 0, 0, 0])

print(mse_loss(y_true, y_pred).mean()) # 0.5
print(d_mse_loss(0, 1))

0.5
2


## Prepare training set
Name | Weight | Height | Gender
- | :-: | :-: | :-
Alice | 133 | 65 | F
Bob | 160 | 72 | M
Charlie | 152 | 70 | M
Diana | 120 | 60 | F

The table above is the training set.\
Let's train our network to predict someone’s gender given his/her weight and height:
<img src="Images/app.png" width="50%">


In [7]:
data = np.array([
  [133, 65],    # Alice
  [160, 72],    # Bob
  [152, 70],    # Charlie
  [120, 60],    # Diana
])
labels = np.array([
  1,    # Alice
  0,    # Bob
  0,    # Charlie
  1,    # Diana
])

## Data Normalization

In [8]:
data_mean = np.mean(data, 0)
data_std = np.std(data, 0)
def norm(data):
    return (data - data_mean) / data_std

data = norm(data)

## SGD
Optimization algorithm called stochastic gradient descent (SGD) that tells us how to change our weights and biases to minimize loss. It’s basically just this update equation:
\begin{equation}
w_1 \leftarrow w_1 - \eta \frac{\partial L}{\partial w_1}.
\end{equation}
Here, $L$ is the loss function, and $\eta$ is a constant called the learning rate that controls how fast we train.

## Train a simple neural network

In [9]:
# build a simple neural network with backward
nn = NeuralNetwork_with_backward()

# learning rate
learn_rate = 0.1

# number of times to loop through the entire dataset
epochs = 1000

for epoch in range(epochs):
    for x, label in zip(data, labels):
        pred = nn.forward(x)
        d_L_d_ypred = d_mse_loss(label, pred)
        nn.backward(learn_rate * d_L_d_ypred)
    
    if (epoch + 1) % 50 == 0:
        y_preds = np.apply_along_axis(nn.forward, 1, data)
        loss = mse_loss(labels, y_preds).mean()
        print("Epoch %d loss: %.6f" % (epoch + 1, loss))

Epoch 50 loss: 0.160155
Epoch 100 loss: 0.066590
Epoch 150 loss: 0.033055
Epoch 200 loss: 0.020086
Epoch 250 loss: 0.013858
Epoch 300 loss: 0.010362
Epoch 350 loss: 0.008177
Epoch 400 loss: 0.006702
Epoch 450 loss: 0.005650
Epoch 500 loss: 0.004866
Epoch 550 loss: 0.004262
Epoch 600 loss: 0.003785
Epoch 650 loss: 0.003398
Epoch 700 loss: 0.003079
Epoch 750 loss: 0.002813
Epoch 800 loss: 0.002586
Epoch 850 loss: 0.002392
Epoch 900 loss: 0.002224
Epoch 950 loss: 0.002077
Epoch 1000 loss: 0.001947


## Evaluate a simple neural network

In [10]:
emily = norm(np.array([[128, 63]]))    # 128 pounds, 63 inches
frank = norm(np.array([[155, 68]]))    # 155 pounds, 68 inches

print("Emily: %.6f" % nn.forward(emily[0])) # 0.968267 - F
print("Frank: %.6f" % nn.forward(frank[0])) # 0.070288 - M

Emily: 0.968267
Frank: 0.070288
