# Multilayer Perceptron

The linear function in the previous section was very simple function. By using only 2 layers (Input and Output) the problem can be solved. Then what about non-linear functions? Of course the 2 layer architecture can't be use again.

So for non-linear regression, at least 3 layers of neural network or commonly called Multilayer Perceptron (MLP) or Fully-Connected Layer using non-linear activation function was needed on all neurons in the hidden layer.

In this section, The implementation of forward pass was tried using Python and Numpy without a framework to make it clearer. Later in the next parts we will try with Tensorflow and Keras.

## Problem

For the case example, This is a regression for data that is actually a non-linear function as follows:

$f(x) = \sqrt{2x^{2} + 1}$

## Forward Propagation

The Forward Pass method that used in the previous section will be slightly modified by adding a new argument to select the activation function.

In [1]:
import numpy as np

def forward_pass(inputs, weight, bias, activation = 'linear'):
    w_sum = np.dot(inputs, weight) + bias
    
    if activation == 'relu':
        # ReLU Activation f(x) = max(0, x)
        act = np.maximum(w_sum, 0)
    else:
        # Linear Activation f(x) = x
        act = w_sum
        
    return act

The neural network above has trained and the other things to do is **Forward Pass** to the weight and bias that obtained during the training.

In [2]:
# Pre-Trained Weights & Biases after Training in Hidden Layer
W_H = np.array([[0.00192761, -0.78845304, 0.30310717, 0.44131625, 
                 0.32792646, -0.02451803, 1.43445349, -1.12972116]])

b_H = np.array([-0.02657719, -1.15885878, -0.79183501, -0.33550513, 
                -0.23438406, -0.25078532, 0.22305705, 0.80253315])

# Pre-Trained Weights & Biases after Training in Output Layer
W_O = np.array([[-0.77540326], [ 0.5030424 ], [ 0.37374797], [-0.20287184], 
                [-0.35956827], [-0.54576212], [ 1.04326093], [ 0.8857621 ]])

b_O = np.array([ 0.04351173])

While the neural network architecture consists of:
- 1 node on the input layer
- 8 nodes on the first hidden layer (ReLU)
- 1 node in the output layer (Linear)

In [3]:
# Initialize Input Data
inputs = np.array([[-2], [0], [2]])

# Output of Hidden Layer
h_out = forward_pass(inputs, W_H, b_H, 'relu')

print('Hidden Layer Output (ReLU)')
print('==========================')
print(h_out, "\n")

# Output of Output Layer
o_out = forward_pass(h_out, W_O, b_O, 'linear')

print('\nOutput Layer Output (Linear)')
print('============================')
print(o_out, "\n")

Hidden Layer Output (ReLU)
[[0.         0.4180473  0.         0.         0.         0.
  0.         3.06197547]
 [0.         0.         0.         0.         0.         0.
  0.22305705 0.80253315]
 [0.         0.         0.         0.54712737 0.42146886 0.
  3.09196403 0.        ]] 


Output Layer Output (Linear)
[[2.96598907]
 [0.98707188]
 [3.00669343]] 



In the non-linear regression experiment this time we will do the value of -2, 0 and 2. The resulting output should be 3, 1, 3 and the predicted results are 2.96598907, 0.98707188 and 3.00669343. There is still a little error but at least the results above show that MLP can perform a regression of non-linear functions quite well.