# Lesson 3: Multi-Layered Perceptrons

## Introduction

Biological neurons are non-linear and are complex intricate networks. In machine learning, single neurons can only construct linear functions and decision boundaries. Single linear neurons can't solve XOR problems but one hidden layered MLPs can. One hidden layer MLPs can solve pratically any problem (given enough neurons in the hidden layer).

## Universal Function Approximation Theorem

The Universal Function Approximation Theorem states that feedforward neural networks with single hidden layers containing finite numbers of neurons can approximate any continuous function on a compact input domain to arbitrary accuracy given a sufficiently large number of hidden neurons and appropriate choice of activiation function.

This does beg the question of figuring out how many neurons to use. This theorem does not tell us how many hidden neurons we would need, all it's saying is that it is possible. It also doesn't tell us how to find these functions. There's no gurantee that we'll be able to find a function given a finite set of example input output pairs.

To gain some more intuition behind it, let's think about what a ReLU function is. Essentially, rectified linear units or ReLU is an activation function that introduces non-linearity to a deep learning model.

In [None]:
import matplotlib.pyplot as plt

def relu(x):
	return max(0.0, x)

reluBase = []
for i in range(-2, 2):
    reluBase.append(relu(i))

The ReLu function takes the following shape

In [None]:
plt.plot(reluBase)

We create some modified ReLU functions:

In [None]:
def relu_add_1(x):
	return relu(x + 1)

def relu_x_minus_two(x):
	return -2 * max(0.0, x)

def relu_minus_1(x):
	return relu(x-1)

relu1 = []
relu2 = []
relu3 = []

Relu_Range = list(range(-3, 3))

for i in Relu_Range:
	relu1.append(relu_add_1(i))
	relu2.append(relu_x_minus_two(i))
	relu3.append(relu_minus_1(i))

fig, ax = plt.subplots()

plt.subplot(3,1,1)
plt.plot(Relu_Range, relu1)
plt.title('relu(x + 1)')

plt.subplot(3,1,2)
plt.plot(Relu_Range, relu2)
plt.title('-2 * relu(x)')

plt.subplot(3,1,3)
plt.plot(Relu_Range, relu3)
plt.title('relu(x - 1)')

plt.show()

But then when we combine these, we get:

In [None]:
def comb_relu(x):
    return relu_add_1(x) + relu_x_minus_two(x) + relu_minus_1(x)

relu4 = []

for i in list(range(-3, 3)):
	relu4.append(comb_relu(i))

plt.plot(Relu_Range, relu4)

Here you can see that you've made a bit of a curve with a combination of ReLUs. Continuously doing this, we'll be able to replicate sin functions and create entire waves just using ReLUs.

## Building MLPs

Multilayered perceptrons are just simple linear layers stacked. The layers have activation functions or non-linearity functions labeled by $\sigma$.