# Feedforward neural network. Backpropagation.

## Feedforward neural network

Observing processes in the human brain, scientists created a mathematical model in 40s, which mimic the behavior of neurons' systems. It became so successful that now it is being used in biology and computer science because it is very good in pattern recognition, something that usual computers cannot do.
Let's start our journey in a world of neural networks from a neuron, the simplest 'cell' that every network has.

### Neuron

Let a neuron have n inputs and 1 output (axon). Each input has its **magnitude** $x_i$ and each connection has its **weight** $w_i$. Then output of the neuron is $$y=\sigma\Bigg(\sum_{i=1}^n x_iw_i \Bigg)$$ Letter $\sigma$ denotes **activation** function (we will discuss them more at the end of this notebook). Inputs are usually coming from other neurons or from initial input. Outputs are ususally coming into other neurons or into final output. 

### Layer

Neurons are usually grouped into **layers**. There are different types of connections between layers, which define a type of neural network. The simplest type is **feedforward** neural network, whose layers are fully connected or, in other words, each neuron of a previous layer is connected with every neuron of a next layer. 
There are 3 different types of layers in feedforward neural network: input layer (all neurons have one input from an external source), hidden layer, and output layer (all neurons' output values exit network). In result, a neural network has one input layer, n hidden layers ($n\geqslant 0$), and one output layer. This structure can be treated as a black box with its input connected to an input layer and its output to an output layer, as shown in the picture.
![Wikipedia](https://upload.wikimedia.org/wikipedia/commons/thumb/4/46/Colored_neural_network.svg/296px-Colored_neural_network.svg.png)

If treat the neural network as a function with several inputs and outputs, let's denote input values as a list of numerical values **x** and output values as a list **y**. Then, basic algorithm of calculating **y** list is:
1. Assign values to outputs of input layer neurons as **x** values
2. If a neural network has hidden layer, compute outputs of hidden layer neurons. If there is more than one hidden layer, repeat step 2.
3. Compute **y**, which is equal to outputs of output layer neurons

This algorithm will be implemented in the second part of this lecture.

## Implementation of NN in Python

### Classes in Python

Although we are already familiar with objects like numbers and lists in Python, sometimes we need to create objects ourselves with properties we desire. Such task is possible using a special structure in Python called class. Usually class has the following structure:
```python
class Name_of_class:
    def __init__(self,initial_parameters):
        self.parameters=initial_parameters
        
    def function1(self,input_values):
        do something
        return output_values
```
The most important elements are self, which refers to class object, and init function, which initializes new object and object's *attributes*.

If you still don't understand how this works, don't worry! Now we are going to create the class called "Triangle", which creates object "Triangle" by setting coordinates of its vertexes (which are, basically, initial parameters). Then this "Triangle" can tell the length of its sides, its area and plot itself: 

In [None]:
class Triangle:
    def __init__(self,x1,y1,x2,y2,x3,y3):
        #writing coordinates
        self.x_coordinates=[x1,x2,x3]
        self.y_coordinates=[y1,y2,y3]        
        
    def length_of_sides(self):
        #calculating sides' lengths
        a=((self.x_coordinates[0]-self.x_coordinates[1])**2+(self.y_coordinates[0]-self.y_coordinates[1])**2)**(1/2)
        b=((self.x_coordinates[1]-self.x_coordinates[2])**2+(self.y_coordinates[1]-self.y_coordinates[2])**2)**(1/2)
        c=((self.x_coordinates[2]-self.x_coordinates[0])**2+(self.y_coordinates[2]-self.y_coordinates[0])**2)**(1/2)
        return a,b,c
    
    def area(self):
        #getting sides' length
        a,b,c=self.length_of_sides()
        #calculating area from Heron's formula
        p=(a+b+c)/2
        s=(p*(p-a)*(p-b)*(p-c))**(1/2)
        return s
    
    def plot(self):
        import matplotlib.pyplot as plt
        x_data=[self.x_coordinates[0],self.x_coordinates[1],self.x_coordinates[2],self.x_coordinates[0]]
        y_data=[self.y_coordinates[0],self.y_coordinates[1],self.y_coordinates[2],self.y_coordinates[0]]
        plt.plot(x_data,y_data)
        plt.show()

After declaring class, we can create objects and use their properties:

In [None]:
t1=Triangle(0,0,1,2,3,0)
print(t1.length_of_sides())
print(t1.area())
t1.plot()

We can also access all self. values like:

In [None]:
print(t1.x_coordinates)

### Class "NeuralNet"

Considering that neural network is a separate entity, which is described by many factors (weights, architecture, and activation function) and has different purposes (output of y-labels, training, which will be discussed in second part of this lecture), it is quite reasonable to create a new class called "NeuralNet". This class will represent feedforward neural network with one input, one hidden and one output layers.

#### Weights

Weights represent 'brain' of the neural net because by changing them we can completely change net's behavior. We can consider an implementation of weight in Python by looking at the picture below.
Firstly, each neuron in the input layer has the number of connection equal to the number of neurons in the hidden layer. This means that the total number of weights in input-hidden connection is equal to a product of the number of the neurons in both layers. As a result, the best way of representing these weights is a double list or list of lists in Python and table (or matrix) in mathematics.  Let the first index of a weight be an index of an input layer neuron and the second one be an index of a hidden layer neuron. For example, the first matrix of the neural net from the picture below can be written as:
$$
\begin{bmatrix}
    w_{11}       & w_{12} & w_{13} & w_{14} \\
    w_{21}       & w_{22} & w_{23} & w_{24} \\    
    w_{31}       & w_{32} & w_{33} & w_{34} \\  
\end{bmatrix}
$$
By analogy, let the first index of a weight be an index of a hidden layer neuron and the second one be an index of an output layer neuron in the second weight table:
$$
\begin{bmatrix}
    w_{11}       & w_{12}\\
    w_{21}       & w_{22}\\    
    w_{31}       & w_{32}\\  
    w_{41}       & w_{42}\\  
\end{bmatrix}
$$

![Wikipedia](https://upload.wikimedia.org/wikipedia/commons/thumb/4/46/Colored_neural_network.svg/296px-Colored_neural_network.svg.png)

It is common to initialize weights as random variables from an interval $(-\epsilon,+\epsilon)$ where $\epsilon$ is quite small number (around 0.01-0.1). Using that knowledge, we can implement weights in Python as following:

In [None]:
import random
weights=[]
n_input=3
n_hidden=4
epsilon=0.1
for i in range(n_input):
    weights_row=[] #creating empty row of weights from one input neuron
    for j in range(n_hidden):
        weights_row.append(random.uniform(-epsilon,epsilon)) #appending random weights
    weights.append(weights_row) #appending row to a table of weights    

print(weights) #printing weights
for i in range(n_input): #printing weights in more table-like form
    print(weights[i])

#### Activation function

There are a lot of different activation functions for different purposes, but they all have in common one simple thing - they can't be linear. Let's list 3 the most common ones and their derivatives and then plot their graphs
1. Sigmoid function $$f(x)=\frac{1}{1+e^{-x}}$$ $$f'(x)=f(x)(1-f(x))$$
2. TanH (hyperbolic tangent) $$f(x)=\frac{e^x-e^{-x}}{e^x+e^{-x}}$$ $$f'(x)=1-f^2(x)$$
3. Rectified linear unit (ReLU) 
$$
f(x)=
\begin{cases} 
      0 & x<0 \\
      x & x\geqslant0
\end{cases}
$$
$$
f'(x)=
\begin{cases} 
      0 & x<0 \\
      1 & x\geqslant0
\end{cases}
$$

In [None]:
import matplotlib.pyplot as plt
import numpy as np

x_data=np.arange(-5,5,0.01)
y1_data=1/(1+np.exp(-x_data))
y2_data=(np.exp(x_data)-np.exp(-x_data))/(np.exp(x_data)+np.exp(-x_data))
y3_data=x_data*(x_data>0)

fig=plt.figure(figsize=(12,4))
ax1=fig.add_subplot(131)
ax1.set_title('Sigmoid')
ax1.plot(x_data,y1_data)

ax2=fig.add_subplot(132)
ax2.set_title('TanH')
ax2.plot(x_data,y2_data)

ax3=fig.add_subplot(133)
ax3.set_title('ReLU')
ax3.plot(x_data,y3_data)

plt.show()

We are going to use sigmoid function which can be implemented using only basic library math:

In [None]:
import math

def sigmoid(x):
    return 1/(1+math.exp(-x))

In [None]:
print(sigmoid(1))

### Neural net's shell

Now we are going to implement neural net but without calculating output and training weights:

In [None]:
class NeuralNet:
    def __init__(self,inp,hid,out,epsilon):
        #initial parameters are numbers of neurons in input (inp), hidden (hid), and output (out) layers 
        #plus epsilon value for weight initialization
        #firstly, we write down numbers of neurons in each layer 
        self.inp=inp
        self.hid=hid
        self.out=out
        #secondly, we create weights
        #wa - weights of connection between input and hidden layers
        #wb - weights of connection between hidden and output layers
        import random
        self.wa=[]
        self.wb=[]
        
        for i in range(self.inp):
            weights_row=[] 
            for j in range(self.hid):
                weights_row.append(random.uniform(-epsilon,epsilon)) 
            self.wa.append(weights_row)  
            
        for i in range(self.hid):
            weights_row=[] 
            for j in range(self.out):
                weights_row.append(random.uniform(-epsilon,epsilon)) 
            self.wb.append(weights_row) 
            
    #define activation function
    def sigmoid(self,x):
        return 1/(1+math.exp(-x))

In [None]:
nn=NeuralNet(3,4,2,0.1) #implement neural net from a picture
print(nn.wa)