# Artificial Neural Network (ANN) - Part I

## 1. ANN Definition

Artificial Neural Networks are models that try to mimic some of the human brain behaviors, specifically how to acquire and store that knowledge. From a process called training, the ANN can extract relationships and patterns found on a set of historical samples, and generalize this to samples not seen before. For example, imagine one can train an ANN to model the patterns that characterize a chronic disease from a set of past medical exams. Once the model is trained, it can predict the chance of a new exam be from an ill patient. Besides classification, ANN has been successfully applied in several other domains like Regression, clustering, time-series forecasting, and so on. Nowadays, ANN has evolved into Deep Learning, the cutting-edge technology used to control self-driving cars and to perform image classification.

## 1. Perceptron

The Perceptron is the fundamental rock to anyone who wants to goes deeper into Artificial Neural Network and Deep Learning. The Perceptron network is classified as a Feedforward single-layer network, is based on the McCulloch & Pitts (1943). Figure 1 depicts the MPC neuron:

<img src="images/perceptron01.png" width="55%" title="MCP Neuron">

The MCP neuron has the main parts depicted next:

**a) Input**: $[x_1, x_2,..., x_n]$

Signs or values from the environment (application). For example, in an application to predict houses prices, $x_1$ could be the size of the house at $m^2$, while $x_2$ could be the number of bedrooms.

**b) Weights**: $[w_1, w_2,..., w_n]$

Values used to weight the importance of each input variable over the neuron output. For example, considering the house prices prediction application, $w_1$, and $w_2$ would weight how significantly the size and number of bedrooms (respectively) are to predict the price.

**c) Linear combination**: $\sum$

To sum the weighted inputs to generate a activation potential.

**d) Bias(Activation threshold)**: $-\theta$

A threshold to limit the value of the linear combination.

**e) Action Potential**: $u$

The result from the diference between the linear combination and the bias.

**f) Activation function**: $g(u)$

This function is responsible to generate the output in a appropriate interval range for the application.

**g) Output**: $y$

The final value generated by the neuron.

## 2. Perceptron code

First, we have to compute the action potential, which is given by $u = \sum_{i=1}^n x_i \cdot w_i - \theta$. 

Once, the values of inputs and weights are in the form of a matrix, we could simple perform a matrix multiplication, given by: 

$\sum_{i=1}^n = [w_1, w_2, ..., w_n]^T \cdot [x_1, x_2, ..., x_n]$

After that, it would be necessary subtract $\theta$ from $\Sigma$:

$u = \sum - \theta$

In order to simply, we could simply add $-1$ to the input matrix: $[-1, w_1, w_2,..., w_n]$ and $\theta$ to weights matrix: $[\theta, w_1, w_2,..., w_n]$. At the end, we have:

$u = \sum_{i=0}^n x_i \cdot w_i$

### 2.1. Numpy library

In python, the Numpy library gives us a powerfull set of resources to operate linear algebra operations. For example, consider we have the input given by $x = [0.1, 0.9, 0.5]$, the weights given by $[0.4, 0.3, 0.2]$ and $\theta = -1.5$. The basic operation of our neuron could be simple: 

In [64]:
import numpy as np

x = np.array([-1, 0.1, 0.9, 0.5])
weights = np.array([1.5,0.4,0.3,0.2])

u = np.matmul(x, weights)

print(round(u,2))

-1.09


### 2.2. Version 1 of our Perceptron class

In [65]:
import numpy as np

class Perceptron:
    
    def __init__(self, input_size):
        self.weights = np.random.normal(size=input_size+1)
        self.inputs = []
    
    def bi_step(self, u):
        return 1 if u >= 0 else -1
    
    def output(self, inputs):
        inputs = np.append(-1, inputs)
        u = np.matmul(self.weights, inputs)
        return self.bi_step(u)



sample = np.array([0.1, 0.9, 0.5])
perceptron = Perceptron(3)
y = perceptron.output(sample)
print(y)



1


## 2.3. Training our model

As said before, it is necessary an algorithm to train a model, based on a set of samples. In this tutorial, we will coven an basic algorithm called Hebb Rule.

## 2.4. The complete code

In [66]:
import numpy as np

class Perceptron:
    
    def __init__(self, input_size, learning_rate = 0.05):
        self.input_size = input_size
        self.weights = np.random.normal(size=self.input_size+1)
        self.inputs = []
        self.epoch = 0
        self.learning_rate = learning_rate
        self.samples = np.array([])
        self.outputs = np.array([])
    
    def bi_step(self, u):
        return 1 if u >= 0 else -1
    
    def train(self):
        if len(self.samples) == 0 or len(self.outputs) == 0:
            raise Exception('Please, you must provide a dataset for training')

        self.weights = np.random.normal(size=self.input_size+1)
        error = True
        self.epoch = 0

        while error:
            error = False
    
            for i, sample in enumerate(self.samples):
                sample = np.append(-1, sample)
                u = np.matmul(sample, self.weights)
                y = self.bi_step(u)
        
                if y != self.outputs[i]:
                    error = True
                    self.weights = self.weights + self.learning_rate * (self.outputs[i] - y) * sample
    
            self.epoch += 1
    
    def output(self, inputs):
        inputs = np.append(-1, inputs)
        u = np.matmul(self.weights, inputs)
        return self.bi_step(u)

    
    
samples = np.array([[0.1, 0.4], 
                    [0.3, 0.7],
                    [0.6, 0.9],
                    [0.5, 0.7]])   

outputs = np.array([1, -1, -1, 1])

perceptron = Perceptron(len(samples[0]))

perceptron.samples = samples
perceptron.outputs = outputs

perceptron.train()
print(perceptron.weights)

y = perceptron.output(np.array([0.3, 0.7]))
print(y)
    

[-0.42958134  1.11889307 -1.24308107]
-1
