# Perpceptron
The perceptron consists of an input layer with $p$ neurones or units each one associated with an input variable.
these neurons transfer their input values to the next layer.
In addition these $p$ neurones we input a bias unit neuron whose output is always 1.
Any vector $\vec{x} = (x_1, x_2, \ldots, x_p)$ is replaced by $\vec{x} = (1, x_1, x_2, \ldots, x_p)$.

The first and only layer of the perceptron consists of a single neuron to which all the units of the input layer are connected. This neuron caclulates the linear combination  $o(\vec{x})=w_0+\sum_{j=1}^{p} w_jx_j$ of the input signals $x_1, x_2, \ldots, xp$. It then applies an activation function $a$ to this linear combination and transmits the result. This output implements the decision function of the perceptron.

- ### Linear combination
$o(\vec{x})=w_0+\sum_{j=1}^{p} w_jx_j$ of the input signals $x_1, x_2, \ldots, xp$

In [4]:
#combinaison lineaire
def lin_comb(x,w,w0):
    sum = 0
    for i in range(len(x)):
        sum += x[i]*w[i]
    return w0+sum

x = [1,1,1]
w = [2,3,4]
w0 = 4

print(f"o(x) = {lin_comb(x,w,w0)}")


o(x) = 13


### A - Binary Classification

- #### Threshold activation function or step function
    $ f:\vec(x) \mapsto \begin{cases} 0, & \text{if } o(x) \geq 0 \\ 1, & else \\ \end{cases}$

In [5]:
#activation function
import math 
#binary classification 

#thresold activation function
def a_threshold(ox):
    return 0 if ox<=0 else 1


- ### Logistic activation function 

    $f:\vec(x) \mapsto \frac{1}{1 + e^{-o(\vec(x)}}$

In [6]:

#logistic activation function
def a_logistic(ox):
    return 1/(1+math.exp(ox))

- ### A Perceptron
$f\vec(x) = a(o\vec(x)) = a \left(w_0+\sum_{j=1}^{p} w_jx_j \right) $

In [11]:
#perecetron
#x: input neuron vector
#w: connection weight vector
#a: activation function
class Neuron:
    def __init__(self, init_con_weight, init_bias_weight, a):
        self.w = init_con_weight
        self.w0 = init_bias_weight
        self.a = a
        
    def output(self, x):
        ox = comb_lin(x,w,w0);
        print(f"ox = {ox}")
        return self.a(ox)  
        

In [15]:
#test
input_values = [2,5,8]
w = [-1.4, -4.5, 3]
w0 = 2

bin_perc_th = Neuron(w, w0, a_threshold)
print(f"bin_perc_th = {bin_perc_th.output(input_values)}\n")

bin_perc_log = Neuron(w, w0, a_logistic)
print(f"bin_perc_log = {bin_perc_log.output(input_values)}\n")


ox = 0.6999999999999993
bin_perc_th = 1

ox = 0.6999999999999993
bin_perc_log = 0.331812227831834



### B - Multi-class classification

For multi-class classification the architecture of the Perceptron consists of C neurons in the output layer, where C is the nomber of classes. Each of the $p+1$ neurons in the input layer is be connected to each of the output neurons. Therefore, we will have $(p+1)C$ connection weigths, denoted as $w^c_j$, where c represents the class index and j represents the input neuron index.
For this perceptron, we use the softmax function as activation function, also known as the normalized exponential function.

- #### Softmax activation function 

The output of the $c$-th neuron of the softmax layer is given by:

$\sigma(o_1, o_2, \dots, o_C )_c = \frac{e^{x_i}}{\sum_{j=1}^{n} e^{x_j}}$


In [59]:
# Multi-class classification perecetron
# x: input neuron vector
# w: connection weight vector
# a: activation function
class Neuron_mc:
    def __init__(self, init_con_weight, init_bias_weight):
        self.w = init_con_weight
        self.w0 = init_bias_weight
    def __str__(self):
        info = f"w0 = {self.w0}\n"
        j = 1
        for wj in self.w:
            info += f"w{j} = {wj}\n"
            j += 1
        return info     

class Layer:
    def __init__(self, *args):
        self.neurons = []
        for neuron in args:
            self.neurons.append(neuron)
        
    def update_softmax(self, x):
        self.oc_list = [lin_comb(x, neuron.w, neuron.w0) for neuron in self.neurons]
        self.exp_oc_list = list(map(lambda ok : math.exp(ok), self.oc_list))
    
    def output(self, x):
        self.update_softmax(x)
        sum_exp_ok = sum(self.exp_oc_list)
        return [ok/sum_exp_ok for ok in self.exp_oc_list]
        
    def __str__(self):
        info = ""
        j = 1
        for neuron in self.neurons:
            info += f"\nNeurone{j}\n{str(neuron)}"
            j += 1
        return info
            


In [62]:
#test
input_values = [2,5,8]
w = [-1.4, -4.5, 3]
w0 = 2

perc1 = Neuron_mc(w, w0)
perc2 = Neuron_mc([-1.8, -5.5, 4.9], 4.3)
perc3 = Neuron_mc([-5.2, -9.5, 2.7], 3)

layer = Layer(perc1, perc2, perc3)
print(layer)

print(layer.output(input_values))

    


Neurone1
w0 = 2
w1 = -1.4
w2 = -4.5
w3 = 3

Neurone2
w0 = 4.3
w1 = -1.8
w2 = -5.5
w3 = 4.9

Neurone3
w0 = 3
w1 = -5.2
w2 = -9.5
w3 = 2.7

[8.293750373891576e-06, 0.9999917062496261, 1.4214728694917544e-20]


### Batch learning vs Online learning

- Batch learning: Learning algorithm performed on a single dataset of n examples.
- Online learning: Learning algorithm that performs one or many operations for each new observation given.

### Training
To train a perceptron, we aim to minimize the empirical risk. The empirical risk is a 
measure of how well the perceptron performs on the training dataset. 

We suppose that the observations $(\vec x^i, y^i )$ are not available simultaneously but are observed sequentially.
In this case we will use a batch learning.
To minimize the empirical risk in an iterative way, we will use the gradient algorithm.