# The Perceptron

The perceptron was invented by Frank Rosenblatt in 1958. It is based on the artificial neuron proposed by McCulloch and Walter Pitts in 1943. The perceptron was the first mathematical model of a single neuron which uses the step function as activation function.The perceptron is a classification algorithm that makes its predictions based on a linear predictor function which combines a vector of weights with the feature vector.

The goal of the perceptron learning algorithm is to find the weights vector $𝑤$ that can perfectly classify the data inputs. The algorithm starts with random weights $(𝑤_1​, 𝑤_2​, …, 𝑤_n)$. It incrementally modifies the weights such that points that are misclassified move closer to the correct side of the boundary. It stops when all learning examples are correctly classified, and the input data in a n\-dimensional space is divided with a hyperplane.

The figure below shows a neural network with two inputs $x_1$, $x_2$. The inputs $x_1$, $x_2$ are connected to the output by weighted edges $w_1$, $w_2$. The output unit multiplies the inputs by their weights and adds the bias, represented by $w_0$.


The activation function is a threshold applied to the output of the network. The step function is one of the simplest activation functions: $𝑔(𝑥) = 1$ if $x > 0$, else $0$. If the input value is greater than $0$ the output will be $1$, otherwise the output will be $0$.

![alt text](perceptrons/Perceptron.png)


In [239]:
import numpy as np

def perceptron(inputs, weights, bias):
    return step(np.dot(np.array(weights), np.array(inputs)) + bias)

#step function

def step(x):
    return 1 if x > 0 else 0

#evaluation

def evaluate(perceptron, inputs, labels, function):
    print (f"{function}/n")

    predictions = len(inputs)

    correct_predictions = 0

    for i in range(predictions):
        prediction = perceptron(inputs[i])

        if labels[i] == prediction:
            correct_predictions+=1

        print(f"[{inputs[i][0]}, {inputs[i][1]}] = {prediction}")
    
    print(f"\nAccuracy {(correct_predictions/predictions) * 100}% ")

## OR perceptron

The function $g(x) = 2x_1+2x_2 -1$ defines the decision boundary for the OR perceptron. When $x_1$ and $x_2$ are $0$, the weighted sum is $−1$ and the output of $g$ is $0$. When $x_1$ or $x_2$ are $1$, the weighted sum is greater than $0$ and the output of the $g$ is $1$.

![alt text](perceptrons/OR.png)

In [240]:
# or perceptron

def or_perceptron(x):
    weights = [2,2]
    bias = -1

    return perceptron(x, weights, bias)

inputs = [[0,0], [0,1], [1,0], [1,1]]
labels = [0,1,1,1]

evaluate(or_perceptron, inputs, labels, "OR preceptron")

OR preceptron/n
[0, 0] = 0
[0, 1] = 1
[1, 0] = 1
[1, 1] = 1

Accuracy 100.0% 


## NOR perceptron

The function $g(x) = -2x_1-2x_2+1$ defines the decision boundary for the NOR perceptron. When $x_1$ and $x_2$ are $0$, the weighted sum is $1$ and the output of $g$ is $1$. When $x_1$ or $x_2$ are $1$, the weighted sum is less than $0$ and the output of the $g$ is $0$.

![alt text](perceptrons/NOR.png)

In [241]:
#nor perceptron

def nor_perceptron(x):
    weights = [-2,-2]
    bias = 1

    return perceptron(x, weights, bias)

inputs = [[0,0], [0,1], [1,0], [1,1]]
labels = [1,0,0,0]

evaluate(nor_perceptron, inputs, labels, "NOR preceptron")

NOR preceptron/n
[0, 0] = 1
[0, 1] = 0
[1, 0] = 0
[1, 1] = 0

Accuracy 100.0% 



## AND perceptron
The function g(x) = 2x_1+2x_2-3 defines the decision boundary for the AND perceptron. When x_1 or x_2 are 0, the weighted sum is less than 0 and the output of g is 0. When x_1 and x_2 are 1, the weighted sum is 1 and the output of the g is 1.

AND perceptron
![alt text](perceptrons/AND.png)

In [242]:
#AND perceptron

def and_perceptron(x):
    weights = [2,2]
    bias = -3

    return perceptron(x, weights, bias)

inputs = [[0,0], [0,1], [1,0], [1,1]]
labels = [1,0,0,0]

evaluate(and_perceptron, inputs, labels, "AND preceptron")

AND preceptron/n
[0, 0] = 0
[0, 1] = 0
[1, 0] = 0
[1, 1] = 1

Accuracy 50.0% 


## NAND perceptron
The function g(x)=-2x_1-2x_2+3 defines the decision boundary for the AND perceptron. When x_1 or x_2 are 0, the weighted sum is greater than 0 and the output of g is 1. When x_1 and x_2 are 1, the weighted sum is -1 and the output of the g is 0.

![alt text](perceptrons/NAND.png)

In [243]:
#NAND perceptron

def nand_perceptron(x):
    weights = [-2,-2]
    bias = 3

    return perceptron(x, weights, bias)

inputs = [[0,0], [0,1], [1,0], [1,1]]
labels = [1,0,0,0]

evaluate(nand_perceptron, inputs, labels, "AND preceptron")

AND preceptron/n
[0, 0] = 1
[0, 1] = 1
[1, 0] = 1
[1, 1] = 0

Accuracy 50.0% 


## XOR perceptron

The XOR function is not linearly separable. A single\-layer perceptron cannot learn a decision boundary to classify the inputs.
![alt text](perceptrons/XOR.png)

### XOR perceptron implemented with OR, NAND, AND

The XOR function is defined as $x_1 \oplus x_2 = (x_1 \land ¬x_2) \lor (¬x_1 \land x_2)$. By applying De Morgan's rules, the XOR function is expressed  $x_1 \oplus x_2 = (x_1 \lor x_2) \land ¬(x_1 \land x_2)$. The XOR perceptron implemented with OR, NAND, and AND perceptrons shows how perceptrons can be combined to learn non\-linear functions.


In [244]:
def xor_perceptron_or_nand_and(x):
    return and_perceptron([or_perceptron(x), nand_perceptron(x)])

inputs= [[0,0], [0,1], [1,0], [1,1]]
labels = [0,1,1,0]

evaluate(xor_perceptron_or_nand_and, inputs, labels, "XOR perceptron implemented with OR, NAND, and AND perceptrons")


XOR perceptron implemented with OR, NAND, and AND perceptrons/n
[0, 0] = 0
[0, 1] = 1
[1, 0] = 1
[1, 1] = 0

Accuracy 100.0% 



### XOR perceptron implemented with NAND perceptrons

The NAND gate is universal. This means that a XOR gate can be implemented with NAND gates.

A network of NAND perceptrons can model any circuit, from the XOR logic gate, to adders and more complex circuits.![alt text](<perceptrons/XOR implemented with nand gates.png>)

In [245]:
def xor_perceptron_nand(x):
    return nand_perceptron([nand_perceptron([nand_perceptron(x), nand_perceptron(x)]), nand_perceptron([nand_perceptron(x), nand_perceptron(x)])])
inputs= [[0,0], [0,1], [1,0], [1,1]]
labels = [0,1,1,0]

evaluate(xor_perceptron_or_nand_and, inputs, labels, "XOR perceptron implemented with OR, NAND, and AND perceptrons")


XOR perceptron implemented with OR, NAND, and AND perceptrons/n
[0, 0] = 0
[0, 1] = 1
[1, 0] = 1
[1, 1] = 0

Accuracy 100.0% 


## Non-Linear Classiffier with multiple layers
A multilayer perceptron can classify non\-linearly separable data. A multilayer perceptron is a simple neural network.
![alt text](<two-layer-perceptions/Network of perceptrons I-A.png>)

A two\-layer perceptron with three neurons can classify the data above by defining two decision boundaries. The region where $y = 1$ is given by the intersection of the inequalities.
![alt text](<two-layer-perceptions/Network of perceptrons I-B.png>)

The output layer is an AND perceptron which inputs are the outputs of the two perceptrons defining the decision boundaries.
![alt text](<two-layer-perceptions/Network of perceptrons I-C.png>)


In [246]:
# test data for two-layer neural networks

# exercise 1: testing non-linearly separable data

inputs = [[0.0, 0.0],[0.0, 0.25],[0.0, 0.5],[0.0, 0.75],[0.0, 1.0],[0.0, 1.5],
          [0.25, 0.0],[0.25, 0.25],[0.25, 0.5],[0.25, 0.75],[0.25, 1.0],[0.25, 1.5],
          [0.5, 0.0],[0.5, 0.25],[0.5, 0.5],[0.5, 0.75],[0.5, 1.0],[0.5, 1.5],
          [0.75, 0.0],[0.75, 0.25],[0.75, 0.5],[0.75, 0.75],[0.75, 1.0],[0.75, 1.5],
          [1.0, 0.0],[1.0, 0.25],[1.0, 0.5],[1.0, 0.75],[1.0, 1.0],[1.0, 1.5],
          [1.5, 0.0],[1.5, 0.25],[1.5, 0.5],[1.5, 0.75],[1.5, 1.0],[1.5, 1.5]]

labels = [1, 1, 0, 0, 0, 0,
          1, 1, 1, 0, 0, 0,
          1, 1, 1, 1, 0, 0,
          1, 1, 1, 1, 0, 0,
          1, 1, 1, 1, 0, 0,
          1, 1, 0, 0, 0, 0]

def example1(z):
    y = perceptron(z, ([4, -6]), 3)
    x = perceptron(z, ([-1, -1]), 2)

    #inputs wights bias
    return and_perceptron([x,y])

evaluate(example1, inputs, labels, "exercise 1")



exercise 1/n
[0.0, 0.0] = 1
[0.0, 0.25] = 1
[0.0, 0.5] = 0
[0.0, 0.75] = 0
[0.0, 1.0] = 0
[0.0, 1.5] = 0
[0.25, 0.0] = 1
[0.25, 0.25] = 1
[0.25, 0.5] = 1
[0.25, 0.75] = 0
[0.25, 1.0] = 0
[0.25, 1.5] = 0
[0.5, 0.0] = 1
[0.5, 0.25] = 1
[0.5, 0.5] = 1
[0.5, 0.75] = 1
[0.5, 1.0] = 0
[0.5, 1.5] = 0
[0.75, 0.0] = 1
[0.75, 0.25] = 1
[0.75, 0.5] = 1
[0.75, 0.75] = 1
[0.75, 1.0] = 0
[0.75, 1.5] = 0
[1.0, 0.0] = 1
[1.0, 0.25] = 1
[1.0, 0.5] = 1
[1.0, 0.75] = 1
[1.0, 1.0] = 0
[1.0, 1.5] = 0
[1.5, 0.0] = 1
[1.5, 0.25] = 1
[1.5, 0.5] = 0
[1.5, 0.75] = 0
[1.5, 1.0] = 0
[1.5, 1.5] = 0

Accuracy 100.0% 


## title?
A multilayer perceptron can classify non\-linearly separable data. A multilayer perceptron is a simple neural network.
![alt text](<two-layer-perceptions/Network of perceptrons II-A.png>)
A two\-layer perceptron with three neurons can classify the data above by defining two decision boundaries. The region where $y = 1$ is given by the intersection of the inequalities.
![alt text](<two-layer-perceptions/Network of perceptrons II-B.png>)
The output layer is an OR perceptron which inputs are the outputs of the two perceptrons defining the decision boundaries.
![alt text](<two-layer-perceptions/Network of perceptrons II-C.png>)

In [247]:
# exercise 2: testing non-linearly separable data


inputs = [[0.0, 0.0],[0.0, 0.25],[0.0, 0.5],[0.0, 0.75],[0.0, 1.0],[0.0, 1.5],
          [0.25, 0.0],[0.25, 0.25],[0.25, 0.5],[0.25, 0.75],[0.25, 1.0],[0.25, 1.5],
          [0.5, 0.0],[0.5, 0.25],[0.5, 0.5],[0.5, 0.75],[0.5, 1.0],[0.5, 1.5],
          [0.75, 0.0],[0.75, 0.25],[0.75, 0.5],[0.75, 0.75],[0.75, 1.0],[0.75, 1.5],
          [1.0, 0.0],[1.0, 0.25],[1.0, 0.5],[1.0, 0.75],[1.0, 1.0],[1.0, 1.5],
          [1.5, 0.0],[1.5, 0.25],[1.5, 0.5],[1.5, 0.75],[1.5, 1.0],[1.5, 1.5]]

labels = [0, 0, 0, 1, 1, 1,
          0, 0, 0, 1, 1, 1,
          0, 0, 0, 0, 1, 1,
          0, 0, 0, 0, 0, 1,
          0, 0, 0, 0, 0, 1,
          0, 0, 0, 1, 1, 1]


def example2(z):
    y = perceptron(z, ([-4, 6]), -3)
    x = perceptron(z, ([1, 1]), -2)

    #inputs wights bias
    return or_perceptron([x,y])

evaluate(example2, inputs, labels, "exercise 2")


exercise 2/n
[0.0, 0.0] = 0
[0.0, 0.25] = 0
[0.0, 0.5] = 0
[0.0, 0.75] = 1
[0.0, 1.0] = 1
[0.0, 1.5] = 1
[0.25, 0.0] = 0
[0.25, 0.25] = 0
[0.25, 0.5] = 0
[0.25, 0.75] = 1
[0.25, 1.0] = 1
[0.25, 1.5] = 1
[0.5, 0.0] = 0
[0.5, 0.25] = 0
[0.5, 0.5] = 0
[0.5, 0.75] = 0
[0.5, 1.0] = 1
[0.5, 1.5] = 1
[0.75, 0.0] = 0
[0.75, 0.25] = 0
[0.75, 0.5] = 0
[0.75, 0.75] = 0
[0.75, 1.0] = 0
[0.75, 1.5] = 1
[1.0, 0.0] = 0
[1.0, 0.25] = 0
[1.0, 0.5] = 0
[1.0, 0.75] = 0
[1.0, 1.0] = 0
[1.0, 1.5] = 1
[1.5, 0.0] = 0
[1.5, 0.25] = 0
[1.5, 0.5] = 0
[1.5, 0.75] = 1
[1.5, 1.0] = 1
[1.5, 1.5] = 1

Accuracy 100.0% 
