# Neural Networks with Perceptron

## Fundamental concepts

An input layer -> Weights and bias -> A weighted sum -> An activation function -> Linear separability 

## Perceptron 

A perceptron is a neural network with just one layer. It's a linear classifier that outputs a binary response variable. Consequently, the algorithm is called "linear binary classifier".

## Linear Separability

- Data is said to have "linear separability" if it can be cleanly classified into one of two classes. 
- Your data must be linearly separable in order for perceptron to operate properly 

## Four essential elements of the perceptron

1. An input layer
2. Weights and bias 
3. A weighted sum 
4. An activation function

## Single-Layer Neutral Network

|Input layer | | Weighted Sum | Activation function | |
| :--- | :--- | :--- | :--- | :--- |
|Input Feature 1 | Weight 1 |                     | Simple   |
|Input Feature 2 | Weight 2 | Weighted Sum / bias | Linear   | Output layer (y^)
|Input Feature 3 | Weight 3 |                     | Function |

## Activation Function

An activation function is a mathemetical function that is deployed on each unit in a neural network. All units in the shared layer deploy the same activation function. The purpose of activation functions is to enable neural networks to model complex, nonlinear phenomenon.

- Linear activation: tf.matmul() - single layer perceptron
- Logistic sigmoid: use these often in the final output layer, useful with binary input features
- Threshold function: useful with binary features
- ReLU (rectified linear unit)
- SoftMax

In [18]:
import numpy as np 
import pandas as pd 
import sklearn

from pandas import Series, DataFrame
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, classification_report

In [19]:
from sklearn.linear_model import Perceptron

In [20]:
iris = datasets.load_iris()

x = iris.data
y = iris.target

In [21]:
# Split data into train and test sets
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=.2)

In [22]:
# Normalize the data
standardize = StandardScaler()

standardize_x_test = standardize.fit_transform(x_test)
standardize_x_train = standardize.fit_transform(x_train)

In [23]:
standardize_x_test[0:5]

array([[-0.08196892, -0.76255128,  0.27241258,  0.22358746],
       [ 0.43572953, -0.76255128,  0.61530954,  0.09206542],
       [ 1.21227721,  0.41665173,  1.18680446,  1.27576373],
       [-0.8585166 ,  1.59585474, -1.15632473, -1.22315491],
       [ 0.82400337,  0.18081113,  0.50101055,  0.48663152]])

In [24]:
# Create a perceptron
perceptron = Perceptron(max_iter=50, eta0=.15, tol=1e-3, random_state=15)
## train model
perceptron.fit(standardize_x_train, y_train.ravel())

Perceptron(eta0=0.15, max_iter=50, random_state=15)

In [25]:
 # Make a prediction using created model
y_pred = perceptron.predict(standardize_x_test)

In [26]:
print(y_test)
print(y_pred)

[1 1 2 0 1 2 0 1 1 2 0 0 0 2 0 1 1 2 0 2 1 0 1 2 1 2 0 0 2 0]
[1 1 2 0 2 2 0 2 1 2 0 1 1 2 0 1 1 2 1 1 2 0 2 2 1 2 1 0 2 1]


In [27]:
# Check the accuracy of the prediction
print(classification_report(y_test, y_pred))

              precision    recall  f1-score   support

           0       1.00      0.55      0.71        11
           1       0.50      0.60      0.55        10
           2       0.67      0.89      0.76         9

    accuracy                           0.67        30
   macro avg       0.72      0.68      0.67        30
weighted avg       0.73      0.67      0.67        30



**Precision** is a measure of the model's relevancy  
**Recall** is a measure of model's completeness  

In our case for precision:

0: For all the points predicted to have 0 label, 100% of the retrieved instances were relevant  
1: For all the points predicted to have 1 label, 50% of the retrieved instances were relevant  

For recall:

2: For all your points that were labeled 2, 89% of the results that were returned were trully relevant  