<a href="https://colab.research.google.com/github/Kevinlodaya/Perceptron_ANN/blob/main/ANN_BasicUnderstandingWithPerceptron.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Artificial Neural Network (ANN) is a Machine Learning model inspired by the networks of biological neurons found in our brains.

**Biological Neurons**


> Biological neurons send and receive signals from the brain. The main component functions of a neuron are - Dendrite: Receives signals from other neurons; Soma: Processes the information; Axon: Transmits the output of this neuron; Synapse: Point of connection to other neurons.



<center>
<img src="https://upload.wikimedia.org/wikipedia/commons/1/10/Blausen_0657_MultipolarNeuron.png" width= 500 px/>
</center>




> Individual biological neurons are organized in a vast network of billions, with each neuron typically connected to thousands of other neurons. Highly complex computations can be performed by a network of fairly simple neurons.





**Artificial Neurons**


> Modeled after human brain activity, artificial neurons are digital constructs that simulate the behavior of biological neurons in some ways. The first computational model of an (artificial) neuron was proposed by Warren McCulloch (neuroscientist) and Walter Pitts (logician) in 1943.

> As shown below, it may be divided into 2 parts. The first part, g takes an input, performs aggregation, and based on the aggregated value, the second part, f, makes a decision.

<br><br>
<center>
<img src="https://miro.medium.com/max/369/1*fDHlg9iNo0LLK4czQqqO9A.png" width= 320px/>
</center>

<br><br>




**Perceptron**
> The Perceptron is one of the simplest ANN architectures, invented in 1957 by Frank Rosenblatt. It is based on a slightly different artificial neuron (shown in the figure below) called a **threshold logic unit (TLU)**. The inputs and the output are numbers (instead of binary on/off values), and each input connection is associated with a weight. The TLU computes a weighted sum of its inputs $$(z = w_1 x_1 + w_2 x_2 + ⋯ + w_n x_n = x^⊺ w)$$, then applies a step function to that sum and outputs the result: $$h_w(x) = step(z)$$, where $z = x^⊺ w$.


<br><br>
<center>
<img src="https://www.oreilly.com/api/v2/epubs/9781492037354/files/assets/mlst_1004.png" width= 400px/>
</center>

$\hspace{10cm} \text {Threshold logic unit}$
<br><br>

The most common step function used in Perceptrons is the Heaviside step function. Sometimes the sign function is used instead.

$$heaviside (z) = \begin{equation}
\left\{
  \begin{aligned}
    &0&  if\ \  z < 0\\
    &1&  if\ \  z \ge 0\\
  \end{aligned}
  \right.
\end{equation}
$$

$$sgn (z) = \begin{equation}
\left\{
  \begin{aligned}
    &-1&  if\ \  z < 0\\
    &0&  if\ \  z = 0\\
    &1&  if\ \  z > 0\\
  \end{aligned}
  \right.
\end{equation}
$$

A single TLU can be used for simple linear binary classification. It computes a linear combination of the inputs, and if the result exceeds a threshold, it outputs the positive class. Otherwise, it outputs the negative class.



The decision boundary of each output neuron is linear, so Perceptrons are incapable of learning complex patterns (just like Logistic Regression classifiers). However, if the training instances are linearly separable, Rosenblatt demonstrated that this algorithm would converge to a solution. This is called the Perceptron convergence theorem.

**Domain Information:**



Iris Plants are flowering plants with showy flowers. They are very popular among movie directors as it gives an excellent background.

They are predominantly found in dry, semi-desert, or colder rocky mountainous areas in Europe and Asia. They have long, erect flowering stems and can produce white, yellow, orange, pink, purple, lavender, blue, or brown colored flowers. There are 260 to 300 types of iris.

![alt text](https://cdn-images-1.medium.com/max/1275/1*7bnLKsChXq94QjtAiRn40w.png)

As you could see, flowers have 3 sepals and 3 petals.  The sepals are usually spreading or drop downwards and the petals stand upright, partly behind the sepal bases. However, the length and width of the sepals and petals vary for each type.

**Description:**

The Iris dataset consists of 150 data instances. There are 3 classes (Iris Versicolor, Iris Setosa, and Iris Virginica) each has 50 instances.

For each flower, we have the below data attributes

*   petal length in cm
*   sepal width in cm
*   sepal length in cm
*   petal width in cm









In [2]:
#Importing the required packages:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from tensorflow import keras
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import Perceptron
from sklearn.metrics import accuracy_score

In [6]:
# Load data using Pandas
iris = pd.read_csv("Iris.csv")
iris.head()

Unnamed: 0,Id,sepal_length,sepal_width,petal_length,petal_width,species
0,1,5.1,3.5,1.4,0.2,Iris-setosa
1,2,4.9,3.0,1.4,0.2,Iris-setosa
2,3,4.7,3.2,1.3,0.2,Iris-setosa
3,4,4.6,3.1,1.5,0.2,Iris-setosa
4,5,5.0,3.6,1.4,0.2,Iris-setosa


In [8]:
iris.shape #Determinig the shape of Iris Dataset

(150, 6)

In [9]:
# As we see the data is in dequential order, lets shuffle the data using .sample()
df = iris.sample(frac = 1)
df.head()

Unnamed: 0,Id,sepal_length,sepal_width,petal_length,petal_width,species
132,133,6.4,2.8,5.6,2.2,Iris-virginica
90,91,5.5,2.6,4.4,1.2,Iris-versicolor
20,21,5.4,3.4,1.7,0.2,Iris-setosa
101,102,5.8,2.7,5.1,1.9,Iris-virginica
141,142,6.9,3.1,5.1,2.3,Iris-virginica


### Data pre-processing

1. **species** is our target variable. Drop Iris-virginica from species column. Select Iris-setosa and Iris-versicolor as your target varaible.

2. Converting categorical values to numerical values where species contain categorical values such as 'Iris-setosa', 'Iris-virginica' and replacing them with with 1, 0


In [10]:
df.drop('Id',axis = 1,inplace = True)
df.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
132,6.4,2.8,5.6,2.2,Iris-virginica
90,5.5,2.6,4.4,1.2,Iris-versicolor
20,5.4,3.4,1.7,0.2,Iris-setosa
101,5.8,2.7,5.1,1.9,Iris-virginica
141,6.9,3.1,5.1,2.3,Iris-virginica


From the above data select only 2 species 'Iris-setosa', 'Iris-versicolor'

In [12]:
df = df[df['species'] != 'Iris-virginica']
df.head()  #Used to return top n rows, Bydefault the value of n is 5

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
90,5.5,2.6,4.4,1.2,Iris-versicolor
20,5.4,3.4,1.7,0.2,Iris-setosa
52,6.9,3.1,4.9,1.5,Iris-versicolor
41,4.5,2.3,1.3,0.3,Iris-setosa
8,4.4,2.9,1.4,0.2,Iris-setosa


In [13]:
# From the above we have selected only setosa and versicolor and not virginica (!='virginica')
df['species'] = df['species'].replace(['Iris-setosa','Iris-versicolor'],[1, 0])

In [14]:
df.loc[[5, 60, 90]]  # Select few samples to see the DataFrame

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
5,5.4,3.9,1.7,0.4,1
60,5.0,2.0,3.5,1.0,0
90,5.5,2.6,4.4,1.2,0


### Single Layer Perceptron

A node, also called a neuron or Perceptron, is a computational unit that has one or more weighted input connections. Use a transfer function that combines the inputs and gives the output connection. Nodes are then organized into layers to comprise a network.

Let us see how the output is measured at each neuron

![alt text](https://cdn.talentsprint.com/aiml/Experiment_related_data/artificial_neuron.png)

In [15]:
#Understanding Perceptron using Mathematical Approach
# Let us take 5th sample as input x1, x2, x3, x4
one_neuron_inputs  = [5.4, 3.9,  1.7,  0.4]
weights = [0.7, 0.6, -1.0, -1.0] # You can give the random weights w1, w2, w3, w4

In [16]:
# This function returns the sum of product of inputs and weights
def weighted_sum(inputs, weights):
    total = 0
    for inputs_value,weight in zip(inputs, weights):
        total += inputs_value * weight
    return total

In [17]:
# Output of first neuron
Node_input = weighted_sum(one_neuron_inputs, weights)
print(Node_input)

4.019999999999999


**Activation Function**


> A Step Function is one of the most common activation function in neural networks. The function produces binary output.

> The output is a certain value, $A_1$, if the input sum is above a certain threshold and $A_0$ if the input sum is below a certain threshold. The values used by the Perceptron were $A_1$ = 1 and $A_0$ = 0.

![alt text](https://cdn.talentsprint.com/aiml/Experiment_related_data/step_function.png)


These kinds of step activation functions are useful for binary classification schemes. In other words, when we want to classify an input pattern into one of two groups, we can use a binary classifier with a step activation function. Another use for this would be to create a set of small feature identifiers. Each identifier would be a small network that would output a 1 if a particular input feature is present, and a 0 otherwise. Combining multiple feature detectors into a single network would allow a very complicated clustering or classification problem to be solved.

In [18]:
def Step_Fun(number):
    if number >= 2: # Threshold value is 2
        return 1
    else:
        return 0

In [19]:
# Apply the step function where it returns one if node input value is greater than or equal to 2, otherwise returns zero
print(Step_Fun(1))
print(Step_Fun(2))
print(Step_Fun(3))

0
1
1


#### Performing for all samples

Performing weigthed sum  and Step function for each of the feature by taking random weights and finding accuracy for actuals and predictions.

In [20]:
samples = df.iloc[:, :4].values
predicted = []
weights = [1,0.1,0.1,1] # Change the weights and observe the change in accuracy 0.7, 0.6, -1.0, -1.0
for sample in samples:
    Node_input = weighted_sum(sample, weights)
    Node_output = Step_Fun(Node_input)
    predicted.append(Node_output)

actual = df.iloc[:, 4].values
# print actual and predicted values
print("Actual labels from the data", actual)
print("\n Predicted labels", predicted)

# calculate accuracy score
acc = accuracy_score(actual, predicted)
print("Accuracy of the Perceptron using mathematical approach", acc)

Actual labels from the data [0 1 0 1 1 0 1 0 1 1 1 0 1 1 0 1 1 1 1 1 1 0 1 1 0 0 0 0 1 1 1 1 0 1 1 0 0
 0 0 0 1 0 0 0 0 0 0 1 1 0 1 1 0 0 0 1 0 1 1 0 1 1 1 1 1 0 1 1 0 1 0 0 1 1
 0 0 1 0 0 0 0 0 0 1 1 1 0 1 1 0 0 1 0 0 0 1 0 0 0 1]

 Predicted labels [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Accuracy of the Perceptron using mathematical approach 0.5


In [21]:
# Apply the step function where it returns one if node input value is greater than or equal to 2, otherwise returns zero
Node_output = Step_Fun(Node_input)
print(Node_output)

1


### Part- B: Perceptron from Sklearn

1. From the given data, Select features and labels
2. Split the data into train and test sets
3. Train using perceptron Classifier

In [22]:
labels = df['species']
features = df.drop('species', axis = 1)

In [23]:
# split the data into train and test
from sklearn.model_selection import train_test_split
xtrain, xtest, ytrain, ytest = train_test_split(features, labels, test_size = 0.2)

In [24]:
from sklearn.linear_model import Perceptron
model = Perceptron()
# Fitting the data into the model
model.fit(xtrain, ytrain)
# Predicting the labels for test data
accuracy = model.score(xtest, ytest)
print("Accuracy of the Perceptron using sklearn", accuracy)

Accuracy of the Perceptron using sklearn 1.0


Accuracy of the Perceptron: 1

Thank you :)