# Title : Multilayer Perceptron

### - Type of content : Documentation 

### - Domain  : Machine Learning

### - Module :  Dimensionality Reduction & Neural Networks

Inspired by the biological neurons, the concept of perceptrons was introduced. Perceptron is an artificial neuron that is the basic building block of the artificial neural network. They are not just named after their biological similar ones but also modeled on the basis of our biological neuron.

![image.png](attachment:image.png)

The field of artificial neural networks is often just called neural networks or multi-layer perceptrons after perhaps the most useful type of neural network. A perceptron is a single neuron model that was a precursor to larger neural networks.

It is a field that investigates how simple models of biological brains can be used to solve difficult computational tasks like the predictive modeling tasks we see in machine learning. The aim of this is not to be build a similar brain like network, but to use the idea of it to develop such models to solve complex tasks.

The power of making predictions for a neural network comes from learning from data given for training and finding the relation with the output variables. The predictive capability of neural networks comes from the hierarchical or multi-layered structure of the networks.

**Biological neurons have dendrites to receive signals, a cell body to process them, and an axon/axon terminal to transfer signals out to other neurons. Similarly an artificial neuron has multiple input channels to accept training samples represented as a vector, and a processing stage where the weights(w) are adjusted such that the output error (actual vs. predicted) is minimized. Then the result is fed into an activation function to produce output, for example, a classification label. The activation function for a classification problem is a threshold cutoff (standard is .5) above which class is 1 else 0.**

Multilayered perceptrons are also called feed forward neural networks. It is necessary to understand the terminology and processes used in the field of multi-layer perceptron artificial neural networks. In this documentation, you'll learn about :
- The building blocks of neural networks including neurons, weights and activation functions.

To address the drawback of single perceptrons, multilayer perceptrons were proposed. It is a composition of multiple
perceptrons connected in different ways and operating on distinctive activation functions to enable improved learning mechanisms.

The training sample propagates forward through the network and the output error is back propagated and the error is minimized
using the gradient descent method, which will calculate a loss function for all the weights in the network

![image.png](attachment:image.png)

A multilayered neural network can have many hidden layers, where the network holds its internal abstract representation of the training sample. The upper layers will be building new abstractions on top of the previous layers. So having more hidden layers for a complex dataset will help the neural network to learn better.

The input layer’s neuron count will be equal to the total number of features and in some libraries an additional neuron for intercept/bias. These neurons are represented as nodes. The output layers will have a single neuron for regression models and binary classifier; otherwise it will be equal to the total number of class labels for multiclass classification models.

![image.png](attachment:image.png)

*Inputs are combined in a weighted sum and, if the weighted sum exceeds a predefined threshold, the neuron fires and produces an output. Threshold T represents the activation function. If the weighted sum of the inputs is greater than zero the neuron outputs the value 1, otherwise the output value is zero.*

![image.png](attachment:image.png)

*The perceptron can be used as a binary classification model, defining a linear decision boundary. It finds the separating hyperplane that minimizes the distance between misclassified points and the decision boundary*

***To minimize this distance, Perceptron uses Stochastic Gradient Descent as the optimization function.
If the data is linearly separable, it is guaranteed that Stochastic Gradient Descent will converge in a finite number of steps.***

The last piece that Perceptron needs is the activation function, the function that determines if the neuron will fire or not.
Initial Perceptron models used sigmoid function, and just by looking at its shape, it makes a lot of sense!
The sigmoid function maps any real input to a value that is either 0 or 1, and encodes a non-linear function.

![image.png](attachment:image.png) Sigmoid function

**While in the Perceptron the neuron must have an activation function that imposes a threshold, like ReLU or sigmoid, neurons in a Multilayer Perceptron can use any arbitrary activation function.**

Using too few neurons for a complex dataset can result in an under-fitted
model due to the fact that it might fail to learn the patterns in complex data. However,
using too many neurons can result in an over-fitted model as it has capacity to capture
patterns that might be noise or specific for the given training dataset. So to build an efficient
multilayered neural network, the fundamental questions to be answered about hidden
layers while implementation is 
**1) what is the ideal number of hidden layers?, and 2) what
should be the number of neurons in hidden layers?**

# Python code to run a perceptron on a corpus 

In [None]:
import numpy as np
from sklearn import metrics
from sklearn.linear_model import Perceptron
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer

corpus = ["We enjoyed our stay so much. The weather was not great, but everything else was perfect",
"Going to think twice before staying here again. The wifi was spotty and the roans smaller than advertised",
"The perfect place to relax and recharge.",
"Never had such a relaxing vacation.",
"The pictures were misleading, so I was expecting the common areas to be bigger Blut the service was good.",
"There were no clean linens when I got to my room and the breakfast options were not that many.",
"Was expecting it to be a bit far from historical downtown, but it was almost impossible to drive through those narrow roads",
"I thought that waking up with the chickens was fun, but I was wrong.",
"Great place for a quick getaway from the city. Everyone is friendly and polite.",
"Unfortunately it was raining during our stay, and there weren\'t many options for indoors activities. Everything was great, but there was literally no other oprionts besides being in the rain..",
"The town festival was postponed, so the area was a complete ghost town. We were the only guests. Not the experience I was looking for.", 
"We had a lovely time. It\'s a fantastic place to go with the children, they loved all the animals.",
"A little bit off the beaten track, but completely worth it. You can hear the birds sing in the morning and then you are greeted with the biggest, sincerest smiles from the owners. Loved it!!",
"It was good to be outside in the country, visiting old town. Everything was prepared to the upmost detail",
"Staff was friendly, Going to come back for sure.",
"They didn\'t have enough stuff for the amount of guests. It took some time to get our breakfast and we had to watt 20 minutes to get more information about the old town",
"The pictures looked way different..",
"Best weekend in the countryside I\'ve ever had.",
"Terrible, Slow staff, slow town. Only good thing was being surrounded by nature.",
"Not as clean as advertised. Found some cobwebs in the corner of the room.",
"It was a peaceful getaway in the countryside. Everyone was nice. Had a good time.",
"The kids loved running around in nature, we loved the old town. Definitely going back.",
"Had worse experiences..",
"Surprised this was mach different than what was on the website..",
"Not that mindblowing."
]


#0: negative sentiment. 1: positive sentiment 
targets = [1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0]

#Splitting the dataset

train_features, test_features, train_targets, test_targets = train_test_split(corpus, targets, test_size=0.1, random_state=123)

#Turning the corpus into a tf-idf array

vectorizer = TfidfVectorizer(stop_words='english', lowercase=True, norm="l1")
train_features = vectorizer.fit_transform(train_features)

test_features = vectorizer.transform(test_features)

#Build the perceptron and fit the dota

classifier = Perceptron(random_state=457) 
classifier.fit(train_features, train_targets)

predictions =  classifier.predict(test_features)

score= np.round(metrics.accuracy_score(test_targets, predictions), 2)

print("Mean accuracy of predictions:" + str(score))

# References :

1. https://towardsdatascience.com/multilayer-perceptron-explained-with-a-real-life-example-and-python-code-sentiment-analysis-cb408ee93141
2. https://machinelearningmastery.com/neural-networks-crash-course/
3. Book - Mastering Machine Learning with Python in Six Steps - A Practical Implementation Guide to Predictive Data Analytics Using Python — Manohar Swamynathan