# Machine learning Workshop 
This document contains the content of the machine learning Workshop that we're about to do ;p. 

### Table of content 
1. Machine learning 
    1. What's machine learning ? 
    2. What are the steps to create a machine learning algorithm ? 
    3. Machine learning Terminology
2. Deep learning  
    1. The Human Nervous System
    2. Perceptrons
    3. Neural Network OR MultiLayer Perceptron
    4. FeedForward 
    5. Backpropagation 
3. Implementing Neural Network with Scikit-Learn
    1. Datasets 
    2. Help me find the steps to create our classifier 
    3. Evaluating the algorithm 

![](Images/start.JPG)
    


# Machine learning 
## What's Machine learning ? 
Humans have an ability to identify patterns within the accessible information with an astonishingly high degree of accuracy. Whenever you see a car or a bicycle you can immediately recognize what they are. This is because we have learned over a period of time how a car and bicycle looks like and what their distinguishing features are. But how can we give those abilities to a computer ? 

![](Images/ml.gif)

## What are the steps to create an ML APP ? 
![](Images/steps.JPG)

## Machine learning Terminology 
### Some examples of machine learning 
![](Images/fb.JPG)


### Let's see what you already know 

#### HeartSound 
A machine learning algorithm uses heart sound from different patient to find differences and classify them. 

#### Pedestrian on the road 
An AI that is learning to identify pedestrians on a street is trained with 2 million short videos of street scenes from self-driving cars. Some of the videos contain no pedestrians at all while others have up to 25. A variety of learning algorithms are trained on the data with each having access to the correct answers. Each algorithm develops a variety of models to identify pedestrians in fast moving scenes. The algorithms are then tested against another set of data to evaluate accuracy and precision.

#### Let's simplify ... Dogs vs Cats ? 

#### Generalization 
> The goal of ML is never to make “perfect” guesses, because ML deals in domains where there is no such thing. The goal is to make guesses that are good enough to be useful.


# Deep learning 
## The human nervous system 
Human nervous system consists of billions of neurons. These neurons collectively process input received from sensory organs, process the information, and decides what to do in reaction to the input. A typical neuron in the human nervous system has three main parts: dendrites, nucleus, and axons. The information passed to a neuron is received by dendrites. The nucleus is responsible for processing this information. The output of a neuron is passed to other neurons via the axon, which is connected to the dendrites of other neurons further down the network.

![](Images/dendrides.png) 

`What the frog’s eye tells the frog’s brain`
## Perceptron 
Artificial neural networks are inspired by the human neural network architecture. The simplest neural network consists of only one neuron and is called a perceptron, as shown in the figure below:
![](Images/perceptron.JPG)

A perceptron has one input layer and one neuron. Input layer acts as the dendrites and is responsible for receiving the inputs. The number of nodes in the input layer is equal to the number of features in the input dataset. Each input is multiplied with a weight (which is typically initialized with some random value) and the results are added together. The sum is then passed through an activation function. The activation function of a perceptron resembles the nucleus of human nervous system neuron. It processes the information and yields an output. In the case of a perceptron, this output is the final outcome. However, in the case of multilayer perceptrons, the output from the neurons in the previous layer serves as the input to the neurons of the proceeding layer.

## MultiLayer Perceptron

Now that we know what a single layer perceptron is, we can extend this discussion to multilayer perceptrons, or more commonly known as artificial neural networks. A single layer perceptron can solve simple problems where data is linearly separable in to 'n' dimensions, where 'n' is the number of features in the dataset. However, in case of non-linearly separable data, the accuracy of single layer perceptron decreases significantly. Multilayer perceptrons, on the other hand, can work efficiently with non-linearly separable data.

### In search of non linearity 
#### Feature crosses 

![](Images/LinearProblem2.png)
However if we use a model that is too complicated, such as one with too many crosses, we give it the opportunity to fit to the noise in the training data, often at the cost of making the model perform badly on test data.

#### Neural networks 
Neural networks are a more sophisticated version of feature crosses. In essence, neural networks learn the appropriate feature crosses for you.
![](Images/Deep-learning-ai-machine-matrix2.gif)
A neural network executes in two phases: Feed-Forward and Back Propagation.

##### Activation functions 
###### Sigmoid 
![](Images/sigmoid.jpg)
![](Images/SigmoidFunction.png)
###### Relu 
![](Images/relu.jpg)
![](Images/relu2.jpg)
###### Tanh
![](Images/1200px-Hyperbolic_Tangent.svg.png)

## FeedForward 
Following are the steps performed during the feed-forward phase:

1. The values received in the input layer are multiplied with the weights. A bias is added to the summation of the inputs and weights in order to avoid null values.
2. Each neuron in the first hidden layer receives different values from the input layer depending upon the weights and bias. Neurons have an activation function that operates upon the value received from the input layer. The activation function can be of many types, like a step function, sigmoid function, relu function, or tanh function. As a rule of thumb, relu function is used in the hidden layer neurons and sigmoid function is used for the output layer neuron.
3. The outputs from the first hidden layer neurons are multiplied with the weights of the second hidden layer; the results are summed together and passed to the neurons of the proceeding layers. This process continues until the outer layer is reached. The values calculated at the outer layer are the actual outputs of the algorithm.

The feed-forward phase consists of these three steps. However, the predicted output is not necessarily correct right away; it can be wrong, and we need to correct it. The purpose of a learning algorithm is to make predictions that are as accurate as possible. To improve these predicted results, a neural network will then go through a back propagation phase. During back propagation, the weights of different neurons are updated in a way that the difference between the desired and predicted output is as small as possible.

## BackPropagation 
Back propagation phase consists of the following steps:

1. The error is calculated by quantifying the difference between the predicted output and the desired output. This difference is called "loss" and the function used to calculate the difference is called the "loss function". Loss functions can be of different types e.g. mean squared error or cross entropy functions. Remember, neural networks are supervised learning algorithms that need the desired outputs for a given set of inputs, which is what allows it to learn from the data.
2. Once the error is calculated, the next step is to minimize that error. To do so, partial derivative of the error function is calculated with respect to all the weights and biases. This is called gradient decent. The derivatives can be used to find the slope of the error function. If the slop is positive, the value of the weights can be reduced or if the slop is negative the value of weight can be increased. This reduces the overall error. The function that is used to reduce this error is called the optimization function.

This one cycle of feed-forward and back propagation is called one "epoch". This process continues until a reasonable accuracy is achieved. There is no standard for reasonable accuracy, ideally you'd strive for 100% accuracy, but this is extremely difficult to achieve for any non-trivial dataset. In many cases 90%+ accuracy is considered acceptable, but it really depends on your use-case.

In [2]:
import numpy as np 
import matplotlib as plt 
import pandas as pd

# Location of dataset
url = "Datasets/Iris.data"

# Assign colum names to the dataset
names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'Target']

# Read dataset to pandas dataframe
irisdata = pd.read_csv(url, names=names) 
print(type(irisdata))
irisdata.head()

<class 'pandas.core.frame.DataFrame'>


Unnamed: 0,sepal-length,sepal-width,petal-length,petal-width,Target
0,5.1,3.5,1.4,0.2,Iris-setosa
1,4.9,3.0,1.4,0.2,Iris-setosa
2,4.7,3.2,1.3,0.2,Iris-setosa
3,4.6,3.1,1.5,0.2,Iris-setosa
4,5.0,3.6,1.4,0.2,Iris-setosa


In [3]:
print("Iris data set dimensions : {}".format(irisdata.shape))

Iris data set dimensions : (150, 5)


In [4]:
print(irisdata.groupby('Target').size())

Target
Iris-setosa        50
Iris-versicolor    50
Iris-virginica     50
dtype: int64


In [5]:
import seaborn as sns
sns.countplot(irisdata['Target'],label="Count")

<matplotlib.axes._subplots.AxesSubplot at 0x1464f4b9e8>

In [6]:
irisdata.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 5 columns):
sepal-length    150 non-null float64
sepal-width     150 non-null float64
petal-length    150 non-null float64
petal-width     150 non-null float64
Target          150 non-null object
dtypes: float64(4), object(1)
memory usage: 5.9+ KB


In [7]:
irisdata.isnull().sum()
irisdata.isna().sum()

sepal-length    0
sepal-width     0
petal-length    0
petal-width     0
Target          0
dtype: int64

In [4]:
# Assign data from first four columns to X variable
X = irisdata.iloc[:, 0:4]

# Assign data from first fifth columns to y variable
y = irisdata.select_dtypes(include=[object])  
y.head()

Unnamed: 0,Target
0,Iris-setosa
1,Iris-setosa
2,Iris-setosa
3,Iris-setosa
4,Iris-setosa


In [5]:
a = y.Target.unique()  
print(a[0])

Iris-setosa


In [5]:
from sklearn import preprocessing  
le = preprocessing.LabelEncoder()

y = y.apply(le.fit_transform) 
y.Target.unique()

array([0, 1, 2], dtype=int64)

In [7]:
from sklearn.model_selection import train_test_split  
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20)
print(y_train)

              Target
114   Iris-virginica
42       Iris-setosa
63   Iris-versicolor
27       Iris-setosa
25       Iris-setosa
6        Iris-setosa
17       Iris-setosa
100   Iris-virginica
65   Iris-versicolor
124   Iris-virginica
129   Iris-virginica
19       Iris-setosa
60   Iris-versicolor
96   Iris-versicolor
33       Iris-setosa
23       Iris-setosa
98   Iris-versicolor
38       Iris-setosa
80   Iris-versicolor
125   Iris-virginica
93   Iris-versicolor
9        Iris-setosa
77   Iris-versicolor
45       Iris-setosa
73   Iris-versicolor
88   Iris-versicolor
21       Iris-setosa
36       Iris-setosa
126   Iris-virginica
69   Iris-versicolor
..               ...
47       Iris-setosa
86   Iris-versicolor
140   Iris-virginica
143   Iris-virginica
127   Iris-virginica
107   Iris-virginica
46       Iris-setosa
50   Iris-versicolor
43       Iris-setosa
95   Iris-versicolor
54   Iris-versicolor
142   Iris-virginica
99   Iris-versicolor
48       Iris-setosa
3        Iris-setosa
81   Iris-ver

In [7]:
from sklearn.preprocessing import StandardScaler  
scaler = StandardScaler()  
scaler.fit(X_train)

X_train = scaler.transform(X_train)  
X_test = scaler.transform(X_test) 