## Mathematical model of a Neuron
<img src="./images/1.1/NN_2.png" width="80%">

In the following example of Neural Networks, we will train a `Multi-Layer Perceptron` model to classify and predict the type of flowers from the Iris Dataset. Since we have only one dataset, we will train the model on 80% of the data, and test for accuracy on the remaining 20%.

We have to make sure that a model is trained on data which belong to the same distribution as that of the test data. This is because if we train on a sample from Population 1 and test the model on a sample from Population 2, we might end up with a `high variance problem`.

* `MLPClassifier` is the Python object to the Nueral Network we are going to train
* `train_test_split` divided the given dataset as per the specified `test_size`. Here, 0.2 means 20%

In [1]:
import pandas as pd
import numpy as np
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split

<img src="./images/1.1/iris.png" width="80%">

In [2]:
df = pd.read_csv("./datasets/iris.data.txt", sep = ",", names = ['sepal_l', 'sepal_w', 'petal_l', 'petal_w', 'class'])
df.head()

Unnamed: 0,sepal_l,sepal_w,petal_l,petal_w,class
0,5.1,3.5,1.4,0.2,Iris-setosa
1,4.9,3.0,1.4,0.2,Iris-setosa
2,4.7,3.2,1.3,0.2,Iris-setosa
3,4.6,3.1,1.5,0.2,Iris-setosa
4,5.0,3.6,1.4,0.2,Iris-setosa


The three classes of Iris available in the dataset are as follows

In [3]:
df['class'].unique()

array(['Iris-setosa', 'Iris-versicolor', 'Iris-virginica'], dtype=object)

Since we are trying to carry out a classification task, it would be easier for us to map the classes to a unique integer ID. Here, we have created a dictionary and used the `replace` function of Pandas.

In [4]:
iris_class = {
    'Iris-setosa' : 0,
    'Iris-versicolor' : 1,
    'Iris-virginica' : 2
}

df= df.replace({"class": iris_class})
df.head()

Unnamed: 0,sepal_l,sepal_w,petal_l,petal_w,class
0,5.1,3.5,1.4,0.2,0
1,4.9,3.0,1.4,0.2,0
2,4.7,3.2,1.3,0.2,0
3,4.6,3.1,1.5,0.2,0
4,5.0,3.6,1.4,0.2,0


## Dividing the dataset in a Train:Test ratio of 80:20
<img src="./images/1.1/split.png" width="70%">

In [5]:
df = df.apply(pd.to_numeric)
df_array = df.values
X_train, X_test, y_train, y_test = train_test_split(df_array[:,:4], df_array[:,4], test_size=0.2)

In [6]:
print("X_train has {0} observations with {1} features".format(X_train.shape[0], X_train.shape[1]))
print("y_train has {0} class values".format(y_train.shape[0]))
print("X_test has {0} observations with {1} features".format(X_test.shape[0], X_test.shape[1]))
print("y_test has {0} class values".format(y_test.shape[0]))

X_train has 120 observations with 4 features
y_train has 120 class values
X_test has 30 observations with 4 features
y_test has 30 class values


## Building the Neural Network has never been easier!!
<img src="./images/1.1/NN_3.png">

In [7]:
mlp = MLPClassifier(hidden_layer_sizes = (10), 
                    solver = 'sgd', 
                    learning_rate_init = 0.01, 
                    max_iter = 200)

In [8]:
mlp.fit(X_train, y_train)
print("Accuracy: {0}%".format(round(mlp.score(X_test,y_test), 2) * 100))

Accuracy: 97.0%




## Testing out a few examples

In [9]:
data = np.array(
    [
        [5.8, 4.0, 1.2, 0.2],
        [3.2, 3.1, 3.2, 1.2],
        [5.4, 2.0, 2.0, 2.3],
        [6.4, 3.2, 4.5, 1.5],
        [2.3, 4.1, 4.6, 3.2]
    ]
)

for i in data:
    print(i, " -> ", mlp.predict(i.reshape(1, -1)))

[5.8 4.  1.2 0.2]  ->  [0.]
[3.2 3.1 3.2 1.2]  ->  [1.]
[5.4 2.  2.  2.3]  ->  [1.]
[6.4 3.2 4.5 1.5]  ->  [1.]
[2.3 4.1 4.6 3.2]  ->  [2.]
