##<font color='green'>Python Programming: Neural Networks</font>

### Import Libraries

In [9]:
# Importing libraries

import pandas as pd
from sklearn.model_selection import train_test_split
# Import a standardization library
from sklearn.preprocessing import StandardScaler
# Import an Multi-Layer Perceptron Classifier model estimator from Scikit-Learn's neural_network library
from sklearn.neural_network import MLPClassifier

from sklearn.metrics import classification_report,confusion_matrix


### Example 1: Wine Classification

In this example we are going to use Neural networks to classifer a wine that has been grown from the sam e region in Italy into three possible cultivars based on various chemical feautures.

**Load Data**

In [10]:
# Loading data
wine = pd.read_csv('http://bit.ly/wine_classification_data', names =["Cultivator", "Alchol", "Malic_Acid", "Ash", "Alcalinity_of_Ash", "Magnesium", "Total_phenols", "Falvanoids", "Nonflavanoid_phenols", "Proanthocyanins", "Color_intensity", "Hue", "OD280", "Proline"])
wine.head()
# wine.shape

Unnamed: 0,Cultivator,Alchol,Malic_Acid,Ash,Alcalinity_of_Ash,Magnesium,Total_phenols,Falvanoids,Nonflavanoid_phenols,Proanthocyanins,Color_intensity,Hue,OD280,Proline
0,1,14.23,1.71,2.43,15.6,127,2.8,3.06,0.28,2.29,5.64,1.04,3.92,1065
1,1,13.2,1.78,2.14,11.2,100,2.65,2.76,0.26,1.28,4.38,1.05,3.4,1050
2,1,13.16,2.36,2.67,18.6,101,2.8,3.24,0.3,2.81,5.68,1.03,3.17,1185
3,1,14.37,1.95,2.5,16.8,113,3.85,3.49,0.24,2.18,7.8,0.86,3.45,1480
4,1,13.24,2.59,2.87,21.0,118,2.8,2.69,0.39,1.82,4.32,1.04,2.93,735


In [11]:
# Setting up our labels and features
X =  wine.drop('Cultivator', axis = 1)
y = wine["Cultivator"]


**Split Data**

In [12]:
# Splitting the data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 10)


**Normalization of data**

Multi-Layer Perceptron class is very sensitve to feature scaling, thus it is always a good habit to sclae our data.

However, feature scaling is only fitted on the training data and not the test data. This is due to the fact that in real world data is not scaled and the major purpose of neural networks is to make predictions on real world data. Hence we try as musch as possible to keep the test data real. 

In [13]:
# Initialize the scaler
scaler = StandardScaler()

# Fitting the scaler
scaler.fit(X_train)

# Applying the transformation to the data
X_train = scaler.transform(X_train)

X_test = scaler.transform(X_test)


**Training the Model**

In [14]:
# Creating an instance of the model
# The MLPClassifier takes in a number of arguments but we are only going to use one for now which is hidden_layer_sizes. we will explore the rest of the arguments in the next session after we have looked at optimization
# For the hidden_layer_sizes we pass in a tuple that consist the number of neurons we want each layer to have. The nth number of the tuple represents the number of layers you want your network to have.
# For us, we will choose 3 layers with the same number of neurons
# YOu can read more on the MLPClasssifier here: https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html#sklearn.neural_network.MLPClassifier

mlp = MLPClassifier(hidden_layer_sizes = (13, 13,13), max_iter = 500)

# fitting the data
mlp.fit(X_train,y_train)

#By default the activation is set to ReLu function but you can always change it to suit your needs. You can always check the other option available from the documentation

MLPClassifier(hidden_layer_sizes=(13, 13, 13), max_iter=500)

**Prediction and Evaluation**

In [15]:
# Now that we have our model in place, let's do the prediction

pred = mlp.predict(X_test)


# Evaluating the performance of ur model
print (confusion_matrix(y_test,pred))

print('-----------------------------------------------')

print(classification_report(y_test,pred))

[[10  0  0]
 [ 2 12  4]
 [ 0  0  8]]
-----------------------------------------------
              precision    recall  f1-score   support

           1       0.83      1.00      0.91        10
           2       1.00      0.67      0.80        18
           3       0.67      1.00      0.80         8

    accuracy                           0.83        36
   macro avg       0.83      0.89      0.84        36
weighted avg       0.88      0.83      0.83        36



**Conclusion**

From the results we can see that we have only missclassified two bottles of wine in our test data.

One downside using Multi-Layer Perception model is that it's dificult to interpret the model itself. The weights and biases are not easily interpretable in relatin to which features are important to the model itself.

However, we can be able to extract the weights and biases after training our model.



**PS**: Try using different activation functions and see which one given you the best results

In [16]:
# Extracting the weights and bias vectors

# Checking the number of weights 
len(mlp.coefs_) 

# Checking the number of biases 
len(mlp.intercepts_) 

4

## <font color='green'>Challenge 1</font>

In [18]:
# Apply neural network technique to the Iris dataset we used in SVM to classify the different classes of flowers. Compare the performance of SVM to that of neural networks and see which is better
#  Dataset Url ----> http://bit.ly/Iris_flower_data

# Your code goes here

## <font color='green'>Challenge 2
</font>

In [None]:
# Use the following dataset to classsify if a patient has diabetes or not

# Dataset Url -------> http://bit.ly/diabetes_data