<a href="https://colab.research.google.com/github/a-nagar/vistra-intermediate/blob/main/Neural_Nets.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Neural Networks

Neural Networks are powerful machine learning models that can represent a wide variety of datasets, such as tabular, text, image, videos, etc. They are one of the most popular models and also are form the foundation of many deep learning models and techniques.

Neural Networks are built using perceptrons (also called neurons) as a building block. They are stacked in layers to form a multi-layer perceptrons or neural networks. Each layer can have multiple perceptrons and neurons are connected serially from input layer all the way to the output layer.

Another important point to understand is that the number of neurons in the input layer is equal to the number of input features in the data and the number of neurons in the output layer is generally equal to the number of classes in the dataset. For the special case of binary classification, you can get by with a single neuron.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import sklearn
from sklearn.neural_network import MLPClassifier
from sklearn.neural_network import MLPRegressor

# Import necessary modules
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from math import sqrt
from sklearn.metrics import r2_score

In [None]:
import pandas as pd

col_names = ['pregnant', 'glucose', 'bp', 'skin', 'insulin', 'bmi', 'pedigree', 'age', 'label']
# load dataset
pima = pd.read_csv("https://an-utd-python.s3.us-west-1.amazonaws.com/pima-indians-diabetes.csv", header=None, names=col_names)



In [None]:
pima.head()

Unnamed: 0,pregnant,glucose,bp,skin,insulin,bmi,pedigree,age,label
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1


In [None]:
pima.describe()

Unnamed: 0,pregnant,glucose,bp,skin,insulin,bmi,pedigree,age,label
count,768.0,768.0,768.0,768.0,768.0,768.0,768.0,768.0,768.0
mean,3.845052,120.894531,69.105469,20.536458,79.799479,31.992578,0.471876,33.240885,0.348958
std,3.369578,31.972618,19.355807,15.952218,115.244002,7.88416,0.331329,11.760232,0.476951
min,0.0,0.0,0.0,0.0,0.0,0.0,0.078,21.0,0.0
25%,1.0,99.0,62.0,0.0,0.0,27.3,0.24375,24.0,0.0
50%,3.0,117.0,72.0,23.0,30.5,32.0,0.3725,29.0,0.0
75%,6.0,140.25,80.0,32.0,127.25,36.6,0.62625,41.0,1.0
max,17.0,199.0,122.0,99.0,846.0,67.1,2.42,81.0,1.0


In [None]:
feature_cols = ['pregnant', 'insulin', 'bmi', 'age','glucose','bp','pedigree']
X = pima[feature_cols] # Features
y = pima.label # Target variable

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=40)


In [None]:
X_train.shape

(614, 7)

In [None]:
X_test.shape

(154, 7)

In [None]:
from sklearn.neural_network import MLPClassifier

mlp = MLPClassifier(hidden_layer_sizes=(8, 8), activation='relu', solver='adam', max_iter=1000)
mlp.fit(X_train,y_train)


Can you figure out how many connections would there be in this neural network?

Each connection is given a weight that is optimized by the training algorithm.

In [None]:
y_predict_train = mlp.predict(X_train)
y_predict_test = mlp.predict(X_test)

In [None]:
y_predict_test

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
       0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [None]:
from sklearn.metrics import classification_report,confusion_matrix
print(classification_report(y_test,y_predict_test))

              precision    recall  f1-score   support

           0       0.64      0.97      0.77        95
           1       0.70      0.12      0.20        59

    accuracy                           0.64       154
   macro avg       0.67      0.54      0.49       154
weighted avg       0.66      0.64      0.55       154



Are you satisfied with the accuracy values? Do you think overfitting has happened? Try changing the values of the hyperparameters and see if it makes a difference.

## Using Neural Networks for Regression

Let's apply neural network on a regression dataset. The model to be used here is called the *MLPRegressor*. Let's use the famous California Housing Dataset

In [None]:
housing = pd.read_csv("https://raw.githubusercontent.com/a-nagar/datasets/main/california_housing.csv")

In [None]:
housing.head()

Unnamed: 0,longitude,latitude,housing_median_age,total_rooms,total_bedrooms,population,households,median_income,median_house_value
0,-122.23,37.88,41,880,129.0,322,126,8.3252,452600
1,-122.22,37.86,21,7099,1106.0,2401,1138,8.3014,358500
2,-122.24,37.85,52,1467,190.0,496,177,7.2574,352100
3,-122.25,37.85,52,1274,235.0,558,219,5.6431,341300
4,-122.25,37.85,52,1627,280.0,565,259,3.8462,342200


In [None]:
housing.dropna(inplace=True)

In [None]:
feature_cols = ['housing_median_age', 'total_rooms', 'total_bedrooms', 'population','households','median_income']
X = housing[feature_cols] # Features
y = housing['median_house_value'] # Target variable

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=40)


In [None]:
from sklearn.neural_network import MLPRegressor

mlp = MLPRegressor(hidden_layer_sizes=(8, 8), activation='relu', solver='adam', max_iter=1000)
mlp.fit(X_train,y_train)


In [None]:
y_predict_test = mlp.predict(X_test)

In [None]:
from sklearn.metrics import mean_absolute_error, mean_squared_error

In [None]:
mean_absolute_error(y_test, y_predict_test)

55654.9516104566

## Working With Another Dataset

Your task is to load the famous heart disease dataset and try to build a predictive model for it using MLPClassifier.

In [None]:
heart_disease = pd.read_csv("https://raw.githubusercontent.com/a-nagar/datasets/main/heart_disease.csv")

In [None]:
heart_disease.head()

Unnamed: 0,age,sex,chest pain type,resting bp s,cholesterol,fasting blood sugar,resting ecg,max heart rate,exercise angina,oldpeak,ST slope,target
0,40,1,2,140,289,0,0,172,0,0.0,1,0
1,49,0,3,160,180,0,0,156,0,1.0,2,1
2,37,1,2,130,283,0,1,98,0,0.0,1,0
3,48,0,4,138,214,0,0,108,1,1.5,2,1
4,54,1,3,150,195,0,0,122,0,0.0,1,0


In [None]:
heart_disease.describe()

Unnamed: 0,age,sex,chest pain type,resting bp s,cholesterol,fasting blood sugar,resting ecg,max heart rate,exercise angina,oldpeak,ST slope,target
count,1190.0,1190.0,1190.0,1190.0,1190.0,1190.0,1190.0,1190.0,1190.0,1190.0,1190.0,1190.0
mean,53.720168,0.763866,3.232773,132.153782,210.363866,0.213445,0.698319,139.732773,0.387395,0.922773,1.62437,0.528571
std,9.358203,0.424884,0.93548,18.368823,101.420489,0.409912,0.870359,25.517636,0.48736,1.086337,0.610459,0.499393
min,28.0,0.0,1.0,0.0,0.0,0.0,0.0,60.0,0.0,-2.6,0.0,0.0
25%,47.0,1.0,3.0,120.0,188.0,0.0,0.0,121.0,0.0,0.0,1.0,0.0
50%,54.0,1.0,4.0,130.0,229.0,0.0,0.0,140.5,0.0,0.6,2.0,1.0
75%,60.0,1.0,4.0,140.0,269.75,0.0,2.0,160.0,1.0,1.6,2.0,1.0
max,77.0,1.0,4.0,200.0,603.0,1.0,2.0,202.0,1.0,6.2,3.0,1.0


## Image Classification

Advanced neural networks can be used for image identification, and classification. Here is an example using Tensorflow:
https://keras.io/examples/vision/image_classification_from_scratch/