<a href="https://colab.research.google.com/github/vengie/neural-network/blob/main/Machine_Learning_with_Neural_Networks_Using_scikit_learn.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Problem Statement**

The aim of this guide is to build a classification model to detect diabetes. We will be using the diabetes dataset which contains 768 observations and 9 variables, as described below:

pregnancies - Number of times pregnant.
glucose - Plasma glucose concentration.
diastolic - Diastolic blood pressure (mm Hg).
triceps - Skinfold thickness (mm).
insulin - Hour serum insulin (mu U/ml).
bmi – Basal metabolic rate (weight in kg/height in m).
dpf - Diabetes pedigree function.
age - Age in years.
diabetes - “1” represents the presence of diabetes while “0” represents the absence of it. This is the target variable.

**Evaluation Metric**

We will evaluate the performance of the model using accuracy, which represents the percentage of cases correctly classified.

Mathematically, for a binary classifier, it's represented as accuracy = (TP+TN)/(TP+TN+FP+FN), where:

True Positive, or TP, are cases with positive labels which have been correctly classified as positive.
True Negative, or TN, are cases with negative labels which have been correctly classified as negative.
False Positive, or FP, are cases with negative labels which have been incorrectly classified as positive.
False Negative, or FN, are cases with positive labels which have been incorrectly classified as negative.

https://www.pluralsight.com/guides/machine-learning-neural-networks-scikit-learn

**Steps**

In this guide, we will follow the following steps:

Step 1 - Loading the required libraries and modules.

Step 2 - Reading the data and performing basic data checks.

Step 3 - Creating arrays for the features and the response variable.

Step 4 - Creating the training and test datasets.

Step 5 - Building , predicting, and evaluating the neural network model.

The following sections will cover these steps

**Step 1 - Loading the Required Libraries and Modules**

In [1]:
# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import sklearn
from sklearn.neural_network import MLPClassifier
from sklearn.neural_network import MLPRegressor

# Import necessary modules
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from math import sqrt
from sklearn.metrics import r2_score

**Step 2 - Reading the Data and Performing Basic Data Checks**

In [2]:
df = pd.read_csv('diabetes.csv')
print(df.shape)
df.describe().transpose()

NameError: ignored

**Step 3 - Creating Arrays for the Features and the Response Variable**

In [3]:
target_column = ['diabetes']
predictors = list(set(list(df.columns))-set(target_column))
df[predictors] = df[predictors]/df[predictors].max()
df.describe().transpose()

NameError: ignored

**Step 4 - Creating the Training and Test Datasets**

In [4]:
X = df[predictors].values
y = df[target_column].values

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=40)
print(X_train.shape); print(X_test.shape)

NameError: ignored

Step 5 - Building, Predicting, and Evaluating the Neural Network Model

In [5]:
from sklearn.neural_network import MLPClassifier

mlp = MLPClassifier(hidden_layer_sizes=(8,8,8), activation='relu', solver='adam', max_iter=500)
mlp.fit(X_train,y_train)

predict_train = mlp.predict(X_train)
predict_test = mlp.predict(X_test)

NameError: ignored

In [6]:
from sklearn.metrics import classification_report,confusion_matrix
print(confusion_matrix(y_train,predict_train))
print(classification_report(y_train,predict_train))

NameError: ignored

In [7]:
print(confusion_matrix(y_test,predict_test))
print(classification_report(y_test,predict_test))

NameError: ignored