# Predicting Pima Indians Onset of diabetes.

* Dataset link : https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv
* Dataset details: https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.names
        

Pima Indians Onset of diabetes dataset describes patient medical record data for Pima Indians whether they had an onset of diabetes within five years.

This is a binary classification problem (onset of diabetes as 1 or not as 0). All of the input variables that describe each patient are numerical. This makes it easy to use directly with neural networks that expect numerical input and output values, and ideal for our first neural network in Keras.

## Problem Statement :

The objective of the Pima Indian diabetes dataset is to diagnostically predict whether a patient has diabetes based on certain diagnostic information included in the dataset. All the patients are female and of Pima Indian heritage.

## Step 0: importing required libraries

In [None]:
# Importing necessary libraries
import numpy as np
import pandas as pd
from numpy import loadtxt
import keras
from keras.models import Sequential
from keras.layers import Dense

## step1: Load data

In [None]:
# Import dataset to view headers
dataset = pd.read_csv('./dataset/pima-indians-diabetes.data.csv')
# load the dataset
pimadataset=loadtxt('pima-indians-diabetes.csv',delimiter=',')


In [None]:
dataset.head()

Unnamed: 0,6,148,72,35,0,33.6,0.627,50,1
0,1,85,66,29,0,26.6,0.351,31,0
1,8,183,64,0,0,23.3,0.672,32,1
2,1,89,66,23,94,28.1,0.167,21,0
3,0,137,40,35,168,43.1,2.288,33,1
4,5,116,74,0,0,25.6,0.201,30,0


In [None]:
pimadataset

array([[  6.   , 148.   ,  72.   , ...,   0.627,  50.   ,   1.   ],
       [  1.   ,  85.   ,  66.   , ...,   0.351,  31.   ,   0.   ],
       [  8.   , 183.   ,  64.   , ...,   0.672,  32.   ,   1.   ],
       ...,
       [  5.   , 121.   ,  72.   , ...,   0.245,  30.   ,   0.   ],
       [  1.   , 126.   ,  60.   , ...,   0.349,  47.   ,   1.   ],
       [  1.   ,  93.   ,  70.   , ...,   0.315,  23.   ,   0.   ]])

In [None]:
# Split into input (X) and output(y) variables
X=pimadataset[:,0:8]
y=pimadataset[:,8]

In [None]:
X

array([[  6.   , 148.   ,  72.   , ...,  33.6  ,   0.627,  50.   ],
       [  1.   ,  85.   ,  66.   , ...,  26.6  ,   0.351,  31.   ],
       [  8.   , 183.   ,  64.   , ...,  23.3  ,   0.672,  32.   ],
       ...,
       [  5.   , 121.   ,  72.   , ...,  26.2  ,   0.245,  30.   ],
       [  1.   , 126.   ,  60.   , ...,  30.1  ,   0.349,  47.   ],
       [  1.   ,  93.   ,  70.   , ...,  30.4  ,   0.315,  23.   ]])

In [None]:
y

array([1., 0., 1., 0., 1., 0., 1., 0., 1., 1., 0., 1., 0., 1., 1., 1., 1.,
       1., 0., 1., 0., 0., 1., 1., 1., 1., 1., 0., 0., 0., 0., 1., 0., 0.,
       0., 0., 0., 1., 1., 1., 0., 0., 0., 1., 0., 1., 0., 0., 1., 0., 0.,
       0., 0., 1., 0., 0., 1., 0., 0., 0., 0., 1., 0., 0., 1., 0., 1., 0.,
       0., 0., 1., 0., 1., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 1.,
       0., 0., 0., 1., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 1., 1., 0.,
       0., 0., 0., 0., 0., 0., 0., 1., 1., 1., 0., 0., 1., 1., 1., 0., 0.,
       0., 1., 0., 0., 0., 1., 1., 0., 0., 1., 1., 1., 1., 1., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 1.,
       0., 1., 1., 0., 0., 0., 1., 0., 0., 0., 0., 1., 1., 0., 0., 0., 0.,
       1., 1., 0., 0., 0., 1., 0., 1., 0., 1., 0., 0., 0., 0., 0., 1., 1.,
       1., 1., 1., 0., 0., 1., 1., 0., 1., 0., 1., 1., 1., 0., 0., 0., 0.,
       0., 0., 1., 1., 0., 1., 0., 0., 0., 1., 1., 1., 1., 0., 1., 1., 1.,
       1., 0., 0., 0., 0.

## step2: Define Keras Model

In Keras, Models are defined as a sequence of layers/
Sequential groups a linear stack of layers into a tf.keras.Model.
Sequential provides training and inference features on this model.

We create a Sequential model and add layers one at a time until we have the best network architecture.
We have to ensure that the input layer has the right number of input features.

* In this work, we are going to use a fully-connected network structure with three layers which are defined using Dense class.
We can specify the number of neurons or nodes in the layer as the first argument, and specify the activation function using the activation argument.
**We are going to use the rectified linear unit activation function referred to as ReLU on the first two layers and the Sigmoid function in the output layer.**

In [None]:
# let's define our keras model
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

## Step 3: Compile Keras Model

In [None]:
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])


## Step4: Fit Keras Model

In [None]:
model.fit(X, y, epochs=150, batch_size=10, verbose=0)

<keras.callbacks.History at 0x1b267add4f0>

## Step 5: Evaluate Keras Model

In [None]:
# evaluate the keras model
_,accuracy=model.evaluate(X,y)
print('accuracy:% 2f'% (accuracy*100))


accuracy: 76.953125


The accuracy of our model is 76.95% which is pretty good actually for making diabetes prediction of Pima indians.

## Step 6: Make Predictions

In [None]:
# making predictions with the model
predictions = model.predict(X)
predictions

array([[7.71898925e-01],
       [1.24491811e-01],
       [8.18665147e-01],
       [1.73713893e-01],
       [7.15293705e-01],
       [2.50023991e-01],
       [2.34928548e-01],
       [7.70999789e-01],
       [9.20341134e-01],
       [1.96945637e-01],
       [1.70325249e-01],
       [8.84334207e-01],
       [4.64490801e-01],
       [8.96733522e-01],
       [8.19182038e-01],
       [5.28581858e-01],
       [4.30480957e-01],
       [3.34824264e-01],
       [3.96735758e-01],
       [2.91186213e-01],
       [1.35381669e-01],
       [3.87094676e-01],
       [8.48600388e-01],
       [3.27564627e-01],
       [5.08689344e-01],
       [4.77263808e-01],
       [7.54703283e-01],
       [1.91666782e-02],
       [5.85355699e-01],
       [2.22092420e-01],
       [2.96959698e-01],
       [6.06836855e-01],
       [1.02520525e-01],
       [6.50897920e-02],
       [5.08126259e-01],
       [5.47470570e-01],
       [7.01071560e-01],
       [4.62204278e-01],
       [2.19129175e-01],
       [6.18293166e-01],
