In this project, I use Keras to conduct deep learning. Batch size, training epoch, learning rate, activation function, network weight initialisation and number of neurons in the layers are tuned using GridSearchCV. The best parameters obtained from tuning are used in the final model for prediction and model evaluation.

Import libraries.

In [127]:
import pandas as pd
import numpy as np
import sklearn
import keras

Read the dataset and assign column names.

In [117]:
names = ['n_pregnant', 'glucose_concentration', 'blood_pressure (mm Hg)', 'skin_thickness (mm)', 'serum_insulin (mu U/ml)',
        'BMI', 'pedigree_function', 'age', 'class']
df = pd.read_csv("pimaindiansdiabetes.csv",names=names)

In [118]:
df.head()

Unnamed: 0,n_pregnant,glucose_concentration,blood_pressure (mm Hg),skin_thickness (mm),serum_insulin (mu U/ml),BMI,pedigree_function,age,class
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1


In [119]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 768 entries, 0 to 767
Data columns (total 9 columns):
n_pregnant                 768 non-null int64
glucose_concentration      768 non-null int64
blood_pressure (mm Hg)     768 non-null int64
skin_thickness (mm)        768 non-null int64
serum_insulin (mu U/ml)    768 non-null int64
BMI                        768 non-null float64
pedigree_function          768 non-null float64
age                        768 non-null int64
class                      768 non-null int64
dtypes: float64(2), int64(7)
memory usage: 54.1 KB


Glucose concentration, blood pressure, skin thickness, serum insulin and BMI have zero values which should not be the case.

In [120]:
df.describe()

Unnamed: 0,n_pregnant,glucose_concentration,blood_pressure (mm Hg),skin_thickness (mm),serum_insulin (mu U/ml),BMI,pedigree_function,age,class
count,768.0,768.0,768.0,768.0,768.0,768.0,768.0,768.0,768.0
mean,3.845052,120.894531,69.105469,20.536458,79.799479,31.992578,0.471876,33.240885,0.348958
std,3.369578,31.972618,19.355807,15.952218,115.244002,7.88416,0.331329,11.760232,0.476951
min,0.0,0.0,0.0,0.0,0.0,0.0,0.078,21.0,0.0
25%,1.0,99.0,62.0,0.0,0.0,27.3,0.24375,24.0,0.0
50%,3.0,117.0,72.0,23.0,30.5,32.0,0.3725,29.0,0.0
75%,6.0,140.25,80.0,32.0,127.25,36.6,0.62625,41.0,1.0
max,17.0,199.0,122.0,99.0,846.0,67.1,2.42,81.0,1.0


Replace zero values with NA values.

In [121]:
columns = ['glucose_concentration', 'blood_pressure (mm Hg)', 'skin_thickness (mm)', 'serum_insulin (mu U/ml)',
        'BMI']

for col in columns:
    df[col].replace(0, np.NaN, inplace=True)

There are no more minimum values indicated as zero.

In [122]:
df.describe()

Unnamed: 0,n_pregnant,glucose_concentration,blood_pressure (mm Hg),skin_thickness (mm),serum_insulin (mu U/ml),BMI,pedigree_function,age,class
count,768.0,763.0,733.0,541.0,394.0,757.0,768.0,768.0,768.0
mean,3.845052,121.686763,72.405184,29.15342,155.548223,32.457464,0.471876,33.240885,0.348958
std,3.369578,30.535641,12.382158,10.476982,118.775855,6.924988,0.331329,11.760232,0.476951
min,0.0,44.0,24.0,7.0,14.0,18.2,0.078,21.0,0.0
25%,1.0,99.0,64.0,22.0,76.25,27.5,0.24375,24.0,0.0
50%,3.0,117.0,72.0,29.0,125.0,32.3,0.3725,29.0,0.0
75%,6.0,141.0,80.0,36.0,190.0,36.6,0.62625,41.0,1.0
max,17.0,199.0,122.0,99.0,846.0,67.1,2.42,81.0,1.0


Drop NA values. There are now 392 rows of data instead of initial 768 rows.

In [123]:
df.dropna(inplace=True)
df.describe()

Unnamed: 0,n_pregnant,glucose_concentration,blood_pressure (mm Hg),skin_thickness (mm),serum_insulin (mu U/ml),BMI,pedigree_function,age,class
count,392.0,392.0,392.0,392.0,392.0,392.0,392.0,392.0,392.0
mean,3.30102,122.627551,70.663265,29.145408,156.056122,33.086224,0.523046,30.864796,0.331633
std,3.211424,30.860781,12.496092,10.516424,118.84169,7.027659,0.345488,10.200777,0.471401
min,0.0,56.0,24.0,7.0,14.0,18.2,0.085,21.0,0.0
25%,1.0,99.0,62.0,21.0,76.75,28.4,0.26975,23.0,0.0
50%,2.0,119.0,70.0,29.0,125.5,33.2,0.4495,27.0,0.0
75%,5.0,143.0,78.0,37.0,190.0,37.1,0.687,36.0,1.0
max,17.0,198.0,110.0,63.0,846.0,67.1,2.42,81.0,1.0


Get dimensions of the dataset.

In [124]:
dataset = df.values
print(dataset.shape)

(392, 9)


Separate columns into independent variables and target variable. The target variable is changed to integer type. Class 1 means person has diabetes. Class 0 means person does not have diabetes.

In [125]:
X = dataset[:,0:8]
Y = dataset[:, 8].astype(int)

Transform data into a normal distribution with mean 0 and standard deviation 1.

In [128]:
from sklearn.preprocessing import StandardScaler

In [129]:
scaler = StandardScaler().fit(X)
print(scaler)

StandardScaler(copy=True, with_mean=True, with_std=True)


In [130]:
X_standardised = scaler.transform(X)
data = pd.DataFrame(X_standardised)
data.describe()

Unnamed: 0,0,1,2,3,4,5,6,7
count,392.0,392.0,392.0,392.0,392.0,392.0,392.0,392.0
mean,-9.063045e-18,1.132881e-17,-4.531523e-16,1.087565e-16,1.064908e-16,1.631348e-16,1.8126090000000003e-17,1.110223e-16
std,1.001278,1.001278,1.001278,1.001278,1.001278,1.001278,1.001278,1.001278
min,-1.029213,-2.161731,-3.739001,-2.108484,-1.196867,-2.120941,-1.269525,-0.9682991
25%,-0.7174265,-0.7665958,-0.694164,-0.7755315,-0.6681786,-0.667678,-0.7340909,-0.771985
50%,-0.4056403,-0.1176959,-0.05314565,-0.01384444,-0.2574448,0.01621036,-0.2131475,-0.3793569
75%,0.5297185,0.6609841,0.5878727,0.7478426,0.2859877,0.5718696,0.4751644,0.5040564
max,4.271153,2.445459,3.151946,3.223325,5.81299,4.846172,5.497667,4.921123


Define function to create model. I start off with 2 layers of 8 and 4 neurons each, and a learning rate of 0.01.

In [132]:
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import Adam

In [133]:
def create_model():
    
    # Create the model.
    model = Sequential()
    model.add(Dense(8, input_dim = 8, kernel_initializer='normal', activation='relu'))
    model.add(Dense(4, input_dim = 8, kernel_initializer='normal', activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    
    # Compile the model.
    adam = Adam(lr = 0.01)
    model.compile(loss = 'binary_crossentropy', optimizer = adam, metrics = ['accuracy'])
    
    return model

model = create_model()
print(model.summary())

Model: "sequential_116"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_335 (Dense)            (None, 8)                 72        
_________________________________________________________________
dense_336 (Dense)            (None, 4)                 36        
_________________________________________________________________
dense_337 (Dense)            (None, 1)                 5         
Total params: 113
Trainable params: 113
Non-trainable params: 0
_________________________________________________________________
None


Tune batch size and training epochs. The best model (based on highest accuracy score) has batch size of 10 and training epochs of 50. 

In [135]:
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import GridSearchCV, KFold

In [136]:
seed = 9876
np.random.seed(seed)

def create_model():
    
    # Create the model.
    model = Sequential()
    model.add(Dense(8, input_dim = 8, kernel_initializer='normal', activation='relu'))
    model.add(Dense(4, input_dim = 8, kernel_initializer='normal', activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    
    # Compile the model.
    adam = Adam(lr = 0.01)
    model.compile(loss = 'binary_crossentropy', optimizer = adam, metrics = ['accuracy'])
    
    return model

model = KerasClassifier(build_fn = create_model, verbose = 1)

batch_size = [10, 40]
epochs = [10, 50]

param_grid = dict(batch_size=batch_size, epochs=epochs)

grid = GridSearchCV(estimator = model, param_grid = param_grid, cv = KFold(random_state=seed), verbose = 10)
grid_results = grid.fit(X_standardised, Y)

print("Best: {0}, using {1}".format(grid_results.best_score_, grid_results.best_params_))
means = grid_results.cv_results_['mean_test_score']
stds = grid_results.cv_results_['std_test_score']
params = grid_results.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print('{0} ({1}) with: {2}'.format(mean, stdev, param))

[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.


Fitting 3 folds for each of 4 candidates, totalling 12 fits
[CV] batch_size=10, epochs=10 ........................................
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
[CV] ............ batch_size=10, epochs=10, score=0.756, total=   2.4s
[CV] batch_size=10, epochs=10 ........................................


[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:    2.3s remaining:    0.0s


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
[CV] ............ batch_size=10, epochs=10, score=0.786, total=   3.2s
[CV] batch_size=10, epochs=10 ........................................


[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:    5.5s remaining:    0.0s


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
[CV] ............ batch_size=10, epochs=10, score=0.792, total=   2.7s
[CV] batch_size=10, epochs=50 ........................................


[Parallel(n_jobs=1)]: Done   3 out of   3 | elapsed:    8.2s remaining:    0.0s


Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
[CV] ............ batch_size=10, epochs=50, score=0.771, total=   5.5s
[CV] batch_size=10, epochs=50 ........................................


[Parallel(n_jobs=1)]: Done   4 out of   4 | elapsed:   13.7s remaining:    0.0s


Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
[CV] ............ batch_size=10, epochs=50, score=0.786, total=   5.0s
[CV] batch_size=10, epochs=50 ........................................


[Parallel(n_jobs=1)]: Done   5 out of   5 | elapsed:   18.6s remaining:    0.0s


Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
[CV] ............ batch_size=10, epochs=50, score=0.831, total=   4.6s
[CV] batch_size=40, epochs=10 ........................................


[Parallel(n_jobs=1)]: Done   6 out of   6 | elapsed:   23.3s remaining:    0.0s


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
[CV] ............ batch_size=40, epochs=10, score=0.710, total=   2.9s
[CV] batch_size=40, epochs=10 ........................................


[Parallel(n_jobs=1)]: Done   7 out of   7 | elapsed:   26.1s remaining:    0.0s


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
[CV] ............ batch_size=40, epochs=10, score=0.748, total=   2.6s
[CV] batch_size=40, epochs=10 ........................................


[Parallel(n_jobs=1)]: Done   8 out of   8 | elapsed:   28.8s remaining:    0.0s


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
[CV] ............ batch_size=40, epochs=10, score=0.823, total=   2.9s
[CV] batch_size=40, epochs=50 ........................................


[Parallel(n_jobs=1)]: Done   9 out of   9 | elapsed:   31.7s remaining:    0.0s


Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
[CV] ............ batch_size=40, epochs=50, score=0.763, total=   5.0s
[CV] batch_size=40, epochs=50 ........................................
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50


[Parallel(n_jobs=1)]: Done  12 out of  12 | elapsed:   45.1s finished


Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Best: 0.7959183580717262, using {'batch_size': 10, 'epochs': 50}
0.7780612200802687 (0.016015409295565528) with: {'batch_size': 10, 'epochs': 10}
0.7959183580717262 (0.025329886804161168) with: {'batch_size': 10, 'epochs': 50}
0.7602040738779672 (0.04695557493229343) with: {'batch_size': 40, 'epochs': 10}
0.7959183487965136 (0.04092362400694938) with: {'batch_size': 40, 'epochs': 50}


Tune learning rate and dropout rate. Hard code the batch size and training epoch obtained previously. The best model has learning rate of 0.01 and dropout rate of 0.1.

In [137]:
from keras.layers import Dropout

In [138]:
seed = 9876
np.random.seed(seed)

def create_model(learn_rate, dropout_rate):
    
    # Create the model.
    model = Sequential()
    model.add(Dense(8, input_dim = 8, kernel_initializer='normal', activation='relu'))
    model.add(Dropout(dropout_rate))
    model.add(Dense(4, input_dim = 8, kernel_initializer='normal', activation='relu'))
    model.add(Dropout(dropout_rate)) 
    model.add(Dense(1, activation='sigmoid'))
    
    # Compile the model.
    adam = Adam(lr = learn_rate)
    model.compile(loss = 'binary_crossentropy', optimizer = adam, metrics = ['accuracy'])
    
    return model

model = KerasClassifier(build_fn = create_model, batch_size = 10, epochs = 50, verbose = 0)

learn_rate = [0.001, 0.01, 0.1]
dropout_rate = [0.0, 0.1, 0.2]

param_grid = dict(learn_rate=learn_rate, dropout_rate=dropout_rate)

grid = GridSearchCV(estimator = model, param_grid = param_grid, cv = KFold(random_state=seed), verbose = 10)
grid_results = grid.fit(X_standardised, Y)

print("Best: {0}, using {1}".format(grid_results.best_score_, grid_results.best_params_))
means = grid_results.cv_results_['mean_test_score']
stds = grid_results.cv_results_['std_test_score']
params = grid_results.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print('{0} ({1}) with: {2}'.format(mean, stdev, param))

[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.


Fitting 3 folds for each of 9 candidates, totalling 27 fits
[CV] dropout_rate=0.0, learn_rate=0.001 ..............................
[CV] .. dropout_rate=0.0, learn_rate=0.001, score=0.733, total=   3.7s
[CV] dropout_rate=0.0, learn_rate=0.001 ..............................


[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:    3.6s remaining:    0.0s


[CV] .. dropout_rate=0.0, learn_rate=0.001, score=0.763, total=   3.3s
[CV] dropout_rate=0.0, learn_rate=0.001 ..............................


[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:    6.8s remaining:    0.0s


[CV] .. dropout_rate=0.0, learn_rate=0.001, score=0.808, total=   2.8s
[CV] dropout_rate=0.0, learn_rate=0.01 ...............................


[Parallel(n_jobs=1)]: Done   3 out of   3 | elapsed:    9.7s remaining:    0.0s


[CV] ... dropout_rate=0.0, learn_rate=0.01, score=0.756, total=   3.5s
[CV] dropout_rate=0.0, learn_rate=0.01 ...............................


[Parallel(n_jobs=1)]: Done   4 out of   4 | elapsed:   13.1s remaining:    0.0s


[CV] ... dropout_rate=0.0, learn_rate=0.01, score=0.771, total=   2.8s
[CV] dropout_rate=0.0, learn_rate=0.01 ...............................


[Parallel(n_jobs=1)]: Done   5 out of   5 | elapsed:   15.9s remaining:    0.0s


[CV] ... dropout_rate=0.0, learn_rate=0.01, score=0.831, total=   3.7s
[CV] dropout_rate=0.0, learn_rate=0.1 ................................


[Parallel(n_jobs=1)]: Done   6 out of   6 | elapsed:   19.6s remaining:    0.0s


[CV] .... dropout_rate=0.0, learn_rate=0.1, score=0.771, total=   3.0s
[CV] dropout_rate=0.0, learn_rate=0.1 ................................


[Parallel(n_jobs=1)]: Done   7 out of   7 | elapsed:   22.6s remaining:    0.0s


[CV] .... dropout_rate=0.0, learn_rate=0.1, score=0.733, total=   4.1s
[CV] dropout_rate=0.0, learn_rate=0.1 ................................


[Parallel(n_jobs=1)]: Done   8 out of   8 | elapsed:   26.8s remaining:    0.0s


[CV] .... dropout_rate=0.0, learn_rate=0.1, score=0.831, total=   2.9s
[CV] dropout_rate=0.1, learn_rate=0.001 ..............................


[Parallel(n_jobs=1)]: Done   9 out of   9 | elapsed:   29.6s remaining:    0.0s


[CV] .. dropout_rate=0.1, learn_rate=0.001, score=0.725, total=   3.5s
[CV] dropout_rate=0.1, learn_rate=0.001 ..............................
[CV] .. dropout_rate=0.1, learn_rate=0.001, score=0.763, total=   4.0s
[CV] dropout_rate=0.1, learn_rate=0.001 ..............................
[CV] .. dropout_rate=0.1, learn_rate=0.001, score=0.815, total=   3.4s
[CV] dropout_rate=0.1, learn_rate=0.01 ...............................
[CV] ... dropout_rate=0.1, learn_rate=0.01, score=0.733, total=   3.6s
[CV] dropout_rate=0.1, learn_rate=0.01 ...............................
[CV] ... dropout_rate=0.1, learn_rate=0.01, score=0.809, total=   3.4s
[CV] dropout_rate=0.1, learn_rate=0.01 ...............................
[CV] ... dropout_rate=0.1, learn_rate=0.01, score=0.838, total=   3.9s
[CV] dropout_rate=0.1, learn_rate=0.1 ................................
[CV] .... dropout_rate=0.1, learn_rate=0.1, score=0.733, total=   3.5s
[CV] dropout_rate=0.1, learn_rate=0.1 ................................
[CV] .

[Parallel(n_jobs=1)]: Done  27 out of  27 | elapsed:  1.6min finished


Best: 0.7933673416169322, using {'dropout_rate': 0.1, 'learn_rate': 0.01}
0.7678571402722475 (0.030710750122669134) with: {'dropout_rate': 0.0, 'learn_rate': 0.001}
0.7857142844978644 (0.032344602514417495) with: {'dropout_rate': 0.0, 'learn_rate': 0.01}
0.7780612243377433 (0.0402725879136073) with: {'dropout_rate': 0.0, 'learn_rate': 0.1}
0.7678571370791416 (0.03693545277935656) with: {'dropout_rate': 0.1, 'learn_rate': 0.001}
0.7933673416169322 (0.04452698200410441) with: {'dropout_rate': 0.1, 'learn_rate': 0.01}
0.7525510304436391 (0.05726352155661545) with: {'dropout_rate': 0.1, 'learn_rate': 0.1}
0.7576530635052797 (0.041729797560808295) with: {'dropout_rate': 0.2, 'learn_rate': 0.001}
0.7704081704117813 (0.03229035213501209) with: {'dropout_rate': 0.2, 'learn_rate': 0.01}
0.7270408099403187 (0.02004392236020585) with: {'dropout_rate': 0.2, 'learn_rate': 0.1}


Tune activation function and network weight initialisation. Hard code the batch size, training epoch and learning rate that were obtained previously. The best model uses softmax activation and uses uniform network weight initialisation.

In [139]:
seed = 9876
np.random.seed(seed)

def create_model(activation, init):
    
    # Create the model.
    model = Sequential()
    model.add(Dense(8, input_dim = 8, kernel_initializer=init, activation=activation))
    model.add(Dense(4, input_dim = 8, kernel_initializer=init, activation=activation))
    model.add(Dense(1, activation='sigmoid'))
    
    # Compile the model.
    adam = Adam(lr = 0.01)
    model.compile(loss = 'binary_crossentropy', optimizer = adam, metrics = ['accuracy'])
    
    return model

model = KerasClassifier(build_fn = create_model, batch_size = 10, epochs = 50, verbose = 0)

activation = ['softmax', 'relu', 'tanh', 'linear']
init = ['uniform', 'normal', 'zero']

param_grid = dict(activation=activation, init=init)

grid = GridSearchCV(estimator = model, param_grid = param_grid, cv = KFold(random_state=seed), verbose = 10)
grid_results = grid.fit(X_standardised, Y)

print("Best: {0}, using {1}".format(grid_results.best_score_, grid_results.best_params_))
means = grid_results.cv_results_['mean_test_score']
stds = grid_results.cv_results_['std_test_score']
params = grid_results.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print('{0} ({1}) with: {2}'.format(mean, stdev, param))

[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.


Fitting 3 folds for each of 12 candidates, totalling 36 fits
[CV] activation=softmax, init=uniform ................................
[CV] .... activation=softmax, init=uniform, score=0.740, total=   2.1s
[CV] activation=softmax, init=uniform ................................


[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:    2.0s remaining:    0.0s


[CV] .... activation=softmax, init=uniform, score=0.779, total=   2.5s
[CV] activation=softmax, init=uniform ................................


[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:    4.6s remaining:    0.0s


[CV] .... activation=softmax, init=uniform, score=0.869, total=   2.3s
[CV] activation=softmax, init=normal .................................


[Parallel(n_jobs=1)]: Done   3 out of   3 | elapsed:    6.8s remaining:    0.0s


[CV] ..... activation=softmax, init=normal, score=0.725, total=   3.5s
[CV] activation=softmax, init=normal .................................


[Parallel(n_jobs=1)]: Done   4 out of   4 | elapsed:   10.3s remaining:    0.0s


[CV] ..... activation=softmax, init=normal, score=0.771, total=   2.9s
[CV] activation=softmax, init=normal .................................


[Parallel(n_jobs=1)]: Done   5 out of   5 | elapsed:   13.2s remaining:    0.0s


[CV] ..... activation=softmax, init=normal, score=0.862, total=   2.6s
[CV] activation=softmax, init=zero ...................................


[Parallel(n_jobs=1)]: Done   6 out of   6 | elapsed:   15.8s remaining:    0.0s


[CV] ....... activation=softmax, init=zero, score=0.611, total=   2.1s
[CV] activation=softmax, init=zero ...................................


[Parallel(n_jobs=1)]: Done   7 out of   7 | elapsed:   18.0s remaining:    0.0s


[CV] ....... activation=softmax, init=zero, score=0.695, total=   2.5s
[CV] activation=softmax, init=zero ...................................


[Parallel(n_jobs=1)]: Done   8 out of   8 | elapsed:   20.4s remaining:    0.0s


[CV] ....... activation=softmax, init=zero, score=0.700, total=   2.1s
[CV] activation=relu, init=uniform ...................................


[Parallel(n_jobs=1)]: Done   9 out of   9 | elapsed:   22.5s remaining:    0.0s


[CV] ....... activation=relu, init=uniform, score=0.725, total=   2.9s
[CV] activation=relu, init=uniform ...................................
[CV] ....... activation=relu, init=uniform, score=0.779, total=   3.1s
[CV] activation=relu, init=uniform ...................................
[CV] ....... activation=relu, init=uniform, score=0.808, total=   2.5s
[CV] activation=relu, init=normal ....................................
[CV] ........ activation=relu, init=normal, score=0.718, total=   2.6s
[CV] activation=relu, init=normal ....................................
[CV] ........ activation=relu, init=normal, score=0.779, total=   2.9s
[CV] activation=relu, init=normal ....................................
[CV] ........ activation=relu, init=normal, score=0.831, total=   2.9s
[CV] activation=relu, init=zero ......................................
[CV] .......... activation=relu, init=zero, score=0.611, total=   3.0s
[CV] activation=relu, init=zero ......................................
[CV] .

[Parallel(n_jobs=1)]: Done  36 out of  36 | elapsed:  1.5min finished


Best: 0.7959183624812535, using {'activation': 'softmax', 'init': 'uniform'}
0.7959183624812535 (0.05394679691822502) with: {'activation': 'softmax', 'init': 'uniform'}
0.7857142720295458 (0.05659710545962977) with: {'activation': 'softmax', 'init': 'normal'}
0.6683673420730902 (0.04092231924913523) with: {'activation': 'softmax', 'init': 'zero'}
0.7704081536859883 (0.034159148828236695) with: {'activation': 'relu', 'init': 'uniform'}
0.7755102109240026 (0.04624149261633471) with: {'activation': 'relu', 'init': 'normal'}
0.6683673420730902 (0.04092231924913523) with: {'activation': 'relu', 'init': 'zero'}
0.7908163082842924 (0.03370616130132414) with: {'activation': 'tanh', 'init': 'uniform'}
0.7755101941982094 (0.03478923012909107) with: {'activation': 'tanh', 'init': 'normal'}
0.6683673420730902 (0.04092231924913523) with: {'activation': 'tanh', 'init': 'zero'}
0.793367321698033 (0.03176447518730299) with: {'activation': 'linear', 'init': 'uniform'}
0.7857142844978644 (0.031736834261

Tune the number of neurons in the hidden layers. Hard code the batch size, training epoch, learning rate, activation function and network weight initialisation that were obtained previously. The best model has 8 neurons in the 1st layer and 6 neurons in the 2nd layer.

In [140]:
seed = 9876
np.random.seed(seed)

def create_model(neuron1, neuron2):
    
    # Create the model.
    model = Sequential()
    model.add(Dense(neuron1, input_dim = 8, kernel_initializer='uniform', activation='softmax'))
    model.add(Dense(neuron2, input_dim = neuron1, kernel_initializer='uniform', activation='softmax'))
    model.add(Dense(1, activation='sigmoid')) # sigmoid is used in binary classification
    
    # Compile the model.
    adam = Adam(lr = 0.01)
    model.compile(loss = 'binary_crossentropy', optimizer = adam, metrics = ['accuracy'])
    
    return model

model = KerasClassifier(build_fn = create_model, batch_size = 10, epochs = 50, verbose = 0)

neuron1 = [4, 8, 12]
neuron2 = [2, 4, 6]

param_grid = dict(neuron1 = neuron1, neuron2 = neuron2)

grid = GridSearchCV(estimator = model, param_grid = param_grid, cv = KFold(random_state=seed), refit = True, verbose = 10)
grid_results = grid.fit(X_standardised, Y)

print("Best: {0}, using {1}".format(grid_results.best_score_, grid_results.best_params_))
means = grid_results.cv_results_['mean_test_score']
stds = grid_results.cv_results_['std_test_score']
params = grid_results.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print('{0} ({1}) with: {2}'.format(mean, stdev, param))

[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.


Fitting 3 folds for each of 9 candidates, totalling 27 fits
[CV] neuron1=4, neuron2=2 ............................................
[CV] ................ neuron1=4, neuron2=2, score=0.702, total=   2.3s
[CV] neuron1=4, neuron2=2 ............................................


[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:    2.2s remaining:    0.0s


[CV] ................ neuron1=4, neuron2=2, score=0.779, total=   3.0s
[CV] neuron1=4, neuron2=2 ............................................


[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:    5.2s remaining:    0.0s


[CV] ................ neuron1=4, neuron2=2, score=0.838, total=   2.5s
[CV] neuron1=4, neuron2=4 ............................................


[Parallel(n_jobs=1)]: Done   3 out of   3 | elapsed:    7.7s remaining:    0.0s


[CV] ................ neuron1=4, neuron2=4, score=0.748, total=   2.6s
[CV] neuron1=4, neuron2=4 ............................................


[Parallel(n_jobs=1)]: Done   4 out of   4 | elapsed:   10.2s remaining:    0.0s


[CV] ................ neuron1=4, neuron2=4, score=0.779, total=   2.8s
[CV] neuron1=4, neuron2=4 ............................................


[Parallel(n_jobs=1)]: Done   5 out of   5 | elapsed:   13.0s remaining:    0.0s


[CV] ................ neuron1=4, neuron2=4, score=0.808, total=   2.3s
[CV] neuron1=4, neuron2=6 ............................................


[Parallel(n_jobs=1)]: Done   6 out of   6 | elapsed:   15.3s remaining:    0.0s


[CV] ................ neuron1=4, neuron2=6, score=0.733, total=   2.5s
[CV] neuron1=4, neuron2=6 ............................................


[Parallel(n_jobs=1)]: Done   7 out of   7 | elapsed:   17.8s remaining:    0.0s


[CV] ................ neuron1=4, neuron2=6, score=0.794, total=   3.0s
[CV] neuron1=4, neuron2=6 ............................................


[Parallel(n_jobs=1)]: Done   8 out of   8 | elapsed:   20.9s remaining:    0.0s


[CV] ................ neuron1=4, neuron2=6, score=0.823, total=   2.5s
[CV] neuron1=8, neuron2=2 ............................................


[Parallel(n_jobs=1)]: Done   9 out of   9 | elapsed:   23.3s remaining:    0.0s


[CV] ................ neuron1=8, neuron2=2, score=0.718, total=   2.4s
[CV] neuron1=8, neuron2=2 ............................................
[CV] ................ neuron1=8, neuron2=2, score=0.786, total=   2.4s
[CV] neuron1=8, neuron2=2 ............................................
[CV] ................ neuron1=8, neuron2=2, score=0.846, total=   3.3s
[CV] neuron1=8, neuron2=4 ............................................
[CV] ................ neuron1=8, neuron2=4, score=0.740, total=   2.3s
[CV] neuron1=8, neuron2=4 ............................................
[CV] ................ neuron1=8, neuron2=4, score=0.779, total=   3.0s
[CV] neuron1=8, neuron2=4 ............................................
[CV] ................ neuron1=8, neuron2=4, score=0.862, total=   2.1s
[CV] neuron1=8, neuron2=6 ............................................
[CV] ................ neuron1=8, neuron2=6, score=0.779, total=   2.6s
[CV] neuron1=8, neuron2=6 ............................................
[CV] .

[Parallel(n_jobs=1)]: Done  27 out of  27 | elapsed:  1.1min finished


Best: 0.8010204017770534, using {'neuron1': 8, 'neuron2': 6}
0.7729591745503095 (0.055700763011101016) with: {'neuron1': 4, 'neuron2': 2}
0.7780612138461094 (0.02431947857888487) with: {'neuron1': 4, 'neuron2': 4}
0.7831632742772296 (0.03759812018049345) with: {'neuron1': 4, 'neuron2': 6}
0.7831632648499645 (0.052511303001968164) with: {'neuron1': 8, 'neuron2': 2}
0.7933673521085661 (0.05049090032359488) with: {'neuron1': 8, 'neuron2': 4}
0.8010204017770534 (0.026557574151598942) with: {'neuron1': 8, 'neuron2': 6}
0.7857142677720712 (0.02650267600187418) with: {'neuron1': 12, 'neuron2': 2}
0.7857142782637051 (0.044365707126585066) with: {'neuron1': 12, 'neuron2': 4}
0.7857142981826043 (0.046508593883445554) with: {'neuron1': 12, 'neuron2': 6}


Use the best model with tuned parameters to make predictions.

In [142]:
y_pred = grid.predict(X_standardised)
print(y_pred[:7].reshape(1, -1))

[[0 1 0 1 1 1 1]]


Calculate model performance metrics. The best model predicted that 262 persons do not have diabetes and 130 persons have diabetes. 

In [143]:
from sklearn.metrics import classification_report, accuracy_score

In [144]:
print(accuracy_score(Y, y_pred))
print(classification_report(Y, y_pred))

0.8698979591836735
              precision    recall  f1-score   support

           0       0.92      0.88      0.90       262
           1       0.78      0.85      0.81       130

    accuracy                           0.87       392
   macro avg       0.85      0.87      0.86       392
weighted avg       0.87      0.87      0.87       392



The 3rd data point is actually class 1, but is predicted as class 0 by the model. This is a false negative.

In [145]:
example = df.iloc[2]
print(example)

n_pregnant                  3.000
glucose_concentration      78.000
blood_pressure (mm Hg)     50.000
skin_thickness (mm)        32.000
serum_insulin (mu U/ml)    88.000
BMI                        31.000
pedigree_function           0.248
age                        26.000
class                       1.000
Name: 6, dtype: float64


In [146]:
prediction = grid.predict(X_standardised[2].reshape(1, -1))
print(prediction)

[[0]]
