# Data Preparation and Model

## About Dataset
Link to the dataset: [Pima Indians Diabetes Database](https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database)

### `Context`
This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective of the dataset is to diagnostically predict whether or not a patient has diabetes, based on certain diagnostic measurements included in the dataset. Several constraints were placed on the selection of these instances from a larger database. In particular, all patients here are females at least 21 years old of Pima Indian heritage.

### `Content`
The datasets consists of several medical predictor variables and one target variable, Outcome. Predictor variables includes the number of pregnancies the patient has had, their BMI, insulin level, age, and so on.

### `Acknowledgements`
Smith, J.W., Everhart, J.E., Dickson, W.C., Knowler, W.C., & Johannes, R.S. (1988). Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In Proceedings of the Symposium on Computer Applications and Medical Care (pp. 261--265). IEEE Computer Society Press.


In [1]:
import pandas as pd
import numpy as np

In [5]:
# set seed for reproducibility
SEED = 20
np.random.seed(SEED)

In [6]:
# Loading Data
df = pd.read_csv('diabetes.csv')
df.head()

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1


In [7]:
# Replacing all 0 values with Null values
def replace_zero(df):
    df_nan=df.copy(deep=True)
    cols = ["Glucose","BloodPressure","SkinThickness","Insulin","BMI"]
    df_nan[cols] = df_nan[cols].replace({0:np.nan})
    return df_nan
df_nan=replace_zero(df)

In [8]:
# Copy pasting functions from previous notebook
def find_median(frame,var):
    temp = frame[frame[var].notnull()]
    temp = frame[[var,'Outcome']].groupby('Outcome')[[var]].median().reset_index()
    return temp

In [9]:
# Copy pasting functions from previous notebook
def replace_null(frame,var):
    median_df=find_median(frame,var)
    var_0=median_df[var].iloc[0]
    var_1=median_df[var].iloc[1]
    frame.loc[(frame['Outcome'] == 0) & (frame[var].isnull()), var] = var_0
    frame.loc[(frame['Outcome'] == 1) & (frame[var].isnull()), var] = var_1
    return frame[var].isnull().sum()

In [10]:
print(str(replace_null(df_nan,'Glucose'))+ ' Nulls for Glucose')
print(str(replace_null(df_nan,'SkinThickness'))+ ' Nulls for SkinThickness')
print(str(replace_null(df_nan,'Insulin'))+ ' Nulls for Insulin')
print(str(replace_null(df_nan,'BMI'))+ ' Nulls for BMI')
print(str(replace_null(df_nan,'BloodPressure'))+ ' Nulls for BloodPressure')
# We have successfully handled Nulls

0 Nulls for Glucose
0 Nulls for SkinThickness
0 Nulls for Insulin
0 Nulls for BMI
0 Nulls for BloodPressure


In [11]:
df_nan.isnull().sum()
# Just a confirmation
# Everything looks good

Pregnancies                 0
Glucose                     0
BloodPressure               0
SkinThickness               0
Insulin                     0
BMI                         0
DiabetesPedigreeFunction    0
Age                         0
Outcome                     0
dtype: int64

In [12]:
# We need to scale our data for uniformity.
from sklearn.preprocessing import StandardScaler
def std_scalar(df):
    std_X = StandardScaler()
    x =  pd.DataFrame(std_X.fit_transform(df.drop(["Outcome"],axis = 1),),
            columns=['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin',
           'BMI', 'DiabetesPedigreeFunction', 'Age'])
    y=df["Outcome"]
    return x,y


In [13]:
X,Y=std_scalar(df_nan)
X.describe()
# Scaled data looks fine

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age
count,768.0,768.0,768.0,768.0,768.0,768.0,768.0,768.0
mean,2.5442610000000002e-17,1.604619e-16,-3.685926e-16,-3.9284260000000004e-17,-8.601337e-18,1.054567e-16,2.398978e-16,1.8576e-16
std,1.000652,1.000652,1.000652,1.000652,1.000652,1.000652,1.000652,1.000652
min,-1.141852,-2.551447,-3.999727,-2.486187,-1.434747,-2.070186,-1.189553,-1.041549
25%,-0.8448851,-0.7202356,-0.6934382,-0.4603073,-0.440843,-0.717659,-0.6889685,-0.7862862
50%,-0.2509521,-0.1536274,-0.03218035,-0.1226607,-0.440843,-0.0559387,-0.3001282,-0.3608474
75%,0.6399473,0.6100618,0.6290775,0.3275348,0.3116039,0.6057816,0.4662269,0.6602056
max,3.906578,2.539814,4.100681,7.868309,7.909072,5.041489,5.883565,4.063716


In [14]:
Y.head()

0    1
1    0
2    1
3    0
4    1
Name: Outcome, dtype: int64

In [15]:
#Keeping train  size as 0.8
from sklearn.model_selection import train_test_split
X_train,X_test,Y_train,Y_test = train_test_split(X,Y,test_size=0.2,random_state=20, stratify=Y)


In [16]:
# We are good to go with baseline model
# Let's first implement KNN
from sklearn.neighbors import KNeighborsClassifier
test_scores = []
train_scores = []
for i in range(5,15):
    neigh = KNeighborsClassifier(n_neighbors=i)
    neigh.fit(X_train, Y_train)
    train_scores.append(neigh.score(X_train,Y_train))
    test_scores.append(neigh.score(X_test,Y_test))

In [17]:
print('Max train_scores is ' + str(max(train_scores)*100) + ' for k = '+ 
      str(train_scores.index(max(train_scores))+5))

Max train_scores is 85.66775244299674 for k = 5


In [18]:
print('Max test_scores is ' + str(max(test_scores)*100) + ' for k = '+ 
      str(test_scores.index(max(test_scores))+5))
# K=13 has generalized well for our data.

Max test_scores is 87.01298701298701 for k = 13


In [19]:
# Lets try Logistic regression now
from sklearn.linear_model import LogisticRegression
log_model = LogisticRegression(random_state=20, penalty='l2').fit(X_train, Y_train)
log_pred=log_model.predict(X_test)
log_model.score(X_test, Y_test)

0.8311688311688312

In [20]:
# Support Vector Machines
from sklearn import svm
svm_model = svm.SVC().fit(X_train, Y_train)
svm_pred=svm_model.predict(X_test)
svm_model.score(X_test, Y_test)
# Almost 89% Accuracy

0.8896103896103896

In [21]:
# Function to evaluate model performance
def model_perf(pred,Y_test):
    cmp_list=[]
    for i,j in zip(pred,Y_test):
        if i==j:
            cmp_list.append(1)
        else:
            cmp_list.append(0)
    return cmp_list


In [22]:
cmp_list=model_perf(svm_pred,Y_test)

In [23]:
print('Model Accuracy Confirmation :'+ str(cmp_list.count(1)/len(Y_test)))

Model Accuracy Confirmation :0.8896103896103896


In [24]:
# Random Forest
from sklearn.ensemble import RandomForestClassifier
rf_model = RandomForestClassifier(max_depth=2, random_state=20).fit(X_train, Y_train)
rf_pred=rf_model.predict(X_test)
rf_model.score(X_test, Y_test)
# Almost 86% Accuracy


0.8571428571428571

In [37]:
import tensorflow as tf
def build_model():
    model = tf.keras.Sequential([
    tf.keras.layers.Dense(8, activation='relu', input_shape=[len(X_train.keys())]),
    tf.keras.layers.Dense(4, activation='relu'),
    tf.keras.layers.Dense(2, activation='relu'),
    tf.keras.layers.Dense(1,activation='sigmoid')
  ])

    optimizer = tf.keras.optimizers.Adam(learning_rate=0.01, beta_1=0.9, beta_2=0.999, epsilon=1e-07)

    model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
    return model

neural_model = build_model()

In [38]:
neural_model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_4 (Dense)              (None, 8)                 72        
_________________________________________________________________
dense_5 (Dense)              (None, 4)                 36        
_________________________________________________________________
dense_6 (Dense)              (None, 2)                 10        
_________________________________________________________________
dense_7 (Dense)              (None, 1)                 3         
Total params: 121
Trainable params: 121
Non-trainable params: 0
_________________________________________________________________


In [39]:
# Keeping EPOCHs high as dataset is small.
EPOCHS = 1000
neural_pred = neural_model.fit(X_train, Y_train,epochs=EPOCHS, validation_split=0.1, verbose=2)

Epoch 1/1000
18/18 - 0s - loss: 0.6310 - accuracy: 0.6630 - val_loss: 0.4663 - val_accuracy: 0.7903
Epoch 2/1000
18/18 - 0s - loss: 0.5312 - accuracy: 0.7355 - val_loss: 0.3885 - val_accuracy: 0.8065
Epoch 3/1000
18/18 - 0s - loss: 0.4882 - accuracy: 0.7518 - val_loss: 0.3451 - val_accuracy: 0.8387
Epoch 4/1000
18/18 - 0s - loss: 0.4569 - accuracy: 0.7754 - val_loss: 0.3376 - val_accuracy: 0.8387
Epoch 5/1000
18/18 - 0s - loss: 0.4323 - accuracy: 0.8025 - val_loss: 0.3182 - val_accuracy: 0.8226
Epoch 6/1000
18/18 - 0s - loss: 0.4194 - accuracy: 0.8062 - val_loss: 0.3084 - val_accuracy: 0.8226
Epoch 7/1000
18/18 - 0s - loss: 0.4084 - accuracy: 0.8116 - val_loss: 0.3027 - val_accuracy: 0.8387
Epoch 8/1000
18/18 - 0s - loss: 0.3904 - accuracy: 0.8134 - val_loss: 0.2999 - val_accuracy: 0.8710
Epoch 9/1000
18/18 - 0s - loss: 0.3767 - accuracy: 0.8279 - val_loss: 0.2913 - val_accuracy: 0.8871
Epoch 10/1000
18/18 - 0s - loss: 0.3663 - accuracy: 0.8370 - val_loss: 0.2860 - val_accuracy: 0.9032

Epoch 81/1000
18/18 - 0s - loss: 0.2162 - accuracy: 0.9076 - val_loss: 0.4606 - val_accuracy: 0.8548
Epoch 82/1000
18/18 - 0s - loss: 0.2087 - accuracy: 0.9040 - val_loss: 0.4844 - val_accuracy: 0.8710
Epoch 83/1000
18/18 - 0s - loss: 0.2066 - accuracy: 0.9004 - val_loss: 0.4891 - val_accuracy: 0.8548
Epoch 84/1000
18/18 - 0s - loss: 0.1992 - accuracy: 0.9112 - val_loss: 0.4612 - val_accuracy: 0.8387
Epoch 85/1000
18/18 - 0s - loss: 0.2019 - accuracy: 0.9167 - val_loss: 0.5086 - val_accuracy: 0.8710
Epoch 86/1000
18/18 - 0s - loss: 0.2004 - accuracy: 0.9185 - val_loss: 0.4804 - val_accuracy: 0.8548
Epoch 87/1000
18/18 - 0s - loss: 0.2047 - accuracy: 0.9167 - val_loss: 0.4026 - val_accuracy: 0.8548
Epoch 88/1000
18/18 - 0s - loss: 0.1929 - accuracy: 0.9239 - val_loss: 0.4893 - val_accuracy: 0.8710
Epoch 89/1000
18/18 - 0s - loss: 0.2123 - accuracy: 0.9130 - val_loss: 0.4542 - val_accuracy: 0.8387
Epoch 90/1000
18/18 - 0s - loss: 0.2061 - accuracy: 0.9130 - val_loss: 0.4205 - val_accurac

Epoch 162/1000
18/18 - 0s - loss: 0.1546 - accuracy: 0.9402 - val_loss: 0.6746 - val_accuracy: 0.8226
Epoch 163/1000
18/18 - 0s - loss: 0.1643 - accuracy: 0.9366 - val_loss: 0.6466 - val_accuracy: 0.8548
Epoch 164/1000
18/18 - 0s - loss: 0.1704 - accuracy: 0.9420 - val_loss: 0.8484 - val_accuracy: 0.8387
Epoch 165/1000
18/18 - 0s - loss: 0.1715 - accuracy: 0.9293 - val_loss: 0.6468 - val_accuracy: 0.8387
Epoch 166/1000
18/18 - 0s - loss: 0.1820 - accuracy: 0.9275 - val_loss: 0.7417 - val_accuracy: 0.8548
Epoch 167/1000
18/18 - 0s - loss: 0.1622 - accuracy: 0.9366 - val_loss: 0.7838 - val_accuracy: 0.8387
Epoch 168/1000
18/18 - 0s - loss: 0.1626 - accuracy: 0.9330 - val_loss: 0.6916 - val_accuracy: 0.8387
Epoch 169/1000
18/18 - 0s - loss: 0.1609 - accuracy: 0.9330 - val_loss: 0.6679 - val_accuracy: 0.8710
Epoch 170/1000
18/18 - 0s - loss: 0.1597 - accuracy: 0.9366 - val_loss: 0.7282 - val_accuracy: 0.8710
Epoch 171/1000
18/18 - 0s - loss: 0.1632 - accuracy: 0.9312 - val_loss: 0.6905 - v

Epoch 243/1000
18/18 - 0s - loss: 0.1869 - accuracy: 0.9312 - val_loss: 0.9388 - val_accuracy: 0.8065
Epoch 244/1000
18/18 - 0s - loss: 0.1782 - accuracy: 0.9293 - val_loss: 1.0579 - val_accuracy: 0.8065
Epoch 245/1000
18/18 - 0s - loss: 0.1574 - accuracy: 0.9366 - val_loss: 1.0090 - val_accuracy: 0.8065
Epoch 246/1000
18/18 - 0s - loss: 0.1570 - accuracy: 0.9402 - val_loss: 1.1958 - val_accuracy: 0.8065
Epoch 247/1000
18/18 - 0s - loss: 0.1459 - accuracy: 0.9457 - val_loss: 1.1842 - val_accuracy: 0.8065
Epoch 248/1000
18/18 - 0s - loss: 0.1417 - accuracy: 0.9475 - val_loss: 1.1340 - val_accuracy: 0.7903
Epoch 249/1000
18/18 - 0s - loss: 0.1397 - accuracy: 0.9457 - val_loss: 1.1348 - val_accuracy: 0.7742
Epoch 250/1000
18/18 - 0s - loss: 0.1427 - accuracy: 0.9457 - val_loss: 1.1611 - val_accuracy: 0.7742
Epoch 251/1000
18/18 - 0s - loss: 0.1400 - accuracy: 0.9493 - val_loss: 0.9437 - val_accuracy: 0.7903
Epoch 252/1000
18/18 - 0s - loss: 0.1447 - accuracy: 0.9420 - val_loss: 0.9617 - v

Epoch 324/1000
18/18 - 0s - loss: 0.1330 - accuracy: 0.9529 - val_loss: 1.8462 - val_accuracy: 0.7581
Epoch 325/1000
18/18 - 0s - loss: 0.1350 - accuracy: 0.9511 - val_loss: 1.7675 - val_accuracy: 0.7419
Epoch 326/1000
18/18 - 0s - loss: 0.1328 - accuracy: 0.9547 - val_loss: 1.7814 - val_accuracy: 0.7581
Epoch 327/1000
18/18 - 0s - loss: 0.1344 - accuracy: 0.9511 - val_loss: 1.7645 - val_accuracy: 0.7419
Epoch 328/1000
18/18 - 0s - loss: 0.1313 - accuracy: 0.9529 - val_loss: 1.7715 - val_accuracy: 0.7903
Epoch 329/1000
18/18 - 0s - loss: 0.1307 - accuracy: 0.9511 - val_loss: 1.7164 - val_accuracy: 0.7581
Epoch 330/1000
18/18 - 0s - loss: 0.1307 - accuracy: 0.9493 - val_loss: 1.6959 - val_accuracy: 0.7581
Epoch 331/1000
18/18 - 0s - loss: 0.1358 - accuracy: 0.9457 - val_loss: 1.6976 - val_accuracy: 0.7419
Epoch 332/1000
18/18 - 0s - loss: 0.1388 - accuracy: 0.9493 - val_loss: 1.9238 - val_accuracy: 0.7581
Epoch 333/1000
18/18 - 0s - loss: 0.1351 - accuracy: 0.9511 - val_loss: 1.9649 - v

Epoch 405/1000
18/18 - 0s - loss: 0.1692 - accuracy: 0.9384 - val_loss: 2.3149 - val_accuracy: 0.7903
Epoch 406/1000
18/18 - 0s - loss: 0.1532 - accuracy: 0.9493 - val_loss: 2.3191 - val_accuracy: 0.7742
Epoch 407/1000
18/18 - 0s - loss: 0.1486 - accuracy: 0.9493 - val_loss: 2.3408 - val_accuracy: 0.7742
Epoch 408/1000
18/18 - 0s - loss: 0.1483 - accuracy: 0.9475 - val_loss: 2.4295 - val_accuracy: 0.7742
Epoch 409/1000
18/18 - 0s - loss: 0.1495 - accuracy: 0.9457 - val_loss: 2.4620 - val_accuracy: 0.7742
Epoch 410/1000
18/18 - 0s - loss: 0.1514 - accuracy: 0.9475 - val_loss: 2.4864 - val_accuracy: 0.7742
Epoch 411/1000
18/18 - 0s - loss: 0.1473 - accuracy: 0.9511 - val_loss: 2.4692 - val_accuracy: 0.7903
Epoch 412/1000
18/18 - 0s - loss: 0.1466 - accuracy: 0.9493 - val_loss: 2.4734 - val_accuracy: 0.8065
Epoch 413/1000
18/18 - 0s - loss: 0.1453 - accuracy: 0.9529 - val_loss: 2.6734 - val_accuracy: 0.7742
Epoch 414/1000
18/18 - 0s - loss: 0.1431 - accuracy: 0.9511 - val_loss: 2.5947 - v

Epoch 486/1000
18/18 - 0s - loss: 0.1529 - accuracy: 0.9438 - val_loss: 1.9096 - val_accuracy: 0.7903
Epoch 487/1000
18/18 - 0s - loss: 0.1524 - accuracy: 0.9457 - val_loss: 2.0670 - val_accuracy: 0.7581
Epoch 488/1000
18/18 - 0s - loss: 0.1695 - accuracy: 0.9348 - val_loss: 2.1616 - val_accuracy: 0.7258
Epoch 489/1000
18/18 - 0s - loss: 0.1609 - accuracy: 0.9420 - val_loss: 1.7220 - val_accuracy: 0.7258
Epoch 490/1000
18/18 - 0s - loss: 0.1836 - accuracy: 0.9348 - val_loss: 2.1609 - val_accuracy: 0.7581
Epoch 491/1000
18/18 - 0s - loss: 0.1706 - accuracy: 0.9293 - val_loss: 1.9595 - val_accuracy: 0.7581
Epoch 492/1000
18/18 - 0s - loss: 0.1565 - accuracy: 0.9438 - val_loss: 1.8803 - val_accuracy: 0.7419
Epoch 493/1000
18/18 - 0s - loss: 0.1548 - accuracy: 0.9475 - val_loss: 1.9846 - val_accuracy: 0.7419
Epoch 494/1000
18/18 - 0s - loss: 0.1553 - accuracy: 0.9438 - val_loss: 1.9196 - val_accuracy: 0.7258
Epoch 495/1000
18/18 - 0s - loss: 0.1533 - accuracy: 0.9457 - val_loss: 1.8853 - v

Epoch 567/1000
18/18 - 0s - loss: 0.1550 - accuracy: 0.9402 - val_loss: 2.0253 - val_accuracy: 0.7419
Epoch 568/1000
18/18 - 0s - loss: 0.1542 - accuracy: 0.9402 - val_loss: 1.9874 - val_accuracy: 0.7419
Epoch 569/1000
18/18 - 0s - loss: 0.1628 - accuracy: 0.9348 - val_loss: 2.0175 - val_accuracy: 0.7742
Epoch 570/1000
18/18 - 0s - loss: 0.1758 - accuracy: 0.9293 - val_loss: 1.9141 - val_accuracy: 0.7258
Epoch 571/1000
18/18 - 0s - loss: 0.1704 - accuracy: 0.9330 - val_loss: 1.8309 - val_accuracy: 0.7581
Epoch 572/1000
18/18 - 0s - loss: 0.1642 - accuracy: 0.9348 - val_loss: 1.8654 - val_accuracy: 0.8065
Epoch 573/1000
18/18 - 0s - loss: 0.1663 - accuracy: 0.9366 - val_loss: 1.7528 - val_accuracy: 0.7419
Epoch 574/1000
18/18 - 0s - loss: 0.1648 - accuracy: 0.9348 - val_loss: 1.7098 - val_accuracy: 0.7419
Epoch 575/1000
18/18 - 0s - loss: 0.1636 - accuracy: 0.9330 - val_loss: 1.7743 - val_accuracy: 0.7419
Epoch 576/1000
18/18 - 0s - loss: 0.1596 - accuracy: 0.9384 - val_loss: 1.8088 - v

Epoch 648/1000
18/18 - 0s - loss: 0.1613 - accuracy: 0.9366 - val_loss: 2.1236 - val_accuracy: 0.7742
Epoch 649/1000
18/18 - 0s - loss: 0.1698 - accuracy: 0.9275 - val_loss: 2.1033 - val_accuracy: 0.7742
Epoch 650/1000
18/18 - 0s - loss: 0.1590 - accuracy: 0.9402 - val_loss: 2.2183 - val_accuracy: 0.7581
Epoch 651/1000
18/18 - 0s - loss: 0.1684 - accuracy: 0.9330 - val_loss: 2.1937 - val_accuracy: 0.7581
Epoch 652/1000
18/18 - 0s - loss: 0.1583 - accuracy: 0.9366 - val_loss: 2.2163 - val_accuracy: 0.7419
Epoch 653/1000
18/18 - 0s - loss: 0.1571 - accuracy: 0.9384 - val_loss: 2.1983 - val_accuracy: 0.7581
Epoch 654/1000
18/18 - 0s - loss: 0.1601 - accuracy: 0.9348 - val_loss: 2.1480 - val_accuracy: 0.7419
Epoch 655/1000
18/18 - 0s - loss: 0.1566 - accuracy: 0.9384 - val_loss: 2.1262 - val_accuracy: 0.7742
Epoch 656/1000
18/18 - 0s - loss: 0.1557 - accuracy: 0.9384 - val_loss: 2.1804 - val_accuracy: 0.7581
Epoch 657/1000
18/18 - 0s - loss: 0.1574 - accuracy: 0.9384 - val_loss: 2.1823 - v

Epoch 729/1000
18/18 - 0s - loss: 0.1926 - accuracy: 0.9221 - val_loss: 1.5107 - val_accuracy: 0.7742
Epoch 730/1000
18/18 - 0s - loss: 0.1935 - accuracy: 0.9185 - val_loss: 1.3274 - val_accuracy: 0.8226
Epoch 731/1000
18/18 - 0s - loss: 0.2017 - accuracy: 0.9239 - val_loss: 1.4079 - val_accuracy: 0.8065
Epoch 732/1000
18/18 - 0s - loss: 0.1972 - accuracy: 0.9221 - val_loss: 1.5250 - val_accuracy: 0.7742
Epoch 733/1000
18/18 - 0s - loss: 0.1843 - accuracy: 0.9293 - val_loss: 1.4300 - val_accuracy: 0.7903
Epoch 734/1000
18/18 - 0s - loss: 0.1840 - accuracy: 0.9275 - val_loss: 1.3870 - val_accuracy: 0.7903
Epoch 735/1000
18/18 - 0s - loss: 0.1826 - accuracy: 0.9275 - val_loss: 1.4215 - val_accuracy: 0.7742
Epoch 736/1000
18/18 - 0s - loss: 0.1792 - accuracy: 0.9293 - val_loss: 1.4145 - val_accuracy: 0.7742
Epoch 737/1000
18/18 - 0s - loss: 0.1800 - accuracy: 0.9312 - val_loss: 1.4950 - val_accuracy: 0.7742
Epoch 738/1000
18/18 - 0s - loss: 0.1824 - accuracy: 0.9275 - val_loss: 1.4604 - v

Epoch 810/1000
18/18 - 0s - loss: 0.2205 - accuracy: 0.8967 - val_loss: 1.7100 - val_accuracy: 0.7903
Epoch 811/1000
18/18 - 0s - loss: 0.2195 - accuracy: 0.8967 - val_loss: 1.6695 - val_accuracy: 0.7903
Epoch 812/1000
18/18 - 0s - loss: 0.2218 - accuracy: 0.8967 - val_loss: 1.6830 - val_accuracy: 0.7903
Epoch 813/1000
18/18 - 0s - loss: 0.2219 - accuracy: 0.8949 - val_loss: 1.6697 - val_accuracy: 0.7903
Epoch 814/1000
18/18 - 0s - loss: 0.2202 - accuracy: 0.8967 - val_loss: 1.7254 - val_accuracy: 0.7903
Epoch 815/1000
18/18 - 0s - loss: 0.2135 - accuracy: 0.9022 - val_loss: 1.7681 - val_accuracy: 0.7903
Epoch 816/1000
18/18 - 0s - loss: 0.2124 - accuracy: 0.9022 - val_loss: 1.7539 - val_accuracy: 0.7903
Epoch 817/1000
18/18 - 0s - loss: 0.2117 - accuracy: 0.9022 - val_loss: 1.7824 - val_accuracy: 0.7903
Epoch 818/1000
18/18 - 0s - loss: 0.2113 - accuracy: 0.9022 - val_loss: 1.7755 - val_accuracy: 0.8065
Epoch 819/1000
18/18 - 0s - loss: 0.2118 - accuracy: 0.9004 - val_loss: 1.7466 - v

Epoch 891/1000
18/18 - 0s - loss: 0.2330 - accuracy: 0.8913 - val_loss: 1.5452 - val_accuracy: 0.7903
Epoch 892/1000
18/18 - 0s - loss: 0.2602 - accuracy: 0.8841 - val_loss: 1.9910 - val_accuracy: 0.7258
Epoch 893/1000
18/18 - 0s - loss: 0.2593 - accuracy: 0.8714 - val_loss: 2.2228 - val_accuracy: 0.7097
Epoch 894/1000
18/18 - 0s - loss: 0.2410 - accuracy: 0.8750 - val_loss: 2.1868 - val_accuracy: 0.7097
Epoch 895/1000
18/18 - 0s - loss: 0.2312 - accuracy: 0.8804 - val_loss: 2.3077 - val_accuracy: 0.7097
Epoch 896/1000
18/18 - 0s - loss: 0.2461 - accuracy: 0.8895 - val_loss: 2.5729 - val_accuracy: 0.7419
Epoch 897/1000
18/18 - 0s - loss: 0.2970 - accuracy: 0.8931 - val_loss: 2.0163 - val_accuracy: 0.7258
Epoch 898/1000
18/18 - 0s - loss: 0.2295 - accuracy: 0.8949 - val_loss: 0.9634 - val_accuracy: 0.7742
Epoch 899/1000
18/18 - 0s - loss: 0.2305 - accuracy: 0.8895 - val_loss: 1.1486 - val_accuracy: 0.7581
Epoch 900/1000
18/18 - 0s - loss: 0.2269 - accuracy: 0.8877 - val_loss: 1.4929 - v

Epoch 972/1000
18/18 - 0s - loss: 0.2009 - accuracy: 0.9094 - val_loss: 1.7563 - val_accuracy: 0.7903
Epoch 973/1000
18/18 - 0s - loss: 0.1925 - accuracy: 0.9167 - val_loss: 1.7893 - val_accuracy: 0.7903
Epoch 974/1000
18/18 - 0s - loss: 0.2073 - accuracy: 0.9112 - val_loss: 1.8583 - val_accuracy: 0.7581
Epoch 975/1000
18/18 - 0s - loss: 0.2005 - accuracy: 0.9112 - val_loss: 1.8940 - val_accuracy: 0.7742
Epoch 976/1000
18/18 - 0s - loss: 0.1920 - accuracy: 0.9167 - val_loss: 1.8911 - val_accuracy: 0.7903
Epoch 977/1000
18/18 - 0s - loss: 0.1944 - accuracy: 0.9130 - val_loss: 1.8856 - val_accuracy: 0.7742
Epoch 978/1000
18/18 - 0s - loss: 0.1917 - accuracy: 0.9149 - val_loss: 1.9096 - val_accuracy: 0.7903
Epoch 979/1000
18/18 - 0s - loss: 0.1884 - accuracy: 0.9185 - val_loss: 1.9347 - val_accuracy: 0.7742
Epoch 980/1000
18/18 - 0s - loss: 0.1893 - accuracy: 0.9185 - val_loss: 1.9343 - val_accuracy: 0.7742
Epoch 981/1000
18/18 - 0s - loss: 0.1894 - accuracy: 0.9185 - val_loss: 1.9145 - v

In [40]:
# Let's measure final performance
hist = pd.DataFrame(neural_pred.history)
hist['epoch'] = neural_pred.epoch
hist.tail()
# 91% accuracy on train

Unnamed: 0,loss,accuracy,val_loss,val_accuracy,epoch
995,0.203334,0.913043,1.789367,0.790323,995
996,0.209075,0.90942,1.883502,0.790323,996
997,0.197687,0.914855,1.790327,0.774194,997
998,0.199069,0.913043,1.727864,0.774194,998
999,0.220912,0.907609,1.568739,0.790323,999


In [41]:
neural_test=neural_model.predict(X_test)

In [42]:
neural_test_converted=[]
for i in neural_test:
    if i>0.5:
        neural_test_converted.append(1)
    else:
        neural_test_converted.append(0)

In [43]:
cmp_list=model_perf(neural_test_converted,Y_test)

In [44]:
print('Test Accuracy :' + str(cmp_list.count(1)/len(Y_test)*100)+' %')
#~86% Accuracy.

Test Accuracy :85.71428571428571 %


In [21]:
import pickle
# Lets dump our SVM model
pickle.dump(svm_model, open('svm_model.pkl','wb'))