# Naive Bayes Classifier on Fetal Heart Health Predictors  

***Karlie Schwartzwald  
DSC 630 Winter 2022  
Bellevue University***

**Change Control Log:**  

Change#: 1  
Change(s) Made:  Imported and cleaned data. Trained Naieve Bayes and ran accuracy scores.    
Date of Change:  1/28/2023  
Author: Karlie Schwartzwald  
Change Approved by: Karlie Schwartzwald  
Date Moved to Production:   

Change#: 2  
Change(s) Made:  Tried Complement NB and got better accuracy score. Then did hyperparameter tuning.     
Date of Change:  1/29/2023  
Author: Karlie Schwartzwald  
Change Approved by: Karlie Schwartzwald  
Date Moved to Production:   

Change#: 3  
Change(s) Made:  Removed anythng not directly relevent to complement NB     
Date of Change:  2/10/2023  
Author: Karlie Schwartzwald  
Change Approved by: Karlie Schwartzwald  
Date Moved to Production:   

In [1]:
# Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import RobustScaler
from sklearn.naive_bayes import ComplementNB
from sklearn.metrics import accuracy_score, classification_report
from sklearn.metrics import confusion_matrix
from sklearn.model_selection import GridSearchCV
from sklearn.pipeline import make_pipeline
from sklearn.model_selection import cross_validate

In [2]:
def get_tpr_fnr_fpr_tnr(cm):
    """
    This function returns class-wise TPR, FNR, FPR & TNR
    [[cm]]: a 2-D array of a multiclass confusion matrix
            where horizontal axes represent actual classes
            and vertical axes represent predicted classes
    {output}: a dictionary of class-wise accuracy parameters
    """
    dict_metric = dict()
    n = len(cm[0])
    row_sums = cm.sum(axis=1)
    col_sums = cm.sum(axis=0)
    array_sum = sum(sum(cm))
    #initialize a blank nested dictionary
    for i in range(1, n+1):
        keys = str(i)
        dict_metric[keys] = {"TPR":0, "FNR":0, "FPR":0, "TNR":0}
    # calculate and store class-wise TPR, FNR, FPR, TNR
    for i in range(n):
        for j in range(n):
            if i == j:
                keys = str(i+1)
                tp = cm[i, j]
                fn = row_sums[i] - cm[i, j]
                dict_metric[keys]["TPR"] = tp / (tp + fn)
                dict_metric[keys]["FNR"] = fn / (tp + fn)
                fp = col_sums[i] - cm[i, j]
                tn = array_sum - tp - fn - fp
                dict_metric[keys]["FPR"] = fp / (fp + tn)
                dict_metric[keys]["TNR"] = tn / (fp + tn)
    return dict_metric

## Null Model for Comparison

Below we will calculate the accuracy of a model that simply always chooses the most likely outcome. In this case the model will predict that all fetal heart health outcomes are healthy.

In [3]:
df = pd.read_csv('fetal_health.csv')

In [4]:
# seperate target feature
x = df.drop(['fetal_health'], axis=1)
y = df['fetal_health']

In [5]:
# split into training and testing sets
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.20, random_state = 0)

In [6]:
# check class distribution in test set
y_test.value_counts()

1.0    326
2.0     58
3.0     42
Name: fetal_health, dtype: int64

In [7]:
# check null accuracy score
null_accuracy = (497/(497+94+47))
print('Null accuracy score: {0:0.4f}'. format(null_accuracy))

Null accuracy score: 0.7790


In [8]:
# So we need to score above 78% to be a good model

## Complement Naive Bayes

### Preparation

For the Naïve Bayes classifier, we chose to use the Complement Naïve Bayes due to its ability to handle imbalanced classes in the target feature. We used a train-test-split ratio of 80/20 because it yielded the highest accuracy of any ratio we tested, without overfitting the model too much. We then applied a minmax scalar because the model only accepts positive values as input. After that we did a grid search to perform hyperparameter tuning on the only hyperparameter in a complement naïve Bayes model, using accuracy as our scoring metric. 

In [9]:
# Drop highly correlated variables
# 0.9 Threshold as described in Milestone 3

# Histogram mode and median highly correlate with hisogram mean
df.drop(['histogram_median', 'histogram_mode', ], axis=1, inplace=True)

In [10]:
# seperate target feature
X = df.drop(['fetal_health'], axis=1)
y = df['fetal_health']

In [11]:
# Splitting the data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20, random_state=0)

In [12]:
# Scaling
scaler = MinMaxScaler()
# I had to use this particular scalar because of negative numbers error
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

### Hyperparameter Tuning for Model Training

In [13]:
# gridsearch values
grid_vals = {"alpha": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 
                       1, 2, 3, 4, 5, 6, 7, 8, 9, 
                      10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
                      20, 21, 22, 23, 24, 25, 26, 27, 28, 29,
                      30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
                      40, 41, 42, 43, 44, 45, 46, 47, 48, 49,
                      50, 51, 52, 53, 54, 55, 56, 57, 48, 59,
                      60, 61, 62, 63, 64, 65, 66, 67, 68, 69,
                      70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
                      80, 81, 82, 83, 84, 85, 86, 87, 88, 89,
                      90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100]}

In [14]:
# gridsearch model
grid_lr = GridSearchCV(estimator=ComplementNB(), param_grid=grid_vals, scoring='accuracy', 
                       cv=6, refit=True, return_train_score=True) 

In [15]:
#Training and Prediction
tuned_classifier = grid_lr.fit(X_train, y_train)
tuned_preds = grid_lr.best_estimator_.predict(X_test)

In [16]:
# Print the tuned parameters and score
print("Tuned Gaussian NB Parameter: {}".format(grid_lr.best_params_))

Tuned Gaussian NB Parameter: {'alpha': 49}


### Evaluation

In [17]:
# cross validate
scores3 = cross_validate(tuned_classifier, X_train, y_train, return_train_score=True)
scores_df3 = pd.DataFrame(scores3)

In [18]:
scores_df3.mean()
# Looks like we could be over fitting but the model is performing better than null

fit_time       1.152596
score_time     0.000000
test_score     0.847059
train_score    0.850147
dtype: float64

In [19]:
# Print the Confusion Matrix
cm = confusion_matrix(y_test, tuned_preds)
conf_mat = pd.DataFrame(get_tpr_fnr_fpr_tnr(cm)).transpose()
conf_mat

Unnamed: 0,TPR,FNR,FPR,TNR
1,0.892638,0.107362,0.3,0.7
2,0.534483,0.465517,0.105978,0.894022
3,0.595238,0.404762,0.026042,0.973958


In evaluation of the complement naïve Bayes model, we primarily used accuracy but also looked more closely at false positive rates, false negative rates, true positive rates, and true negative rates. For the test data, the average accuracy was approximately 84.7% and for the test data the average accuracy was approximately 85%. This could imply some overfitting is happening with this model. But when we look more closely at the true positive rates for each of the categorical outcomes, we do have some serious concerns. It appears that this model is very accurate at predicting healthy outcomes, with an accuracy of 89%, but much worse at correctly predicting other fetal health outcomes. Since that is really the purpose of this model, we conclude that this model is not suitable for deployment. 