# Fetal health Classification

The purpose of this code notebook is to clean and prepare the fetal health classification data into a useable format to predict fetal health outcomes accurately using Cardiotocograms (CTGs) data to prevent or decrease child and maternal mortality.

A decrease in the child mortality rate is one of the United Nations’ Sustainable Development Goals. Parallel to the notion of child mortality is of course maternal mortality, which accounts for 295 000 deaths during and following pregnancy and childbirth (as of 2017). The vast majority of these deaths (94%) occurred in low-resource settings, and most could have been prevented.

Cardiotocograms (CTGs) can be used as a simple and cost-effective method for assessing fetal health and taking action to prevent child and maternal mortality. CTGs work by sending ultrasound pulses and reading their response to gather information about the fetal heart rate (FHR), fetal movements, uterine contractions, and other factors.

Cardiotocogram (CTG) is the most widely used in the clinical routine evaluation to detect fetal state and has enabled clinical practitioners to detect signs of fetal compromise at an early stage. It provides information on uterine respiration and fetal heart rate, which can be used to determine whether the fetus is healthy, suspect, or pathological.

Having a predictive model would be a valuable tool for the healthcare industry. This healthcare facility wants to make an impact by being able to predict fetal health outcomes which would in the end prevent or decrease child and maternal mortality. 

### Dataset:

The dataset is from Kaggle and contains 2,126 rows of 22 features extracted from Cardiotocogram (CTG) exams, which were then classified by three expert obstetricians into 3 classes: Normal, Suspect, and Pathological.

The data description is detailed below:

- baseline_value: Baseline Fetal Heart Rate (FHR)
- accelerations: Number of accelerations per second
- fetal_movement: Number of fetal movements per second
- uterine_contractions: Number of uterine contractions per second
- light_decelerations: Number of LDs per second
- severe_decelerations: Number of SDs per second
- prolongued_decelerations: Number of PDs per second
- abnormal_short_term_variability: Percentage of time with abnormal short term variability
- mean_value_of_short_term_variability: Mean value of short term variability
- percentage_of_time_with_abnormal_long_term_variability: Percentage of time with abnormal long term variability
- mean_value_of_long_term_variability: Mean value of long term variability
- histogram_width: Width of the histogram made using all values from a record
- histogram_min: Histogram minimum value
- histogram_max: Histogram maximum value
- histogram_number_of_peaks: Number of peaks in the exam histogram
- histogram_number_of_zeroes: Number of zeroes in the exam histogram
- histogram_mode: Histogram mode
- histogram_mean: Histogram mean
- histogram_median: Histogram median
- histogram_variance: Histogram variance
- histogram_tendency: Histogram trend
- fetal_health: Fetal health:
   - 1 - Normal
   - 2 - Suspect
   - 3 - Pathological

### Citation:
Ayres de Campos et al. (2000) SisPorto 2.0 A Program for Automated Analysis of Cardiotocograms. J Matern Fetal Med 5:311-318 (Link9:5%3C311::AID-MFM12%3E3.0.CO;2-9))


## PART THREE

## Preprocessing and Training Data Development

Remember the question we are trying to answer: **To predict fetal health outcomes accurately using Cardiotocograms (CTGs) data to prevent or decrease child and maternal mortality.**

Preprocessing and Training Data Development Task:

1. Create a dummy or indicator features for categorical variables. Use get_dummies().
2. Standardize the magnitude of numeric features using a scaler. Make a scaler object and fit the data to the scaler object.
3. Split into a testing and training datasets.

Goal: Create a cleaned development dataset that can be use to complete the modeling step of the project. This step prepares the data for fitting models.

In [29]:
# Imports
import os
import pandas as pd
import numpy as np
import glob
import random
random.seed(1)
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import spearmanr

import scipy.stats as stats
import statsmodels.api as sm
from statsmodels.formula.api import ols
from IPython.display import display

from sklearn.preprocessing import MinMaxScaler
from pandas import DataFrame
from sklearn.preprocessing import OneHotEncoder
from statsmodels.stats.proportion import proportions_ztest
#import ydata_profiling
#!pip install termcolor
#from termcolor import colored
#!pip install colorama
from colorama import Fore, Back, Style
import statsmodels.stats.weightstats as ws

from sklearn.preprocessing import OneHotEncoder as OHE
import sklearn.model_selection
from sklearn.model_selection import train_test_split
from sklearn.model_selection import KFold
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, f1_score
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import ConfusionMatrixDisplay

#Other Imports
from sklearn.model_selection import KFold
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, f1_score, recall_score
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import ConfusionMatrixDisplay

from sklearn.preprocessing import MinMaxScaler, PolynomialFeatures
from sklearn.model_selection import GridSearchCV, StratifiedKFold
from sklearn.metrics import f1_score, make_scorer
from sklearn.multiclass import OneVsRestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.cluster import KMeans
from sklearn.svm import SVC

from pprint import pprint

This is the best place to put all the libraries needed for the project.

In [30]:
fhc_dataset2 = pd.read_csv('archive/fetal_health.csv')

In [31]:
fhc_dataset2 = fhc_dataset2.rename(columns={"baseline value":
                             "baseline_value"}) 

In [32]:
fhc_dataset2.head().T

Unnamed: 0,0,1,2,3,4
baseline_value,120.0,132.0,133.0,134.0,132.0
accelerations,0.0,0.006,0.003,0.003,0.007
fetal_movement,0.0,0.0,0.0,0.0,0.0
uterine_contractions,0.0,0.006,0.008,0.008,0.008
light_decelerations,0.0,0.003,0.003,0.003,0.0
severe_decelerations,0.0,0.0,0.0,0.0,0.0
prolongued_decelerations,0.0,0.0,0.0,0.0,0.0
abnormal_short_term_variability,73.0,17.0,16.0,16.0,16.0
mean_value_of_short_term_variability,0.5,2.1,2.1,2.4,2.4
percentage_of_time_with_abnormal_long_term_variability,43.0,0.0,0.0,0.0,0.0


In [33]:
fhc_dataset2.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2126 entries, 0 to 2125
Data columns (total 22 columns):
 #   Column                                                  Non-Null Count  Dtype  
---  ------                                                  --------------  -----  
 0   baseline_value                                          2126 non-null   float64
 1   accelerations                                           2126 non-null   float64
 2   fetal_movement                                          2126 non-null   float64
 3   uterine_contractions                                    2126 non-null   float64
 4   light_decelerations                                     2126 non-null   float64
 5   severe_decelerations                                    2126 non-null   float64
 6   prolongued_decelerations                                2126 non-null   float64
 7   abnormal_short_term_variability                         2126 non-null   float64
 8   mean_value_of_short_term_variability  

fhc_dataset2 shows fetal_health as categorical data (object) since it has normal, suspect and pathological.

In [34]:
fhc_dataset2.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
baseline_value,2126.0,133.303857,9.840844,106.0,126.0,133.0,140.0,160.0
accelerations,2126.0,0.003178,0.003866,0.0,0.0,0.002,0.006,0.019
fetal_movement,2126.0,0.009481,0.046666,0.0,0.0,0.0,0.003,0.481
uterine_contractions,2126.0,0.004366,0.002946,0.0,0.002,0.004,0.007,0.015
light_decelerations,2126.0,0.001889,0.00296,0.0,0.0,0.0,0.003,0.015
severe_decelerations,2126.0,3e-06,5.7e-05,0.0,0.0,0.0,0.0,0.001
prolongued_decelerations,2126.0,0.000159,0.00059,0.0,0.0,0.0,0.0,0.005
abnormal_short_term_variability,2126.0,46.990122,17.192814,12.0,32.0,49.0,61.0,87.0
mean_value_of_short_term_variability,2126.0,1.332785,0.883241,0.2,0.7,1.2,1.7,7.0
percentage_of_time_with_abnormal_long_term_variability,2126.0,9.84666,18.39688,0.0,0.0,0.0,11.0,91.0


The fhc_dataset have features values that range from 0 - 100 or 0-1 or both and more. The data needs to undergo min/max scaling to make it consistent and standardize the magnitude of numeric features using a scaler.


In [35]:
#Scaling the data; all the features will get scaled except fetal_health
cols=fhc_dataset2.columns[:-1]
# retrieve just the numeric input values for the features excluding fetal_health
data = fhc_dataset2.values[:, :-1]
# perform a robust scaler transform of the dataset
trans = MinMaxScaler()
data = trans.fit_transform(data)
# convert the array back to a dataframe
fhc_dataset3 = DataFrame(data)
fhc_dataset3.set_axis(cols[:], axis=1, inplace=True)
fhc_dataset3.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
baseline_value,2126.0,0.505627,0.182238,0.0,0.37037,0.5,0.62963,1.0
accelerations,2126.0,0.167277,0.203452,0.0,0.0,0.105263,0.315789,1.0
fetal_movement,2126.0,0.01971,0.097018,0.0,0.0,0.0,0.006237,1.0
uterine_contractions,2126.0,0.291094,0.196405,0.0,0.133333,0.266667,0.466667,1.0
light_decelerations,2126.0,0.125964,0.197347,0.0,0.0,0.0,0.2,1.0
severe_decelerations,2126.0,0.003293,0.0573,0.0,0.0,0.0,0.0,1.0
prolongued_decelerations,2126.0,0.031703,0.11799,0.0,0.0,0.0,0.0,1.0
abnormal_short_term_variability,2126.0,0.466535,0.229238,0.0,0.266667,0.493333,0.653333,1.0
mean_value_of_short_term_variability,2126.0,0.166586,0.129888,0.0,0.073529,0.147059,0.220588,1.0
percentage_of_time_with_abnormal_long_term_variability,2126.0,0.108205,0.202164,0.0,0.0,0.0,0.120879,1.0


Reminder: fhc_dataset3 same as fhc_dataset2 with features scaled and includes fetal_health scaled (starting below)

In [36]:
fhc_dataset3['fetal_health']= fhc_dataset2['fetal_health']
fhc_dataset3.describe(include='all').T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
baseline_value,2126.0,0.505627,0.182238,0.0,0.37037,0.5,0.62963,1.0
accelerations,2126.0,0.167277,0.203452,0.0,0.0,0.105263,0.315789,1.0
fetal_movement,2126.0,0.01971,0.097018,0.0,0.0,0.0,0.006237,1.0
uterine_contractions,2126.0,0.291094,0.196405,0.0,0.133333,0.266667,0.466667,1.0
light_decelerations,2126.0,0.125964,0.197347,0.0,0.0,0.0,0.2,1.0
severe_decelerations,2126.0,0.003293,0.0573,0.0,0.0,0.0,0.0,1.0
prolongued_decelerations,2126.0,0.031703,0.11799,0.0,0.0,0.0,0.0,1.0
abnormal_short_term_variability,2126.0,0.466535,0.229238,0.0,0.266667,0.493333,0.653333,1.0
mean_value_of_short_term_variability,2126.0,0.166586,0.129888,0.0,0.073529,0.147059,0.220588,1.0
percentage_of_time_with_abnormal_long_term_variability,2126.0,0.108205,0.202164,0.0,0.0,0.0,0.120879,1.0


Splitting the data into training and testing datasets in preparation to modeling. Our y variable is fetal_health.

In [37]:
# doing a stratified split based on fetal_health
dfy = fhc_dataset3['fetal_health']
dfX = fhc_dataset3.copy()
dfX.drop('fetal_health', axis=1, inplace=True)
Xtrain, Xtest, ytrain, ytest = train_test_split(dfX, dfy, stratify=dfy,
                                                train_size = 0.80,
                                                random_state = 42)

In [38]:
#inspecting the shape of the split data
print(Xtrain.shape, Xtest.shape, ytrain.shape, ytest.shape)

(1700, 21) (426, 21) (1700,) (426,)


In [39]:
#inspecting the percentages for each of the target classes in the training data
ytrain.value_counts()/ytrain.shape[0] * 100.00

1.0    77.823529
2.0    13.882353
3.0     8.294118
Name: fetal_health, dtype: float64

In [40]:
#inspecting the percentages for each of the target classes in the test data. These values should closely match above
ytest.value_counts()/ytest.shape[0] * 100.00

1.0    77.934272
2.0    13.849765
3.0     8.215962
Name: fetal_health, dtype: float64

### SUMMARY

The following tasks were performed:

We start with importing the libraries needed for preprocessing. Downloaded the dataset ('archive/cleanedDataEDA.csv').

Reviewed the top 5 of the cleaned dataset, the dtypes, and summary statistics of the the dataset. The fhc_dataset shows fetal_health as categorical data (object) since it has normal, suspect and pathological.

The fhc_dataset have features values that range from 0 - 100 or 0-1 or both and more. The data needs to undergo min/max scaling to make it consistent and standardize the magnitude of numeric features using a scaler.

Scaling of the data using MinMaxScaler and performed a robust scaler fit_transform on the dataset in preparation to splitting the dataset.

We chose the fetal_health as our dependent/response variable. Train test split() applied to develop a training data which will be use as our original training set in preparation for modeling. We start with importing the libraries needed for preprocessing. 

Train test split is a model validation procedure that allows you to simulate how a model would perform on new/unseen data. This splitting approach whether 70/30 or 80/20 split is the general rule for an effective holdout test data for model validation. 
 
We used the 80/20 split in the model development dataset. We chose the fetal_health as our dependent/response variable. 

## PART THREE

## Modeling

In this section:

The goal of the modeling step is to develop a final model that effectively predicts the stated goal in the problem identification section.


Remember the question we are trying to answer: To predict fetal health outcomes accurately using Cardiotocograms (CTGs) data to prevent or decrease child and maternal mortality.

Modeling Tasks:

1. Fit the models with the training dataset.
   Try a number of different models and compare outputs in the model evaluation stage.
   Use hyperparameter tuning methods like cross validation.
2. Review model outcomes and iterate over additional models as needed.                                                        Use a standard model evaluation metrics such as accuracy, recall, precision and F1.
3. Identify the final model that is the best model for the project.
   Hint: the most powerful model is not always the best one to use. 
   Consider computational complexity, scalability and maintenance cost.

   

#### MODELING THE DATASET

In [41]:
#Import ML models:

from sklearn.model_selection import train_test_split
from sklearn.model_selection import cross_val_score
from sklearn.preprocessing import MinMaxScaler
from sklearn.linear_model import LogisticRegression
#from sklearn.grid_search import GridSearchCV
from sklearn.metrics import accuracy_score,recall_score,f1_score 
#from sklearn.learning_curve import validation_curve

My Dataset and Applying the Machine Learning models:

The fetal health classification dataset is a multiclass classification problem. In multi-class classification, the goal is to classify the input into one of several classes or categories.

Classification problems having multiple classes with imbalanced dataset present a different challenge than a binary classification problem. The skewed distribution makes many conventional machine learning algorithms less effective, especially in predicting minority class examples. In order to do so, let us first understand the problem at hand and then discuss the ways to overcome those.

The dataset is a multiclass classification. We are interested in the ACCURACY which is the proportion of the total number of correct predictions that were correct. RECALL measurement captures all instances of the positive class.The objective is to minimize false negatives and err on the side of caution.

We would like a high positive conclusion even if it gives us a large number of false positives. A high false positive rate would indicate that we consider . This can be resolved by additional medical testing. We prefer a high false postive rate as opposed to high false negative rate, due to the problems of not being diagnosed as normal, suspect or pathological.

A dataset that is multiclass classification and with an imbalanced dependent variable the type of cross validation to use would be Stratified K-fold. Stratified K-Fold is an enhanced version of K-Fold cross-validation which is mainly used for imbalanced datasets. Just like K-fold, the whole dataset is divided into K-folds of equal size. But in this technique, each fold will have the same ratio of instances of target variable as in the whole datasets.





#### Applying the Machine Learning models

Here are the following classification models I will be using:

    Logistic Regression
    K-Nearest Neighbor (KNN)
    Gradient Boost
    Support vector machine (SVM)
    Random Forest (entropy and gini)
    Naive Bayes
    

In [42]:
import numpy as np

from sklearn.preprocessing import MinMaxScaler, PolynomialFeatures
from sklearn.model_selection import GridSearchCV, StratifiedKFold
from sklearn.metrics import f1_score, make_scorer
from sklearn.multiclass import OneVsRestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.cluster import KMeans
from sklearn.svm import SVC

from pprint import pprint

In [43]:
# multi-class classification with Keras
import pandas
from keras.models import Sequential
from keras.layers import Dense
from scikeras.wrappers import KerasClassifier
from keras.utils import to_categorical
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import LabelEncoder
from sklearn.pipeline import Pipeline

dummy_y = to_categorical(dfy)
 

In [None]:
# define baseline model
def baseline_model():
# create model
    model = Sequential()
    model.add(Dense(8, input_dim=21, activation='relu'))
    model.add(Dense(4, activation='softmax'))
 # Compile model
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model
 
estimator = KerasClassifier(build_fn=baseline_model, epochs=200, batch_size=5, verbose=0)
kfold = KFold(n_splits=10, shuffle=True)
results = cross_val_score(estimator, dfX, dummy_y, cv=kfold)
print("Baseline: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

  X, y = self._initialize(X, y)
  X, y = self._initialize(X, y)


In [44]:
import tensorflow as tf
print("TensorFlow version:", tf.__version__)

from tensorflow.keras.layers import Dense, Flatten, Conv2D
from tensorflow.keras import Model


TensorFlow version: 2.13.0


In [45]:
from keras.models import Sequential
from keras.layers import Dense, Embedding, LSTM, GRU
from keras.layers import Embedding

EMBEDDING_DIM =100

print('Build model...')

model = Sequential()
#model.add(Embedding(EMBEDDING_DIM, input_length=len(Xtrain)))
model.add(GRU(units=32, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(3, activation='softmax'))

#try using differnet optimizers and different optimizer configs
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

print('Summary of the build model...')
print(model.summary())



Build model...
Summary of the build model...


ValueError: This model has not yet been built. Build the model first by calling `build()` or by calling the model on a batch of data.

In [123]:
num_epochs =10
batch_size  = 128
history = model.fit(Xtrain, ytrain, batch_size_size=batch_size, epochs=num_epochs, verbose=2, validation_split=0.2)

NameError: name 'model' is not defined

In [124]:
#Evaluate the model
score, acc = model.evaluate(Xtest, ytest, batch_size=batch_size, verbose=2)

print('Test accuracy:', acc)

NameError: name 'model' is not defined

In [120]:
from sklearn.neighbors import KNeighborsClassifier
#from sklearn.metrics import plot_roc_curve

# Apply KNN model to training data:

knn = KNeighborsClassifier()
knn.fit(Xtrain,ytrain)

# Predict using model:

y_predict_knn=knn.predict(Xtest)

Accuracy_knn = accuracy_score(ytest,y_predict_knn)
print("Accuracy: "+str(Accuracy_knn))
Recall_knn = recall_score(ytest,y_predict_knn)
print("Recall: "+str(Recall_knn))
F1_knn = f1_score(ytest,y_predict_knn)
print("F1: "+str(F1_knn))

#knn_disp= plot_roc_curve(knn,X_test,y_test)

Accuracy: 0.8826291079812206


ValueError: Target is multiclass but average='binary'. Please choose another average setting, one of [None, 'micro', 'macro', 'weighted'].

In [48]:
from sklearn.ensemble import ExtraTreesClassifier, GradientBoostingClassifier
gbc = GradientBoostingClassifier(subsample=0.8, learning_rate=0.05 , n_estimators=160, random_state=5, max_depth=9, max_leaf_nodes=100)
gbc.fit(Xtrain, ytrain)

#Predict using the model:

y_predict_gbc = gbc.predict(Xtest)

Accuracy_gbc = accuracy_score(ytest,y_predict_gbc)
print("Accuracy: "+str(Accuracy_gbc))
Recall_gbc = recall_score(ytest,y_predict_gbc)
print("Recall: "+str(Recall_gbc))
F1_gbc = f1_score(ytest,y_predict_gbc)
print("F1: "+str(F1_gbc))

Accuracy: 0.9577464788732394


ValueError: Target is multiclass but average='binary'. Please choose another average setting, one of [None, 'micro', 'macro', 'weighted'].

In [74]:
from sklearn.naive_bayes import GaussianNB
nb = GaussianNB()
nb.fit(Xtrain1,ytrain1)

#Predict using the model:

y_predict_nb=nb.predict(Xtest1)

Accuracy_nb = accuracy_score(ytest1,y_predict_nb)
print("Accuracy: "+str(Accuracy_nb))
Recall_nb = recall_score(ytest1,y_predict_nb)
print("Recall: "+str(Recall_nb))
F1_nb = f1_score(ytest1,y_predict_nb)
print("F1: "+str(F1_nb))

ValueError: Unknown label type: (array([0. , 0.5, 1. ]),)

In [None]:
# doing a stratified split based on fetal_health
#dfy = fhc_dataset3['fetal_health']
#dfX = fhc_dataset3.copy()
#dfX.drop('fetal_health', axis=1, inplace=True)
#Xtrain, Xtest, ytrain, ytest = train_test_split(dfX, dfy, stratify=dfy,
#                                                train_size = 0.80,
#                                                random_state = 42)

In [125]:
from imblearn.over_sampling import SMOTE
sm =  SMOTE(random_state=2)
Xtrain_res, ytrain_res= sm.fit_resample(Xtrain, ytrain)


ModuleNotFoundError: No module named 'imblearn'

In [None]:
y_train_res.value_counts()

### SUMMARY

This data set is definitely a Multiclass Classification with a dependent variable that is imbalance. A Multiclass Classification is a classification with more than two classes, and each sample can only labeled as one class. 

Imbalanced dataset in a multiclass classification means, if one class has overwhelmingly more samples than another, it can be seen as an imbalanced dataset. That is, a classification problem where the classes are not represented equally. This imbalance causes two problems: Training is inefficient as most samples are easy examples that contribute no useful learning signal; The easy examples can overwhelm training and lead to degenerate models. 




Which type of cross-validation is used for an imbalanced data set?
Top 7 cross validation techniques with Python Code ...
Stratified K-Fold
Stratified K-Fold is an enhanced version of K-Fold cross-validation which is mainly used for imbalanced datasets. Just like K-fold, the whole dataset is divided into K-folds of equal size. But in this technique, each fold will have the same ratio of instances of target variable as in the whole datasets.

What is the multi-class classification technique?
Multi-class Classification — One-vs-All & One-vs-One | by ...
Multi-class classification is the classification technique that allows us to categorize the test data into multiple class labels present in trained data as a model prediction

Which model is good for multiclass classification?
You can use decision tree techniques and logistic regression for multiclass classification. To handle this particular problem, you can use a machine learning algorithm for multiclass classification like Neural Networks, Naive Bayes, and SVM.

Can I use cross entropy loss for multiclass classification?
It's a softmax activation plus a Cross-Entropy loss used for multiclass classification. Using this loss, we can train a Convolutional Neural Network to output a probability over the N classes for each image