# Fetal Health Classification

## Motivation

Cardiotocograms (CTGs) are a simple and cost accessible option to assess fetal health. This allows healthcare professionals to take action in order to prevent child and maternal mortality. 

Deaths during and following pregnancy and childbirth is 295,000 (as of 2017). The vast majority of these deaths (94%) occurred in low-resource settings, and most could have been prevented.

We'll predict fetal_health from CTGs data. The goal is to be able to respond to the risk of death in advance.

## Target

We aim for good predictions of "fetal_health" class.

This class is cardiotocogram exams' result classified by three expert obstetritians. The labels are coresponding to the following three:

- Normal
- Suspect
- Pathological

## Evaluetaion

I'll evaluate my model by F1 score.

$$
  F_1 = \frac{2}{\frac{1}{recall} + \frac{1}{precision}} = \frac{2TP}{2TP + FP + FN}
$$

Here, TP is True Positive, TN is True Negative, FP is False Positive and FN is False Negative.

The highest possible value is 1. This indicates perfect precision and recall. The lowest possible value is 0. This indicates either the precision or the recall is zero.

I also check auc and Roc curves.

I use 66% of dataset for training, and 33% for test. 

# <div class="alert alert-block alert-info">Load data and library</div>

In [None]:
import gc
from itertools import cycle
import random

import matplotlib.pyplot as plt
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
from scipy import interp
import seaborn as sns
from sklearn.metrics import f1_score
from sklearn.preprocessing import label_binarize
from sklearn.metrics import roc_auc_score
from sklearn.metrics import roc_curve, auc
from sklearn.model_selection import KFold
from sklearn.model_selection import train_test_split

from optuna.integration import lightgbm as lgb
#import lightgbm as lgb

In [None]:
def fix_seed(seed):
    # random
    random.seed(seed)
    # Numpy
    np.random.seed(seed)

SEED = 42
fix_seed(SEED)

In [None]:
!ls ../input/fetal-health-classification

In [None]:
fetal_health = pd.read_csv("../input/fetal-health-classification/fetal_health.csv")

# <div class="alert alert-block alert-info">Checking data overview</div>

In [None]:
fetal_health.head()

In [None]:
fetal_health.info()

There are no null data. all data are float64 type.

In [None]:
fetal_health.describe()

Scales are different by each data column. In this notebbok, I'll use LightBGM, so I won't normalize scales. (This is because I don't think LightBGM will be affected much by scales.)

In [None]:
def plot_with_seaborn(fetal_health):
    fig, axes = plt.subplots(11, 2, figsize=(10,40))
    fig.suptitle(f"Distributions of values of dataset")
    g1 = sns.distplot(fetal_health["baseline value"],  color='orange', ax=axes[0, 0])
    g2 = sns.distplot(fetal_health["accelerations"], color='darkgoldenrod', ax=axes[0, 1])
    g3 = sns.distplot(fetal_health["fetal_movement"], color='darkkhaki', ax=axes[1, 0])
    g4 = sns.distplot(fetal_health["uterine_contractions"], color='olive', ax=axes[1, 1])
    g5 = sns.distplot(fetal_health["light_decelerations"], color='lime', ax=axes[2, 0])
    g6 = sns.countplot(fetal_health["severe_decelerations"], ax=axes[2, 1])
    g7 = sns.countplot(fetal_health["prolongued_decelerations"], ax=axes[3, 0])
    g8 = sns.distplot(fetal_health["abnormal_short_term_variability"], color='blue', ax=axes[3, 1])
    g9 = sns.distplot(fetal_health["mean_value_of_short_term_variability"], color='violet', ax=axes[4, 0])
    g10 = sns.distplot(fetal_health["percentage_of_time_with_abnormal_long_term_variability"], color='darkmagenta', ax=axes[4, 1])
    g11 = sns.distplot(fetal_health["mean_value_of_long_term_variability"], color='orange', ax=axes[5, 0])
    g12 = sns.distplot(fetal_health["histogram_width"], color='darkgoldenrod', ax=axes[5, 1])
    g13 = sns.distplot(fetal_health["histogram_min"], color='darkkhaki', ax=axes[6, 0])
    g14 = sns.distplot(fetal_health["histogram_max"], color='olive', ax=axes[6, 1])
    g15 = sns.distplot(fetal_health["histogram_number_of_peaks"], color='lime', ax=axes[7, 0])
    g16 = sns.countplot(fetal_health["histogram_number_of_zeroes"], ax=axes[7, 1])
    g17 = sns.distplot(fetal_health["histogram_mode"], color='darkturquoise', ax=axes[8, 0])
    g18 = sns.distplot(fetal_health["histogram_mean"], color='blue', ax=axes[8, 1])
    g19 = sns.distplot(fetal_health["histogram_median"], color='violet', ax=axes[9, 0])
    g20 = sns.distplot(fetal_health["histogram_variance"], color='darkmagenta', ax=axes[9, 1])
    g21 = sns.distplot(fetal_health["histogram_tendency"], color='orange', ax=axes[10, 0])
    g22 = sns.countplot(fetal_health["fetal_health"], ax=axes[10, 1])

In [None]:
plot_with_seaborn(fetal_health)

Some columns have skewed distribution. I'm worried about it, but I'll adopt it all as input for now.

Correlation matrix is here.

In [None]:
fig = plt.figure(figsize=(15, 15))
corr = fetal_health.corr()
sns.heatmap(corr, square=True, annot=True)

prolongued_decelerations, abnormal_short_term_variability and percentage_of_time_with_abnormal_long_term_variability have high relation to fetal_health.

Pair plot is folowing. Too big, so I divide tmen two groups, high relation with fetal_health or not.

In [None]:
fig = plt.figure(figsize=(15, 15))
sns.pairplot(fetal_health[corr[abs(corr["fetal_health"]) > 0.30].index])

In [None]:
fig = plt.figure(figsize=(15, 15))
sns.pairplot(fetal_health[corr[abs(corr["fetal_health"]) <= 0.30].index])

# <div class="alert alert-block alert-info">Training</div>

In this notebook, I choose LightGBM as model, and tune hyper parameter by LightGBM Tuner.

We can use LightGBM Tuner very easily but it strongly tune our model.

For LightGBM Tuner, you can read following contents.

- https://medium.com/optuna/lightgbm-tuner-new-optuna-integration-for-hyperparameter-optimization-8b7095e99258

- https://optuna.readthedocs.io/en/stable/_modules/optuna/integration/lightgbm.html

First, I devide dataset to train and target, and devide them train and test.

In [None]:
X = fetal_health[[col for col in fetal_health.columns if col not in ["fetal_health"]]]
y = fetal_health["fetal_health"]

In [None]:
X, X_test, y, y_test = train_test_split(X, y, test_size=0.33, random_state=SEED)
X = X.reset_index(drop=True)
y = y.reset_index(drop=True)
X_test = X_test.reset_index(drop=True) 
y_test = y_test.reset_index(drop=True)

#To use LightGBM's multiclass objective, I adjust labels.
y = y - 1
y_test = y_test - 1

In [None]:
params = {
    "objective": "multiclass",
    "boosting": "gbdt",
    "num_leaves": 40,
    "learning_rate": 0.05,
    "feature_fraction": 0.85,
    "reg_lambda": 2,
    "metric": "multi_logloss",
    "num_class" : 3,
}

In [None]:
def calc_multiclass_auc(y_test, y_pred):
    y_test = label_binarize(y_test, classes=[0, 1, 2])
    y_pred = label_binarize(y_pred, classes=[0, 1, 2])
    
    fpr = dict()
    tpr = dict()
    roc_auc = dict()
    for i in range(3):
        fpr[i], tpr[i], _ = roc_curve(y_test[:, i], y_pred[:, i])
        roc_auc[i] = auc(fpr[i], tpr[i])

    # Compute micro-average ROC curve and ROC area
    fpr["micro"], tpr["micro"], _ = roc_curve(y_test.ravel(), y_pred.ravel())
    roc_auc["micro"] = auc(fpr["micro"], tpr["micro"])
    
    return roc_auc, tpr, fpr
    

In [None]:
kf = KFold(n_splits=3)
models = []
f1 = 0
auc_vals = []
tpr_vals = []
fpr_vals = []

for train_index,val_index in kf.split(X):
    train_features = X.loc[train_index]
    train_target = y.loc[train_index]
    
    val_features = X.loc[val_index]
    val_target = y.loc[val_index]
    
    d_training = lgb.Dataset(train_features, label=train_target, free_raw_data=False)
    d_val = lgb.Dataset(val_features, label=val_target, free_raw_data=False)
    
    cls = lgb.train(params, train_set=d_training, num_boost_round=1000, valid_sets=[d_val], verbose_eval=25, early_stopping_rounds=50)

    models.append(cls)
    f1 += f1_score(val_target, np.argmax(cls.predict(val_features),axis=1), average='macro')
    
    roc_auc, tpr, fpr = calc_multiclass_auc(val_target, np.argmax(cls.predict(val_features),axis=1))
    auc_vals.append(roc_auc)
    tpr_vals.append(tpr)
    fpr_vals.append(fpr)


In [None]:
def print_tuned_params(cls, fold):
    print("---------------------")
    print(f"Tune result of the {fold}th fold.")
    print("params:", cls.params)
    print("best_iteration:", cls.best_iteration)
    print("best_score:", cls.best_score)    
    print("---------------------")

In [None]:
fold = 1
for cls in models:
    print_tuned_params(cls, fold)
    fold += 1

print("F1 score:", f1 / 3)

In [None]:
auc_cal_micro = 0
auc_val_0 = 0
auc_val_1 = 0
auc_val_2 = 0

for auc_val in auc_vals:
    auc_cal_micro += auc_val['micro']
    auc_val_0 += auc_val[0]
    auc_val_1 += auc_val[1]
    auc_val_2 += auc_val[2]
    
print("auc micro", auc_cal_micro / 3)
print("auc for label 0:", auc_val_0 / 3)
print("auc for label 1:", auc_val_1 / 3)
print("auc for label 2:", auc_val_2 / 3)

OK, we got good f1 score. The learning seems to be going well.

I also check auc and roc curves.

In [None]:
def plot_roc_curve(tprs, fprs, fold,  n_classes, lw):
    """Plots ROC curves for the multilabel problem
    Refer https://scikit-learn.org/stable/auto_examples/model_selection/plot_roc.html
    """
    if fold != "test":
        tpr = tprs[fold]
        fpr = fprs[fold]
    else:
        tpr = tprs
        fpr = fprs
        
    n_classes = 3
    lw = 2
    
    # First aggregate all false positive rates
    all_fpr = np.unique(np.concatenate([fpr[i] for i in range(n_classes)]))

    # Then interpolate all ROC curves at this points
    mean_tpr = np.zeros_like(all_fpr)
    for i in range(n_classes):
        mean_tpr += interp(all_fpr, fpr[i], tpr[i])

    # Finally average it and compute AUC
    mean_tpr /= n_classes

    fpr["macro"] = all_fpr
    tpr["macro"] = mean_tpr
    roc_auc["macro"] = auc(fpr["macro"], tpr["macro"])

    # Plot all ROC curves
    plt.figure()
    plt.plot(fpr["micro"], tpr["micro"],
         label='micro-average ROC curve (area = {0:0.2f})'
               ''.format(roc_auc["micro"]),
         color='deeppink', linestyle=':', linewidth=4)

    plt.plot(fpr["macro"], tpr["macro"],
         label='macro-average ROC curve (area = {0:0.2f})'
               ''.format(roc_auc["macro"]),
         color='navy', linestyle=':', linewidth=4)

    colors = cycle(['aqua', 'darkorange', 'cornflowerblue'])
    for i, color in zip(range(n_classes), colors):
        plt.plot(fpr[i], tpr[i], color=color, lw=lw,
             label='ROC curve of class {0} (area = {1:0.2f})'
             ''.format(i, roc_auc[i]))

    plt.plot([0, 1], [0, 1], 'k--', lw=lw)
    plt.xlim([0.0, 1.0])
    plt.ylim([0.0, 1.05])
    plt.xlabel('False Positive Rate')
    plt.ylabel('True Positive Rate')
    plt.title(f'Roc Curve of fold {fold}')
    plt.legend(loc="lower right")
    plt.show()

In [None]:
plot_roc_curve(tpr_vals, fpr_vals, 0,  3, 2)

In [None]:
plot_roc_curve(tpr_vals, fpr_vals, 1,  3, 2)

In [None]:
plot_roc_curve(tpr_vals, fpr_vals, 2,  3, 2)

# <div class="alert alert-block alert-info">Predict</div>

In [None]:
result = np.zeros((X_test.shape[0], 3))

In [None]:
for model in models:
    result += model.predict(X_test)
f1_test = f1_score(y_test, np.argmax(result,axis=1), average='macro')
auc_test, tpr_test, fpr_test = calc_multiclass_auc(y_test, np.argmax(result,axis=1))

In [None]:
f1_test

In [None]:
plot_roc_curve(tpr_test, fpr_test, "test",  3, 2)

OK, we got good f1 score in test, too.

In [None]:
print("Predicted fetal_health labels are:")
np.argmax(result,axis=1) + 1