# Fetal Health Prediction

**Author: Victor Mayowa(MB;BS, Ilorin)**

**Source: [Kaggle](https://www.kaggle.com/datasets/andrewmvd/fetal-health-classification)**

<ul>
<li><a href="#title">Title page</a><li>
<li><a href="#toc">Table of content</a><li>
<li><a href="#abbreviation">List of abbreviations</a><li>
<li><a href="#abstract">Summary</a><li>
<li><a href="#background">Background for the study</a><li>
<li><a href="#aim">Aims</a><li>
<li><a href="#methodology">Proposed methodology</a><li>
<li><a href="#ethic">Ethical considerations</a><li>
<li><a href="#reference">List of references</a><li>
<li><a href="#appendix">Appendices</a><li>
</ul>

#### List of Abbreviation

#### Summary

#### Abstract
Classify fetal health in order to prevent child and maternal mortality.

#### Context

Reduction of child mortality is reflected in several of the United Nations' Sustainable Development Goals and is a key indicator of human progress.
The UN expects that by 2030, countries end preventable deaths of newborns and children under 5 years of age, with all countries aiming to reduce under‑5 mortality to at least as low as 25 per 1,000 live births.

Parallel to notion of child mortality is of course maternal mortality, which accounts for **295 000 deaths** during and following pregnancy and childbirth (as of 2017). The vast majority of these deaths **(94%)** occurred in low-resource settings, and most could have been prevented.

In light of what was mentioned above, **Cardiotocograms (CTGs)** are a simple and cost accessible option to assess fetal health, allowing healthcare professionals to take action in order to prevent child and maternal mortality. The equipment itself works by sending ultrasound pulses and reading its response, thus shedding light on fetal heart rate (FHR), fetal movements, uterine contractions and more.



#### Data Summary

This dataset contains **2126 records** of features extracted from Cardiotocogram exams, which were then classified by three expert obstetritians into **3 classes:**

* Normal
* Suspect
* Pathological

#### Data Loading

In [1]:
# install all required libraries
#!pip install -U dataprep

In [2]:
import pandas as pd
import numpy as np
import seaborn as sns
import scipy
import matplotlib.pyplot as plt
import warnings

warnings.filterwarnings("ignore")
pd.set_option('display.max_columns', None)

In [11]:
import torch
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier, GradientBoostingClassifier
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import RandomizedSearchCV
from sklearn.metrics import precision_recall_fscore_support, f1_score, confusion_matrix, RocCurveDisplay, PrecisionRecallDisplay

In [4]:
df = pd.read_csv('fetal_health.csv')

In [5]:
df.head(5)

Unnamed: 0,baseline value,accelerations,fetal_movement,uterine_contractions,light_decelerations,severe_decelerations,prolongued_decelerations,abnormal_short_term_variability,mean_value_of_short_term_variability,percentage_of_time_with_abnormal_long_term_variability,mean_value_of_long_term_variability,histogram_width,histogram_min,histogram_max,histogram_number_of_peaks,histogram_number_of_zeroes,histogram_mode,histogram_mean,histogram_median,histogram_variance,histogram_tendency,fetal_health
0,120.0,0.0,0.0,0.0,0.0,0.0,0.0,73.0,0.5,43.0,2.4,64.0,62.0,126.0,2.0,0.0,120.0,137.0,121.0,73.0,1.0,2.0
1,132.0,0.006,0.0,0.006,0.003,0.0,0.0,17.0,2.1,0.0,10.4,130.0,68.0,198.0,6.0,1.0,141.0,136.0,140.0,12.0,0.0,1.0
2,133.0,0.003,0.0,0.008,0.003,0.0,0.0,16.0,2.1,0.0,13.4,130.0,68.0,198.0,5.0,1.0,141.0,135.0,138.0,13.0,0.0,1.0
3,134.0,0.003,0.0,0.008,0.003,0.0,0.0,16.0,2.4,0.0,23.0,117.0,53.0,170.0,11.0,0.0,137.0,134.0,137.0,13.0,1.0,1.0
4,132.0,0.007,0.0,0.008,0.0,0.0,0.0,16.0,2.4,0.0,19.9,117.0,53.0,170.0,9.0,0.0,137.0,136.0,138.0,11.0,1.0,1.0


In [6]:
df.shape


(2126, 22)

import torch
import math

class HDCClassifier:
    def __init__(self, n_classes, n_features, D):
        self.n_classes = n_classes
        self.n_features = n_features
        self.D = D
        self.classes = torch.zeros(n_classes, D)
        self.base_vector = torch.randn(D, n_features)
        
    def create_base_vector(self):
        self.base_vector = torch.randn(self.D, self.n_features)
    
    def encode_data(self, data):
        encoded_data = torch.matmul(data, self.base_vector.T)
        return encoded_data
    
    def train(self, data, labels, epochs, learning_rate):
        for epoch in range(epochs):
            for i, encoded_sample in enumerate(data):
                predicted_class = self.predict(encoded_sample)
                true_class = labels[i]
                
                if predicted_class != true_class:
                    self.classes[true_class] += learning_rate * encoded_sample
                    self.classes[predicted_class] -= learning_rate * encoded_sample
                    
    def predict(self, encoded_sample):
        scores = torch.matmul(self.classes, encoded_sample)
        predicted_class = torch.argmax(scores)
        return predicted_class

In [41]:
class HDCClassifier:
    def __init__(self, n_features, D):
        self.n_features = n_features
        self.D = D
        self.base_vector = torch.randn(D, n_features)
        self.classes = None
    
    def create_base_vector(self):
        self.base_vector = torch.randn(self.D, self.n_features)
    
    def encode_data(self, data):
        encoded_data = torch.matmul(data, self.base_vector.T)
        return encoded_data
    
    def train(self, data, labels, epochs, learning_rate):
        # Determine the number of classes dynamically
        n_classes = len(torch.unique(labels))
        self.classes = torch.zeros(n_classes, self.D)
        
        for epoch in range(epochs):
            for i, encoded_sample in enumerate(data):
                predicted_class = self.predict(encoded_sample)
                true_class = labels[i]
                
                if predicted_class != true_class:
                    self.classes[true_class] += learning_rate * encoded_sample
                    self.classes[predicted_class] -= learning_rate * encoded_sample

    def predict(self, encoded_sample):
        scores = torch.matmul(self.classes, encoded_sample)
        predicted_class = torch.argmax(scores)
        return predicted_class

In [8]:
#fetal=HDCClassifier(n_classes=3,n_features=21, D=1000)

In [7]:
df['fetal_health'] = [int(label - 1) for label in df['fetal_health']]

In [8]:
X = df.drop(columns=["fetal_health"])
y = df["fetal_health"]

In [9]:
# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [12]:
# Convert data to PyTorch tensors
# Convert float labels to integers
y_train_int = y_train.astype(int)
y_test_int = y_test.astype(int)

# Convert data to PyTorch tensors
X_train_tensor = torch.tensor(X_train.values).float()
X_test_tensor = torch.tensor(X_test.values).float()
y_train_tensor = torch.tensor(y_train_int.values).long()
y_test_tensor = torch.tensor(y_test_int.values).long()

# Initialize HDCClassifier
n_classes = len(y.unique())
n_features = X_train.shape[1]
D = 100  # Choose an appropriate hyperdimensional space size
learning_rate = 0.1  # Choose an appropriate learning rate
hdc_classifier = HDCClassifier(n_features, D)

In [None]:
y.unique()

In [None]:
n_classes

In [None]:
# Encoding data
encoded_X_train = hdc_classifier.encode_data(X_train_tensor)

In [None]:
# Training
epochs = 50  # Choose the number of training epochs
hdc_classifier.train(encoded_X_train, y_train_tensor, epochs, learning_rate)

In [None]:
# Encoding test data and predicting
encoded_X_test = hdc_classifier.encode_data(X_test_tensor)
predicted_labels = [hdc_classifier.predict(encoded_sample) for encoded_sample in encoded_X_test]

In [None]:
# Calculate accuracy
accuracy = sum(predicted_labels == y_test_tensor.numpy()) / len(y_test_tensor)
print("Accuracy:", accuracy)

#### Data Preprocessing

#### Exploratory Data Analysis

In [None]:
#!conda install dask 

In [None]:
from dataprep.eda import create_report, plot, plot_correlation, plot_missing

#### Model development

#### Model Evaluation

#### Model saving

#### Model Deployment