# Maternal Health Risk Assessment: A Comprehensive Model of Predictive Factors

### Introduction:

Maternal mortality is a major concern for the UN's sustainable development goals, with already 287,000 women dying in 2020 alone. Recognizing the gravity of this issue, there is a palpable urgency to employ innovative approaches for mitigation. A potential avenue of impactful intervention lies in the development of a predictive model. This model holds the potential to discern high-risk patients among pregnant women, thereby empowering healthcare providers to allocate resources to enhance the likelihood of successful outcomes for both mothers and their newborns. This is why in this report we will answer the question: Can we determine, with precision, whether a pregnant woman faces low, medium, or high maternal health risk based on key variables such as age, systolic blood pressure, diastolic blood pressure, blood glucose, body temperature, and heart rate? To answer this question, this report uses as its foundation a dataset gathered from 1013 pregnant women. This dataset represents a diverse cross-section, sourced from hospitals, clinics, and maternal health care centers embedded in the rural landscapes of Bangladesh. Each data point encapsulates the variables of interest, forming the basis for our robust predictive model. The dataset contains overall 7 columns for: age, systolic blood pressure, diastolic blood pressure, blood glucose (BS), body temperature, heart rate, and the risk classification. Age is given in years, systolic and diastolic blood pressure in mmHg, blood glucose in mmol/L, body temperature in degrees Fahrenheit, heart rate in bpm, and risk level as a categorical variables for three classifications: low risk, medium risk, and high risk. 

#### Methods & Results

Data analysis

We are reading the dataset from UCI Machine Learning Repository from the Web to R with the following steps. In summary, we need to download the file, unzip, give it a new name and read the first 6 rows.

Data selection:

Based on the data analysis, it has become apparent that the temperature variable does not significantly contribute to distinguishing between high, medium and low-risk patients. The temperature remains within the same range for all risk groups. Therefore, it would be misleading to consider temperature as a crucial factor in determining whether a pregnant woman is at high/medium/low risk. 
In alignment with our data analysis, our decision to exclude temperature as a significant variable in determining maternal health risk is supported by a complementary study conducted by Marzia Ahmed and Mohammod Abul Kashem from the Department of Software Engineering at Daffodil International University and the Department of Computer Science and Engineering at Dhaka University of Science and Technology, respectively. Their work utilized a significance ranking system to assess the influence of the factors (age, systolic blood pressure, diastolic blood pressure, blood glucose (BS), body temperature, and heart rate) on risk classification as shown in Figure 1:


IMAGE


According to their findings, blood sugar (BS) emerged as the most influential risk factor in pregnancy, “Especially the mother affected by diabetes is considerably more responsible about three times higher than Blood Pressure and other factors” (Ahmed & Kashem, 2020). Notably, their analysis, rooted in the context of decision trees and entropy, indicated that temperature had a comparatively lower influence over the risk classification. 
In light of this corroborating evidence, our decision to exclude temperature from our risk assessment model aligns with the observed minimal impact of temperature on distinguishing between high, medium, and low-risk patients in our dataset. The temperature variable, remaining within the same range across all risk groups, lacks the discriminatory power demonstrated by other variables, such as blood sugar and blood pressure, as elucidated by Ahmed and Kashem's comprehensive analysis.

In [1]:
temp <- tempfile()
download.file("https://archive.ics.uci.edu/static/public/863/maternal+health+risk.zip",temp)
maternal_original_data <- read.csv(unzip(temp, "Maternal Health Risk Data Set.csv"))
unlink(temp)
head(maternal_original_data)

Unnamed: 0_level_0,Age,SystolicBP,DiastolicBP,BS,BodyTemp,HeartRate,RiskLevel
Unnamed: 0_level_1,<int>,<int>,<int>,<dbl>,<dbl>,<int>,<chr>
1,25,130,80,15.0,98,86,high risk
2,35,140,90,13.0,98,70,high risk
3,29,90,70,8.0,100,80,high risk
4,30,140,85,7.0,98,70,high risk
5,35,120,60,6.1,98,76,low risk
6,23,140,80,7.01,98,70,high risk


#### reminder for codes(delete later)

#### Discussion

#### Reference