# Analysis of the Maternal Mortality Rate by Countries and Regions from 1985 to 2023 using WHO API


### by Giulia Sepeda

[LinkedIn](https://www.linkedin.com/in/giuliasepeda/) | [GitHub repository](https://github.com/giuuusepeda/who-data-tools) | [Web Portfolio](https://giuliasepeda.carrd.co)

Data source: [WHO data](https://www.who.int/data/gho/data/indicators/indicator-details/GHO/maternal-mortality-ratio-(per-100-000-live-births) )

Start date: July 13th, 2025 | Latest update: August 2nd, 2025

### Introduction

#### Maternal Mortality: A Global Health Crisis

> “Every two minutes, a woman dies from preventable causes related to pregnancy or childbirth.” — WHO, 2025

- In 2023, over **260,000 women** died from maternal causes — most of them **preventable**.
- **92%** of deaths occurred in **low- and lower-middle-income countries**.
- **Sub-Saharan Africa** alone accounted for **70%** of all maternal deaths.

Despite progress (MMR ↓ 40% since 2000), we're far from the **SDG target** of <70 deaths per 100k live births by 2030.



#### Why This Project Matters

This analysis explores **30+ years of global data** to:

- Highlight regional disparities in maternal mortality
- Reveal gaps in access to quality care
- Provide visual tools to support public health decisions

Source: [WHO Data ](https://www.who.int/data/gho/data/indicators/indicator-details/GHO/maternal-mortality-ratio-(per-100-000-live-births) )


### Goals

- Create a reusable ETL tool  
- Perform exploratory analysis of maternal mortality  
- Conduct spatial analysis of MMR  
- Merge MMR data with socioeconomic indicators from the WHO API  
- Build an interactive dashboard with R Shiny


### Steps

1. **Extract** the data using the WHO API  
2. **Transform** the data using Pandas  
3. **Feature engineering**  
4. **Exploratory analysis**  
5. **Spatial analysis**  
6. **R Shiny dashboard**

### 0. Set up

Creates a autorealoaded enviroment for real time update from reusable functions.

In [None]:
%load_ext autoreload
%autoreload 2

Import functions from 'who_data_tools' as a package

In [None]:
import sys
sys.path.append (r'C:\Users\giuli\Documents\GitHub')
from who_data_tools import get_who_data, clean_column_names2, save_to_csv, plot_line, check_data_quality, fix_column_types, drop_constant_columns

Load packages needed

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import who_data_tools 

### 1. Extract Data from API Who

The Function `get_who_data(ID)` retrieves data from the WHO based on the indicator ID.

It exports the requested indicator (maternal mortality) from the WHO into a dataframe and displays the first few rows. It uses the indicador ID as an argument. 

The indicator ID can be found [here](https://ghoapi.azureedge.net/api/Indicator).

In [None]:
df = get_who_data('MDG_0000000026')

Saving raw file as CSV

In [None]:
df.to_csv('data\maternal_mortality_raw.csv', index=False, sep=',')

### 2. Transform Data

#### 2.1. Check the raw state of the dataset

Gives a starting point and a view of what needs to be done. 

Uses `check_data_quality()` function. 

Used to exibit infos about the df:
- data type
- null values
- duplicates

In [None]:
check_data_quality(df)

#### 2.2. Correct the columns names

The function `clean_column_names2()` padronizes column names from CamelCase to snake_case, lower case and no special caracters.

In [None]:
df = clean_column_names2(df)

#### 2.3. Drop constant and empty columns

The Function `drop_constant_columns()` removes columns that all observations are the same

In [None]:
df = drop_constant_columns(df)

#### 2.4. Drop columns that won`t be used on the analysis

- `time_dimension_begin` and `time_dimension_end`: timestamp start and end data generation
- `date`: date when the data was collected
- `time_dimension_value`: year (doppled) with `time_dim`

In [None]:
df.drop(columns=['time_dimension_begin', 'time_dimension_end', 'date', 'time_dimension_value'], inplace=True)

#### 2.5. Change the data format

In [None]:
fix_column_types(df)

#### 2.6. Check results of Transformation and check if something still needs to be done

In [None]:
check_data_quality(df)

#### 2.7 ~~Drop lines containing region-level information~~

(Commented out later, after realizing it was useful for analysis and the dashboard.)

The maternal mortality rate (MMR), or maternal mortality ratio (MMR), refers to the number of maternal deaths per 100,000 live births.

That said, regional MMR cannot be accurately built from country-level values. The median may differ from the actual regional MMR due to disparities in population size.

In [None]:
"""
missing_region = df[
    df['parent_location'].isna() | df['parent_location_code'].isna()
]

countries_missing_region = (
    missing_region['spatial_dim']
    .drop_duplicates()
    .reset_index(drop=True)
)
print("Países sem região:", countries_missing_region.tolist())


df = df[~df['spatial_dim'].isin(['SEAR', 'EMR', 'EUR', 'AMR', 'WPR', 'AFR', 'GLOBAL'])]
df.reset_index(drop=True, inplace=True)
"""

#### 2.8. Export all changes

In [None]:
df.to_csv('data\maternal_mortality_clean.csv', index=False, sep=',')

### 3. Feature Engineering

### 4. Exploratory analysis

![MMR trend](plots\tableau_trend.jpg)

### 5. Spatial analysis

### 6. [R Shiny dashboard](https://giuliasepeda.shinyapps.io/maternal_mortality_who/)