# COVID-19 Mortality Rates

The objective of this notebook is to compute the COVID-19 Mortality Rates of Philippine Provinces. There are two such rates to be computed:
1. Crude Mortality Rates and
2. Age Standardized Mortality Rates

From [1], mortality rate per 100,000 is defined as $M = 100,000 \times \sum_{i=1}^n w_i \frac{m_i}{p_i}$ where $i = 1, ..., n$ is the age group, $w_i$ is the weight of age group $i$, $m_i$ is the mortality count at age group $i$, and $p_i$ is the population at age group $i$.

The difference between the crude and age-standardized mortality rates is the weight. With crude rates, the weight at age group $i$ is computed from the population itself. In contrast, the weight in the age-standardized rates is computed from a standard population. In this study, the standard population is the 2020 Philippine National Population.

Output:
1. "output/age_specific_mortality_rates.xlsx": Excel File with Sheets = Provinces; for each sheet (province): columns = [population, mortality, m/p] and rows = age groups.
2. "output/mortality_rates.xlsx": Excel File with one sheet: columns = [crude per 100k, age-standardized per 100k] and rows = provinces.

In [1]:
import pandas as pd
import numpy as np

In [2]:
from google.colab import drive
drive.mount('/content/drive')
%cd /content/drive/My Drive/SMSL/COVID-19 Mortality Rates

Mounted at /content/drive
/content/drive/My Drive/SMSL/COVID-19 Mortality Rates


## Age-Specific Crude Mortality Rates

Output: Dictionary with keys = provinces; for each key (province): Pandas DataFrame with columns = [population, mortality, m / p] and rows = age groups.

In [3]:
# Population Data
population = pd.read_excel("data/Philippine Population 2020 by Province and Age Group.xlsx", index_col=0).T
population = population.drop("INTERIM PROVINCE", axis=1) # Removing INTERIM PROVINCE since it is not in the COVID-19 data
population.shape

(17, 84)

In [4]:
population.columns

Index(['BASILAN', 'COTABATO CITY', 'LANAO DEL SUR', 'MAGUINDANAO', 'SULU',
       'TAWI-TAWI', 'ABRA', 'APAYAO', 'BENGUET', 'IFUGAO', 'KALINGA',
       'MOUNTAIN PROVINCE', 'AGUSAN DEL NORTE', 'AGUSAN DEL SUR',
       'DINAGAT ISLANDS', 'SURIGAO DEL NORTE', 'SURIGAO DEL SUR', 'NCR',
       'ILOCOS NORTE', 'ILOCOS SUR', 'LA UNION', 'PANGASINAN', 'BATANES',
       'CAGAYAN', 'ISABELA', 'NUEVA VIZCAYA', 'QUIRINO', 'AURORA', 'BATAAN',
       'BULACAN', 'NUEVA ECIJA', 'PAMPANGA', 'TARLAC', 'ZAMBALES', 'BATANGAS',
       'CAVITE', 'LAGUNA', 'QUEZON', 'RIZAL', 'MARINDUQUE',
       'OCCIDENTAL MINDORO', 'ORIENTAL MINDORO', 'PALAWAN', 'ROMBLON',
       'CITY OF ISABELA', 'ZAMBOANGA DEL NORTE', 'ZAMBOANGA DEL SUR',
       'ZAMBOANGA SIBUGAY', 'ALBAY', 'CAMARINES NORTE', 'CAMARINES SUR',
       'CATANDUANES', 'MASBATE', 'SORSOGON', 'AKLAN', 'ANTIQUE', 'CAPIZ',
       'GUIMARAS', 'ILOILO', 'NEGROS OCCIDENTAL', 'BOHOL', 'CEBU',
       'NEGROS ORIENTAL', 'SIQUIJOR', 'BILIRAN', 'EASTERN SAMAR', 'LEYT

In [None]:
# Mortality Data
mortality = pd.ExcelFile("output/mortality.xlsx")
age_specific_rates = {}
for province in mortality.sheet_names:
    temp_df = pd.DataFrame()
    temp_df['mortality'] = mortality.parse(province, index_col=0)['DIED']
    temp_df['population'] = population[province]

    temp_df['m / p'] = temp_df['mortality'] / temp_df['population']
    age_specific_rates[province] = temp_df
    # break
age_specific_rates

In [None]:
# Save file
with pd.ExcelWriter("output/age_specific_mortality_rates.xlsx") as writer:
    for province in age_specific_rates:
        age_specific_rates[province].to_excel(writer, sheet_name=province[:31]) # Excel sheets limit the sheet name length to 31
        print(f"Province {province} added.")
print("File saving completed.")

## Crude Mortality Rates

"output/crude_mortality_rates.xlsx": Excel File with one sheet: columns = [crude mortality rates] and rows = provinces.

In [41]:
def mortality_rate(weights, age_specific_mortality_rates, scale):
    return np.sum(weights * age_specific_mortality_rates) * 100000

In [42]:
crude_rates = pd.DataFrame(columns=["crude per 100k"])
for province in age_specific_rates:
    weights = age_specific_rates[province]["population"] / age_specific_rates[province]["population"].sum()
    age_specific_mortality_rates = age_specific_rates[province]['m / p']
    crude_rates.loc[province] = mortality_rate(weights, age_specific_mortality_rates, 100000)
crude_rates

Unnamed: 0,crude per 100k
ABRA,62.322969
AGUSAN DEL NORTE,78.587174
AGUSAN DEL SUR,45.122501
AKLAN,41.110304
ALBAY,27.029981
...,...
TAWI-TAWI,1.368161
ZAMBALES,90.710054
ZAMBOANGA DEL NORTE,27.819816
ZAMBOANGA DEL SUR,53.375148


In [43]:
# Standard Population: 2020 Philippine National Population
ph_pop = pd.read_excel("data/PH_population.xlsx", index_col=0)
ph_pop

Unnamed: 0_level_0,Population
Age Group,Unnamed: 1_level_1
0 to 4,11066707
5 to 9,11266823
10 to 14,11080715
15 to 19,10459186
20 to 24,9969846
25 to 29,9172896
30 to 34,8120568
35 to 39,7179320
40 to 44,6491312
45 to 49,5571168


In [52]:
age_standardized_rates = pd.DataFrame(columns=["age-standardized per 100k"])
weights = ph_pop['Population'] / ph_pop['Population'].sum()
for province in age_specific_rates:
    age_specific_mortality_rates = age_specific_rates[province]['m / p']
    age_standardized_rates.loc[province] = mortality_rate(weights, age_specific_mortality_rates, 100000)
age_standardized_rates

Unnamed: 0,age-standardized per 100k
ABRA,47.368369
AGUSAN DEL NORTE,77.397950
AGUSAN DEL SUR,50.589968
AKLAN,33.988423
ALBAY,25.471226
...,...
TAWI-TAWI,2.221281
ZAMBALES,81.157044
ZAMBOANGA DEL NORTE,27.468587
ZAMBOANGA DEL SUR,57.170267


In [53]:
mortality_rates = pd.DataFrame()
mortality_rates['crude per 100k'] = crude_rates
mortality_rates['age-standardized per 100k'] = age_standardized_rates
mortality_rates

Unnamed: 0,crude per 100k,age-standardized per 100k
ABRA,62.322969,47.368369
AGUSAN DEL NORTE,78.587174,77.397950
AGUSAN DEL SUR,45.122501,50.589968
AKLAN,41.110304,33.988423
ALBAY,27.029981,25.471226
...,...,...
TAWI-TAWI,1.368161,2.221281
ZAMBALES,90.710054,81.157044
ZAMBOANGA DEL NORTE,27.819816,27.468587
ZAMBOANGA DEL SUR,53.375148,57.170267


In [54]:
# Save file
mortality_rates.to_excel("output/mortality_rates.xlsx")

References:
1. Hong, D., Lee, S., Choi, Y.-J., Moon, S., Jang, Y., Cho, Y.-M., Lee, H., Min, S., Park, H., Hahn, S., Choi, J.-Y., Shin, A., & Kang, D. (2021). The age-standardized incidence, mortality, and case fatality rates of COVID-19 in 79 countries: A cross-sectional comparison and their correlations with associated factors. Epidemiology and Health, 43, e2021061. https://doi.org/10.4178/epih.e2021061
