# **TR_2021/05 - Technical report: Rate ratio for cardiovascular deaths and extreme events**


|Technical Report ID  |2021/05|
|--|--|
| Title |Rate ratio for cardiovascular deaths and extreme events|
| Authors | Júlia De Lázari, Paula Dornhofer|
| Creation Date| 2021-03|


## Databases descriptions

**inputs:** 

- obitos_circulatorio.csv: Dataframe of deaths due to cardiovascular diseases from 2001 to 2019 (only data up to 2018 was used, to match the climate data).

- EV_VCP.csv: Dataframe with the extreme events computed. Viracopos data was used for this.

## Analysis

This report presents an analysis of the the _rate ratio_ for the [extreme climate events](https://github.com/climate-and-health-datasci-Unicamp/project-climatic-variations-cardiovascular-diseases/blob/main/notebooks/TR_2020_05_Extreme_climatic_events_for_Campinas.ipynb).

The analysis was conducted for the total data and for some stratifications (sex, age, age and sex).





##**Rate ratio**

Rate ratio is a relative difference measure used to compare the incidence rates of events occurring at any given point in time, frequently used in epidemiology[CDC].

It is given by **RR = rate ratio = incidence rate 1/incidence rate 2**

with **incidence rate = number of events/population size**

The confidence interval is given by **log(RR) - [1.96 x SE(log(RR))] a log(RR) + [1.96 x SE(log(RR))]**. SE is the abreviation for standard error [SPH].

In our case **RR = (number of deaths at days under extreme climatic events/number of days with extreme climatic events)/(number of deaths at control days/number of control days)**

Its interpretation is similar to that of the _risk ratio_. A rate ratio of 1.0 indicates equal rates in the two groups. A rate ratio greater than 1.0 indicates increased risk for the group in the numerator. A rate ratio less than 1.0 indicates descreased risk for the group in the numerator.

##**Import libraries**

In [None]:
#-------------------------------------------------------------------#
#                       Import libraries                            #
#-------------------------------------------------------------------#
import pandas as pd
import numpy as np
import datetime
import more_itertools as mit
import datetime
import statistics as stat
import matplotlib.pyplot as plt
from matplotlib.ticker import PercentFormatter
import scipy
import math
import seaborn as sns
import pylab
from datetime import timedelta
from calendar import isleap
from google.colab import drive
from google.colab import files

drive.mount('/content/drive')

pd.options.mode.chained_assignment = None

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


## **Load and merge dataframes**

In [None]:
#-------------------------------------------------------------------#
#                      Cardiovascular deaths                        #
#-------------------------------------------------------------------#

df_obitos = pd.read_csv('obitos_circulatorio.csv')
df_obitos = df_obitos.drop(columns = {'Unnamed: 0','CODMUNRES','CODMUNOCOR','COMPLRES','Descrição CID',
                                      'CAUSABAS','LINHAA','LINHAB','LINHAC','LINHAD'}) #drop unneded columns
df_obitos = df_obitos.rename(columns = {'DTOBITO':'DATE'})
df_obitos = df_obitos[df_obitos['DATE']<='2018-12-31'] #period for cliamte data
df_obitos = df_obitos[(df_obitos.DATE !='2000-02-29')&(df_obitos.DATE !='2004-02-29')&(df_obitos.DATE !='2008-02-29')&(df_obitos.DATE !='2012-02-29')&(df_obitos.DATE !='2016-02-29')] #remove leap year dates (02-29)

In [None]:
#-------------------------------------------------------------------#
#                  Extreme climatic variations                      #
#-------------------------------------------------------------------#
df_vir =  pd.read_csv('/content/drive/Shared drives/Clima&Saúde/Pesquisadores/DeLázari_Júlia/data/Climáticos_Viracopos/EV_VCP.csv')
df_vir = df_vir.drop(columns = {'Unnamed: 0'})

In [None]:
#-------------------------------------------------------------------#
#              Merge health and climate dataframes                  #
#-------------------------------------------------------------------#

df = pd.merge(df_vir,df_obitos, on='DATE', how='outer')

## **Functions**

Automatize some repeated operations along the notebook
- stratify functions: different stratifications of the dataframe
- rate_ratio: compute the rate ratio for the desired stratification

###**Stratify functions**

In [None]:
# Stratify sex
def stratify_sex(database):
  women = database[database['SEXO']=='F']
  men = database[database['SEXO']=='M']

  dataframes = [database, women, men]
  df_names = ["All", "Women", "Men"]

  return dataframes, df_names

In [None]:
#Stratify age
def stratify_age(database):
  #less_20 = database[(database['IDADE'] < 20)]  
  between_20_40 = database[(database['IDADE'] >= 20) & (database['IDADE'] < 40)]
  between_40_65 = database[(database['IDADE'] >= 40) & (database['IDADE'] < 65)]   
  over_65 = database[(database['IDADE'] > 64)]   
  over_75 = database[(database['IDADE'] > 75)]

  dataframes  = [database, between_20_40, between_40_65, over_65, over_75] 
  df_names = ["All", "Between 20 and 40 years old","Between 40 and 65 years old","Above 65 years old","Above 75 years old"]

  return dataframes, df_names

In [None]:
# Stratify age sex
def stratify_age_sex(database): 
    between_20_65_F = database[(database['IDADE'] >= 20) & (database['IDADE'] < 65)  & (database['SEXO']=="F")]
    between_20_65_M = database[(database['IDADE'] >= 20) & (database['IDADE'] < 65) & (database['SEXO']=="M")]   
    over_65_F = database[(database['IDADE'] > 64) & (database['SEXO']=="F")]   
    over_65_M = database[(database['IDADE'] > 64) & (database['SEXO']=="M")]     

    dataframes = [database, between_20_65_F, between_20_65_M,over_65_F,over_65_M]
    df_names = ["All", "Women between 20 and 65 years old","Men between 20 and 65 years old","Women above 65 years old","Men above 65 years old"]

    return dataframes, df_names

In [None]:
# Stratify race
# 1:white, 2:black, 3:yellow, 4:brown, 5:indian

def stratify_race(database):
  white = database[database['RACACOR']==1.0]
  black = database[database['RACACOR']==2.0]
  yellow =  database[database['RACACOR']==3.0]
  brown =  database[database['RACACOR']==4.0]
  indian = database[database['RACACOR']==5.0]

  dataframes = [database, white, black, yellow, brown, indian]
  df_names = ["All", "White","Black","Yellow","Brown","Indian"]

  return dataframes, df_names

###**Rate ratio function**

In [None]:
def rate_ratio(db, stratify,event):
  database = db.copy()

  #subsets depending on the stratification
  if (stratify == 'age and sex'): 
    dataframes, df_names = stratify_age_sex(database)
  elif (stratify == 'sex'): 
    dataframes, df_names = stratify_sex(database)
  elif (stratify == 'age'): 
    dataframes, df_names = stratify_age(database)
  elif (stratify == 'race'): 
    dataframes, df_names = stratify_race(database)
  
  #aux variable 
  list_rr = []
  list_up_ci = []
  list_lr_ci = []

  for df in dataframes:
    #column for number of deathsitalizations
    df['N_deaths'] = np.where(df['CID'].isnull(),0,df.groupby(['DATE']).DATE.transform('count'))
    df = df.drop_duplicates('DATE',keep='first')
    df = df.sort_values('DATE')
      
    number_event = len(df[df[event] ==1]) # number of days with a extreme event
    number_control = len(df[df[event] ==0]) # number of days without a extreme event

    deaths_event = df.N_deaths[df[event] ==1].sum() # total number of deaths during a extreme event
    deaths_control = df.N_deaths[df[event] == 0].sum() # total number of deaths during control days

    # Rate ratio and confidence interval
    RR = round((deaths_event/number_event)/(deaths_control/number_control), 2) # compute rate ratio
    SE = math.sqrt(1/deaths_event + 1/deaths_control)

    upper_CI = round(np.exp(math.log(RR)+1.96*SE),2) #upper value
    lower_CI = round(np.exp(math.log(RR)-1.96*SE),2) #lower value

    # Append values in the list
    list_rr.append(RR)
    list_up_ci.append(upper_CI)
    list_lr_ci.append(lower_CI)

  #Create table
  table = pd.DataFrame()
  table['Group'] = df_names
  table['Rate ratio (RR)'] = list_rr
  table['Upper CI'] = list_up_ci
  table['Lower CI'] = list_lr_ci

  return table

##**Temperature**

###**Extreme thermal range**

####**Sex**

In [None]:
rate_ratio(df, 'sex','above_temp_range')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,1.01,1.04,0.98
1,Women,0.99,1.03,0.95
2,Men,1.03,1.07,0.99


####**Age**

In [None]:
rate_ratio(df, 'age','above_temp_range')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,1.01,1.04,0.98
1,Between 20 and 40 years old,1.01,1.16,0.88
2,Between 40 and 65 years old,0.97,1.02,0.92
3,Above 65 years old,1.01,1.04,0.98
4,Above 75 years old,1.01,1.05,0.97


####**Age and sex**

In [None]:
rate_ratio(df, 'age and sex','above_temp_range')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,1.01,1.04,0.98
1,Women between 20 and 65 years old,0.98,1.06,0.9
2,Men between 20 and 65 years old,0.97,1.03,0.91
3,Women above 65 years old,0.99,1.04,0.94
4,Men above 65 years old,1.03,1.08,0.98


####**Race**

In [None]:
rate_ratio(df, 'race','above_temp_range')



Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,1.01,1.04,0.98
1,White,1.02,1.05,0.99
2,Black,0.95,1.06,0.85
3,Yellow,1.0,1.29,0.78
4,Brown,0.98,1.06,0.9
5,Indian,,,


###**Extreme temperature difference between days**

####**Sex**

In [None]:
rate_ratio(df, 'sex','above_temp_dif')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,0.89,0.96,0.82
1,Women,0.91,1.02,0.81
2,Men,0.84,0.94,0.75


####**Age**

In [None]:
rate_ratio(df, 'age','above_temp_dif')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,0.89,0.96,0.82
1,Between 20 and 40 years old,1.03,1.53,0.69
2,Between 40 and 65 years old,0.91,1.06,0.78
3,Above 65 years old,0.87,0.96,0.79
4,Above 75 years old,0.85,0.96,0.75


####**Age and sex**

In [None]:
rate_ratio(df, 'age and sex','above_temp_dif')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,0.89,0.96,0.82
1,Women between 20 and 65 years old,1.01,1.25,0.81
2,Men between 20 and 65 years old,0.89,1.07,0.74
3,Women above 65 years old,0.89,1.01,0.78
4,Men above 65 years old,0.85,0.98,0.74


####**Race**

In [None]:
rate_ratio(df, 'race','above_temp_dif')



Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,0.89,0.96,0.82
1,White,0.91,0.99,0.83
2,Black,0.87,1.23,0.61
3,Yellow,1.04,1.84,0.59
4,Brown,0.82,1.03,0.65
5,Indian,,,


##**Pressure**

###**Low pressure waves**

####**Sex**

In [None]:
rate_ratio(df, 'sex','LPW')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,0.97,1.01,0.93
1,Women,0.97,1.03,0.91
2,Men,1.0,1.06,0.94


####**Age**

In [None]:
rate_ratio(df, 'age','LPW')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,0.97,1.01,0.93
1,Between 20 and 40 years old,1.05,1.31,0.84
2,Between 40 and 65 years old,0.97,1.05,0.89
3,Above 65 years old,0.97,1.02,0.92
4,Above 75 years old,0.93,1.0,0.87


####**Age and sex**

In [None]:
rate_ratio(df, 'age and sex','LPW')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,0.97,1.01,0.93
1,Women between 20 and 65 years old,1.01,1.15,0.89
2,Men between 20 and 65 years old,0.99,1.09,0.9
3,Women above 65 years old,0.93,1.0,0.86
4,Men above 65 years old,1.0,1.08,0.93


####**Race**

In [None]:
rate_ratio(df, 'race','LPW')



Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,0.97,1.01,0.93
1,White,0.97,1.02,0.92
2,Black,0.99,1.17,0.84
3,Yellow,0.95,1.42,0.64
4,Brown,1.03,1.16,0.91
5,Indian,,,


###**High pressure waves**

####**Sex**

In [None]:
rate_ratio(df, 'sex','HPW')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,1.01,1.05,0.98
1,Women,1.0,1.05,0.95
2,Men,1.02,1.07,0.97


####**Age**

In [None]:
rate_ratio(df, 'age','HPW')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,1.01,1.05,0.98
1,Between 20 and 40 years old,0.96,1.18,0.78
2,Between 40 and 65 years old,1.03,1.1,0.97
3,Above 65 years old,0.99,1.03,0.95
4,Above 75 years old,1.0,1.05,0.95


####**Age and sex**

In [None]:
rate_ratio(df, 'age and sex','HPW')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,1.01,1.05,0.98
1,Women between 20 and 65 years old,0.97,1.08,0.87
2,Men between 20 and 65 years old,1.03,1.11,0.95
3,Women above 65 years old,0.98,1.04,0.92
4,Men above 65 years old,1.01,1.07,0.95


####**Race**

In [None]:
rate_ratio(df, 'race','HPW')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,1.01,1.05,0.98
1,White,1.0,1.04,0.96
2,Black,1.03,1.18,0.9
3,Yellow,0.99,1.43,0.69
4,Brown,1.02,1.12,0.93
5,Indian,1.0,11.03,0.09


###**Extreme difference of pressure between days**

####**Sex**

In [None]:
rate_ratio(df, 'sex','above_pressure_dif')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,1.0,1.03,0.97
1,Women,1.0,1.04,0.96
2,Men,0.99,1.03,0.95


####**Age**

In [None]:
rate_ratio(df, 'age','above_pressure_dif')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,1.0,1.03,0.97
1,Between 20 and 40 years old,1.04,1.21,0.89
2,Between 40 and 65 years old,1.03,1.09,0.98
3,Above 65 years old,0.99,1.03,0.96
4,Above 75 years old,0.97,1.01,0.93


####**Age and sex**

In [None]:
rate_ratio(df, 'age and sex','above_pressure_dif')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,1.0,1.03,0.97
1,Women between 20 and 65 years old,1.03,1.12,0.95
2,Men between 20 and 65 years old,1.03,1.1,0.97
3,Women above 65 years old,0.99,1.04,0.94
4,Men above 65 years old,0.98,1.03,0.93


####**Race**

In [None]:
rate_ratio(df, 'race','above_pressure_dif')



Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,1.0,1.03,0.97
1,White,0.99,1.02,0.96
2,Black,1.0,1.12,0.9
3,Yellow,1.0,1.35,0.74
4,Brown,1.04,1.13,0.96
5,Indian,,,


##**Humidity**

###**Low humidity waves**

####**Sex**

In [None]:
rate_ratio(df, 'sex','LHW')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,1.09,1.15,1.04
1,Women,1.07,1.15,0.99
2,Men,1.09,1.17,1.02


####**Age**

In [None]:
rate_ratio(df, 'age','LHW')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,1.09,1.15,1.04
1,Between 20 and 40 years old,0.97,1.27,0.74
2,Between 40 and 65 years old,0.98,1.09,0.88
3,Above 65 years old,1.14,1.21,1.07
4,Above 75 years old,1.13,1.22,1.05


####**Age and sex**

In [None]:
rate_ratio(df, 'age and sex','LHW')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,1.09,1.15,1.04
1,Women between 20 and 65 years old,0.97,1.14,0.83
2,Men between 20 and 65 years old,1.02,1.15,0.9
3,Women above 65 years old,1.08,1.17,0.99
4,Men above 65 years old,1.13,1.23,1.04


####**Race**

In [None]:
rate_ratio(df, 'race','LHW')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,1.09,1.15,1.04
1,White,1.04,1.1,0.98
2,Black,1.06,1.29,0.87
3,Yellow,0.95,1.56,0.58
4,Brown,1.23,1.4,1.08
5,Indian,1.0,11.03,0.09


###**High humidity waves**

####**Sex**

In [None]:
rate_ratio(df, 'sex','HHW')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,0.99,1.05,0.94
1,Women,1.03,1.12,0.95
2,Men,0.96,1.04,0.89


####**Age**

In [None]:
rate_ratio(df, 'age','HHW')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,0.99,1.05,0.94
1,Between 20 and 40 years old,0.88,1.19,0.65
2,Between 40 and 65 years old,1.0,1.11,0.9
3,Above 65 years old,1.01,1.08,0.94
4,Above 75 years old,0.98,1.07,0.9


####**Age and sex**

In [None]:
rate_ratio(df, 'age and sex','HHW')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,0.99,1.05,0.94
1,Women between 20 and 65 years old,0.93,1.1,0.79
2,Men between 20 and 65 years old,0.98,1.11,0.86
3,Women above 65 years old,1.05,1.15,0.96
4,Men above 65 years old,0.92,1.02,0.83


####**Race**

In [None]:
rate_ratio(df, 'race','HHW')



Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,0.99,1.05,0.94
1,White,1.02,1.09,0.96
2,Black,0.94,1.18,0.75
3,Yellow,1.08,1.73,0.68
4,Brown,0.9,1.06,0.76
5,Indian,,,


###**Extreme humidity variation**

####**Sex**

In [None]:
rate_ratio(df, 'sex','above_humidity_range')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,1.02,1.04,1.0
1,Women,1.03,1.07,0.99
2,Men,1.02,1.05,0.99


####**Age**

In [None]:
rate_ratio(df, 'age','above_humidity_range')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,1.02,1.04,1.0
1,Between 20 and 40 years old,1.0,1.13,0.88
2,Between 40 and 65 years old,1.0,1.05,0.96
3,Above 65 years old,1.02,1.05,0.99
4,Above 75 years old,1.03,1.07,0.99


####**Age and sex**

In [None]:
rate_ratio(df, 'age and sex','above_humidity_range')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,1.02,1.04,1.0
1,Women between 20 and 65 years old,0.99,1.06,0.92
2,Men between 20 and 65 years old,0.99,1.04,0.94
3,Women above 65 years old,1.03,1.07,0.99
4,Men above 65 years old,1.01,1.05,0.97


####**Race**

In [None]:
rate_ratio(df, 'race','above_humidity_range')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,1.02,1.04,1.0
1,White,1.03,1.06,1.0
2,Black,1.02,1.12,0.93
3,Yellow,0.98,1.25,0.77
4,Brown,0.98,1.05,0.91
5,Indian,1.0,11.03,0.09


###**Extreme humidity difference between days**

####**Sex**

In [None]:
rate_ratio(df, 'sex','above_humidity_dif')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,0.99,1.07,0.92
1,Women,1.0,1.12,0.9
2,Men,0.94,1.05,0.84


####**Age**

In [None]:
rate_ratio(df, 'age','above_humidity_dif')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,0.99,1.07,0.92
1,Between 20 and 40 years old,0.91,1.43,0.58
2,Between 40 and 65 years old,0.98,1.13,0.85
3,Above 65 years old,0.98,1.08,0.89
4,Above 75 years old,0.96,1.08,0.85


####**Age and sex**

In [None]:
rate_ratio(df, 'age and sex','above_humidity_dif')

Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,0.99,1.07,0.92
1,Women between 20 and 65 years old,0.93,1.16,0.74
2,Men between 20 and 65 years old,1.0,1.19,0.84
3,Women above 65 years old,0.98,1.11,0.86
4,Men above 65 years old,0.98,1.13,0.85


####**Race**

In [None]:
rate_ratio(df, 'race','above_humidity_dif')



Unnamed: 0,Group,Rate ratio (RR),Upper CI,Lower CI
0,All,0.99,1.07,0.92
1,White,1.03,1.12,0.95
2,Black,0.99,1.38,0.71
3,Yellow,0.95,1.83,0.49
4,Brown,0.86,1.1,0.67
5,Indian,,,


## **References**

CENTERS FOR DISEASE CONTROL AND PREVENTION (CDC).Principles ofEpidemiology in Public Health Practice, Third Edition An Introduction to AppliedEpidemiology and Biostatistics. Available at: <https://www.cdc.gov/csels/dsepd/ss1978/lesson3/section5.html>


BOSTON UNIVERSITY SCHOOL OF PUBLIC HEALTH (SPH).Rate Ratios. Available at: <https://sphweb.bumc.bu.edu/otlt/MPH-Modules/PH717-QuantCore/PH717_ComparingFrequencies/PH717_ComparingFrequencies9.html>.


