In [9]:
import pandas as pd
from src.models.panel import FixedEffects
from src.models.linreg import LinReg
from src.displays.display_linear import display_models

In [18]:
witch = pd.read_csv('../data/witch_killing.csv')

witch['tot_m'] = witch['witch_murders'] + witch['oth_murders']
print(f"The dataset has {witch.shape[0]} rows and {witch.shape[1]} columns ")
witch.head()

The dataset has 736 rows and 10 columns 


Unnamed: 0,vid,year,witch_murders,oth_murders,any_rain,any_disease,famine,educat,norelig,tot_m
0,3192,1992,0,0,0,0,0,3.25,0.9,0
1,3153,1992,0,0,0,0,0,4.78125,0.78125,0
2,1041,1992,1,0,1,0,1,4.666666,0.625,1
3,1063,1992,0,0,0,0,0,4.75,0.6875,0
4,2092,1992,0,0,0,1,0,4.2,0.666667,0


This dataset was taken from the very interesting paper by Edward Miguel "Poverty and Witch killing" (http://emiguel.econ.berkeley.edu/wordpress/wp-content/uploads/2021/03/Paper__Poverty_and_Witch_Killing.pdf).

The dataset is a panel dataset containing  information on witch killings in Tanzania from 1992 to 2002. It aggregates data at the village level and captures covariates of interest for the village. Specifically, witch murders in a year, total other murders, an indicator for drought or flood, and indicator for the precense of disease and an indicator for the famine.  It also collected information on the average number of years of schooling in a population and the percent practicing traditional religions.

While we will not be replicating the full study here, we will use this dataset to explore a causal inference technique called "Fixed Effects". 

In [19]:
"""
Lets start by considering the causes of total murders in a village.  We will start with a naive model and compare it to one with year level fixed effects and one with these effects and clustered standard errors. 

"""

base = LinReg(df = witch,
              outcome='tot_m',
              independent=['any_rain'])
fe1 = FixedEffects(df = witch,
                   outcome='tot_m',
                   independent=['any_rain'],
                   fixed = ['year'])

fe1_robust = FixedEffects(df = witch,
                          outcome='tot_m',
                          independent=['any_rain'],
                          fixed = ['year'],
                          standard_error_type='clustered')

display_models([base, fe1, fe1_robust])

Looking at the above table we dont see any statistically relevent relationship between rainfall and annual murders. However, we see that by allowing for village level year fixed effects and computing cluster robust standard errors we get more efficient and less biased estimates for the coefficients and their standard errors.  Lets add some more covariates. 

In [23]:


fe2_robust = FixedEffects(df = witch,
                          outcome='tot_m',
                          independent=['any_rain', 
                                       "educat",
                                       "norelig"],
                          fixed = ['year'],
                          standard_error_type='clustered')

display_models([fe2_robust])

when we now control for both village education levels as well as if they adhere to traditional religions we see that again any rain does not seem to be statistically significantly related to total murders in a village.  Interestingly, while traditional religion adhereance also doesnt seem to be statistically significantly related to a villages murders, education does.

In [24]:
"""Lets run the same analysis for the prescence of disease"""

fe3_robust = FixedEffects(df = witch,
                          outcome='tot_m',
                          independent=['any_disease'],
                          fixed = ['year'],
                          standard_error_type='clustered')

fe4_robust = FixedEffects(df = witch,
                          outcome='tot_m',
                          independent=['any_disease',
                                       "educat",
                                       "norelig"],
                          fixed = ['year'],
                          standard_error_type='clustered')

display_models([fe2_robust, fe3_robust, fe4_robust])