In [None]:
library(data.table)
library(lmtest)
library(sandwich)
library(stargazer)
options(repr.plot.width=4, repr.plot.height=4)

# Teacher Incentives and Student Outcomes 
This data is from a really interesting paper written by [Karthik Muralidharan and Venkatesh Sundararaman](https://www.jstor.org/stable/10.1086/659655?seq=1#metadata_info_tab_contents) about how teachers respond to pay incentives in India. 

Here is the abstract from the paper: 

> We present results from a randomized evaluation of a teacher performance pay program implemented across a large representative sample of government-run rural primary schools in the Indian state of Andhra Pradesh. At the end of 2 years of the program, students in incentive schools performed significantly better than those in control schools by 0.27 and 0.17 standard deviations in math and language tests, respectively. We find no evidence of any adverse consequences of the program. The program was highly cost effective, and incentive schools performed significantly better than other randomly chosen schools that received additional schooling inputs of a similar value.

In [None]:
d = fread('./performance_pay_replication/data/dta/Incentives_JPE_HTEs.csv')

There is a lot of data in this data frame and the authors have shared all the data they colleted. Lets focus on the following variables only: 

- `cheaters_y2`: people who cheated in the study window;
  we don't want their data
- `y2_nts_level_mean`: the DV, a standardized 
- `y0_nts`: the individual's year prior national test score
- `incentive`: a treatment indicator -- was the school in control or any treatment 
- `school_treatment`: a treatment indicator -- was the school in control, a group treatment, or an individual treatment
- `parent_literacy_index`: the 1-4 indicators for parental literacy
- `hh_affluence_index`: the 1-7 affluence index 
- `U_MC`: the mandal
- `apfschoolcode`: the school code (which we will eventually use
  to cluster the standard errors.
  
The rest of the data, as well as their codebook, and analysis files are available [here](link). 

In [None]:
d[1:10, .(cheaters_y2, y2_nts_level_mean, y0_nts, incentive, 
          parent_literacy_index, hh_affluence_index, U_MC, apfschoolcode)]

In [None]:
d[ , table(school_treatment, incentive)]

## Exploratory Data Analysis
Before beginning your analysis of the experiment, *per se*, check to see if there are any strange or outlying values in these data fields. At the same time, check that you understand the level of information that is coded in each data field (e.g. interval, ratio, categorial) and how it is distributed. 

In [None]:
d[ , table(cheaters_y2)]

In [None]:
histogram <- d[ , hist(y2_nts_level_mean, col = 'black')]

Make a determination about what you want to do for missingness across the dataset. There is a _lot_ of itemwise missingness -- people who weren't at school the day of the tests, for example. 

Although it isn't generally a sound practice, for the concision of this code, **go ahead and drop these observations**. 

In [None]:
d_no_na <- na.omit(d, cols = c('y2_nts_level_mean', 'y0_nts', 'cheaters_y2'))

# Analysis of Experiment 
We'll write the first regression for you, but the rest are up to you! 

Let's focus only on those individuals who didn't cheat in the second period. Among this set of people who fairly took the test, what is the causal effect of having a teacher who was a part of any incentive program? 

In [None]:
mod_overall <- d_no_na[ ,  lm(y2_nts_level_mean ~ incentive + y0_nts + factor(U_MC))]
test_results <- coeftest(mod_overall, vcovCL(mod_overall, d_no_na[ , apfschoolcode]))

test_results[1:4, ]

There is a clear effect of teachers being in any of the incentive conditions. Students whose teachers were in any incentive condition scores 0.24 standard deviations higher than students whose teachers were not in that condition. 

**What is happening in that call?**
- First, we're filtering the rows to incldue only thos who don't cheat and who don't have a NA in the outcome value
- Second, we're transforming the column space by a *regression transform* -- we can do anything to the columns, including a regression! 
- Third, we're using the `coeftest` package which will pretty-print coefficients and standard errors for us; the standard errors that we are calculating are clustered standard errors, since the treatment assignment happens at the *school level*. 

## Different Types of Treatment? 

Using a similar regression set up as above, are the effects of being in different *types* of treatment meaningfully different from one another? That is, use a regression of the variable `school_treatment` to measure the effects of the different types of treatment.  

In [None]:
mod_treatments <- d_no_na[ , lm(y2_nts_level_mean ~ school_treatment)]
# fill out the standard errors calculation and test

## To the heart of the question: Parental Literacy

Using a similar regression set up as above, test for whether receiving any incentives (`incentive`) operates differently when student's parents score have different scores on `parents_literacy_index`. 

> The theory underlying this test is that when teachers are provided incentives to teach more, perhaps these incentives are particularily well-received when the students' home-lives also support learning. 

Because of the way that `parents_literacy_index` is coded, you should probably use this as a factor variable, and test appropriately for heterogeneity across this multi-level factor varible using an F-test first. 

In [None]:
mod_short <- ''
mod_long <- ''

anova(mod_short, mod_long, test = 'F')

In [None]:
stargazer(mod_short, mod_long, type = 'text')

What do you conclude about the effectiveness of these treatments? Do they work differently when supported by parents who read well than parents who do not read well? 

## To the heart of the question: Household Affluence 

Using a similar regression set up as above, test for whether receiving any incentives (`incentive`) operates differently when student's parents score have different scores on `hh_affluence_index`. 

> The theory underlying this test is that when teachers are provided incentives to teach more, perhaps these incentives are particularily well-received when the students' home-lives also support learning. 

Because of the way that `parents_literacy_index` is coded, you should probably use this as a factor variable, and test appropriately for heterogeneity across this multi-level factor varible using an F-test first. 

What do you conclude about the effectiveness of these treatments? Do they work differently when supported by households that are relatively wealthy rather than relatively poor? 