## Evaluating an RCT

Welcome to the final lab!

In today's lab, we'll replicate some results from a recent randomized controlled experiment.

In the Second Liberian Civil War, which lasted from 1999 to 2003, many children were recruited as soldiers.  By 2009, many of these ex-fighters were working in illegal industries or as mercenaries.

An organization called Action On Armed Violence wanted to know whether a program of training and monetary aide could help ex-fighters reintegrate into Liberian society.  Such programs have had mixed results in other contexts, but most of the available evidence was from observational studies, not RCTs.  They identified a candidate group of ex-fighters, randomly assigned them to treatment and control groups, and offered their assistance program to the treatment group.

Our goal is to determine whether the assistance program actually improved outcomes for the people in the experiment.

The data are available [here](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/11R0LX&version=2.1) and the paper [here](https://www.povertyactionlab.org/sites/default/files/publications/994_138_Can-Employment-Reduce-Lawlessness-and-Rebellion_June2015.pdf).  The data come straight from the first link; we haven't modified them at all.

You should also be aware that there are many additional questions that go into the design and analysis of an RCT.  For example, some of the people in the experimental group declined to participate, which could become a confounding factor in the analysis if not handled properly.  (Why?)  Many details like that are discussed in the paper.

In [75]:
# Run this cell to set up the notebook, but please don't change it.

# These lines import the Numpy and Datascience modules.
import numpy as np
from datascience import *

# These lines do some fancy plotting magic.
import matplotlib
%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
import warnings
warnings.simplefilter('ignore', FutureWarning)

# These lines load the tests.
from client.api.notebook import Notebook
ok = Notebook('lab15.ok')

Run the following cell to load the data in a table called raw.  **Do not attempt to look at the table.**  It has so many columns that your browser may crash!  If you want, you can use `raw.labels` to see a list of the columns in the table.  `sorted(raw.labels)` will show them sorted in alphabetical order.

In [76]:
!pip install zip
import zipfile
with zipfile.ZipFile("aoav.zip", 'r') as f:
    f.extractall(".")
    print("Successfully extracted dataset.")
raw = Table.read_table("analysis_final.tsv", sep='\t', low_memory=False)

## Cleaning the data
The `raw` table presents a few challenges.  We should clean it up a bit before we proceed.  Here are some problems:

1. There are thousands of columns in the table, which makes the table difficult to read.  For our analysis, we're going to look only at two of those columns:
  * `assigned_final`: Whether the person was in the treatment or the control group.  1 means treatment, and 0 means control.
  * `raiseanintfut_dum_end`: Whether the person expressed interested in raising animals at the end of the trial period.  1 means they expressed interest, and 0 means they didn't.  (`"raiseanintfut"` is short for "raise animals in the future."  `"dum"` is short for "dummy," which just means the variable is coded as 0 or 1.  `"end"` means the variable was measured at the end of the experiment.)

2. Some of the people didn't report their interest.  Their `raiseanintfut_dum_end` are labeled as `nan`, which stands for "not a number."  This will cause a problem for any analysis of that column.  They can be removed by choosing only rows where `raiseanintfut_dum_end` is greater than or equal to 0.

3. The existing column names are terrible.  They should be called `"Treatment"` and `"Interested in raising animals"`.

### Question 1
Produce a table called `farming` that is a cleaned-up version of `raw`.  You should address all three of the issues described above.

In [77]:
farming = (raw.select('assigned_final', 'raiseanintfut_dum_end') #SOLUTION
              .where(1, are.above_or_equal_to(0)) #SOLUTION
              .relabeled(0, "Treatment") #SOLUTION
              .relabeled(1, "Interested in raising animals")) #SOLUTION
farming

### Question 2
Find the proportion of people interested in raising animals in the treatment and control groups.  **Find a way to produce a table containing these values with a single call to the `group` method.**  Name your table `proportions`.

*Hint:* If two people are 1s and three are 0s, then .2 of the people are 1s.  That's also the *average* of the array [1, 1, 0, 0, 0].

In [78]:
proportions = farming.group("Treatment", np.mean) #SOLUTION
proportions

### Question 3
What pattern do you observe in the data?

**SOLUTION:** Almost 5 percentage points more of the treatment group than the control group are interested in raising animals.

## Is there really an effect?

Individuals were assigned randomly to the treatment and control groups.  It is therefore possible that the pattern you described is due to chance.

### Question 4
Define appropriate null and alternative hypotheses that you could test to guard against this possibility.

**SOLUTION:** Null hypothesis: The treatment has no effect, so the difference in proportions is due to random assignment.  Alternative hypothesis: The treatment has an effect.

### Question 5
You should be able to test your null hypothesis by repeatedly taking samples of a certain size from `farming`.  What size should that be?  Calculate it using Python code.

In [79]:
sample_size = farming.where("Treatment", are.equal_to(1)).num_rows #SOLUTION
sample_size

In [80]:
_ = ok.grade('q5')

Now that we've determined the sample size, how can we run many simulations under the null hypothesis? One approach is to create 1 sample at a time, and then compute the test statistic for that sample.

Running simulations is a multi-step process.  Sometimes when writing code that takes multiple steps, it's useful to work backwards from the end.  Write code *assuming* that you've already defined another function that's useful.  Then you can write that function.

### Question 6
Write the function `sample_once` below. *Assume* that you have previously defined a function called `test_statistic_function`, which is a function that computes the test statistic.  Your function `sample_once` takes a single argument, which is an integer that specifies how large we want our sample to be. 

Think about whether you want to sample the `farming` table with or without replacement!

*Note:* Because `test_statistic_function` isn't defined, calling your function will result in an error.  We'll fix that next.

In [81]:
def sample_once(sample_size): 
    return test_statistic_function(farming.sample(sample_size, with_replacement=False)) #SOLUTION

In [83]:
_ = ok.grade('q6')

### Question 7

Now, define `test_statistic_function` function so that your `sample_once` function actually works! Think about what `test_statistic_function` should take in as an input.

In [89]:
# Write your test_statistic_function function here.
def test_statistic_function(tbl):
    return np.mean(tbl.column('Interested in raising animals')) #SOLUTION

In [90]:
np.round(test_statistic_function(farming), 3)

In [91]:
_ = ok.grade('q7')

### Question 8

Finally, define a function called `show_null_test_stats` that conducts `num_simulations` of `sample_size`. It should then use the results of the simulations in order to create a histogram.  The histogram should display the distribution of test statistics simulated under the null hypothesis.

If you're having trouble with this, take a look at the previous lectures, or take look at the extra section 8 worksheet.

In [95]:
def show_null_test_stats(num_simulations, sample_size):
    trys = Table().with_columns("Sample size", np.repeat(sample_size, num_simulations)) #SOLUTION
    simulations = trys.with_column("Proportion interested in raising animals", trys.apply(sample_once, "Sample size")) #SOLUTION
    simulations.hist("Proportion interested in raising animals", bins=35) #SOLUTION

### Question 9
Take a look at the histogram generated by running the cell below. What can you conclude? 

In [96]:
show_null_test_stats(10000, sample_size)
observed_test_stat = test_statistic_function(farming.where("Treatment", are.equal_to(1)))
plt.scatter(observed_test_stat, 0, color='red', s=30);

**SOLUTION:** The observed proportion interested in raising animals would be quite unlikely if the null hypothesis were true.  It seems reasonable to reject it.  It's not way out in the tail of the distribution, though, so we probably shouldn't claim that we're *very confident* in rejecting the null hypothesis.

In [97]:
# For your convenience, you can run this cell to run all the tests at once.
_ = ok.grade_all()

In [98]:
# Run this cell to submit your work.
# You can submit as many times as you want.  If you want us to grade a
# submission other than your most recent one, you can choose which submission
# is graded at https://okpy.org/cal/data8r/su17/ .

_ = ok.submit()