# The Golden Standard

In the previous session, we saw why and how association is different from causation. We also saw what is required to make association be causation.

$
E[Y|T=1] - E[Y|T=0] = \underbrace{E[Y_1 - Y_0|T=1]}_{ATET} + \underbrace{\{ E[Y_0|T=0] - E[Y_0|T=1] \}}_{BIAS}
$


We saw that association is causation if there is no bias. And, there will be no bias if \\(E[Y_0|T=0]=E[Y_0|T=1]\\). In words, association will be causation if the treated and control equal unless for the treatment they receive. Or, in other words, when the outcome of the untreated is equal to the counterfactual outcome of the treated, that is, the outcome of the treated if they had not received the treatment.

I think we did a pretty good job explaining in math terms how to make association equal to causation. But that was only in theory. Now, we look at the first tool we have to make the bias vanish: **Randomised Experiments**. Randomised experiments consists at randomly assigning individuals in a population to the treatment or to a control group. Note that the proportion that receives the treatment doesn't have to be 50%. You could have an experiment where only 10% of your samples get the treatment.

Randomisation annihilate bias by making the potential outcomes independent of the treatment.

$
(Y_0, Y_1)\perp T
$

This can be confusing at first. If the outcome is independent of the treatment, doesn't it mean that the treatment has no effect? Well, yes! but notice I'm not talking about the outcomes. Rather, I'm talking about the **potential** outcomes. The potential outcomes talk about how the outcome **would have been** under the treatment (\\(Y_1\\)) or under the control (\\(Y_0\\)). So, saying that the potential outcomes are independent of the treatment is saying that they would be, in expectation, the same in the treatment or the control group. In simpler terms, it means that treated and control are comparable. Consequently, \\((Y_0, Y_1)\perp T\\) means that the treatment is the only thing that is generating a difference between the outcome in the treated and in the control. To see this, we can notice that independence implies precisely that that

$
E[Y_0|T=0]=E[Y_0|T=1]
$

Which, as we've seen, makes that

$
E[Y|T=1] - E[Y|T=0] = E[Y_1 - Y_0]=ATE
$

As an example, let's say you are responsible for the educational program of a school. You want to know what is the impact of switching from face to face class to online class. Fortunately, some economist did a randomised experiment where some classes where randomly assigned to have face-to-face lectures, others, to have only online lectures and a third group, to have a blended format of both online and face-to-face lectures. At the end of the semester, they collected data on an exam that is the same for all groups. 

Here is what the data looks like:

In [3]:
import pandas as pd
import numpy as np

data = pd.read_csv("./data/online_classroom.csv")
print(data.shape)
data.head()

(323, 10)


Unnamed: 0,gender,asian,black,hawaiian,hispanic,unknown,white,format_ol,format_blended,falsexam
0,0,0.0,0.0,0.0,0.0,0.0,1.0,0,0.0,63.29997
1,1,0.0,0.0,0.0,0.0,0.0,1.0,0,0.0,79.96
2,1,0.0,0.0,0.0,0.0,0.0,1.0,0,1.0,83.37
3,1,0.0,0.0,0.0,0.0,0.0,1.0,0,1.0,90.01994
4,1,0.0,0.0,0.0,0.0,0.0,1.0,1,0.0,83.3


We can see that we have 323 samples. It's not exacly big data, but is something we can work with. To estimate the causal effect, we can simply compute the mean score for each of the treatment groups.

In [4]:
(data
 .assign(class_format = np.select(
     [data["format_ol"].astype(bool), data["format_blended"].astype(bool)],
     ["online", "blended"],
     default="face_to_face"
 ))
 .groupby(["class_format"])
 .mean())

Unnamed: 0_level_0,gender,asian,black,hawaiian,hispanic,unknown,white,format_ol,format_blended,falsexam
class_format,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
blended,0.550459,0.217949,0.102564,0.025641,0.012821,0.012821,0.628205,0.0,1.0,77.093731
face_to_face,0.633333,0.20202,0.070707,0.0,0.010101,0.0,0.717172,0.0,0.0,78.547485
online,0.542553,0.228571,0.028571,0.014286,0.028571,0.0,0.7,1.0,0.0,73.635263


Yup. It's that simple. We can see that face to face classes yield a 78.54 average score, while online classes yield a 73.63 average score. The \\(ATE\\) for only class is thus -4.91. This means that online classes causes students to perform about 5 points lower, on average. That is it. You don't need to worry that online classes might have poorer students that can't afford face to face classes or, for that mater, you don't have to worry that the students from the different treatments are different in anyway other than the treatment they received. By design, the random experiment is made to wipe out those diferences. 

For this reason, a good sanity check to see if the randomisation was done right (or if you are looking at the right data), it is always a good idea to check if the treated are equal do the control in pre-treatment variables. In our data, we have information on gender and ethnicity, so we can see if they are equal across groups. For the `gender`, `asian`, `hispanic` and `white` variables, we can say that they look pretty similar. The `black` variable, however, looks a little bit different. This draws attention to what happens with small dataset. Even under randomisation, it could be that, by chance, one group is different than another. In large samples, this difference tends to desapear.

## The Ideal Experiment

Randomised experiments is the most reliable way to get causal effects. If we could, they would be all we would ever do to uncover causality. Unfortunately, they tend to be either very expensive or just plain unethical. Sometime, we simply can't control the assignment mechanism. 

Imagine yourself as a physician trying to estimate the effect of smoking during pregnancy on baby weight at birth. You can't simply force a random portion of moms to smoke during pregnancy. Or say you work for a big bank and you need to estimate the impact of the credit line on customer churn. It would be simply too expensive to give random credit lines to your customers. Or that you want to understand the impact of increasing minimun wage on unemployment. You can't simply assign countries to have one or another minimum wage.

We will later see how to lower the randomisation cost by using conditional randomisation, but there is nothing we can do about unethical or unfesble experiments. Still, whenever we deal with causal questions, it is worth thinking about the **ideal experiment**. Always ask yourself, if you could, **what would be the ideal experiment you would run to uncover this causal effect?**. This tends to shed some light in the way of how we can uncover the causal effect even without the ideal experiment.

## The Assignment Mechanism

In a randomised experiment, the mechanism that assigns unit to one treatment or the other is, well, random. As we will see latter, all causal inference techniques will somehow try to identify the assignment mechanisms of the treatments. When we know for sure how this mechanism behaves, causal inference will be much more certain, even if the as assignment mechanism isn't random.

Unfortunatly, the assignment mechanism can't be discovered by simply looking at the data. For example, if you have a dataset where higher education correlate with wealth, you can't know for sure which one caused wich by just looking at the data. You will have to use your knowladge about how the world work to argument in favor of a plausible assignment mechanism: is it the case that schools educate people, making them more productive and hence leading them to higher paying jobs. Or, if you are pessimistic about education, you can say that schools does nothing to increase productivity and this is just a spurious correlation because only wealthy familys can aford to have a kind doing a higher degree.

In causal question, we usually have the possibility to argue in both way: that X causes Y, or that it is a third variable Z that causes both X and Y, and hence the X and Y correlation is just spurious. It is for this reason that knowing the assignment mechanism leads to a much more convincing causal answer. 

## Key Ideas

We looked at how randomised experiment is the simplest and most effective way to uncover causal impact. It does this by making the treatment and control group comparable. Unfortunalty, we can't do randomised experiments all the time, but it is still usefull to think about what is the ideal experiment we would do if we could.

Someone that is familiar with statistics might be protesting right now that I didn't look at the variance of my causal effect estimate. How can I know that those decrease in 4.91 points in online classes are not due to chance?
In other words, how can't know if they are statistically significant? And they would be right. Don't worry. I intend to review some statistical concepts next. 

## References

I like to think of this entire series is a tribute to Joshua Angrist, Alberto Abadie and Christopher Walters for their amazing Econometrics class. Most of the ideas here are taken from their classes at the American Economic Association. Watching them is what is keeping me sane during this though year of 2020.
* [Cross-Section Econometrics](https://www.aeaweb.org/conference/cont-ed/2017-webcasts)
* [Mastering Mostly Harmless Econometrics](https://www.aeaweb.org/conference/cont-ed/2020-webcasts)

I'll also like to reference the amazing books from Angrist. They have shown me that Econometrics, or 'Metrics as they call it, is not only extremely useful but also profoundly fun.

* [Mostly Harmless Econometrics](https://www.mostlyharmlesseconometrics.com/)
* [Mastering 'Metrics](https://www.masteringmetrics.com/)

My final reference is Miguel Hernan and Jamie Robins' book. It has been my trustworthy companion in the most thorny causal questions I had to answer.

* [Causal Inference Book](https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/)

The data used here if from a study of Alpert, William T., Kenneth A. Couch, and Oskar R. Harmon. 2016. ["A Randomized Assessment of Online Learning"](https://www.aeaweb.org/articles?id=10.1257/aer.p20161057). American Economic Review, 106 (5): 378-82.