# Causal Inference Homework 2
## Alex Pine (akp258@nyu.edu)

### Problem 1 part i

Imagine an experiment where a researcher is comparing the SAT scores of charter school students to those of traditional public school students. The researcher wants to determine if charter schools cause its students to have higher SAT scores. In this experiment, students may choose to take part in a lottery in which the winners will be assigned to a charter school. People who do not enter the lottery, or do not win the lottery, are put into a traditional public school. Let's also imagine that students who inherentely prefer charter schools are more likely to sign up for the charter school lottery than those who have no preference, and are more likely to do well on the SAT if they attend a charter school.

Let $Y^0$ be the SAT score of each student when they are not admitted to a charter school, and let $Y^1$ be SAT score of each student when they are admitted into a charter school.

$ D = 1 $ if a student chooses to enter the lottery and wins, and $ D = 0 $ otherwise. 

$ Y^0 \perp D $, since knowing a student's potential public school SAT score does not tell you anything about if they applied or were selected to go to a charter school. $ Y^1 \perp D $ for corresponding reasons.

However, $ (Y^0, Y^1) \not\perp D $, since the difference between $ Y^1 - Y^0 > 0 $ makes it more likely that the student inherently prefers charter schools. That is, $ Y^1 - Y^0 > 0 $ makes $ D = 1 $ more likely than $ D = 0 $. 


### Problem 1 part ii

In this scenerio, we could use the fact that $ Y^0 \perp D $ and $ Y^1 \perp D $ to compute an the ATE from the naive estimator $ E[Y^*|D=1] - E[Y^*|D=0] $, since 

\begin{align} 
E[Y^*|D=1] - E[Y^*|D=0] &= \\
E[Y^1|D=1] - E[Y^0|D=0] &= \\
E[Y^1] - E[Y^0] &= E[\delta] 
\end{align}

The same is true for the ATT $ E[\delta|D=1] = E[Y^1|D=1] - E[Y^0|D=1] $, since

\begin{align} 
E[Y^*|D=1] - E[Y^*|D=0] &= \\
E[Y^1|D=1] - E[Y^0|D=0] &= \\
E[Y^1|D=1] - E[Y^0|D=1] &= E[\delta|D=1] 
\end{align}

The same is also true for the ATC $ E[\delta|D=0] = E[Y^1|D=0] - E[Y^0|D=0] $, since

\begin{align} 
E[Y^*|D=1] - E[Y^*|D=0] &= \\
E[Y^1|D=1] - E[Y^0|D=0] &= \\
E[Y^1|D=0] - E[Y^0|D=0] &= E[\delta|D=0] 
\end{align}

### Problem 2

let $Y = (Y^0, Y^1)$. Unconfoundedness ($Y \perp D | X$) gives us $p(Y,D|X) = p(Y|X)p(D|X)$. 

Since $X \perp D$, $p(Y|X)p(D|X) = p(Y|X)p(D)$. 

Using Bayes' rule, $p(Y,D|X) = \frac{p(Y,D)}{p(X)} $.

Equating the previous two expressions, we get $\frac{p(Y,D)}{p(X)} = p(Y|X)p(D)$

Multiplying through by $p(X)$ and using Bayes' rule again, we get $p(Y,D) = p(Y)p(D)$, which implies $Y \perp D$.

### Problem 3

#### (i) Compute the average treatment effect assuming full randomization
If we assume full randomization of the assignment mechanism, the naive estimator is an unbiased estimator of the average treatment effect.

In [3]:
treatedData = "/Users/pinesol/causal/hw2/nsw_treated.csv"
controlData = "/Users/pinesol/causal/hw2/nsw_control.csv"

data1 = read.table(treatedData, header=TRUE, sep=',')
data0 = read.table(controlData, header=TRUE, sep=',')
names(data1) = c("treatment indicator", "age", "education", "Black", "Hispanic", 
                 "married", "nodegree", "earnings in 1975", "earnings in 1978")
names(data0) = c("treatment indicator", "age", "education", "Black", "Hispanic", 
                 "married", "nodegree", "earnings in 1975", "earnings in 1978")

In [4]:
# earnings diff for those who were treated
y1 = data1[["earnings in 1978"]] - data1[["earnings in 1975"]]
# earnings diff for those who were not treated
y0 = data0[["earnings in 1978"]] - data0[["earnings in 1975"]]
# take the difference between their averages
ate = mean(y1) - mean(y0)
print(paste("Average Treatment Effect: ", ate))

[1] "Average Treatment Effect:  846.88828668576"


The average treatment effect, assuming a fully random assignment mechanism, is 846.89.

#### Partition the data by martial status.

Code is below:

In [17]:
data1_married = data1[data1[, "married"] == 1,, drop=FALSE]
data1_unmarried = data1[data1[, "married"] == 0,, drop=FALSE]

data0_married = data0[data0[, "married"] == 1,, drop=FALSE]
data0_unmarried = data0[data0[, "married"] == 0,, drop=FALSE]

#### (ii) Argue why it may make sense to make this weaker assumption rather than the stronger assumption of full randomization.

Computing an unbiased ATE estimator does not require $Y^0$ and $Y^1$ to be jointly independent of $D$. All you need is for each of them to be individually dependent of $D$, and the naive estimator is unbiased, as illustrated here:

Naive ATE = $E[Y|D=1] - E[Y|D=0] = E[Y^1|D=1] - E[Y^0|D=0] = E[Y^1] - E[Y^0] = $ ATE.

Given this, there is no need to make the stronger assumption for this data.

#### (iii) Compute the propensity score.

The propensity score is the probability of being in the treatment group conditioned on being a covariate. So join the data, find all the ones who are married, and see what percent were treated.

In [18]:
num_married_chosen = sum(data1_married["treatment indicator"] == 1)
num_married_notchosen = sum(data0_married["treatment indicator"] == 0)
prop_score_married = num_married_chosen / (num_married_chosen + num_married_notchosen)

num_unmarried_chosen = sum(data1_unmarried["treatment indicator"] == 1)
num_unmarried_notchosen = sum(data0_unmarried["treatment indicator"] == 0)
prop_score_unmarried = num_unmarried_chosen / (num_unmarried_chosen + num_unmarried_notchosen)

print(paste("Propensity score for married individuals: ", prop_score_married))
print(paste("Propensity score for unmarried individuals: ", prop_score_unmarried))

[1] "Propensity score for married individuals:  0.427350427350427"
[1] "Propensity score for unmarried individuals:  0.408264462809917"


As calculated above, the propensity score is 0.43 for married individuals, and 0.41 for unmarried individuals.

#### (iv) Compute the average treatment effect assuming unconfoundedness

Using $ATE = E[ATE[X]] = p(x = 1)ATE(x = 1) + p(x = 0)ATE(x = 0)$, where $x$ is maritial status.


In [28]:
num_married = nrow(data1_married) + nrow(data0_married)
num_unmarried = nrow(data1_unmarried) + nrow(data0_unmarried)
p_married = num_married / (num_married + num_unmarried)
p_unmarried = 1 - p_married

y1_married = data1_married[["earnings in 1978"]] - data1_married[["earnings in 1975"]]
y0_married = data0_married[["earnings in 1978"]] - data0_married[["earnings in 1975"]]
ate_married = mean(y1_married) - mean(y0_married)
print(paste("Average Treatment Effect (married): ", ate_married))

y1_unmarried = data1_unmarried[["earnings in 1978"]] - data1_unmarried[["earnings in 1975"]]
y0_unmarried = data0_unmarried[["earnings in 1978"]] - data0_unmarried[["earnings in 1975"]]
ate_unmarried = mean(y1_unmarried) - mean(y0_unmarried)
print(paste("Average Treatment Effect (unmarried): ", ate_unmarried))

ate = p_married*ate_married + p_unmarried*ate_unmarried
print(paste("Average Treatment Effect: ", ate))

[1] "Average Treatment Effect (married):  3767.20496677612"
[1] "Average Treatment Effect (unmarried):  302.81488260964"
[1] "Average Treatment Effect:  864.218815916396"


As shown above, the average treatment effect, after conditioning on martial status, was 864.22.