## Lab 7: Heterogenous treatment effects, Experiments

In [38]:
import numpy as np
from datascience import Table
%matplotlib inline

## Part 1. Donor Influence/Corruption, Continued

Let's continue to explore these idea of how money may influence political decisions in a slightly different way.

Imagine a company called InfluenceCo, who needs local governments to make favorable decisions for them to be able to run their business profitably. Perhaps they are a ride-sharing company who needs cities to allow them to operate.

We will consider the decisions made by mayors of a large number of cities where InfluenceCo wants to do business. As before, we will assume that mayors vary in their "pro-business" ideology, but also may make decisions based on political donations (or outright bribes).

InfluenceCo wants mayors to support a policy that would help their business. To simplify, let's assume the policy in question is entirely at the discretion of the mayor.

First suppose that the mayors are all dedicated public servants who will only support the policy if they think it is good for their city. Let's given them a pro-business ideology that ranges from 0 to 1, and put it in a table.

In [39]:
n_mayors = 2000
mayor_ideol = np.random.rand(n_mayors) 
mayor_dataA = Table().with_column("Ideology", mayor_ideol)

A natural assumption is that those with a higher ideology are more likely to support it. Here we will capture this by assuming that the utility for supporting the policy is:
$$
u_{support} = \text{ideology} - e
$$
where $e$ is a random number between 0 and 1 (use `np.random.rand` for this). The utility to not supporting is zero.

**Question 1.1. Write code to create a variables which correspond to the mayor utility for supporting the policy, and a variable equal to 1 if they choose to support it and 0 otherwise. Add the binary support variable to the `mayor_dataA` table with the name "Support".**

In [40]:
#Code for 1.1
usupport = mayor_ideol - np.random.rand(n_mayors)
mayor_dataA = mayor_dataA.with_column("Support", np.where(usupport > 0, 1, 0))
mayor_dataA

Ideology,Support
0.981585,1
0.549972,1
0.773637,1
0.468783,0
0.00981098,0
0.444716,0
0.25336,1
0.465499,0
0.128345,0
0.296152,0


Last week we considered a causal process where donations and politician behavior were confounded by their ideology. Now let's think of a process of reverse causation, where politician behavior causes donations.

Here is a data generating process which captures the idea that politicians are more likely to receive donations when they support policies that donors like. In particular, suppose the utility to donating is given by:
$$
u_{donate} = .5*\text{Support} + e - .8
$$
where Support$=1$ for mayors who support the policy. The utility to note donating is zero.

**Question 1.2. Write code to generate a variable which indicates whether a mayor gets a donation consistent with this assumption, and add it to `mayor_dataA` with the name "Donate".**

In [41]:
u_donate = .5*mayor_dataA.column("Support") + np.random.rand(n_mayors)
mayor_dataA = mayor_dataA.with_column("Donate", 1*(u_donate > .8))
mayor_dataA

Ideology,Support,Donate
0.981585,1,1
0.549972,1,0
0.773637,1,0
0.468783,0,0
0.00981098,0,0
0.444716,0,0
0.25336,1,1
0.465499,0,1
0.128345,0,0
0.296152,0,0


**Question 1.3. Give a short theoretical explanation for why politican support decisions might cause donations which is consistent with this code.**

Influenceco way want to reward politicians who vote the way they like, or help them get re-elected. In this data generating process, donors get a higher utility from giving to mayors who support the policy they like, and so are more likely to donate. It is the politician decision which cases the donation, not the other way around.

**Question 1.4. Compute the difference in the proportion of mayors who support the policy who receieved donations vs. not. Given this data generating process, does this difference reflect a causal effect of donations on voting behavior or selection bias (or both)? That is, do mayors who vote differently do so because of the donations, or are they just different for other reasons?**

In [42]:
#Code for 1.4
s1=np.mean(mayor_dataA.where("Donate",1).column("Support"))
s0=np.mean(mayor_dataA.where("Donate",0).column("Support"))
s1-s0

0.48369446134498406

Mayors who received a donation are almost 50\% more likely to support the policy. However, given the way we wrote the code this is not a causal effect but selection bias driven by reverse causation: the types who received donations were more likely to support the policy because support caused an increase in donations.

Throughout this lab we will be taking many of differences of means, and the table notation for this is a bit repetitive. So let's make a function that computes differences of means more efficiently. This will work for any "treatment" variable that takes on values 0 and 1.

In [43]:
def diffmean(outcome, treatment, data):
    mean1 = np.mean(data.where(treatment, 1).column(outcome))
    mean0 = np.mean(data.where(treatment, 0).column(outcome))
    return(mean1-mean0)

**Question 1.5. Compute the same difference of means from part 1.3 using the `diffmean` function. (Recall that for a variable that takes on values of 0 or 1, the mean gives us the proportion of 1s. So in this case, the `diffmean` function will return the difference in proportions.)**

In [44]:
# Code for 1.5
diffmean("Support", "Donate", mayor_dataA)

0.48369446134498406

Something missing in this process is the idea that mayors probably want to receive donations, and so if they anticipate receiving donations if they support a policy that could influence their decisions. (If your answer to 1.3 assumed the previous data generating process captured that you might want to change it!)

Here is one way to capture this idea. Let's create a variable called `b_don=.3` which reflects how much mayors like getting donations. The mayor utility for supporting the policy without a donation is the same as above, but let's now call this `u_support0`. Their utility for supporting the policy if they do get a donation is the utility with no donation plus `b_don`. The utility to not supporting is 0. Here is code for this:

In [45]:
b_don = .3
u_support0 = mayor_ideol - np.random.rand(n_mayors)
u_support1 = u_support0 + b_don
u_nosupport = 0

We will store the data from this process in a table called `mayor_dataB`. Let's start by making a copy of the previous mayors table but without the support and donate variables, which will be different in this version.

In [46]:
mayor_dataB = mayor_dataA.drop(["Support", "Donate"])

Now we want to create variables which indicate whether the mayor would support the policy with and without a donation. Recall they will support the policy without a donation if `usupport0` is greater than 0, and will support the policy wiht a donation if `usupport1` is greater than zero.  

**Question 1.6. Write code to create two variables and add them to `mayor_dataB`, the first of which indicate if the mayor prefers to support the policy if they don't get a donation (call this "Support0") and the second which indicates if the mayor prefers to support the policy if they do get a donation (call this "Support1").** 

In [47]:
# Code for 1.6
mayor_dataB = mayor_dataB.with_column("Support0", np.where(u_support0 > 0, 1, 0))
mayor_dataB = mayor_dataB.with_column("Support1", np.where(u_support1 > 0, 1, 0))
mayor_dataB

Ideology,Support0,Support1
0.981585,1,1
0.549972,1,1
0.773637,1,1
0.468783,0,1
0.00981098,0,0
0.444716,0,0
0.25336,0,1
0.465499,0,0
0.128345,1,1
0.296152,0,0


Now let's suppose the donors can anticipate who will support the policy with a donation, and want to reward politicians who do so (but also consider other factors too). The following code captures this idea (it assumes you named the variable correctly in the previous block!) Again, we are implicitly  assuming the "cost" to donating is .8.

In [48]:
u_donate2 = .5*mayor_dataB.column("Support1") + np.random.rand(n_mayors)
mayor_dataB = mayor_dataB.with_column("Donate", 1*(u_donate2 > .8))
mayor_dataB

Ideology,Support0,Support1,Donate
0.981585,1,1,0
0.549972,1,1,1
0.773637,1,1,0
0.468783,0,1,0
0.00981098,0,0,0
0.444716,0,0,1
0.25336,0,1,1
0.465499,0,0,0
0.128345,1,1,1
0.296152,0,0,0


**Question 1.7. Use the `np.where` function to create a variable which indicates the realized support decision (i.e., "Support0" with no donation and "Support1" with a donation), and add it to the table with the name "Support"**

In [49]:
# Code for 1.7
mayor_dataB = mayor_dataB.with_column("Support", np.where(mayor_dataB.column("Donate")==1, 
                                           mayor_dataB.column("Support1"),
                                           mayor_dataB.column("Support0")))

 What is the causal effect of the donation here? It is the difference between the support choice with a donation versus not, which will not necessarily be the same for each mayor. 
 
 **Question 1.8. Write code to create a variable and add it to `mayor_dataB` with the name "Causal".**

In [50]:
# Code for 1.8
mayor_dataB = mayor_dataB.with_column("Causal", mayor_dataB.column("Support1")-mayor_dataB.column("Support0"))
mayor_dataB

Ideology,Support0,Support1,Donate,Support,Causal
0.981585,1,1,0,1,0
0.549972,1,1,1,1,0
0.773637,1,1,0,1,0
0.468783,0,1,0,0,1
0.00981098,0,0,0,0,0
0.444716,0,0,1,0,0
0.25336,0,1,1,1,1
0.465499,0,0,0,0,0
0.128345,1,1,1,1,0
0.296152,0,0,0,0,0


**Question 1.9. What is the Average Treatment Effect of a donation here? It should be a number between 0 and 1. What does the Average Treatment Effect mean in words?**

In [51]:
# Code for 1.8
np.mean(mayor_dataB.column("Causal"))

0.25950000000000001

We can intepret this as meaning that mayors who anticipate receiving a donation are 25% more likely to support the policy. 

**Question 1.10 what is the Average Treatment Effect on the Treated? What does this mean in words? (Hint: it's should be somewhat different than your answer to 1.9)**

In [52]:
# Code for 1.10
np.mean(mayor_dataB.where("Donate",1).column("Causal"))

0.31384350816852968

This indicates that among those who actually recieved a donation, mayors are 30% more likely to support the policy if they anticipate receive a donation.

**Question 1.11 Write code to confirm that the Difference of Means is equal to the ATET plus selection bias. Remember you can use the `diffmean` function for the difference of means, and for the selection bias you want to compute the difference in the "Support0" variable among those who receive a donation versus those who do not.**

In [54]:
#Code for 1.11
dm = diffmean("Support", "Donate", mayor_dataB)
atet = np.mean(mayor_dataB.where("Donate",1).column("Causal"))
sb = np.mean(mayor_dataB.where("Donate",1).column("Support0")) - np.mean(mayor_dataB.where("Donate",0).column("Support0"))
dm, sb + atet

(0.57389378394565194, 0.57389378394565194)

## Part 2. A Fake Experiment

While it would probably be infeasible and unethical to run an experiment where we get donors to randomize which politicians they give money to, we can see how this would play out hypothetically with our simulated data. 

Let's again use the potential outcomes from the previous simulated data, putting it in a table called mayor_dataR (for "randomized")

In [25]:
mayor_dataR = mayor_dataB.select(["Ideology", "Support0", "Support1", "Causal"])
mayor_dataR

Ideology,Support0,Support1,Causal
0.304924,1,1,0
0.448708,0,0,0
0.880759,1,1,0
0.855725,1,1,0
0.402672,1,1,0
0.805248,1,1,0
0.139257,0,0,0
0.0283242,0,0,0
0.377752,0,1,1
0.717344,1,1,0


A simple way to randomly decide who gets a donation is to use the `np.random.rand` function, which again makes random numbers between 0 and 1. If we want to create a 10 random variables that have a 70% chance of being a 1 and a 30% chance of being 0, we can generate 10 random numbers between 0 and 1 and then ask if it they are above 0.3, and then use `np.where` to report either 1 or 0:

In [26]:
np.where(np.random.rand(20)>.3, 1,0)

array([1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0])

If we change the argument in the np.random.rand we can get more or fewer random numbers, and if we change the .3 we can affect the probability of being a 1. 

**Question 2.1. Write code which randomly says whether each mayor gets a donation or not (think of it as 1 coin flip), where each gets a donation with probability .5. Add this to the `mayor_dataR` table with the name "DonateR". (Hint: recall the number of mayors is `n_mayors`.**

In [27]:
# Code for 2.1
donateR = np.where(np.random.rand(n_mayors)>.5, 1,0)
mayor_dataR = mayor_dataR.with_column("DonateR", donateR)

**Question 2.2 Make a variable for the realized support choice with the random donation, and add it to `mayor_dataR` with the name "Support". (Hint: look back to problem 1.7)**

In [28]:
# Code for 2.2
mayor_dataR = mayor_dataR.with_column("Support", np.where(mayor_dataR.column("DonateR")==1, 
                                           mayor_dataR.column("Support1"),
                                           mayor_dataR.column("Support0")))

**Question 2.3 Compute the difference in mean of "Support" among those received a (random) donation versus not, and compare this to the Average Treatment Effect. Does the difference of means do a reasonable job of approximating the real average causal effect?**

In [29]:
# Code for 2.3
diffmean("Support", "DonateR", mayor_dataR)

0.25620495674945959

In [30]:
np.mean(mayor_dataR.column("Causal"))

0.24399999999999999

The difference of means when donations are random is quite close to the real difference of means, only off by about 1%.

## Part 3. A real Experiment

Now let's see how these ideas play out with some real data. Unlike with the simulations, we can't check whether there isn't any selection bias driven by differences in treatment and control groups, since in "real world" mode we don't get to know the potential outcomes. But with the confidence gained by our simulations, if the treatment is randomized, then we can assume that selection bias is likely to be small. We can also check that the treatment and control groups are otherwise similar.

The example we will use is from <a href="https://www.jstor.org/stable/26379536?seq=1#metadata_info_tab_contents">this</a> paper, which aimed to see if sending letters to political elites can encourage them to get women more involved in leadership positions.

Quoting directly from the article: "We ran a field experiment with the cooperation of a state Republican Party in a state with low levels of women's representation (Utah). Party leaders were concerned that although women comprised about half of the party activists who attended neighborhood caucus meetings, women typically accounted for only 20-25% of the delegates elected from these meetings to attend the state nominating convention. We randomly assigned over 2,000 precinct chairs to receive one of four letters from the state party chair prior to these neighborhood caucus meetings. The treatments were a neutral placebo control (Control), a request to recruit 2-3 women to run as state delegates (Supply), a request to read a letter at the precinct meeting encouraging attendees to elect more women as delegates (Demand), and a request to both recruit women and read the letter (Supply+Demand)."

Note that in this case, like in many real experiments, there is not just one treatment and one control. Since precinct chairs could get neither, one, or both, there are four "treatment statuses". We will see how these groups differ individually, as well as comparing "supply to no supply" and "demand to no demand".

The outcomes we will look at the the proportion of female delegates and whether there are any female delegates. Here are the variables in the data:
- unique_id: Precinct ID
- "treat": treatment variable, with four possibilities:
(1) 'control': control group
(2) 'supply': supply group; party chair instructed to recruit 2-3 women
(3) 'demand': demand group; party chair reads letter at precinct convention
(4) 'both': a fourth group getting both the supply and demand treatments; party chair instructed to read letter and to recruit 2-3 women
- "prop_sd_fem2014": Outcome: Proportion of 2014 elected state delegates from that precinct who were women
- "sd_onefem2014": 1 if at least one woman was selected; 0 otherwise
- "county": County name in Utah
- "pc_male": 1 if precinct chair is male; 0 otherwise (precinct chair is person who runs precinct meeting, would read letter if assigned to do so, etc.)

In [88]:
utah = Table.read_table("electing_women.csv")
utah

unique_id,treat,prop_sd_fem2014,sd_onefem2014,county,pc_male
27215,supply,0.0,0,Grand,1
27386,control,0.0,0,Grand,0
27496,control,1.0,1,Grand,1
16202,demand,1.0,1,Daggett,1
16241,control,0.5,1,Daggett,1
26601,control,0.0,0,Emery,1
27551,demand,0.0,0,Grand,1
67237,control,0.0,0,San Juan,1
69699,supply,0.0,0,Sevier,1
86949,supply,0.0,0,Wasatch,1


Let's see how many observations are in each treatment group.

In [89]:
utah.group("treat")

treat,count
both,427
control,435
demand,426
supply,446


**Question 3.1. Compute the average proportion of female delegates for the each of the four possible treatments.  Interpret the results**

In [91]:
# Code for 3.1
np.mean(utah.where("treat", "both").column("prop_sd_fem2014"))

0.28560276693208431

In [92]:
np.mean(utah.where("treat", "control").column("prop_sd_fem2014"))

0.24208812379310343

In [93]:
np.mean(utah.where("treat", "supply").column("prop_sd_fem2014"))

0.26158445585201795

In [94]:
np.mean(utah.where("treat", "demand").column("prop_sd_fem2014"))

0.26064162870892021

The highest proportion female is in the "both" group, and the lowest is in the control group. The averages for the treatments with only supply and only demand are in between.

If we want to estimate the effect of the supply and demand treatments individually, we can create variables that indicate whether they received the supply treatment (setting aside the demand treatment) and vice versa. Here is one way to do this, which uses the fact that if we add two boolean variables it will return "True" if either one of them is True. 

In [103]:
supply = np.where((utah.column("treat")=="supply") + (utah.column("treat")=="both"), 1, 0)
demand = np.where((utah.column("treat")=="demand") + (utah.column("treat")=="both"), 1, 0)
utah = utah.with_columns("Supply", supply, "Demand", demand)

**Question 3.2 Compute the difference in means for the proportion of female delegates for those who did and did not receive the demand treatment, and those who did and did not receive the supply treatment. You can use the `diffmean` function since there is now a binary treatment equal to 0 or 1.**

In [101]:
#Code for 3.2
diffmean("prop_sd_fem2014", "Supply", utah)

0.022064337032580206

In [102]:
diffmean("prop_sd_fem2014", "Demand", utah)

0.021178825615980446

**Question 3.3. While we can't learn the potential outcomes of what delegates would have been sent with a different treatment, we can check to see if those who received different treatments did happen to be different on other dimensions. One that might matter is whether the precinct chair was a male or female (recall this is in the "pc_male" variable). Use the ``diffmean`` function to check if precincts with a male chair send a higher or lower proportion of female delegates.**

In [104]:
# Code for 3.3
diffmean("prop_sd_fem2014", "pc_male", utah)

-0.10616735299764207

**Question 3.5. Now use the `diffmean` function to check whether precincts who recieved the "supply" teatment were more likely to have a male precinct chair, then do the same for the "demand" treatment.**

In [105]:
# Code for 3.5
diffmean("pc_male", "Supply", utah)

-0.019796368803157827

In [106]:
diffmean("pc_male", "Demand", utah)

0.010821125413011168

**Question 3.6. If a reader of  this study said "It's hard to say whether encouraging women to run is helpful, because male political leaders are more likely to do so and they also do other things which make women less likely to run", how could the authors respond? (Hint: think about the power of randomization in general, and what you found in the previous questions.)**

It is true that precincts with male chairs are less likely to send female delegates by about 10%, but since the study randomized whether women were enouranged to run there isn't a big difference in the proportion of male precinct chairs in the treatment and control group. So we can be pretty confident that this is not a confounding variable