In [5]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import statsmodels.api as sm
import statsmodels.formula.api as smf

In [7]:
AB = pd.read_csv('charitable_giving.csv')
AB.head()

Unnamed: 0,donation_amount,donation_dummy,control,treatment,match_ratio,ratio1,ratio2,ratio3,red_state_dummy,months_since_last_donation,highest_previous_donation,prior_donations_num
0,0.0,0.0,0.0,1.0,1.0,1,0.0,0.0,1.0,19.0,500.0,32.0
1,0.0,0.0,1.0,0.0,0.0,0,0.0,0.0,1.0,29.0,300.0,22.0
2,0.0,0.0,1.0,0.0,0.0,0,0.0,0.0,1.0,3.0,500.0,22.0
3,0.0,0.0,0.0,1.0,3.0,0,0.0,1.0,0.0,4.0,250.0,29.0
4,0.0,0.0,0.0,1.0,2.0,0,1.0,0.0,0.0,8.0,50.0,17.0


# Part 1: Understanding Table 1

### Question 1 Answer 
We can run a regression of the form: 
$$Y_i = \beta_0 +\beta_1 X_i + \varepsilon_i,$$
where $Y_i$ are the moths since last donation and $X_i$ is one if there is treatment and zero otherwise. Under this specification, the coefficient $\beta_0$ indicates the average moths since last donation in the control group; while $\beta_1$ is the difference in average moths since last donation between treatment and control. 

In [8]:
AB_month_reg  = smf.ols(formula = 'months_since_last_donation ~ treatment',data=AB) 
#AB_month_reg_result = AB_month_reg.fit(cov_type='hc0')
AB_month_reg_result = AB_month_reg .fit()
print(AB_month_reg_result.summary())

                                OLS Regression Results                                
Dep. Variable:     months_since_last_donation   R-squared:                       0.000
Model:                                    OLS   Adj. R-squared:                 -0.000
Method:                         Least Squares   F-statistic:                   0.01428
Date:                        Tue, 18 Jan 2022   Prob (F-statistic):              0.905
Time:                                23:22:37   Log-Likelihood:            -1.9585e+05
No. Observations:                       50082   AIC:                         3.917e+05
Df Residuals:                           50080   BIC:                         3.917e+05
Df Model:                                   1                                         
Covariance Type:                    nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
---------------------------------------------------

The intercept ({{AB_month_reg_result.params[0]}}) represents the average value for the control group. The treatment coefficient ({{AB_month_reg_result.params[1]}}) represents the difference between treatment and control group. Table 1 of the paper reports the averages for each group rather than the difference. 


### Question 2 Answer 

The difference is not statistically significantly different because the p-value is greater than 0.05. This is as expected because treatment was randomly assigned.

### Question 3 Answer 

Table 1 represents a randomization check. It shows that variables listed in the table do not differ significantly between the treatment and control groups.

# Part 2: Response Rate Regressions

### Question 1 Answer 

We run the following regression 
$$
\text{donation_dummy}_i = \beta_0 +\beta_1\text{treatment_dummy}_i +\varepsilon_i
$$

In [42]:
AB_dd_reg  = smf.ols(formula = 'donation_dummy ~ treatment',data = AB) 
#AB_dd_reg_result = AB_dd_reg.fit(cov_type='hc0')
AB_dd_reg_result = AB_dd_reg.fit()
print(AB_dd_reg_result.summary())

                            OLS Regression Results                            
Dep. Variable:         donation_dummy   R-squared:                       0.000
Model:                            OLS   Adj. R-squared:                  0.000
Method:                 Least Squares   F-statistic:                     9.618
Date:                Tue, 11 Jan 2022   Prob (F-statistic):            0.00193
Time:                        15:05:37   Log-Likelihood:                 26630.
No. Observations:               50083   AIC:                        -5.326e+04
Df Residuals:                   50081   BIC:                        -5.324e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      0.0179      0.001     16.225      0.0

Intercept ({{AB_dd_reg_result.params[0]}}) represents the average response rate in the control group. Treatment coefficient ({{AB_dd_reg_result.params[1]}}) captures the percentage point difference in the response rate in treatment versus control. The first two columns of Table 2a report the average response rates in the treatment and control groups.

### Question 2 Answer 

We run the following regression 
$$
\text{donation_dummy}_i = \beta_0 +\beta_1\text{ratio1}_i + \beta_2\text{ratio2}_i +\beta_3\text{ratio3}_i +\varepsilon_i
$$

In [43]:
AB_mratio_reg    = smf.ols(formula = 'donation_dummy ~ ratio1 + ratio2 +  ratio3',data = AB) 
#AB_mratio_reg_result = AB_mratio_reg.fit(cov_type='hc0')
AB_mratio_reg_result = AB_mratio_reg.fit()
print(AB_mratio_reg_result.summary())

                            OLS Regression Results                            
Dep. Variable:         donation_dummy   R-squared:                       0.000
Model:                            OLS   Adj. R-squared:                  0.000
Method:                 Least Squares   F-statistic:                     3.665
Date:                Tue, 11 Jan 2022   Prob (F-statistic):             0.0118
Time:                        15:05:37   Log-Likelihood:                 26630.
No. Observations:               50083   AIC:                        -5.325e+04
Df Residuals:                   50079   BIC:                        -5.322e+04
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      0.0179      0.001     16.225      0.0

Intercept represents average response rate in the control group. The match ratio dummies represent difference in response rate for each match ratio group relative to the control group.

### Question 3 Answer 

The response rate difference between the 1:1 and 2:1 match ratio groups is {{AB_mratio_reg_result.params[2]-AB_mratio_reg_result.params[1]}}. Simply take the difference between the 1:1 and 2:1 coefficients from the previous regression.

### Question 4 Answer 

Matching leads to more donations, but increasing the match ratio beyond 1:1 has relatively little impact (there is a smaller effect when increasing from 1:1 to 2:1 and almost no further effect when increasing from 2:1 to 3:1).

# Part 3: Response Rates in Red/Blue States

### Question 1 Answer 

In [44]:
AB_dd_blue_reg = smf.ols(formula="donation_dummy ~ treatment", data=AB[AB["red_state_dummy"]==0])
AB_dd_blue_reg_result = AB_dd_blue_reg.fit()
print(AB_dd_blue_reg_result.summary())

                            OLS Regression Results                            
Dep. Variable:         donation_dummy   R-squared:                       0.000
Model:                            OLS   Adj. R-squared:                 -0.000
Method:                 Least Squares   F-statistic:                    0.3567
Date:                Tue, 11 Jan 2022   Prob (F-statistic):              0.550
Time:                        15:05:39   Log-Likelihood:                 15783.
No. Observations:               29806   AIC:                        -3.156e+04
Df Residuals:                   29804   BIC:                        -3.155e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      0.0200      0.001     14.085      0.0

In [45]:
AB_dd_red_reg = smf.ols(formula="donation_dummy ~ treatment", data=AB[AB["red_state_dummy"]==1])
AB_dd_red_reg_result = AB_dd_red_reg.fit()
print(AB_dd_red_reg_result.summary())

                            OLS Regression Results                            
Dep. Variable:         donation_dummy   R-squared:                       0.001
Model:                            OLS   Adj. R-squared:                  0.001
Method:                 Least Squares   F-statistic:                     17.24
Date:                Tue, 11 Jan 2022   Prob (F-statistic):           3.31e-05
Time:                        15:05:39   Log-Likelihood:                 10839.
No. Observations:               20242   AIC:                        -2.167e+04
Df Residuals:                   20240   BIC:                        -2.166e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      0.0146      0.002      8.398      0.0

Treatment is more effective in red states because the treatment coefficient is larger for red states ({{AB_dd_red_reg_result.params[1]}}) than for blue states ({{AB_dd_blue_reg_result.params[0]}}).

### Question 2 Answer 



Yes, sub-sample treatment coefficients do have a causal interpretation because treatment is still random in each sub-sample. However, the difference in treatment effects cannot be causally attributed to the political leaning of states because political orientation is not randomly assigned.

# Part 4: Response Rates and Donation Amount

### Question 1 Answer 

We run the following regression 
$$
\text{donation_amount}_i = \beta_0 +\beta_1\text{treatment_dummy}_i +\varepsilon_i
$$

In [46]:
AB_amount_reg  = smf.ols(formula="donation_amount ~ treatment", data=AB)
AB_amount_reg_result = AB_amount_reg.fit()
print(AB_amount_reg_result.summary())

                            OLS Regression Results                            
Dep. Variable:        donation_amount   R-squared:                       0.000
Model:                            OLS   Adj. R-squared:                  0.000
Method:                 Least Squares   F-statistic:                     3.461
Date:                Tue, 11 Jan 2022   Prob (F-statistic):             0.0628
Time:                        15:05:41   Log-Likelihood:            -1.7946e+05
No. Observations:               50083   AIC:                         3.589e+05
Df Residuals:                   50081   BIC:                         3.589e+05
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      0.8133      0.067     12.063      0.0

Treated donors, on average, donate $\$0.15$ dollars more than control-group donors, who give $\$0.81$ on average. This difference has a causal interpretation because the regression is based on the full sample where treatment is randomly assigned.

### Question 2 Answer 



In [47]:
AB_conditional_amount_reg   = smf.ols(formula="donation_amount ~ treatment", data=AB[AB["donation_dummy"]==1])
AB_conditional_amount_reg_result = AB_conditional_amount_reg.fit()
print(AB_conditional_amount_reg_result.summary())

                            OLS Regression Results                            
Dep. Variable:        donation_amount   R-squared:                       0.000
Model:                            OLS   Adj. R-squared:                 -0.001
Method:                 Least Squares   F-statistic:                    0.3374
Date:                Tue, 11 Jan 2022   Prob (F-statistic):              0.561
Time:                        15:05:42   Log-Likelihood:                -5326.8
No. Observations:                1034   AIC:                         1.066e+04
Df Residuals:                    1032   BIC:                         1.067e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept     45.5403      2.423     18.792      0.0

Untreated donors give $\$45.54$ on average. Treated donors give $\$1.67$ less than that. Treatment does not have a causal interpretation because the respondents select whether to donate or not. Therefore, treatment is not randomly assigned within this sub-sample.