# Evaluating the impact of HISP: Instrumental Variables

In the design of HISP, there are two rounds of data on two groups of households: one group that enrolled in the program, and the another that did not. As in the case of the enrolled and non-enrolled groups, <b>we realized that we cannot simply compare the average health expentidures of the two groups beacuase of selection bias.</b> As we have data for two periods for wach household in the sample, we can use those data to solve some of these challenges by comparing the change in health expenditures for the two groups.

# Set up

### Lauching stata from the jupyter notebook

In [1]:
%%capture
import stata_setup
import os
os.chdir('C:\Program Files\Stata17/utilities')
from pystata import config
config.init('mp');

### Initial set up of log file and load data

In [2]:
%%capture
%%stata

clear
set more off, perm

# redirect to workplace
cd "C:\Users\USER\Desktop\Charlene\2022 Charlene at York\Evaluation of Health Policy\practical exercise"

# load data
use "evaluation.dta", clear


### Create(rename) variable for treatment effect evaluation 

In [3]:
%%capture
%%stata

# create generic variable (y)
clonevar y=health_expenditures 
label var y "out of pocket health expenditure pc/pa"
clonevar d=enrolled 
label var d "Treatment"

# Create global list of regressors
global xs "age_hh age_sp educ_hh educ_sp female_hh indigenous hhsize dirtfloor bathroom land hospital_distance" 

# Intrumental Variables from Text book

In [4]:
%%stata
describe locality_identifier household_identifier promotion_locality enrolled enrolled_rp


Variable      Storage   Display    Value
    name         type    format    label      Variable label
-------------------------------------------------------------------------------
locality_iden~r float   %9.0g                 Locality identifier
household_ide~r float   %9.0g                 Unique household identifier
promotion_loc~y float   %9.0g                 Household is located in promoted
                                                community (0=no, 1=yes)
enrolled        float   %9.0g                 HH enrolled in HISP (0=no, 1=yes)
enrolled_rp     float   %9.0g                 Household enrolled in HISP under
                                                the random promotion scenario
                                                (0=no, 1=yes)


In [5]:
%%stata
ttest health_expenditures if round==0, by(promotion_locality)


Two-sample t test with equal variances
------------------------------------------------------------------------------
   Group |     Obs        Mean    Std. err.   Std. dev.   [95% conf. interval]
---------+--------------------------------------------------------------------
       0 |   4,831    17.23795    .0814034    5.657973    17.07836    17.39754
       1 |   5,082    17.18535    .0774503    5.521289    17.03352    17.33719
---------+--------------------------------------------------------------------
Combined |   9,913    17.21099    .0561257    5.588098    17.10097      17.321
---------+--------------------------------------------------------------------
    diff |            .0525987    .1122917                -.167516    .2727133
------------------------------------------------------------------------------
    diff = mean(0) - mean(1)                                      t =   0.4684
H0: diff = 0                                     Degrees of freedom =     9911

    Ha: dif

In [7]:
%%stata
ttest health_expenditures if round==1, by(promotion_locality)


Two-sample t test with equal variances
------------------------------------------------------------------------------
   Group |     Obs        Mean    Std. err.   Std. dev.   [95% conf. interval]
---------+--------------------------------------------------------------------
       0 |   4,831    18.84538    .1558312    10.83111    18.53988    19.15088
       1 |   5,083    14.97152     .175731    12.52877    14.62701    15.31603
---------+--------------------------------------------------------------------
Combined |   9,914    16.85922    .1194185    11.89039    16.62513     17.0933
---------+--------------------------------------------------------------------
    diff |             3.87386    .2357366                3.411769    4.335952
------------------------------------------------------------------------------
    diff = mean(0) - mean(1)                                      t =  16.4330
H0: diff = 0                                     Degrees of freedom =     9912

    Ha: dif

In [6]:
%%stata
ttest enrolled_rp if round==1, by(promotion_locality)


Two-sample t test with equal variances
------------------------------------------------------------------------------
   Group |     Obs        Mean    Std. err.   Std. dev.   [95% conf. interval]
---------+--------------------------------------------------------------------
       0 |   4,831    .0842476    .0039966    .2777875    .0764123    .0920828
       1 |   5,083    .4920323    .0070129    .4999857     .478284    .5057806
---------+--------------------------------------------------------------------
Combined |   9,914    .2933226    .0045728     .455308     .284359    .3022862
---------+--------------------------------------------------------------------
    diff |           -.4077847    .0081809                -.423821   -.3917484
------------------------------------------------------------------------------
    diff = mean(0) - mean(1)                                      t = -49.8458
H0: diff = 0                                     Degrees of freedom =     9912

    Ha: dif

In [9]:
%%stata
ivreg health_expenditures (enrolled_rp = promotion_locality) if round==1, first


First-stage regressions
-----------------------

      Source |       SS           df       MS      Number of obs   =     9,914
-------------+----------------------------------   F(1, 9912)      =   2484.60
       Model |  411.879408         1  411.879408   Prob > F        =    0.0000
    Residual |  1643.13855     9,912  .165772654   R-squared       =    0.2004
-------------+----------------------------------   Adj R-squared   =    0.2003
       Total |  2055.01795     9,913  .207305352   Root MSE        =    .40715

------------------------------------------------------------------------------
 enrolled_rp | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
promotion_~y |   .4077847   .0081809    49.85   0.000     .3917484     .423821
       _cons |   .0842476   .0058578    14.38   0.000      .072765    .0957301
------------------------------------------------------------------------------



In [17]:
%%stata
ivreg health_expenditures (enrolled_rp = treatment_locality) if round==1, first


First-stage regressions
-----------------------

      Source |       SS           df       MS      Number of obs   =     9,914
-------------+----------------------------------   F(1, 9912)      =   7019.16
       Model |  851.950212         1  851.950212   Prob > F        =    0.0000
    Residual |  1203.06774     9,912  .121374873   R-squared       =    0.4146
-------------+----------------------------------   Adj R-squared   =    0.4145
       Total |  2055.01795     9,913  .207305352   Root MSE        =    .34839

------------------------------------------------------------------------------
 enrolled_rp | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
treatment_~y |   .5862903   .0069979    83.78   0.000     .5725729    .6000077
       _cons |  -2.11e-15   .0049498    -0.00   1.000    -.0097026    .0097026
------------------------------------------------------------------------------



In [18]:
%%stata
ivreg health_expenditures $xs (enrolled_rp = treatment_locality) if round==1, first


First-stage regressions
-----------------------

      Source |       SS           df       MS      Number of obs   =     9,914
-------------+----------------------------------   F(12, 9901)     =    818.72
       Model |  1023.53142        12  85.2942854   Prob > F        =    0.0000
    Residual |  1031.48653     9,901  .104180035   R-squared       =    0.4981
-------------+----------------------------------   Adj R-squared   =    0.4975
       Total |  2055.01795     9,913  .207305352   Root MSE        =    .32277

------------------------------------------------------------------------------
 enrolled_rp | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      age_hh |  -.0035159   .0003963    -8.87   0.000    -.0042928    -.002739
      age_sp |  -.0021934   .0004545    -4.83   0.000    -.0030844   -.0013024
     educ_hh |  -.0074264   .0014646    -5.07   0.000    -.0102973   -.0045556
 

# Instrumental Variables 

In [7]:
%%stata
drop if round==0

(9,913 observations deleted)


In [8]:
%%stata
regress health_expenditures enrolled $xs, robust


Linear regression                               Number of obs     =      9,914
                                                F(12, 9901)       =     496.87
                                                Prob > F          =     0.0000
                                                R-squared         =     0.4118
                                                Root MSE          =     9.1246

------------------------------------------------------------------------------
             |               Robust
health_exp~s | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
    enrolled |  -10.00504   .1758574   -56.89   0.000    -10.34976   -9.660324
      age_hh |   .0738534   .0134845     5.48   0.000     .0474209    .1002858
      age_sp |  -.0152703   .0153175    -1.00   0.319    -.0452958    .0147552
     educ_hh |   .0416857   .0419395     0.99   0.320    -.0405242    .1238956
     educ_sp |

In [9]:
%%stata
regress health_expenditures treatment_locality $xs, robust


Linear regression                               Number of obs     =      9,914
                                                F(12, 9901)       =     317.19
                                                Prob > F          =     0.0000
                                                R-squared         =     0.3443
                                                Root MSE          =     9.6343

------------------------------------------------------------------------------
             |               Robust
health_exp~s | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
treatment_~y |  -6.129552   .1937445   -31.64   0.000    -6.509331   -5.749773
      age_hh |   .1080116   .0140847     7.67   0.000     .0804027    .1356206
      age_sp |    .007992   .0161433     0.50   0.621    -.0236521    .0396361
     educ_hh |   .1126522   .0439669     2.56   0.010     .0264682    .1988363
     educ_sp |

In [10]:
%%stata 
regress enrolled treatment_locality $xs, robust


Linear regression                               Number of obs     =      9,914
                                                F(12, 9901)       =     987.32
                                                Prob > F          =     0.0000
                                                R-squared         =     0.5111
                                                Root MSE          =     .32036

------------------------------------------------------------------------------
             |               Robust
    enrolled | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
treatment_~y |   .5889406   .0064712    91.01   0.000     .5762557    .6016255
      age_hh |  -.0034289   .0004028    -8.51   0.000    -.0042184   -.0026393
      age_sp |  -.0023327   .0004669    -5.00   0.000    -.0032478   -.0014175
     educ_hh |  -.0070704   .0014737    -4.80   0.000    -.0099591   -.0041816
     educ_sp |

In [11]:
%%stata
ivregress 2sls health_expenditures $xs (enrolled=treatment_locality), vce(robust)


Instrumental variables 2SLS regression            Number of obs   =      9,914
                                                  Wald chi2(12)   =    3984.74
                                                  Prob > chi2     =     0.0000
                                                  R-squared       =     0.4116
                                                  Root MSE        =     9.1203

------------------------------------------------------------------------------
             |               Robust
health_exp~s | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
    enrolled |  -10.40776   .3110995   -33.45   0.000     -11.0175   -9.798016
      age_hh |   .0723248   .0135174     5.35   0.000     .0458311    .0988184
      age_sp |   -.016286   .0152964    -1.06   0.287    -.0462664    .0136945
     educ_hh |   .0390656   .0417476     0.94   0.349    -.0427581    .1208893
     educ_sp |

In [12]:
%%stata
estat endogenous


  Tests of endogeneity
  H0: Variables are exogenous

  Robust score chi2(1)            =   2.1799  (p = 0.1398)
  Robust regression F(1,9900)     =  2.18031  (p = 0.1398)


In [13]:
%%stata
estat firststage


  First-stage regression summary statistics
  --------------------------------------------------------------------------
               |            Adjusted      Partial       Robust
      Variable |   R-sq.       R-sq.        R-sq.     F(1,9901)   Prob > F
  -------------+------------------------------------------------------------
      enrolled |  0.5111      0.5105       0.4576       8282.64    0.0000
  --------------------------------------------------------------------------

