### Background

The data set for this exercise comes from the paper by David Card and Alan Krueger
“Minimum Wages and Employment: A Case-Study of the Fast-Food Industry in New
Jersey and Pennsylvania”, published in the American Economic Review, September
1994, vol. 84. This can be downloaded from JSTOR using
http://links.jstor.org/sici?sici=0002-8282%28199409%2984%3A4%3C772%3AMWAEAC%3E2.0.CO%3B2-O

In [1]:
import pandas as pd
import ipystata
import matplotlib

In [5]:
%%stata
cd "C:\Users\aslop\STATA"

C:\Users\aslop\STATA



In [6]:
%%stata
use "C:\Users\aslop\STATA\DinD_ex.dta", clear

### QUESTION 1: Use mean differences to compute the difference in means estimate of the change in minimum wage.

In [7]:
%%stata
codebook, compact


Variable   Obs Unique       Mean    Min  Max  Label
--------------------------------------------------------------------------
sheet      698    349   245.9456      1  522  sheet number (unique stor...
fte        698    106   17.78403      3   80  
nj         698      2   .8137536      0    1  
after      698      2         .5      0    1  
njafter    698      2   .4068768      0    1  
dfte       349    101  -.1511461  -43.5   26  
--------------------------------------------------------------------------



In [8]:
%%stata
browse

In [77]:
%%stata

mat T = J(3,6,.)

use "C:\Users\aslop\STATA\DinD_ex.dta", clear
drop if after == 1 

ttest fte, by(nj)
mat T[1,1] = r(mu_1)
mat T[2,1] = r(mu_2)
mat T[3,1] = r(mu_2) - r(mu_1)

use "C:\Users\aslop\STATA\DinD_ex.dta", clear
drop if after == 0 

ttest fte, by(nj)
mat T[1,2] = r(mu_1)
mat T[2,2] = r(mu_2)
mat T[3,2] = r(mu_2) - r(mu_1)


mat T[1,3] = T[1,1] - T[1,2]
mat T[2,3] = T[2,1] - T[2,2]
mat T[3,3] = T[3,1] - T[3,2]


mat rownames T = PA NJ Difference

frmttable using ttest.doc, statmat(T) varlabels replace ctitle("", before, after, Difference)


(349 observations deleted)

Two-sample t test with equal variances
------------------------------------------------------------------------------
   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       0 |      65        20.3     1.50888    12.16498    17.28567    23.31433
       1 |     284    17.30106    .5267727    8.877331    16.26417    18.33795
---------+--------------------------------------------------------------------
combined |     349     17.8596    .5152968    9.626539    16.84611    18.87309
---------+--------------------------------------------------------------------
    diff |            2.998944    1.315724                .4111455    5.586742
------------------------------------------------------------------------------
    diff = mean(0) - mean(1)                                      t =   2.2793
Ho: diff = 0                                     degrees of fre

### an alternative way

In [54]:
%%stata
ssc install diff

checking diff consistency and verifying not already installed...
installing into c:\ado\plus\...
installation complete.



In [60]:
%%stata
use "C:\Users\aslop\STATA\DinD_ex.dta", clear
ssc install diff
diff fte, t(nj) p(after) robust  


DIFFERENCE-IN-DIFFERENCES ESTIMATION RESULTS
Number of observations in the DIFF-IN-DIFF: 698
            Before         After    
   Control: 65             65          130
   Treated: 284            284         568
            349            349
--------------------------------------------------------
 Outcome var.   | fte     | S. Err. |   |t|   |  P>|t|
----------------+---------+---------+---------+---------
Before          |         |         |         | 
   Control      | 20.300  |         |         | 
   Treated      | 17.301  |         |         | 
   Diff (T-C)   | -2.999  | 1.591   | -1.88   | 0.060*
After           |         |         |         | 
   Control      | 18.254  |         |         | 
   Treated      | 17.584  |         |         | 
   Diff (T-C)   | -0.670  | 1.093   | 0.61    | 0.540
                |         |         |         | 
Diff-in-Diff    | 2.329   | 1.931   | 1.21    | 0.228
--------------------------------------------------------
R-square:    0.01
* 

### Question 2: Estimate the difference-in-difference using a regression model in differences

In [86]:
%%stata
use "C:\Users\aslop\STATA\DinD_ex.dta", clear
reg dfte njafter nj after


note: nj omitted because of collinearity
note: after omitted because of collinearity

      Source |       SS           df       MS      Number of obs   =       349
-------------+----------------------------------   F(1, 347)       =      3.91
       Model |  286.841779         1  286.841779   Prob > F        =    0.0489
    Residual |  25485.8728       347  73.4463192   R-squared       =    0.0111
-------------+----------------------------------   Adj R-squared   =    0.0083
       Total |  25772.7145       348  74.0595245   Root MSE        =    8.5701

------------------------------------------------------------------------------
        dfte |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     njafter |   2.328724   1.178371     1.98   0.049     .0110768    4.646372
          nj |          0  (omitted)
       after |          0  (omitted)
       _cons |  -2.046154   1.062988    -1.92   

In [87]:
%%stata
use "C:\Users\aslop\STATA\DinD_ex.dta", clear
reg dfte njafter nj after, robust


note: nj omitted because of collinearity
note: after omitted because of collinearity

Linear regression                               Number of obs     =        349
                                                F(1, 347)         =       2.51
                                                Prob > F          =     0.1142
                                                R-squared         =     0.0111
                                                Root MSE          =     8.5701

------------------------------------------------------------------------------
             |               Robust
        dfte |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     njafter |   2.328724   1.470425     1.58   0.114    -.5633425    5.220791
          nj |          0  (omitted)
       after |          0  (omitted)
       _cons |  -2.046154   1.395098    -1.47   0.143    -4.790066    .6977583
------------

### QUESTION 3: Now estimate the following model in levels, i.e., with the left-hand-side variable in levels.

In [89]:
%%stata
use "C:\Users\aslop\STATA\DinD_ex.dta", clear
reg fte njafter nj after


      Source |       SS           df       MS      Number of obs   =       698
-------------+----------------------------------   F(3, 694)       =      2.09
       Model |  503.456802         3  167.818934   Prob > F        =    0.1004
    Residual |  55766.2976       694  80.3548957   R-squared       =    0.0089
-------------+----------------------------------   Adj R-squared   =    0.0047
       Total |  56269.7544       697  80.7313549   Root MSE        =    8.9641

------------------------------------------------------------------------------
         fte |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     njafter |   2.328724   1.743083     1.34   0.182    -1.093624    5.751072
          nj |  -2.998944   1.232546    -2.43   0.015    -5.418909   -.5789781
       after |  -2.046154   1.572405    -1.30   0.194    -5.133396    1.041088
       _cons |       20.3   1.111858    18.26   0.

In [90]:
%%stata
use "C:\Users\aslop\STATA\DinD_ex.dta", clear
reg fte njafter nj after, robust


Linear regression                               Number of obs     =        698
                                                F(3, 694)         =       1.32
                                                Prob > F          =     0.2682
                                                R-squared         =     0.0089
                                                Root MSE          =     8.9641

------------------------------------------------------------------------------
             |               Robust
         fte |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     njafter |   2.328724   1.930761     1.21   0.228     -1.46211    6.119558
          nj |  -2.998944   1.591452    -1.88   0.060    -6.123581    .1256939
       after |  -2.046154   1.788875    -1.14   0.253     -5.55841    1.466103
       _cons |       20.3   1.501537    13.52   0.000      17.3519     23.2481
--------------

### QUESTION 4: Now estimate the levels model from question 3 but cluster on sheet. How do the standard errors change?

In [7]:
%%stata
use "C:\Users\aslop\STATA\DinD_ex.dta", clear
reg fte njafter nj after, cluster(sheet)


Linear regression                               Number of obs     =       
                                                F(3, 348)         =       
                                                Prob > F          =     0.
                                                R-squared         =     0.
                                                Root MSE          =     8.
                                (Std. Err. adjusted for 349 clusters in sh
--------------------------------------------------------------------------
             |               Robust
         fte |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Inter
-------------+------------------------------------------------------------
     njafter |   2.328724   1.471481     1.58   0.114    -.5653903    5.22
          nj |  -2.998944   1.592595    -1.88   0.061    -6.131266    .133
       after |  -2.046154     1.3961    -1.47   0.144    -4.792009    .699
       _cons |       20.3   1.502615    13.51   0.000     17.34

In [8]:
%%stata
use "C:\Users\aslop\STATA\DinD_ex.dta", clear
reg fte njafter nj after, cluster(sheet) robust


Linear regression                               Number of obs     =       
                                                F(3, 348)         =       
                                                Prob > F          =     0.
                                                R-squared         =     0.
                                                Root MSE          =     8.
                                (Std. Err. adjusted for 349 clusters in sh
--------------------------------------------------------------------------
             |               Robust
         fte |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Inter
-------------+------------------------------------------------------------
     njafter |   2.328724   1.471481     1.58   0.114    -.5653903    5.22
          nj |  -2.998944   1.592595    -1.88   0.061    -6.131266    .133
       after |  -2.046154     1.3961    -1.47   0.144    -4.792009    .699
       _cons |       20.3   1.502615    13.51   0.000     17.34

### QUESTION 5: Now estimate the levels model using fixed effects (i.e. xtreg). Which variables get dropped and why?

we need to create fixed-effects for each restaurant and for each year

In [None]:
%%stata
tab 

In [103]:
%%stata
use "C:\Users\aslop\STATA\DinD_ex.dta", clear
xtreg fte nj njafter after, fe i(sheet)


note: nj omitted because of collinearity

Fixed-effects (within) regression               Number of obs     =        698
Group variable: sheet                           Number of groups  =        349

R-sq:                                           Obs per group:
     within  = 0.0114                                         min =          2
     between = 0.0082                                         avg =        2.0
     overall = 0.0004                                         max =          2

                                                F(2,347)          =       2.01
corr(u_i, Xb)  = -0.1033                        Prob > F          =     0.1359

------------------------------------------------------------------------------
         fte |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          nj |          0  (omitted)
     njafter |   2.328724   1.178371     1.98   0.049     .01107