Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sunab not matching Stata Sun and Abraham implementation #287

Closed
kylebutts opened this issue Mar 29, 2022 · 3 comments
Closed

sunab not matching Stata Sun and Abraham implementation #287

kylebutts opened this issue Mar 29, 2022 · 3 comments

Comments

@kylebutts
Copy link
Contributor

Hi Laurent,

I've gotten a few reports from folks that the Sun and Abraham code produces wonky results that don't look like other estimators e.g.: kylebutts/did2s#11. I have a feeling there is something off with your code because Sun and Abraham with a never-treated group, the correct reference periods being dropped, and no covariates is numerically equivalent to Callaway and Sant'Anna. In the above example, all three things are true and they don't match.

Prof. Sun has Stata code now for her estimator: https://github.com/lsun20/EventStudyInteract. As you can see, did matches with the stata Sun and Abraham implementation.

Reprex:

R:

library(fixest)
library(did)
data(base_stagg)
head(base_stagg)
#>   id year year_treated time_to_treatment treated treatment_effect_true
#> 2 90    1            2                -1       1                     0
#> 3 89    1            3                -2       1                     0
#> 4 88    1            4                -3       1                     0
#> 5 87    1            5                -4       1                     0
#> 6 86    1            6                -5       1                     0
#> 7 85    1            7                -6       1                     0
#>           x1           y
#> 2 -1.0947021  0.01722971
#> 3 -3.7100676 -4.58084528
#> 4  2.5274402  2.73817174
#> 5 -0.7204263 -0.65103066
#> 6 -3.6711678 -5.33381664
#> 7 -0.3152137  0.49562631

table(base_stagg$year_treated)
#> 
#>     2     3     4     5     6     7     8     9    10 10000 
#>    50    50    50    50    50    50    50    50    50   500
table(base_stagg$time_to_treatment)
#> 
#> -1000    -9    -8    -7    -6    -5    -4    -3    -2    -1     0     1     2 
#>   500     5    10    15    20    25    30    35    40    45    45    40    35 
#>     3     4     5     6     7     8 
#>    30    25    20    15    10     5

res_sunab = feols(y ~ sunab(year_treated, year), base_stagg)
etable(res_sunab)
#>                          res_sunab
#> Dependent Var.:                  y
#>                                   
#> (Intercept)       -0.0827 (0.0997)
#> year = -9           0.6799 (1.046)
#> year = -8         -1.793* (0.7427)
#> year = -7         -1.033. (0.6091)
#> year = -6         -1.113* (0.5299)
#> year = -5        -1.284** (0.4760)
#> year = -4         -0.4538 (0.4365)
#> year = -3         -0.6591 (0.4058)
#> year = -2         -0.0349 (0.3813)
#> year = 0        -4.772*** (0.3610)
#> year = 1        -2.692*** (0.3813)
#> year = 2        -1.632*** (0.4058)
#> year = 3           0.6462 (0.4365)
#> year = 4         2.158*** (0.4760)
#> year = 5         3.272*** (0.5299)
#> year = 6         6.721*** (0.6091)
#> year = 7         5.998*** (0.7427)
#> year = 8          10.24*** (1.046)
#> _______________ __________________
#> S.E. type                      IID
#> Observations                   950
#> R2                         0.47477
#> Adj. R2                    0.42576
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

res_cs <- did::att_gt(
    yname = "y", tname = "year", idname = "id", gname = "year_treated",
    data = base_stagg
)
#> Warning in did::att_gt(yname = "y", tname = "year", idname = "id", gname
#> = "year_treated", : Not returning pre-test Wald statistic due to singular
#> covariance matrix
did::aggte(res_cs, type = "dynamic")
#> 
#> Call:
#> did::aggte(MP = res_cs, type = "dynamic")
#> 
#> Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 
#> 
#> 
#> Overall summary of ATT's based on event-study/dynamic aggregation:  
#>     ATT    Std. Error     [ 95%  Conf. Int.]  
#>  1.3312        0.5741      0.206      2.4563 *
#> 
#> 
#> Dynamic Effects:
#>  Event time Estimate Std. Error [95% Simult.  Conf. Band]  
#>          -8  -2.4592     2.0794       -8.0075      3.0890  
#>          -7   0.5836     0.8039       -1.5615      2.7286  
#>          -6  -0.3572     0.6123       -1.9910      1.2765  
#>          -5  -0.0968     0.6856       -1.9263      1.7326  
#>          -4   0.7133     0.5920       -0.8663      2.2929  
#>          -3  -0.1274     0.4051       -1.2083      0.9534  
#>          -2   0.3993     0.5080       -0.9561      1.7546  
#>          -1  -0.2413     0.4817       -1.5266      1.0440  
#>           0  -5.0415     0.6330       -6.7304     -3.3527 *
#>           1  -3.2209     0.6031       -4.8301     -1.6117 *
#>           2  -2.1232     0.6082       -3.7461     -0.5003 *
#>           3   0.0518     0.6448       -1.6687      1.7724  
#>           4   1.4291     0.8140       -0.7429      3.6011  
#>           5   2.4004     0.9287       -0.0777      4.8784  
#>           6   5.9454     1.0977        3.0166      8.8742 *
#>           7   4.5864     1.6481        0.1888      8.9840 *
#>           8   7.9529     0.9802        5.3376     10.5681 *
#> ---
#> Signif. codes: `*' confidence band does not cover 0
#> 
#> Control Group:  Never Treated,  Anticipation Periods:  0
#> Estimation Method:  Doubly Robust

Created on 2022-03-29 by the reprex package (v2.0.1)

Stata:

use "base_stagg.dta", clear

forvalues k = 9(-1)2 {
   gen g_`k' = time_to_treatment == -`k'
}
forvalues k = 0(1)8 {
	 gen g`k' = time_to_treatment == `k'
}

replace year_treated = . if year_treated == 10000
gen never_treated = (year_treated == .)

set matsize 800
eventstudyinteract y g_* g0-g8, cohort(year_treated) control_cohort(never_treated) ///
	absorb(i.id i.year) vce(cluster id)

(obs=450)

IW estimates for dynamic effects                Number of obs     =        950
Absorbing 2 HDFE groups                         F(  81,     94)   =          .
                                                Prob > F          =          .
                                                R-squared         =     0.5802
                                                Adj R-squared     =     0.4786
                                                Root MSE          =     2.2178
                                    (Std. Err. adjusted for 95 clusters in id)
------------------------------------------------------------------------------
             |               Robust
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         g_9 |   2.343297   1.122482     2.09   0.040     .1145821    4.572011
         g_8 |  -.8663914   1.210262    -0.72   0.476    -3.269394    1.536611
         g_7 |  -.3458365   .7748877    -0.45   0.656    -1.884394    1.192721
         g_6 |  -.4711133   .7479421    -0.63   0.530     -1.95617    1.013943
         g_5 |  -.7965442   .6118887    -1.30   0.196    -2.011464    .4183752
         g_4 |  -.2800326    .488554    -0.57   0.568    -1.250068    .6900028
         g_3 |  -.3157657   .5711382    -0.55   0.582    -1.449774    .8182425
         g_2 |   .2413252   .4907698     0.49   0.624    -.7331097     1.21576
          g0 |  -5.041515   .6164013    -8.18   0.000    -6.265394   -3.817636
          g1 |  -3.220936   .6027611    -5.34   0.000    -4.417732    -2.02414
          g2 |  -2.123172   .6242335    -3.40   0.001    -3.362602   -.8837416
          g3 |   .0518106   .6443656     0.08   0.936    -1.227592    1.331214
          g4 |    1.42912   .8115982     1.76   0.082    -.1823276    3.040567
          g5 |   2.400366   .9510404     2.52   0.013     .5120528    4.288679
          g6 |   5.945377   1.039949     5.72   0.000     3.880534     8.01022
          g7 |   4.586416   1.611715     2.85   0.005     1.386318    7.786515
          g8 |   7.952886   .9807291     8.11   0.000     6.005625    9.900146
------------------------------------------------------------------------------


@kylebutts kylebutts changed the title sunab not matching sunab not matching Stata Sun and Abraham implementation Mar 29, 2022
@lrberge
Copy link
Owner

lrberge commented Mar 30, 2022

Hi Kyle,

Nope, the code isn't off: it simply appears that you forgot the fixed-effects in the example:

res_sunab = feols(y ~ sunab(year_treated, year) | id + year, base_stagg)
res_sunab 
#> OLS estimation, Dep. Var.: y
#> Observations: 950 
#> Fixed-effects: id: 95,  year: 10
#> Standard-errors: Clustered (id) 
#>           Estimate Std. Error   t value Pr(>|t|)    
#> year::-9  2.343297   1.122482  2.087603 0.039539 *  
#> year::-8 -0.866391   1.186767 -0.730043 0.467179    
#> year::-7 -0.345837   0.713641 -0.484608 0.629082    
#> year::-6 -0.471113   0.731894 -0.643691 0.521343    
#> year::-5 -0.796544   0.527162 -1.511005 0.134142    
#> year::-4 -0.280033   0.429222 -0.652419 0.515723    
#> year::-3 -0.315766   0.528401 -0.597587 0.551552    
#> year::-2  0.241325   0.457825  0.527113 0.599357    
#> ... 9 coefficients remaining (display them with summary() or use argument n)
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 1.98883     Adj. R2: 0.479262
#>                 Within R2: 0.424684
etable(res_sunab)
#>                          res_sunab
#> Dependent Var.:                  y
#>                                   
#> year = -9           2.343* (1.122)
#> year = -8          -0.8664 (1.187)
#> year = -7         -0.3458 (0.7136)
#> year = -6         -0.4711 (0.7319)
#> year = -5         -0.7965 (0.5272)
#> year = -4         -0.2800 (0.4292)
#> year = -3         -0.3158 (0.5284)
#> year = -2          0.2413 (0.4578)
#> year = 0        -5.042*** (0.4155)
#> year = 1        -3.221*** (0.4579)
#> year = 2        -2.123*** (0.4976)
#> year = 3           0.0518 (0.4784)
#> year = 4           1.429* (0.6622)
#> year = 5          2.400** (0.8176)
#> year = 6          5.945*** (1.002)
#> year = 7           4.586** (1.610)
#> year = 8         7.953*** (0.9807)
#> Fixed-Effects:  ------------------
#> id                             Yes
#> year                           Yes
#> _______________ __________________
#> S.E.: Clustered             by: id
#> Observations                   950
#> R2                         0.58023
#> Within R2                  0.42468
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

It does not mean that there isn't possibly other problems, but at least in this example it works.
Is there any other example with big discrepancies?

@kylebutts
Copy link
Contributor Author

Oh! I'm embarrassed. I think I was looking at https://lrberge.github.io/fixest/reference/sunab.html and didn't realize that unit and time FEs were needed. I see them in the intro vignette. If you think it's a good idea, I can submit a pull request adding fixed effects to the sunab examples?

@lrberge
Copy link
Owner

lrberge commented Mar 31, 2022

Ooooch, that a major documentation bug then! I didn't notice it since I was only looking at the vignette.

I've fixed it thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants