Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Request] Allowing custom leads and lags in interact() #23

Closed
SuperMayo opened this issue Jun 10, 2020 · 5 comments
Closed

[Request] Allowing custom leads and lags in interact() #23

SuperMayo opened this issue Jun 10, 2020 · 5 comments

Comments

@SuperMayo
Copy link

@SuperMayo SuperMayo commented Jun 10, 2020

Problem

The interact function is very handy for differences in differences setups and allows to quickly plot the estimated coefficients with coefplot. However, by default the function interacts every values of the fe parameter. This is problematic when one wants to have only some leads and lags.

Example

The basic diff in diff setup presented in the vignette is

data(base_did)
feols(y ~ x1 + treat::period(5) | id + period, base_did)

However, one often wants to set custom number of leads and lags. For instance one would want to have only two pre-treatment coefficients (ie 2 leads) and two post-treatment coefficients (ie 2 lags).

What I tried

I tried to create a third dummy (hereby called window) that is equal to 1 if the observation is treated AND in the focus window such that the interaction between the window and period gives the right set of dummies. However it creates interaction even when window is equal to 0 and I end up with collinearity error.

Code to reproduce :

library(dplyr)
df <- mutate(base_did, window = ifelse(period > 2 & period < 8 & treat, 1, 0))
est <- feols(y ~ window::period | id + period, df)
#> ...
#> "Presence of collinearity, covariance not defined. Use function collinearity() to pinpoint the problems."
collinearity(est)
# [1] "Variables 'window:period::1', 'window:period::2', 'window:period::8', 'window:period::9' and 'window:period::10' are constant, thus collinear with the fixed-effects."

My hacky solution

The solution is simply to filter out the null dummies when I interact window and period.

library(dplyr)
data("base_did")
df <- base_did %>%
  mutate(window = ifelse(period > 2 & period < 8 & treat, 1, 0)) %>%
  mutate(long_term = ifelse(period >= 8 & treat, 1, 0))
coefs <- df %$% interact(window, period) %>% {.[,colSums(.) !=0]}
colnames(coefs) <- gsub("\\:", "_", colnames(coefs))
dummies <- colnames(coefs) %>%
  paste(collapse = "+")
df <- cbind(df, coefs)
frmla <- as.formula(paste("y ~ x1+", dummies, " + long_term | id + period"))
feols(frmla, df)

result:

OLS estimation, Dep. Var.: y
Observations: 1,080 
Fixed-effects: id: 108,  period: 10
Standard-errors: Clustered (id) 
                  Estimate Std. Error   z value  Pr(>|z|)    
x1                0.975347   0.046274 21.078000 < 2.2e-16 ***
window_period__3  1.049600   0.935761  1.121700  0.262288    
window_period__4 -0.473162   0.969770 -0.487912  0.625724    
window_period__5  1.324300   0.947332  1.397900  0.162461    
window_period__6  2.108000   0.874269  2.411200  0.016088 *  
window_period__7  4.921800   0.790574  6.225600  7.17e-10 ***
long_term         6.373900   0.696438  9.152200 < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Log-likelihood: -2,988.30   Adj. R2: 0.48591 
                          R2-Within: 0.38541 

Proposed interface

  • Adding explicit lead and lag options. (prefered solution)
    Maybe lead and lag are too economics-centered and using preand post is clearer.
feols(y ~ treat::period(ref=5, lead=2, lag=2) | id + period, df)

# Non numeric variable
all_months = c("aug", "sept", "oct", "nov", "dec", "jan",
                "feb", "mar", "apr", "may", "jun", "jul")
base_did$period_month = all_months[base_did$period]
feols(y ~ x1 + i(treat, period_month, "oct", lead=c("aug", "sep"), lag=c("nov", "dec")) | id+period, base_inter)
  • Adding a range parameter to the interact function.
feols(y ~ treat::period(range=3:7) | id + period, df)
  • Auto guess and filtering of null interactions
feols(y ~ window::period | id + period, df)

Final words

I can try implementing the function with a little help from the developer.

@lrberge
Copy link
Owner

@lrberge lrberge commented Jun 11, 2020

Thanks for your very clear request!

I've just released a new version of the package (0.5.0) so a few comments are in order:

  1. now collinear variables are removed on the fly, so your solution with window should work (and works).
  2. I've added the possibility to add multiple references in interactions, so you can do what you want directly there:
data(base_did)
df = base_did ; df$window = ifelse(df$period %in% 3:7 & df$treat, 1, 0)
est1 = feols(y ~ x1 + window::period | id + period, df)
#> Variables 'window:period::1', 'window:period::2' and 3 others have been removed because of collinearity (see $collin.var).
est2 = feols(y ~ x1 + treat::period(c(1:2, 8:10)) | id + period, base_did)
etable(est1, est2)
#>                                     est1                est2
#> x1                   0.9907*** (0.04843) 0.9907*** (0.04843)
#> window:period::3       -2.789** (0.8627)
#> window:period::4      -4.314*** (0.9298)
#> window:period::5       -2.502** (0.9101)
#> window:period::6        -1.725* (0.8379)
#> window:period::7          1.083 (0.7922)
#> treat:period::3                            -2.789** (0.8627)
#> treat:period::4                           -4.314*** (0.9298)
#> treat:period::5                            -2.502** (0.9101)
#> treat:period::6                             -1.725* (0.8379)
#> treat:period::7                               1.083 (0.7922)
#> Fixed-Effects:       ------------------- -------------------
#> id                                   Yes                 Yes
#> period                               Yes                 Yes
#> ___________________  ___________________ ___________________
#> Observations                       1,080               1,080
#> S.E. type: Clustered              by: id              by: id
#> R2                               0.50713             0.50713
#> Within R2                        0.33496             0.33496

The big problem is that by doing these changes I introduced bugs in coefplot (sigh)--because the automatic reference was supposed to be unitary and due to other internal quirks.
In est2, the interaction plot works:

coefplot(est2)

image

But then all the references are represented as data points.
In est1, the interaction plot throws an error. A workaround is to do a "normal" coefplot:

coefplot(est1, only.inter = FALSE, drop = "x1")

image

You can also locate the coefficients where you want on the x-axis using the argument x (in case the factors are not consecutive numbers):

coefplot(est1, only.inter = FALSE, drop = "x1", x = c(0, 3.5, 4, 4.5, 6.5, 7), xlim.add = c(-0.1,0.1))

image

Note that you still have to provide a value for the x1 that is dropped though (I'll try to fix that when I have the time).

On Your suggestions

As you noticed, the factor to be interacted with can be of any type (logical, numeric, character, factor) and can represent anything, not only time periods. So the lead and lag arguments are off the table since they are not general enough.

I think I'll introduce the two arguments keep and drop on top of the ref argument. Why? Because the meaning of what keep and drop do is explicit, intuitive, and applies to any situation.
I will also keep the ref argument to ensure a synergy with coefplot. The difference between ref and drop would be:

  • coefficients removed by ref will appear in the output of coefplot
  • coefficients removed by drop will not appear in the output of coefplot

What do you think? Would these changes be OK?

In any case, thanks a lot for the effort, very appreciated!

@SuperMayo
Copy link
Author

@SuperMayo SuperMayo commented Jun 15, 2020

Thank you for your fast answer, It now works as expected !

I agree with you about using keep and drop instead of the too specific lead and lag.

@lrberge
Copy link
Owner

@lrberge lrberge commented Jun 16, 2020

Great then! :-)

I'll close the issue when I add the new arguments.

@lrberge
Copy link
Owner

@lrberge lrberge commented Jul 7, 2020

That's done! Still haven't fixed the coefplot bug though, but I will soon.
Thanks again!

@lrberge lrberge closed this Jul 7, 2020
@lrberge
Copy link
Owner

@lrberge lrberge commented Jul 10, 2020

Fyi: just got rid of the bug when interacted variables were removed because of collinearity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.