Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: VCV for multiple outcomes #424

Open
jonathandroth opened this issue Jun 23, 2023 · 8 comments
Open

Feature request: VCV for multiple outcomes #424

jonathandroth opened this issue Jun 23, 2023 · 8 comments

Comments

@jonathandroth
Copy link

Laurent -- thanks again for the fantastic package!

I think it would be great if fixest could return the joint variance-covariance matrix when estimating multiple regressions at once. I.e., I run

feols(c(y1,y2) ~ x,...)

Now I'd like to know the covariance between the coefficient on x when using y1 as an outcome and y2 as an outcome. (This is what's sometime called seeming unrelated regression.) There are some wrap-arounds available (e.g. here), but it would be great if this could easily be done natively.

@kylebutts
Copy link
Contributor

@jonathandroth, here's a basic version (no clustering). Two questions and I can make this more robust:

  1. What is the correct way to account for degrees of freedom? Is it sum of number of rows in each est - sum of number of covariates in each est?
  2. Would clustered standard errors be useful? I believe you would do sum $X_i' e_i$ for each g and premultiply by $(X'X)^{-1}$ to get a $N_g$ by $k$ matrix of influence functions (instead of N by k).
library(fixest)

#' Get joint variance-covariance matrix of seemingly unrelated regressions
#' 
#' @param ests List of `feols` estimates.
#' 
#' @return Variance-covariance matrix of the combined set of coefficients in
#' all models in `ests`.
#' 
vcovSUR <- function(ests) {
  if (inherits(ests, "fixest") | inherits(ests, "fixest_multi")) { 
    ests = list(ests)
  } 
  
  inf_list <- lapply(ests, get_influence_func)
  inf_funcs <- do.call("rbind", inf_list)
  
  vcov = tcrossprod(inf_funcs)
  return(vcov)
}

get_influence_func <- function(est) {
  # (X'X)^{-1} X'e
  if (inherits(est, "fixest_multi")) {
    
    inf_list = lapply(est, function(x) { 
      (x$cov.iid / x$sigma2) %*% t(x$scores)
    })
    do.call("rbind", inf_list)

  } else if (inherits(est, "fixest")) {
    
    (est$cov.iid / est$sigma2) %*% t(est$scores)
  
  } else { # code taken partially from `sandwich`
  
    X <- stats::model.matrix(est)
    if (any(alias <- is.na(stats::coef(est)))) X <- X[, !alias, drop = FALSE]
    if (!is.null(w <- stats::weights(est))) X <- X * sqrt(w)
    scores = X * stats::resid(est)
    solve(crossprod(X), t(scores))

  }
}

mtcars$w = runif(nrow(mtcars), 0.15, 0.85)
est1 <- feols(mpg ~ wt, mtcars)
est2 <- feols(mpg ~ wt + i(cyl), mtcars)
est3 <- lm(mpg ~ wt + i(cyl), mtcars)
est_w <- feols(mpg ~ wt + i(cyl), mtcars, weights = ~w)
est_multi <- feols(c(mpg, hp)  ~ wt + i(cyl), mtcars)

vcov(est1, "hc1", ssc = ssc(adj = FALSE, cluster.adj = FALSE))
#>             (Intercept)        wt
#> (Intercept)    4.516833 -1.305717
#> wt            -1.305717  0.401577
vcov(est2, "hc1", ssc = ssc(adj = FALSE, cluster.adj = FALSE))
#>             (Intercept)         wt     cyl::6     cyl::8
#> (Intercept)   3.2700821 -0.9321293 -0.3715946  0.3477177
#> wt           -0.9321293  0.3783737 -0.2442125 -0.5329499
#> cyl::6       -0.3715946 -0.2442125  1.2716154  1.3082462
#> cyl::8        0.3477177 -0.5329499  1.3082462  1.9913114
vcovSUR(list(est1, est2))
#>             (Intercept)          wt (Intercept)         wt     cyl::6
#> (Intercept)   4.5168333 -1.30571703   3.3911740 -0.8658345 -0.6364905
#> wt           -1.3057170  0.40157696  -0.9446715  0.2607989  0.1249679
#> (Intercept)   3.3911740 -0.94467154   3.2700821 -0.9321293 -0.3715946
#> wt           -0.8658345  0.26079887  -0.9321293  0.3783737 -0.2442125
#> cyl::6       -0.6364905  0.12496790  -0.3715946 -0.2442125  1.2716154
#> cyl::8       -0.2795933  0.06152524   0.3477177 -0.5329499  1.3082462
#>                  cyl::8
#> (Intercept) -0.27959331
#> wt           0.06152524
#> (Intercept)  0.34771768
#> wt          -0.53294991
#> cyl::6       1.30824624
#> cyl::8       1.99131136

# Works with weights
vcov(est_w, "hc1", ssc = ssc(adj = FALSE, cluster.adj = FALSE))
#>             (Intercept)          wt      cyl::6     cyl::8
#> (Intercept)   5.0756301 -1.12152509 -1.59657428 -0.4062194
#> wt           -1.1215251  0.33655783  0.07906278 -0.2880801
#> cyl::6       -1.5965743  0.07906278  1.47597744  1.2937823
#> cyl::8       -0.4062194 -0.28808008  1.29378231  1.9053256
vcovSUR(est_w)
#>             (Intercept)          wt      cyl::6     cyl::8
#> (Intercept)   5.0756301 -1.12152509 -1.59657428 -0.4062194
#> wt           -1.1215251  0.33655783  0.07906278 -0.2880801
#> cyl::6       -1.5965743  0.07906278  1.47597744  1.2937823
#> cyl::8       -0.4062194 -0.28808008  1.29378231  1.9053256

# Works with multi
lapply(est_multi, function(est) vcov(est, "hc1", ssc = ssc(adj = FALSE, cluster.adj = FALSE)))
#> $`lhs: mpg`
#>             (Intercept)         wt     cyl::6     cyl::8
#> (Intercept)   3.2700821 -0.9321293 -0.3715946  0.3477177
#> wt           -0.9321293  0.3783737 -0.2442125 -0.5329499
#> cyl::6       -0.3715946 -0.2442125  1.2716154  1.3082462
#> cyl::8        0.3477177 -0.5329499  1.3082462  1.9913114
#> 
#> $`lhs: hp`
#>             (Intercept)         wt    cyl::6    cyl::8
#> (Intercept)    481.0195 -189.14317 135.81828  479.2033
#> wt            -189.1432   80.30526 -73.10224 -221.1762
#> cyl::6         135.8183  -73.10224 174.73028  230.6635
#> cyl::8         479.2033 -221.17619 230.66346  730.4108
vcovSUR(est_multi)
#>             (Intercept)         wt     cyl::6      cyl::8  (Intercept)
#> (Intercept)   3.2700821 -0.9321293 -0.3715946   0.3477177   -3.6102659
#> wt           -0.9321293  0.3783737 -0.2442125  -0.5329499    0.2846401
#> cyl::6       -0.3715946 -0.2442125  1.2716154   1.3082462    2.1105066
#> cyl::8        0.3477177 -0.5329499  1.3082462   1.9913114   -1.6623714
#> (Intercept)  -3.6102659  0.2846401  2.1105066  -1.6623714  481.0194872
#> wt            0.2846401 -0.2300793  0.7005150   2.4442424 -189.1431698
#> cyl::6        2.1105066  0.7005150 -6.1116792  -6.4158361  135.8182788
#> cyl::8       -1.6623714  2.4442424 -6.4158361 -13.5626121  479.2033017
#>                       wt     cyl::6      cyl::8
#> (Intercept)    0.2846401   2.110507   -1.662371
#> wt            -0.2300793   0.700515    2.444242
#> cyl::6         0.7005150  -6.111679   -6.415836
#> cyl::8         2.4442424  -6.415836  -13.562612
#> (Intercept) -189.1431698 135.818279  479.203302
#> wt            80.3052563 -73.102243 -221.176193
#> cyl::6       -73.1022434 174.730281  230.663464
#> cyl::8      -221.1761929 230.663464  730.410817

Created on 2023-07-10 with reprex v2.0.2

@kylebutts
Copy link
Contributor

@grantmcdermott @vincentarelbundock, is there a way to call marginaleffects::hypothesis providing a vector of coefficients and the variance-covariance matrix?

@vincentarelbundock
Copy link
Contributor

@kylebutts maybe the FUN argument in hypotheses() can help. (Not sure.)

@jonathandroth
Copy link
Author

Thanks, @kylebutts !

Re d.o.f., Stata's sureg command has the following options:

image

Yes, I do think clustered SEs would be a nice add-on.

@kylebutts
Copy link
Contributor

@vincentarelbundock That worked with defining custom get_coef and set_coef methods! Thanks!

@vincentarelbundock
Copy link
Contributor

Cool! Is that too niche, or is it something that should be merged in the package, you think?

@kylebutts
Copy link
Contributor

@jonathandroth See https://github.com/kylebutts/vcovSUR :-)

✔️ Variance-covariance matrix estimation
✔️ Clustering (which I realized is necessary for clustering by observation id)
✔️ Support for fixest_multi objects
✔️ Hypothesis Tests from {marginaleffects}
✖︎ No small-sample degree of freedom correction (yet).

@jonathandroth
Copy link
Author

jonathandroth commented Jul 14, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants