Feature request: VCV for multiple outcomes #424

jonathandroth · 2023-06-23T17:08:31Z

Laurent -- thanks again for the fantastic package!

I think it would be great if fixest could return the joint variance-covariance matrix when estimating multiple regressions at once. I.e., I run

feols(c(y1,y2) ~ x,...)

Now I'd like to know the covariance between the coefficient on x when using y1 as an outcome and y2 as an outcome. (This is what's sometime called seeming unrelated regression.) There are some wrap-arounds available (e.g. here), but it would be great if this could easily be done natively.

The text was updated successfully, but these errors were encountered:

kylebutts · 2023-07-10T16:34:40Z

@jonathandroth, here's a basic version (no clustering). Two questions and I can make this more robust:

What is the correct way to account for degrees of freedom? Is it sum of number of rows in each est - sum of number of covariates in each est?
Would clustered standard errors be useful? I believe you would do sum $X_i' e_i$ for each g and premultiply by $(X'X)^{-1}$ to get a $N_g$ by $k$ matrix of influence functions (instead of N by k).

library(fixest)

#' Get joint variance-covariance matrix of seemingly unrelated regressions
#' 
#' @param ests List of `feols` estimates.
#' 
#' @return Variance-covariance matrix of the combined set of coefficients in
#' all models in `ests`.
#' 
vcovSUR <- function(ests) {
  if (inherits(ests, "fixest") | inherits(ests, "fixest_multi")) { 
    ests = list(ests)
  } 
  
  inf_list <- lapply(ests, get_influence_func)
  inf_funcs <- do.call("rbind", inf_list)
  
  vcov = tcrossprod(inf_funcs)
  return(vcov)
}

get_influence_func <- function(est) {
  # (X'X)^{-1} X'e
  if (inherits(est, "fixest_multi")) {
    
    inf_list = lapply(est, function(x) { 
      (x$cov.iid / x$sigma2) %*% t(x$scores)
    })
    do.call("rbind", inf_list)

  } else if (inherits(est, "fixest")) {
    
    (est$cov.iid / est$sigma2) %*% t(est$scores)
  
  } else { # code taken partially from `sandwich`
  
    X <- stats::model.matrix(est)
    if (any(alias <- is.na(stats::coef(est)))) X <- X[, !alias, drop = FALSE]
    if (!is.null(w <- stats::weights(est))) X <- X * sqrt(w)
    scores = X * stats::resid(est)
    solve(crossprod(X), t(scores))

  }
}

mtcars$w = runif(nrow(mtcars), 0.15, 0.85)
est1 <- feols(mpg ~ wt, mtcars)
est2 <- feols(mpg ~ wt + i(cyl), mtcars)
est3 <- lm(mpg ~ wt + i(cyl), mtcars)
est_w <- feols(mpg ~ wt + i(cyl), mtcars, weights = ~w)
est_multi <- feols(c(mpg, hp)  ~ wt + i(cyl), mtcars)

vcov(est1, "hc1", ssc = ssc(adj = FALSE, cluster.adj = FALSE))
#>             (Intercept)        wt
#> (Intercept)    4.516833 -1.305717
#> wt            -1.305717  0.401577
vcov(est2, "hc1", ssc = ssc(adj = FALSE, cluster.adj = FALSE))
#>             (Intercept)         wt     cyl::6     cyl::8
#> (Intercept)   3.2700821 -0.9321293 -0.3715946  0.3477177
#> wt           -0.9321293  0.3783737 -0.2442125 -0.5329499
#> cyl::6       -0.3715946 -0.2442125  1.2716154  1.3082462
#> cyl::8        0.3477177 -0.5329499  1.3082462  1.9913114
vcovSUR(list(est1, est2))
#>             (Intercept)          wt (Intercept)         wt     cyl::6
#> (Intercept)   4.5168333 -1.30571703   3.3911740 -0.8658345 -0.6364905
#> wt           -1.3057170  0.40157696  -0.9446715  0.2607989  0.1249679
#> (Intercept)   3.3911740 -0.94467154   3.2700821 -0.9321293 -0.3715946
#> wt           -0.8658345  0.26079887  -0.9321293  0.3783737 -0.2442125
#> cyl::6       -0.6364905  0.12496790  -0.3715946 -0.2442125  1.2716154
#> cyl::8       -0.2795933  0.06152524   0.3477177 -0.5329499  1.3082462
#>                  cyl::8
#> (Intercept) -0.27959331
#> wt           0.06152524
#> (Intercept)  0.34771768
#> wt          -0.53294991
#> cyl::6       1.30824624
#> cyl::8       1.99131136

# Works with weights
vcov(est_w, "hc1", ssc = ssc(adj = FALSE, cluster.adj = FALSE))
#>             (Intercept)          wt      cyl::6     cyl::8
#> (Intercept)   5.0756301 -1.12152509 -1.59657428 -0.4062194
#> wt           -1.1215251  0.33655783  0.07906278 -0.2880801
#> cyl::6       -1.5965743  0.07906278  1.47597744  1.2937823
#> cyl::8       -0.4062194 -0.28808008  1.29378231  1.9053256
vcovSUR(est_w)
#>             (Intercept)          wt      cyl::6     cyl::8
#> (Intercept)   5.0756301 -1.12152509 -1.59657428 -0.4062194
#> wt           -1.1215251  0.33655783  0.07906278 -0.2880801
#> cyl::6       -1.5965743  0.07906278  1.47597744  1.2937823
#> cyl::8       -0.4062194 -0.28808008  1.29378231  1.9053256

# Works with multi
lapply(est_multi, function(est) vcov(est, "hc1", ssc = ssc(adj = FALSE, cluster.adj = FALSE)))
#> $`lhs: mpg`
#>             (Intercept)         wt     cyl::6     cyl::8
#> (Intercept)   3.2700821 -0.9321293 -0.3715946  0.3477177
#> wt           -0.9321293  0.3783737 -0.2442125 -0.5329499
#> cyl::6       -0.3715946 -0.2442125  1.2716154  1.3082462
#> cyl::8        0.3477177 -0.5329499  1.3082462  1.9913114
#> 
#> $`lhs: hp`
#>             (Intercept)         wt    cyl::6    cyl::8
#> (Intercept)    481.0195 -189.14317 135.81828  479.2033
#> wt            -189.1432   80.30526 -73.10224 -221.1762
#> cyl::6         135.8183  -73.10224 174.73028  230.6635
#> cyl::8         479.2033 -221.17619 230.66346  730.4108
vcovSUR(est_multi)
#>             (Intercept)         wt     cyl::6      cyl::8  (Intercept)
#> (Intercept)   3.2700821 -0.9321293 -0.3715946   0.3477177   -3.6102659
#> wt           -0.9321293  0.3783737 -0.2442125  -0.5329499    0.2846401
#> cyl::6       -0.3715946 -0.2442125  1.2716154   1.3082462    2.1105066
#> cyl::8        0.3477177 -0.5329499  1.3082462   1.9913114   -1.6623714
#> (Intercept)  -3.6102659  0.2846401  2.1105066  -1.6623714  481.0194872
#> wt            0.2846401 -0.2300793  0.7005150   2.4442424 -189.1431698
#> cyl::6        2.1105066  0.7005150 -6.1116792  -6.4158361  135.8182788
#> cyl::8       -1.6623714  2.4442424 -6.4158361 -13.5626121  479.2033017
#>                       wt     cyl::6      cyl::8
#> (Intercept)    0.2846401   2.110507   -1.662371
#> wt            -0.2300793   0.700515    2.444242
#> cyl::6         0.7005150  -6.111679   -6.415836
#> cyl::8         2.4442424  -6.415836  -13.562612
#> (Intercept) -189.1431698 135.818279  479.203302
#> wt            80.3052563 -73.102243 -221.176193
#> cyl::6       -73.1022434 174.730281  230.663464
#> cyl::8      -221.1761929 230.663464  730.410817

^{Created on 2023-07-10 with reprex v2.0.2}

kylebutts · 2023-07-11T01:18:09Z

@grantmcdermott @vincentarelbundock, is there a way to call marginaleffects::hypothesis providing a vector of coefficients and the variance-covariance matrix?

vincentarelbundock · 2023-07-11T06:22:35Z

@kylebutts maybe the FUN argument in hypotheses() can help. (Not sure.)

jonathandroth · 2023-07-11T13:16:30Z

Thanks, @kylebutts !

Re d.o.f., Stata's sureg command has the following options:

Yes, I do think clustered SEs would be a nice add-on.

kylebutts · 2023-07-12T14:58:50Z

@vincentarelbundock That worked with defining custom get_coef and set_coef methods! Thanks!

vincentarelbundock · 2023-07-12T15:01:27Z

Cool! Is that too niche, or is it something that should be merged in the package, you think?

kylebutts · 2023-07-12T15:17:51Z

@jonathandroth See https://github.com/kylebutts/vcovSUR :-)

✔️ Variance-covariance matrix estimation
✔️ Clustering (which I realized is necessary for clustering by observation id)
✔️ Support for fixest_multi objects
✔️ Hypothesis Tests from {marginaleffects}
✖︎ No small-sample degree of freedom correction (yet).

jonathandroth · 2023-07-14T13:44:15Z

Awesome! My own 2 cents is it would be useful to have a native fixest feature for doing hypothesis tests when using multiple outcomes, but I'll leave it up to the package development experts here. Either way this will be a great resource

…

On Wed, Jul 12, 2023 at 11:18 AM Kyle F Butts ***@***.***> wrote: @jonathandroth <https://github.com/jonathandroth> See https://github.com/kylebutts/vcovSUR :-) ✔️ Variance-covariance matrix estimation ✔️ Clustering (which I realized is necessary for clustering by observation id) ✔️ Support for fixest_multi objects ✔️ Hypothesis Tests from {marginaleffects} ✖︎ No small-sample degree of freedom correction (yet). — Reply to this email directly, view it on GitHub <#424 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AE6EXFCGYE76O27U4ZRVGWDXP252VANCNFSM6AAAAAAZR2EXXE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

kylebutts mentioned this issue Jul 12, 2023

Generic for hypotheses testing with coeftable and vcov vincentarelbundock/marginaleffects#838

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: VCV for multiple outcomes #424

Feature request: VCV for multiple outcomes #424

jonathandroth commented Jun 23, 2023

kylebutts commented Jul 10, 2023

kylebutts commented Jul 11, 2023

vincentarelbundock commented Jul 11, 2023

jonathandroth commented Jul 11, 2023

kylebutts commented Jul 12, 2023

vincentarelbundock commented Jul 12, 2023

kylebutts commented Jul 12, 2023

jonathandroth commented Jul 14, 2023 via email

Feature request: VCV for multiple outcomes #424

Feature request: VCV for multiple outcomes #424

Comments

jonathandroth commented Jun 23, 2023

kylebutts commented Jul 10, 2023

kylebutts commented Jul 11, 2023

vincentarelbundock commented Jul 11, 2023

jonathandroth commented Jul 11, 2023

kylebutts commented Jul 12, 2023

vincentarelbundock commented Jul 12, 2023

kylebutts commented Jul 12, 2023

jonathandroth commented Jul 14, 2023 via email