Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vi_model for glm should select z value instead of t value #74

Closed
RoelVerbelen opened this issue Oct 31, 2019 · 0 comments
Closed

vi_model for glm should select z value instead of t value #74

RoelVerbelen opened this issue Oct 31, 2019 · 0 comments

Comments

@RoelVerbelen
Copy link

@RoelVerbelen RoelVerbelen commented Oct 31, 2019

The function vip:::vi_model.lm is also used for glm type models, but always selects the "t value" column. The relevant line of code inside this function:

tib <- tibble::tibble(Variable = rownames(coefs), 
    Importance = abs(coefs[, "t value"]), 
    Sign = ifelse(sign(coefs[, "Estimate"]) ==1, yes = "POS", no = "NEG"))

When the family argument is not gaussian, the column name is "z value" instead of "t value", which causes vip:::vi_model.lm to break. Minimal example:

df <- data.frame(norm = rnorm(30), pois = rpois(30, 1), unif = runif(30))

gaussian_fit <- glm(norm ~ unif, family = "gaussian", data = df)
summary(gaussian_fit)$coefficients
#>               Estimate Std. Error    t value  Pr(>|t|)
#> (Intercept)  0.3453485  0.4902475  0.7044371 0.4869788
#> unif        -0.9664243  0.7295934 -1.3246067 0.1960119
vip::vi(gaussian_fit)
#> # A tibble: 1 x 3
#>   Variable Importance Sign 
#>   <chr>         <dbl> <chr>
#> 1 unif           1.32 NEG

poisson_fit <- glm(pois ~ unif, family = "poisson", data = df)
summary(poisson_fit)$coefficients
#>               Estimate Std. Error   z value  Pr(>|z|)
#> (Intercept)  0.6808461  0.3340880  2.037924 0.0415575
#> unif        -0.7225532  0.5352184 -1.350016 0.1770109
vip::vi(poisson_fit)
#> Error in coefs[, "t value"]: subscript out of bounds

Created on 2019-10-31 by the reprex package (v0.3.0)

The logic behind z/t value is in stats::summary.glm. The easiest fix is to select by location instead: the z/t value is always in column 3.

@bgreenwell bgreenwell closed this Nov 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.