Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

relative.influence gives different value when n.trees is given or not for multinomial distributions #31

Closed
see24 opened this issue Nov 2, 2018 · 4 comments

Comments

@see24
Copy link
Contributor

see24 commented Nov 2, 2018

I noticed that relative.influence and summary give different answers for the relative influence for multinomial distributions and when I looked a bit deeper I noticed that relative.influence gives a different result if it is called with n.trees specified or the default which is to take the object$n.trees

library(gbm)
gbm_iris <- gbm(Species~., data = iris, interaction.depth = 3)
# Distribution not specified, assuming multinomial ...

relative.influence(gbm_iris)
# n.trees not given. Using 100 trees.
# Sepal.Length  Sepal.Width Petal.Length  Petal.Width 
# 11.07564     92.51308   2343.64249   1488.99965 

relative.influence(gbm_iris, n.trees = 100)
# Sepal.Length  Sepal.Width Petal.Length  Petal.Width 
# 0.9626096   29.2052748  960.5071372  455.1289295

summary(gbm_iris, normalize = FALSE)
#                      var     rel.inf
# Petal.Length Petal.Length 960.5071372
# Petal.Width   Petal.Width 455.1289295
# Sepal.Width   Sepal.Width  29.2052748
# Sepal.Length Sepal.Length   0.9626096

The problem is that in relative.influence it multiplies n.trees by the number of classes but this multiplication is incorrectly located inside the if(missing(n.trees)) statement. Should be an easy fix but fairly important.

@bgreenwell
Copy link
Contributor

@see24 Good catch, and thank you! I'll take a look at the PR this weekend!

@cunningjames
Copy link
Collaborator

cunningjames commented Nov 6, 2018

@bgreenwell The pull request looks good to me, for what it's worth. I'm not imaginative enough to think of a use case in which you'd not want to perform that multiplication, so the pull request as written seems as reasonable to me as the alternative of keeping relative.influence as-is but multiplying within summary.gbm.

@bgreenwell
Copy link
Contributor

Thanks @cunningjames for the review.

@cunningjames
Copy link
Collaborator

@bgreenwell Hey, no problem. The gargantuan effort it took for me to validate a three-line pull request won't be soon forgotten, though!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants