Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Levels with zero weight are always assigned to the right child #45

Closed
pat-oreilly opened this issue Mar 16, 2015 · 1 comment
Closed
Assignees

Comments

@pat-oreilly
Copy link

g <- gbm(y ~ x, 
         distribution="gaussian", 
         train.fraction=1,
         bag.fraction=0.1,
         data=data.frame(x=as.factor(1:100), y=rnorm(100)), 
         n.trees=1,
         n.minobsinnode=1)

g$c.splits[[1]]
 [1]  1  1 -1  1  1  1  1 -1  1 -1  1  1 -1  1  1  1  1  1  1  1  1  1
[23]  1 -1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
[45]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
[67]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
[89]  1  1  1  1  1  1  1  1  1  1  1  1

In this example 5 levels are assigned to the left child and 95 are assigned to the right child. Of those at the right child, 90 will have had zero weight while training (since bag.fraction=0.1). It's an artificial example but a node having zero weight for a particular level is very possible when training a real model due to

  • correlation among predictors
  • bag.fraction < 1
  • predictors with high cardinality.

I'm wondering if this could be an issue since the tree is (in expectation) over-predicting for zero-weight levels. Granted, later trees will attempt to correct for any over-prediction but

  1. the later trees may also have zero-weight levels at their nodes
  2. convergence may be improved if the need for correction is avoided.

Perhaps in these circumstances it is more reasonable to use the prediction at the parent node since there is no data to suggest whether the zero-weight levels should be assigned to either the left or right child?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants