How are predictions calculated from the individual trees? #42

markseeto · 2019-06-29T01:00:32Z

How are the predictions from predict() calculated from the individual trees?

If I understand the documentation correctly, using single.tree=TRUE in predict() gives the prediction from an individual tree or trees. But I can't see how to combine the individual predictions. I thought they would be added together with shrinkage applied to each subsequent tree, but that doesn't appear to be correct.

Example:

library(gbm)

set.seed(1)

shr <- 0.1  # shrinkage value

gbm.iris <- gbm(Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width,
                data = iris, distribution = "multinomial",
                shrinkage = shr, bag.fraction = 1, n.trees = 10)

new.data <- data.frame(Sepal.Length = c(4.7, 4.9, 6.2, 7.1),
                       Sepal.Width = c(2.7, 2.3, 3.1, 3.2),
                       Petal.Length = c(2.5, 2.5, 3.5, 4.9),
                       Petal.Width = c(0.5, 0.8, 1.8, 1.6))

# How is predict(gbm.iris, newdata = new.data, n.trees = 2) calculated?

predict(gbm.iris, newdata = new.data, n.trees = 2)

## , , 2
##          setosa  versicolor  virginica
## [1,] -0.2903287  0.13823604 -0.2523327
## [2,] -0.2903287  0.13823604 -0.2523327
## [3,] -0.2903287 -0.06420611  0.5207189
## [4,] -0.2903287  0.13823604 -0.2523327

# Not the same:
predict(gbm.iris, newdata = new.data, n.trees = 1) +
  shr*predict(gbm.iris, newdata = new.data, n.trees = 2, single.tree=TRUE)

# Not the same:
predict(gbm.iris, newdata = new.data, n.trees = 1) +
  predict(gbm.iris, newdata = new.data, n.trees = 2, single.tree=TRUE)

I'm using gbm version 2.1.5.

Thanks.

The text was updated successfully, but these errors were encountered:

cunningjames · 2019-09-27T14:03:43Z

For what it's worth, the shrinkage parameter appears to be a no-op from the perspective of gbm.fit. For this example:

library(gbm)

set.seed(1)

gbm.iris <- gbm(Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width,
                data = iris, distribution = "multinomial",
                # shrinkage = shr,
                bag.fraction = 1, n.trees = 10)

new.data <- data.frame(Sepal.Length = c(4.7, 4.9, 6.2, 7.1),
                       Sepal.Width = c(2.7, 2.3, 3.1, 3.2),
                       Petal.Length = c(2.5, 2.5, 3.5, 4.9),
                       Petal.Width = c(0.5, 0.8, 1.8, 1.6))

predict(gbm.iris, newdata = new.data, n.trees = 2)

# Same results:

## , , 2
##          setosa  versicolor  virginica
## [1,] -0.2903287  0.13823604 -0.2523327
## [2,] -0.2903287  0.13823604 -0.2523327
## [3,] -0.2903287 -0.06420611  0.5207189
## [4,] -0.2903287  0.13823604 -0.2523327

What I'm having a bit of trouble discerning just yet is why the following gives such different results, when -- from my reading of the prediction code in gbmentry.cpp -- it should be identical as well:

predict(gbm.iris, newdata = new.data, n.trees = 1) +
  predict(gbm.iris, newdata = new.data, n.trees = 2, single.tree = TRUE)

## , , 1
##           setosa versicolor   virginica
## [1,] -0.01176396 -0.1773327 -0.40307460
## [2,] -0.01176396 -0.1773327 -0.40307460
## [3,] -0.21420611  0.5957189  0.01550818
## [4,] -0.01176396 -0.1773327 -0.40307460

I'll keep looking into this.

markseeto · 2019-09-27T19:39:58Z

Thanks for your reply @cunningjames. Not sure if I've understood you correctly, but I get different results if I change the value of shr.

bgreenwell · 2021-06-02T15:54:11Z

@markseeto and @cunningjames . Sorry I'm super late to the party. A couple of things to note. The prediction obtained from predict(..., single.tree = TRUE) are already shrunk by the factor shr. Second, boosting starts from an initial value (e.g., the mean response for LS loss in regression, and something close to the logit for binary outcomes) and this initial value also needs to be added to get the final prediction from the ensemble. It's more complicated in the case of multinomial and it's possible it's bugged in gbm (hence the new warning), but it's easy to see in the binary case:

library(gbm)

set.seed(1)

shr <- 0.1  # shrinkage value
iris$Species <- ifelse(iris$Species == "setosa", 1, 0)
gbm.iris <- gbm(Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width,
                data = iris, distribution = "bernoulli",
                shrinkage = shr, bag.fraction = 1, n.trees = 10)

new.data <- data.frame(Sepal.Length = c(4.7, 4.9, 6.2, 7.1),
                       Sepal.Width = c(2.7, 2.3, 3.1, 3.2),
                       Petal.Length = c(2.5, 2.5, 3.5, 4.9),
                       Petal.Width = c(0.5, 0.8, 1.8, 1.6))

# How is predict(gbm.iris, newdata = new.data, n.trees = 2) calculated?

predict(gbm.iris, newdata = new.data, n.trees = 2)
# [1] -0.9861826 -0.9861826 -0.9861826 -0.9861826

# Should be the same
p.setosa <- mean(iris$Species)
init <- log(p.setosa / (1 - p.setosa))  # boosting starts from the logit of P(Y = 1)

predict(gbm.iris, newdata = new.data, n.trees = 1, single.tree = TRUE) +
  predict(gbm.iris, newdata = new.data, n.trees = 2, single.tree = TRUE) +
  init
# [1] -0.9861826 -0.9861826 -0.9861826 -0.9861826

Hope this helps clear up some confusion.

markseeto mentioned this issue Jun 29, 2019

classif.gbm multiclass mlr-org/mlr#2612

Closed

gregridgeway closed this as completed Jan 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How are predictions calculated from the individual trees? #42

How are predictions calculated from the individual trees? #42

markseeto commented Jun 29, 2019 •

edited

cunningjames commented Sep 27, 2019

markseeto commented Sep 27, 2019

bgreenwell commented Jun 2, 2021 •

edited

How are predictions calculated from the individual trees? #42

How are predictions calculated from the individual trees? #42

Comments

markseeto commented Jun 29, 2019 • edited

cunningjames commented Sep 27, 2019

markseeto commented Sep 27, 2019

bgreenwell commented Jun 2, 2021 • edited

markseeto commented Jun 29, 2019 •

edited

bgreenwell commented Jun 2, 2021 •

edited