Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

predict.train fails with single observation in new data for gbms #274

Closed
jknowles opened this issue Oct 12, 2015 · 2 comments
Closed

predict.train fails with single observation in new data for gbms #274

jknowles opened this issue Oct 12, 2015 · 2 comments

Comments

@jknowles
Copy link
Contributor

I have found an odd bug on both the current CRAN and dev versions of caret which is related to zachmayer/caretEnsemble#171. The error is that predict.train(model, newdata, type = 'prob') does not work consistently for single case values of newdata for gbm models. Here's an MWE using the iris dataset:

library(caret)
gbmMod <- train(iris[, 1:2], iris[, 5],  method = "gbm", 
     trControl=trainControl(method="cv", number=2, 
                     savePredictions=TRUE, classProbs=TRUE))

head(predict(gbmMod, type = "prob"))

This nicely produces:

     setosa   versicolor    virginica
1 0.9997469 1.541028e-04 9.896410e-05
2 0.9940483 3.459335e-03 2.492320e-03
3 0.9999370 3.382596e-05 2.913693e-05
4 0.9999131 5.922540e-05 2.762628e-05
5 0.9998697 7.918245e-05 5.111574e-05
6 0.9992466 5.095148e-04 2.438444e-04

And if we pass in newdata, that works too: predict(gbmMod, type = "prob", newdata = iris[100:105, c(1:2)]):

 setosa versicolor virginica
1 3.347258e-03 0.77843367 0.2182191
2 7.632194e-04 0.24274868 0.7564881
3 3.165440e-04 0.57012163 0.4295618
4 2.054595e-05 0.06004269 0.9399368
5 8.409241e-05 0.68628701 0.3136289
6 6.959093e-05 0.14138125 0.8585492

But, if newdata only has 1 row: predict(gbmMod, type = "prob", newdata = iris[100, c(1:2)]):

Error in `[.data.frame`(out, , obsLevels, drop = FALSE) : 
  undefined columns selected
@topepo
Copy link
Owner

topepo commented Oct 13, 2015

It is a bug. I've fixed the model file. Until the next release, you can source this file and use the new code via:

gbmMod <- train(iris[, 1:2], iris[, 5],  method = modelInfo, 
                trControl=trainControl(method="cv", number=2, 
                                       savePredictions=TRUE, classProbs=TRUE))

@topepo topepo closed this as completed Oct 13, 2015
@jknowles
Copy link
Contributor Author

Awesome thanks Max!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants