Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unnecessary call to predictionFunction when using predict.train with type="prob"? #105

Closed
matthuska opened this issue Jan 21, 2015 · 2 comments

Comments

@matthuska
Copy link

I noticed that when predicting class probabilities, the classes themselves are predicted as well as the class probabilities (from extractProb.R, lines 98-105):

      tempUnkPred <- predictionFunction(models[[i]]$modelInfo,                                                                           
                                        models[[i]]$finalModel,                                                                          
                                        tempX,                                                                                           
                                        models[[i]]$preProcess)                                                                          
      tempUnkProb <- probFunction(models[[i]]$modelInfo,                                                                                 
                                  models[[i]]$finalModel,                                                                                
                                  tempX,                                                                                                 
                                  models[[i]]$preProcess)    

In the case of the model I'm using, this makes prediction take twice as long as it should because the prediction is performed twice. Is this on purpose or is it a bug? Couldn't we accomplish the same result by only calling probFunction, and then using a cutoff to determine class membership, and save a lot of runtime?

@topepo
Copy link
Owner

topepo commented Jan 28, 2015

There is no functional reason to have the class predicted too; that's just how I wrote it so I wouldn't have to call the other function. I'll make a change for the next version that will add an option to estimate the class predictions.

Believe it or not, there are some models where the class predictions and the probability predictions disagree. At this point I've probably error trapped them or base the predictions off of the probs so I will just choose the class with the largest probability here instead of calling predictionFunction again (as you suggest).

topepo added a commit that referenced this issue Apr 2, 2015
@topepo
Copy link
Owner

topepo commented Apr 2, 2015

Checked in and tested

@topepo topepo closed this as completed Apr 2, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants