New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SVM with classProbs=TRUE leads to much worse accuracy #386
Comments
|
A reproducible example would be a good place to start the discussion so that we can see how different the results are. With |
|
The results are very different. But since it's more a problem of kernlab, I think you can close this issue. For everyone who wants to use the e1071-svm with caret, here is my code (according to the official caret-tutorial for embedding other libraries): E1071 to Caret |
That's not really the case; SVMs in general do not intrinsically estimate class probabilities so Platt's method (which is what I described) is pretty much used everywhere. |
|
Then it would be interesting why (found several topics on StackOverflow, where the same problem was encountered) using classProbs=TRUE with your kernlab-SVM cuts down the accuracy that much. Using classProbs=TRUE lead to an accuracy-reduction from over 80% to 45%. While using e1071 this is not the case and the accuracy stays nearly constant. |
|
That's more than I've seen. Again a data set I can use to look into the matter would be great. |
|
I'm going to close this; we don't have any other information to help investigate the issue. |
|
Quick note. I'm experiencing this issue in 2019. A test using the iris dataset fails to reproduce the issue. However, using my dataset (y = factor w/ 10 levels)(unfortunately, confidential data), prediction accuracy is dropping from ~97% to ~61% when trained in caret when the only difference is classProbs = TRUE and classProbs = FALSE. Training of models directly in kernlab produces similar results when probabilistic predictions are made, indicating that as others have suggested, is an issue with how Platt's method works. Using the same dataset with other classes of learners class probabilities seem to work as expected. Strangely, even though svmRadialSigma/svmRadial/svmCost all exhibit this weird behaviour, svmRadialWeights seems to perform well, when producing raw predictions, but using the same model to make probabilistic predictions, performance again drops. |
|
Hi I am experiencing this issue using caret package. |
Why are the classification accuracies in certain cases much worse while using "svmRadial" together with classProbs=TRUE instead of classProbs=FALSE? Is there any way to use svmRadial with class-probabilities and not screwing up my accuracy?
The text was updated successfully, but these errors were encountered: