Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] SVC's default value for decision_function_shape #3

Closed
liganega opened this issue Apr 3, 2022 · 2 comments
Closed

[QUESTION] SVC's default value for decision_function_shape #3

liganega opened this issue Apr 3, 2022 · 2 comments

Comments

@liganega
Copy link

liganega commented Apr 3, 2022

According to sklearn's document, it seems that the hyperparmeter decision_function_shape for SVC model is "ovr", not "ovo". But in the book, "ovo" is mentioned and explained in the section about Multiclass classification, Chapter 3.

On the other hand, it is written in the jupyter notebook as follows:

If you want decision_function() to return all 45 scores, you can set the decision_function_shape hyperparameter to "ovo". The default value is "ovr", but don't let this confuse you: SVC always uses OvO for training. This hyperparameter only affects whether or not the 45 scores get aggregated or not:

This should be also explained in the book.

@ageron
Copy link
Owner

ageron commented Apr 4, 2022

Hi @liganega ,

Thanks for your feedback. Are you referring to the 2nd edition? Because there's already the following explanation in the 3rd edition:

This code actually made 45 predictions—one per pair of classes—and it selected the class that won the most duels. If you call the decision_function() method, you will see that it returns 10 scores per instance: one per class. Each class gets a score equal to the number of won duels plus or minus a small tweak (max ±0.33) to break ties, based on the classifier scores.

Indeed, under the hood, SVC always uses OvO, there's no way to change that. The decision_function_shape hyperparameter only affects the output of the decision_function() method: if this hyperparameter is set to "ovr" (which is the default), then it's still going to use OvO under the hood, but after that it will aggregate the 45 scores into 10 (one per class), while if you set this hyperparameter to "ovo", it will just return the 45 scores directly.

I think this is a confusing part of Scikit-Learn's API. Many people were confused when first reading the docs (myself included): it really looks like the class uses OvR by default, but it doesn't. It still trains 45 models, and it still makes 45 predictions, it's just the way these results are presented. More details here.

Hope this helps.

@liganega
Copy link
Author

liganega commented Apr 4, 2022

The confusion comes from the number "10".

The number "10" in the sentence below does NOT correspond to the "10" binary classification models which would be trained if "OvR" method should really be applied. It's just the number of classes. And the 10 scores are aggregates of the 45 scores leared from 45 binary classification models based on the "OvO" method.

it returns 10 scores per instance: one per class

On the other hand, when the following hyperparameter is set, then 45 scores will be returned.

svm_clf.decision_function_shape = "ovo"

So IMHO, the confusion can easily arise.

Anyway, it is a subtle point, but well understood now. Thank you.

@liganega liganega closed this as completed Apr 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants