ENH Allows plotting max class for multiclass in DecisionBoundaryDisplay#29797
ENH Allows plotting max class for multiclass in DecisionBoundaryDisplay#29797ogrisel merged 62 commits intoscikit-learn:mainfrom
DecisionBoundaryDisplay#29797Conversation
|
I'm also going to try amend example: Logistic Regression 3-class Classifier to show a plot using this enhancement. Though as we use the same |
|
I would almost argue that we be removing the I'm thinking that we could instead modify this one: https://scikit-learn.org/stable/auto_examples/classification/plot_classification_probability.html#sphx-glr-auto-examples-classification-plot-classification-probability-py and add an additional column to the plot with the max. Like this we could remove the OvR LR (it is deprecated), and instead use a Nystroem with LR. |
glemaitre
left a comment
There was a problem hiding this comment.
Ups I see I didn't post some review that I had. This should be partial comments.
DecisionBoundaryDisplayDecisionBoundaryDisplay
Do you want me to do this in this PR or a separate PR (since removing OvR LR may technically not be considered part of this work)? |
We can do it in this PR and limit the change to this single example. |
|
Maybe @ogrisel could have a look since we discussed that feature some time ago. |
|
@ogrisel gentle ping about this 😬 Thank you! |
|
@lucyleeow I pushed two commits of changes I wanted to do while reviewing the PR:
I will do a full review tomorrow but this LGTM. |
| classifier, | ||
| X, | ||
| X_train, | ||
| response_method="predict_proba", |
There was a problem hiding this comment.
This is a pitfall of this method. By default, it would use response_method="decision_function" which is much harder to interpret in my opinion (especially when comparing different model classes).
I think we should change the "auto" policy to use favor predict_proba when available but this should rather be done in a dedicated follow-up PR.
There was a problem hiding this comment.
BTW, if we use response_method="predict_proba", maybe the vmin=0 and vmax=1 parameters could be set automatically to make the use DecisionBoundaryDisplay terser.
There was a problem hiding this comment.
It looks indeed like a good suggestion.
|
FYI @ogrisel I'm just going to fix the merge conflicts |
…oid raising warning
|
Thanks @lucyleeow! |
|
Thanks for finishing this one off @ogrisel ! |
|
Thanks @ogrisel |
|
@lucyleeow If you are looking for follow-up improvements, please consider those two comments: |
|
FYI this broke the CI when matplotlib is not installed, fix in #30971 |
Reference Issues/PRs
Towards #27462
What does this implement/fix? Explain your changes.
Allows
DecisionBoundaryDisplayto represent all classes for multiclassdecision_functionandpredict_probasby plotting the class with the max response at each point.Have closely followed the code here: #27291 (comment)
Not 100% on what the colour API should be, open to change.
contourandcontourfhave acolorsparameter, which can be used instead ofcmapBUTpcolormeshonly hascmap. For simplicity, I've decided to only allow users to passcmap. This probably makes more sense in this context vs a list of colors to cycle through.Any other comments?
WIP - need to add tests once we're happy with plot appearance and API