FIX Avoid setting legend when labels are None in RocCurveDisplay kwargs#29727
FIX Avoid setting legend when labels are None in RocCurveDisplay kwargs#29727OmarManzoor merged 7 commits intoscikit-learn:mainfrom
Conversation
sklearn/metrics/_plot/roc_curve.py
Outdated
| if "label" in line_kwargs or "label" in chance_level_line_kw: | ||
| if ( | ||
| line_kwargs.get("label") is not None | ||
| and chance_level_line_kw.get("label") is not None |
There was a problem hiding this comment.
Isn't it an or and instead of an and? We need one of the two not being None to show the legend?
It also means that we don't have a test right now.
There was a problem hiding this comment.
Yes, I did change the logic so that the legend is not set as soon as either of the labels is None.
The reasoning is that I don't think anyone would set plot_chance_lvl=True and override chance_level_kw with {"label": None}, so that part could be a bit more flexible. WDYT?
There was a problem hiding this comment.
For me this is counter-intuitive to have no legend for the chance level curve with the following code:
In [3]: import matplotlib.pyplot as plt
...: from sklearn.datasets import make_classification
...: from sklearn.metrics import RocCurveDisplay
...: from sklearn.model_selection import train_test_split
...: from sklearn.svm import SVC
...: X, y = make_classification(random_state=0)
...: X_train, X_test, y_train, y_test = train_test_split(
...: X, y, random_state=0)
...: clf = SVC(random_state=0).fit(X_train, y_train)
...: RocCurveDisplay.from_estimator(
...: clf, X_test, y_test, label=None, plot_chance_level=True
...: )
...: plt.show()If I really want to turn off the legend for the chance level then I can do it with the right params:
In [2]: import matplotlib.pyplot as plt
...: from sklearn.datasets import make_classification
...: from sklearn.metrics import RocCurveDisplay
...: from sklearn.model_selection import train_test_split
...: from sklearn.svm import SVC
...: X, y = make_classification(random_state=0)
...: X_train, X_test, y_train, y_test = train_test_split(
...: X, y, random_state=0)
...: clf = SVC(random_state=0).fit(X_train, y_train)
...: RocCurveDisplay.from_estimator(
...: clf, X_test, y_test, label=None, plot_chance_level=True, chance_level_kw={"label": None},
...: )
...: plt.show()So to have this last behaviour you need to have a or operator and not a and operator.
|
We will need an additional test to check that the behaviour is the one that we want. |
| assert display.chance_level_.get_linestyle() == "--" | ||
| assert display.chance_level_.get_label() == "Chance level (AUC = 0.5)" | ||
| elif plot_chance_level: | ||
| assert display.chance_level_.get_label() == chance_level_kw["label"] |
There was a problem hiding this comment.
I removed this test as it is redundant with the assert chance_level_kw["label"] in legend_labels added further below.
|
I addressed all your comments @glemaitre. Also I don't really think we need a changelog entry here, unless you think otherwise. |
glemaitre
left a comment
There was a problem hiding this comment.
I think that we need to have an entry in a changelog because we change the behaviour and this is actually a fix.
Apart from that. LGTM.
OmarManzoor
left a comment
There was a problem hiding this comment.
LGTM. Thanks @ArturoAmorQ
Reference Issues/PRs
What does this implement/fix? Explain your changes.
In #29617 and #29611 the RocCurveDisplay is used to visualize the roc curve across several cross-validation splits. As having the labels for each split is unnecessary, we rather set
label=None, but that rises aUserWarning: No artists with labels found to put in legenddue to the legend being set at a lower level inside the display.This PR fixes the issue.
Any other comments?
Notice that
"label" in chance_level_line_kwand"label" in line_kwargsare always True, as those keys are assigned earlier in the code. As "label" cannot be removed, only updated toNone, we rather query for it's value using thegetmethod.