New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEP deprecate multi_class in LogisticRegression #28703
base: main
Are you sure you want to change the base?
DEP deprecate multi_class in LogisticRegression #28703
Conversation
@scikit-learn/core-devs friendly ping for visibility. |
I have not had time to look at this PR at all, but before deprecating, can we have tests to be sure they are equivalent in results and potentially in UX (e.g. would accessing fitted attributes still be possible)? |
would be ok for me. I think it's ok to live with OneVsRestClassifier(LogisticRegression(..)) when using liblinear. |
Just to make sure that I understand things correctly: the plan is that by default multi-class is supported via multinomial loss but if the solver is liblinear multi-class raises an error, and the user must use a OvR? If so, that strategy is fine by me, but I think that the deprecation message should first suggest to use solvers that support multinomial or user OvR if liblinear is desired. |
Yes, exactly. And yes, with an informative deprecation warning, later an error advertising solvers that support the multinomial loss. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also looks like solver="newton-cholesky"
does not support ovr .
From a high level API point of view, LinearSVC
only does OVR with liblinear
and LogisticRegression
will no longer support it. In the future, should LinearSVC
also force users to use OneVsRestClassifier
for multi-class problems?
Over the last few years, we are slowly making meta-estimators more necessary for certain task. (i.e., the removal of normalize
or this PR). It kind of goes against the history of "lets make estimators easy to use". For example, the classifiers encodes string labels to "make these easy". This is my observation and I am undecided on the current path.
"'multi_class' was deprecated in version 1.5 and will be removed in" | ||
" 1.7. Use OneVsRestClassifier(LogisticRegression(..)) instead." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If one specifically sets multi_class="multinomial"
, then this warning seems out of place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifically, the problem is now that this warning message would be off when calling LogisticRegression(multi_class="auto").fit(X, y)
on multiclass data.
Maybe we can just use a generic all ecompassing message such as:
if self.multi_class != "deprecated":
warnings.warn(
(
"'multi_class' was deprecated in version 1.5 and will be removed in"
" 1.7. For solvers and penalties that support it, the multinomial"
" scheme is used automatically when the data has more than two"
" classes. The one-vs-rest scheme can be implemented with"
" OneVsRestClassifier(LogisticRegression(...)) instead."
" See the docstring for more details."
),
FutureWarning,
)
else:
# Set to old default value.
multi_class = "auto"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it off? The parameter multi_class
should best not be used anymore, also not with value "auto"
.
Over the last few years, we are slowly making meta-estimators more necessary for certain task. (i.e., the removal of normalize or this PR). It kind of goes against the history of "lets make estimators easy to use". For example, the classifiers encodes string labels to "make these easy". This is my observation and I am undecided on the current path.
Yes, I worry a lot about this trend.
Everybody that I talk to values a lot the fact that it's easy to get things going with scikit-learn. It's the number one benefit that people mention. If we loose this, we loose what made scikit-learn.
|
You raise good points. For this particular case, I have 3 answers:
|
I interpret the conversation so far as decision to deprecate I have an open questions: Shall we raise an error for multiclass liblinear or internally switch to OvR? (And state this in the docs)? |
I am fine for deprecating the |
Ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some feedback. The most important comments are related to the contents of the docstrings and the warning message to address Thomas' remark.
The other things are more minor and overall LGTM.
Thanks for the PR.
sklearn/linear_model/_logistic.py
Outdated
), | ||
FutureWarning, | ||
) | ||
elif self.multi_class != "deprecated": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
elif self.multi_class != "deprecated": | |
elif self.multi_class != "ovr": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean == "ovr
, right?
"'multi_class' was deprecated in version 1.5 and will be removed in" | ||
" 1.7. Use OneVsRestClassifier(LogisticRegression(..)) instead." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifically, the problem is now that this warning message would be off when calling LogisticRegression(multi_class="auto").fit(X, y)
on multiclass data.
Maybe we can just use a generic all ecompassing message such as:
if self.multi_class != "deprecated":
warnings.warn(
(
"'multi_class' was deprecated in version 1.5 and will be removed in"
" 1.7. For solvers and penalties that support it, the multinomial"
" scheme is used automatically when the data has more than two"
" classes. The one-vs-rest scheme can be implemented with"
" OneVsRestClassifier(LogisticRegression(...)) instead."
" See the docstring for more details."
),
FutureWarning,
)
else:
# Set to old default value.
multi_class = "auto"
) | ||
else: | ||
# Set to old default value. | ||
multi_class = "auto" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment about the warning messages, especially when multi_class="auto"
is passed explicitly.
@@ -1598,6 +1564,9 @@ def test_LogisticRegressionCV_GridSearchCV_elastic_net(multi_class): | |||
assert gs.best_params_["C"] == lrcv.C_[0] | |||
|
|||
|
|||
# TODO(1.7): remove filterwarnings after the deprecation of multi_class | |||
# Maybe remove whole test after removal of the deprecated multi_class. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the record, I am +1 for this suggestion.
tol = 0.00001 | ||
max_iter = 40 | ||
tol = 1e-5 | ||
max_iter = 70 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this was needed because the previous max_iter
value was too RNG sensitive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I checked the new (and old) value with all global random seeds. SAG and SAGA work better for larger datasets,
Reference Issues/PRs
Towards #11865.
What does this implement/fix? Explain your changes.
This PR deprecates the
multi_class
parameter inLogisticRegression
. Using that option is equivalent toOneVsRestClassifier(LogisticRegression())
, so no functionality is lost and, once gone, it would simplify the code of logreg quite a bit and make in more maintainable.Any other comments?
This PR starts very simple with only
LogisticRegression
. In case of positive feedback, I'll extend it toLogisticRegressionCV
and adapt all the docstrings and so on.