New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] FIX Add l1_ratio to set regularization with elasticnet penalty in Perceptron #18622
[MRG] FIX Add l1_ratio to set regularization with elasticnet penalty in Perceptron #18622
Conversation
@@ -148,13 +153,14 @@ class Perceptron(BaseSGDClassifier): | |||
https://en.wikipedia.org/wiki/Perceptron and references therein. | |||
""" | |||
@_deprecate_positional_args | |||
def __init__(self, *, penalty=None, alpha=0.0001, fit_intercept=True, | |||
def __init__(self, *, penalty=None, alpha=0.0001, l1_ratio=0.15, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately introducing l1_ratio=.15
might produce changes in the results of users who had set penalty='elasticnet'
, since it used to be hardcoded to 0 before.
We can:
- set the default to 0, but that would be slightly weird since just setting
penalty='elasticnet'
would still be using L2. It would also be inconsistent with SGDClassifier. - set the default to 'warn' which is equivalent to 0, but indicates that the default will change from 0 to .15 in version 0.26
- not care about all this and consider that this is a bugfix. In this case we can just set the default as it is: 0.15
I have a slight preference for option 3 since it's the simplest and it's likely that this won't affect many people.
Thoughts @rth @glemaitre @ogrisel @thomasjpfan?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also in favor of option 3 as tend to perceive it as a bugfix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And why 0.15 as default, not, for instance, 0.5?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also slightly prefer option 3 and consider this a bug fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, seems like a bugfix. And the default should be consistent with SGDClassifier, which has 0.15, probably for no real reason.
Hi @rickiepark, it seems to me that there is a consensus about your implementation. Do you mind adding a "Fix" entry in the what's new? Thanks! |
Hi @cmarmo Ok, I'll add a Fix entry. Thanks! :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @rickiepark , looks good.
We will also need the following entry at the top of the whatsnew, under the "changed models" section:
- |Fix| ::class:
linear_model.Perceptron
whenpenalty='elasticnet'
Also, it'd be nice to have a non-regression test, e.g. a small test in sklearn/linear_model/tests/test_perceptron.py
that makes sure that changing l1_ratio
will lead to different predictions / scores
Thanks!
The Elastic Net mixing parameter, with 0 <= l1_ratio <= 1. | ||
l1_ratio=0 corresponds to L2 penalty, l1_ratio=1 to L1. | ||
Only used if `penalty` is 'elasticnet'. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add .. versionadded:: 0.24
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just added versionadded. Thanks :)
doc/whats_new/v0.24.rst
Outdated
- |Fix| Fixes bug in :class:`linear_model.Perceptron` where `__init__` method | ||
doesn't have `l1_ratio` parameter. :pr:`18622` by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest:
- |Fix| Fixes bug in :class:`linear_model.Perceptron` where `__init__` method | |
doesn't have `l1_ratio` parameter. :pr:`18622` by | |
- |Fix| Added the missing `l1_ratio` parameter in :class:`linear_model.Perceptron`, to be used when `penalty='elasticnet'`. This changes the default from 0 to 0.15. :pr:`18622` by |
(make sure to properly break lines)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for helping me on this PR.
I updated the fix entry. :)
Thanks @NicolasHug I'll update this pr with a test. |
@@ -67,3 +67,13 @@ def test_undefined_methods(): | |||
clf = Perceptron(max_iter=100) | |||
for meth in ("predict_proba", "predict_log_proba"): | |||
assert_raises(AttributeError, lambda x: getattr(clf, x), meth) | |||
|
|||
|
|||
def test_perceptron_l1_ratio(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we get a better test?
Also, I feel like the perceptron documentation could probably be improved. Our perceptron class doesn't really implement only a perceptron. Has anyone else implemented something like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that I'm an example to follow on this, but let's try to be more descriptive in our comments ;) What kind of test would you like to see @amueller?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @rickiepark , LGTM
clf2 = Perceptron(l1_ratio=0.15, penalty='elasticnet') | ||
clf2.fit(X, y) | ||
|
||
assert clf1.score(X, y) != clf2.score(X, y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to assert that the scores are different with a significant difference (e.g. something like .1 or .05).
assert clf1.score(X, y) != clf2.score(X, y)
would pass even if the score difference was 1e-15 and that's not strict enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clf1.score(X, y)
is 0.96 and clf2.score(X, y)
is 0.906.
How about assert clf1.score(X, y) - clf2.score(X, y) > 0.01
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that sounds fine. Maybe check the absolute difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clf1
is overfitted more than clf2
. Is it normal that clf2.score(X, y)
is greater than clf1.score(X, y)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to have l1_ratio=0
and check with penalty="l2"
and l1_ratio=1
and check with penalty="l1"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @glemaitre ,
Do you mean something like
assert Perceptron(l1_ratio=0, penalty='elasticnet').fit(X, y).score(X, y) != Perceptron(l1_ratio=0, penalty='l2').fit(X, y).score(X, y)
and
assert Perceptron(l1_ratio=1, penalty='elasticnet').fit(X, y).score(X, y) != Perceptron(l1_ratio=1, penalty='l1').fit(X, y).score(X, y)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a small test to check that this is passing across all platforms.
We should be able to check the coefficient in l1 and with l1_ratio=1 and l2 with l1_ratio=0
clf2 = Perceptron(l1_ratio=0.15, penalty='elasticnet') | ||
clf2.fit(X, y) | ||
|
||
assert clf1.score(X, y) != clf2.score(X, y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to have l1_ratio=0
and check with penalty="l2"
and l1_ratio=1
and check with penalty="l1"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the test is passing across all platforms. We are good at merging.
Thanks @rickiepark |
Thanks @glemaitre :) |
Reference Issues/PRs
Fixes #18620
What does this implement/fix? Explain your changes.
Add l1_ratio parameter in Perceptron class.
Default value is 0.15 like SGDClassifier.
Any other comments?
@NicolasHug Please check it out. :)