Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] FIX Add l1_ratio to set regularization with elasticnet penalty in Perceptron #18622

Merged
merged 11 commits into from Nov 2, 2020

Conversation

rickiepark
Copy link
Contributor

Reference Issues/PRs

Fixes #18620

What does this implement/fix? Explain your changes.

Add l1_ratio parameter in Perceptron class.
Default value is 0.15 like SGDClassifier.

Any other comments?

@NicolasHug Please check it out. :)

@@ -148,13 +153,14 @@ class Perceptron(BaseSGDClassifier):
https://en.wikipedia.org/wiki/Perceptron and references therein.
"""
@_deprecate_positional_args
def __init__(self, *, penalty=None, alpha=0.0001, fit_intercept=True,
def __init__(self, *, penalty=None, alpha=0.0001, l1_ratio=0.15,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately introducing l1_ratio=.15 might produce changes in the results of users who had set penalty='elasticnet', since it used to be hardcoded to 0 before.

We can:

  • set the default to 0, but that would be slightly weird since just setting penalty='elasticnet' would still be using L2. It would also be inconsistent with SGDClassifier.
  • set the default to 'warn' which is equivalent to 0, but indicates that the default will change from 0 to .15 in version 0.26
  • not care about all this and consider that this is a bugfix. In this case we can just set the default as it is: 0.15

I have a slight preference for option 3 since it's the simplest and it's likely that this won't affect many people.

Thoughts @rth @glemaitre @ogrisel @thomasjpfan?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also in favor of option 3 as tend to perceive it as a bugfix.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And why 0.15 as default, not, for instance, 0.5?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also slightly prefer option 3 and consider this a bug fix.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, seems like a bugfix. And the default should be consistent with SGDClassifier, which has 0.15, probably for no real reason.

@NicolasHug NicolasHug added this to the 0.24 milestone Oct 15, 2020
@cmarmo
Copy link
Member

cmarmo commented Oct 27, 2020

Hi @rickiepark, it seems to me that there is a consensus about your implementation. Do you mind adding a "Fix" entry in the what's new? Thanks!

@rickiepark
Copy link
Contributor Author

Hi @cmarmo

Ok, I'll add a Fix entry. Thanks! :)

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rickiepark , looks good.

We will also need the following entry at the top of the whatsnew, under the "changed models" section:

  • |Fix| ::class:linear_model.Perceptron when penalty='elasticnet'

Also, it'd be nice to have a non-regression test, e.g. a small test in sklearn/linear_model/tests/test_perceptron.py that makes sure that changing l1_ratio will lead to different predictions / scores

Thanks!

The Elastic Net mixing parameter, with 0 <= l1_ratio <= 1.
l1_ratio=0 corresponds to L2 penalty, l1_ratio=1 to L1.
Only used if `penalty` is 'elasticnet'.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add .. versionadded:: 0.24 here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added versionadded. Thanks :)

- |Fix| Fixes bug in :class:`linear_model.Perceptron` where `__init__` method
doesn't have `l1_ratio` parameter. :pr:`18622` by
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest:

Suggested change
- |Fix| Fixes bug in :class:`linear_model.Perceptron` where `__init__` method
doesn't have `l1_ratio` parameter. :pr:`18622` by
- |Fix| Added the missing `l1_ratio` parameter in :class:`linear_model.Perceptron`, to be used when `penalty='elasticnet'`. This changes the default from 0 to 0.15. :pr:`18622` by

(make sure to properly break lines)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for helping me on this PR.
I updated the fix entry. :)

@rickiepark
Copy link
Contributor Author

Thanks @NicolasHug

I'll update this pr with a test.

@@ -67,3 +67,13 @@ def test_undefined_methods():
clf = Perceptron(max_iter=100)
for meth in ("predict_proba", "predict_log_proba"):
assert_raises(AttributeError, lambda x: getattr(clf, x), meth)


def test_perceptron_l1_ratio():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we get a better test?

Also, I feel like the perceptron documentation could probably be improved. Our perceptron class doesn't really implement only a perceptron. Has anyone else implemented something like this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I'm an example to follow on this, but let's try to be more descriptive in our comments ;) What kind of test would you like to see @amueller?

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rickiepark , LGTM

clf2 = Perceptron(l1_ratio=0.15, penalty='elasticnet')
clf2.fit(X, y)

assert clf1.score(X, y) != clf2.score(X, y)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to assert that the scores are different with a significant difference (e.g. something like .1 or .05).

assert clf1.score(X, y) != clf2.score(X, y) would pass even if the score difference was 1e-15 and that's not strict enough

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clf1.score(X, y) is 0.96 and clf2.score(X, y) is 0.906.
How about assert clf1.score(X, y) - clf2.score(X, y) > 0.01 ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that sounds fine. Maybe check the absolute difference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clf1 is overfitted more than clf2. Is it normal that clf2.score(X, y) is greater than clf1.score(X, y)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to have l1_ratio=0 and check with penalty="l2" and l1_ratio=1 and check with penalty="l1"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @glemaitre ,
Do you mean something like
assert Perceptron(l1_ratio=0, penalty='elasticnet').fit(X, y).score(X, y) != Perceptron(l1_ratio=0, penalty='l2').fit(X, y).score(X, y)
and
assert Perceptron(l1_ratio=1, penalty='elasticnet').fit(X, y).score(X, y) != Perceptron(l1_ratio=1, penalty='l1').fit(X, y).score(X, y)?

Copy link
Contributor

@glemaitre glemaitre Nov 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a small test to check that this is passing across all platforms.
We should be able to check the coefficient in l1 and with l1_ratio=1 and l2 with l1_ratio=0

sklearn/linear_model/_perceptron.py Outdated Show resolved Hide resolved
doc/whats_new/v0.24.rst Show resolved Hide resolved
sklearn/linear_model/_perceptron.py Outdated Show resolved Hide resolved
sklearn/linear_model/_perceptron.py Outdated Show resolved Hide resolved
sklearn/linear_model/tests/test_perceptron.py Show resolved Hide resolved
clf2 = Perceptron(l1_ratio=0.15, penalty='elasticnet')
clf2.fit(X, y)

assert clf1.score(X, y) != clf2.score(X, y)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to have l1_ratio=0 and check with penalty="l2" and l1_ratio=1 and check with penalty="l1"?

@glemaitre glemaitre changed the title [MRG] Add l1_ratio parameter in Perceptron [MRG] FIX Add l1_ratio to set regularization with elasticnet penalty in Perceptron Oct 30, 2020
@glemaitre glemaitre self-requested a review November 2, 2020 09:33
Copy link
Contributor

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the test is passing across all platforms. We are good at merging.

@glemaitre glemaitre merged commit 4880f1c into scikit-learn:master Nov 2, 2020
2 checks passed
@glemaitre
Copy link
Contributor

Thanks @rickiepark

@rickiepark
Copy link
Contributor Author

Thanks @glemaitre :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mismatch between docs and l1_ratio parameter in Perceptron
7 participants