[MRG+1]: avoid FutureWarning in BaseSGD.set_params #9802

vrishank97 · 2017-09-19T18:22:44Z

Reference Issue

What does this implement/fix? Explain your changes.

Adds a set_future_warning flag in BaseSGD to help avoid FutureWarnings
in uses of init or set_params.

jnothman · 2017-09-19T23:49:30Z

sklearn/linear_model/stochastic_gradient.py

@@ -50,7 +50,7 @@ def __init__(self, loss, penalty='l2', alpha=0.0001, C=1.0,
                 l1_ratio=0.15, fit_intercept=True, max_iter=None, tol=None,
                 shuffle=True, verbose=0, epsilon=0.1, random_state=None,
                 learning_rate="optimal", eta0=0.0, power_t=0.5,
-                 warm_start=False, average=False, n_iter=None):
+                 warm_start=False, average=False, n_iter=None, set_future_warning=True):


I don't think we want this as a constructor parameter. It should be a parameter to _validate_params

Thanks. I'll modify the code.

jnothman

Looks about right. Can you show the output of cross validation over an SGDRegressor() before and after this change please?

vrishank97 · 2017-09-20T11:16:40Z

Before:
/usr/local/lib/python2.7/site-packages/sklearn/linear_model/stochastic_gradient.py:84: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
"and default tol will be 1e-3." % type(self), FutureWarning)

After:
/usr/local/lib/python2.7/site-packages/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
"and default tol will be 1e-3." % type(self), FutureWarning)

vrishank97 · 2017-09-20T11:19:18Z

Script
import numpy as np
from sklearn.model_selection import cross_val_score
from sklearn import linear_model
n_samples, n_features = 10, 5
np.random.seed(0)
y = np.random.randn(n_samples)
X = np.random.randn(n_samples, n_features)
clf = linear_model.SGDRegressor()
scores = cross_val_score(clf, X, y, cv=5)

vrishank97 · 2017-09-24T14:25:10Z

@jnothman I've included the CV output. Can you have a look at it?

jnothman · 2017-09-25T03:32:44Z

At master I get:


/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)

which is what I was expecting.

In your branch, I get:


/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)

I now realise that this is identical because __init__ already does set_max_iter=False. I believe now that set_max_iter=False should also be the case for set_params, and that this PR is unnecessary if that change is made.

vrishank97 · 2017-09-25T06:53:56Z

Should I close the PR or add set_max_iter to set_params instead?

jnothman · 2017-09-25T07:35:47Z

whichever you rather

vrishank97 · 2017-09-25T08:29:15Z

I've added set_max_iter as a param for set_param.

jnothman

LGTM. I assume this is MRG now...

jnothman · 2017-09-25T10:12:01Z

(We could require a test, but I'm not sure it's worth much...)

jnothman · 2017-09-25T10:13:55Z

But just to be sure, check the output when doing GridSearchCV(SGDRegressor(), {'alpha': np.logspace(-5, 1, 10)}).fit(...)

vrishank97 · 2017-09-26T12:31:35Z

Output-
/usr/local/lib/python2.7/site-packages/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
"and default tol will be 1e-3." % type(self), FutureWarning)

Code-
import numpy as np
from sklearn.model_selection import GridSearchCV
from sklearn.linear_model import SGDRegressor
n_samples, n_features = 10, 5
np.random.seed(0)
y = np.random.randn(n_samples)
X = np.random.randn(n_samples, n_features)
GridSearchCV(SGDRegressor(), {'alpha': np.logspace(-5, 1, 10)}).fit(X, y)

jnothman · 2017-09-26T21:36:28Z

and this is different to master?

vrishank97 · 2017-09-26T22:57:39Z

No, it isn't. Is it possible that something else is triggering the warnings?

jnothman · 2017-09-27T04:39:06Z

It is for me! At master I get 62 warnings; at this branch I get 31. It's definitely an improvement!

jnothman · 2017-09-27T04:39:36Z

Merging, thanks.

vrishank97 added 2 commits September 19, 2017 23:52

Added set_future_warning flag in BaseSGD

01046a7

Update stochastic_gradient.py

5c7da54

jnothman reviewed Sep 19, 2017

View reviewed changes

Adds ignore_future_warning flag to _validate_params

b7103d9

jnothman reviewed Sep 20, 2017

View reviewed changes

use set_max_iter instead of set_future_warning

e5fc0f8

jnothman reviewed Sep 25, 2017

View reviewed changes

jnothman changed the title ~~[WIP]: Add set_future_warning flag in BaseSGD~~ [MRG+1]: Add set_future_warning flag in BaseSGD Sep 25, 2017

jnothman changed the title ~~[MRG+1]: Add set_future_warning flag in BaseSGD~~ [MRG+1]: avoid FutureWarning in BaseSGD.set_params Sep 26, 2017

jnothman added this to the 0.19.1 milestone Sep 26, 2017

jnothman merged commit 5f863aa into scikit-learn:master Sep 27, 2017

jnothman pushed a commit to jnothman/scikit-learn that referenced this pull request Oct 3, 2017

ENH avoid FutureWarning in BaseSGD.set_params (scikit-learn#9802)

052be48

maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017

ENH avoid FutureWarning in BaseSGD.set_params (scikit-learn#9802)

0971c90

jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017

ENH avoid FutureWarning in BaseSGD.set_params (scikit-learn#9802)

54dec5e

Uh oh!

[MRG+1]: avoid FutureWarning in BaseSGD.set_params #9802

[MRG+1]: avoid FutureWarning in BaseSGD.set_params #9802

Uh oh!

Conversation

vrishank97 commented Sep 19, 2017

Reference Issue

What does this implement/fix? Explain your changes.

Uh oh!

jnothman Sep 19, 2017

Choose a reason for hiding this comment

Uh oh!

vrishank97 Sep 20, 2017

Choose a reason for hiding this comment

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

vrishank97 commented Sep 20, 2017

Uh oh!

vrishank97 commented Sep 20, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vrishank97 commented Sep 24, 2017

Uh oh!

jnothman commented Sep 25, 2017

Uh oh!

vrishank97 commented Sep 25, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnothman commented Sep 25, 2017 via email

Uh oh!

vrishank97 commented Sep 25, 2017

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

jnothman commented Sep 25, 2017

Uh oh!

jnothman commented Sep 25, 2017

Uh oh!

vrishank97 commented Sep 26, 2017

Uh oh!

jnothman commented Sep 26, 2017 via email

Uh oh!

vrishank97 commented Sep 26, 2017

Uh oh!

jnothman commented Sep 27, 2017

Uh oh!

jnothman commented Sep 27, 2017

Uh oh!

Uh oh!

vrishank97 commented Sep 20, 2017 •

edited

Loading

vrishank97 commented Sep 25, 2017 •

edited

Loading