Skip to content

[MRG+1]: avoid FutureWarning in BaseSGD.set_params #9802

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Sep 27, 2017

Conversation

vrishank97
Copy link
Contributor

Reference Issue

Fixes #9752

What does this implement/fix? Explain your changes.

Adds a set_future_warning flag in BaseSGD to help avoid FutureWarnings
in uses of init or set_params.

@@ -50,7 +50,7 @@ def __init__(self, loss, penalty='l2', alpha=0.0001, C=1.0,
l1_ratio=0.15, fit_intercept=True, max_iter=None, tol=None,
shuffle=True, verbose=0, epsilon=0.1, random_state=None,
learning_rate="optimal", eta0=0.0, power_t=0.5,
warm_start=False, average=False, n_iter=None):
warm_start=False, average=False, n_iter=None, set_future_warning=True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want this as a constructor parameter. It should be a parameter to _validate_params

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I'll modify the code.

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks about right. Can you show the output of cross validation over an SGDRegressor() before and after this change please?

@vrishank97
Copy link
Contributor Author

Before:
/usr/local/lib/python2.7/site-packages/sklearn/linear_model/stochastic_gradient.py:84: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
"and default tol will be 1e-3." % type(self), FutureWarning)

After:
/usr/local/lib/python2.7/site-packages/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
"and default tol will be 1e-3." % type(self), FutureWarning)

@vrishank97
Copy link
Contributor Author

vrishank97 commented Sep 20, 2017

Script
import numpy as np
from sklearn.model_selection import cross_val_score
from sklearn import linear_model
n_samples, n_features = 10, 5
np.random.seed(0)
y = np.random.randn(n_samples)
X = np.random.randn(n_samples, n_features)
clf = linear_model.SGDRegressor()
scores = cross_val_score(clf, X, y, cv=5)

@vrishank97
Copy link
Contributor Author

@jnothman I've included the CV output. Can you have a look at it?

@jnothman
Copy link
Member

At master I get:


/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)

which is what I was expecting.

In your branch, I get:


/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)

I now realise that this is identical because __init__ already does set_max_iter=False. I believe now that set_max_iter=False should also be the case for set_params, and that this PR is unnecessary if that change is made.

@vrishank97
Copy link
Contributor Author

vrishank97 commented Sep 25, 2017

Should I close the PR or add set_max_iter to set_params instead?

@jnothman
Copy link
Member

jnothman commented Sep 25, 2017 via email

@vrishank97
Copy link
Contributor Author

I've added set_max_iter as a param for set_param.

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I assume this is MRG now...

@jnothman
Copy link
Member

(We could require a test, but I'm not sure it's worth much...)

@jnothman jnothman changed the title [WIP]: Add set_future_warning flag in BaseSGD [MRG+1]: Add set_future_warning flag in BaseSGD Sep 25, 2017
@jnothman
Copy link
Member

But just to be sure, check the output when doing GridSearchCV(SGDRegressor(), {'alpha': np.logspace(-5, 1, 10)}).fit(...)

@jnothman jnothman changed the title [MRG+1]: Add set_future_warning flag in BaseSGD [MRG+1]: avoid FutureWarning in BaseSGD.set_params Sep 26, 2017
@jnothman jnothman added this to the 0.19.1 milestone Sep 26, 2017
@vrishank97
Copy link
Contributor Author

Output-
/usr/local/lib/python2.7/site-packages/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
"and default tol will be 1e-3." % type(self), FutureWarning)

Code-
import numpy as np
from sklearn.model_selection import GridSearchCV
from sklearn.linear_model import SGDRegressor
n_samples, n_features = 10, 5
np.random.seed(0)
y = np.random.randn(n_samples)
X = np.random.randn(n_samples, n_features)
GridSearchCV(SGDRegressor(), {'alpha': np.logspace(-5, 1, 10)}).fit(X, y)

@jnothman
Copy link
Member

jnothman commented Sep 26, 2017 via email

@vrishank97
Copy link
Contributor Author

No, it isn't. Is it possible that something else is triggering the warnings?

@jnothman
Copy link
Member

It is for me! At master I get 62 warnings; at this branch I get 31. It's definitely an improvement!

@jnothman jnothman merged commit 5f863aa into scikit-learn:master Sep 27, 2017
@jnothman
Copy link
Member

Merging, thanks.

jnothman pushed a commit to jnothman/scikit-learn that referenced this pull request Oct 3, 2017
maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017
jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BaseSGD should not warn quite so often
2 participants