Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1]: avoid FutureWarning in BaseSGD.set_params #9802

Merged
merged 4 commits into from Sep 27, 2017

Conversation

@vrishank97
Copy link
Contributor

@vrishank97 vrishank97 commented Sep 19, 2017

Reference Issue

Fixes #9752

What does this implement/fix? Explain your changes.

Adds a set_future_warning flag in BaseSGD to help avoid FutureWarnings
in uses of init or set_params.

@@ -50,7 +50,7 @@ def __init__(self, loss, penalty='l2', alpha=0.0001, C=1.0,
l1_ratio=0.15, fit_intercept=True, max_iter=None, tol=None,
shuffle=True, verbose=0, epsilon=0.1, random_state=None,
learning_rate="optimal", eta0=0.0, power_t=0.5,
warm_start=False, average=False, n_iter=None):
warm_start=False, average=False, n_iter=None, set_future_warning=True):
Copy link
Member

@jnothman jnothman Sep 19, 2017

I don't think we want this as a constructor parameter. It should be a parameter to _validate_params

Copy link
Contributor Author

@vrishank97 vrishank97 Sep 20, 2017

Thanks. I'll modify the code.

Copy link
Member

@jnothman jnothman left a comment

Looks about right. Can you show the output of cross validation over an SGDRegressor() before and after this change please?

@vrishank97
Copy link
Contributor Author

@vrishank97 vrishank97 commented Sep 20, 2017

Before:
/usr/local/lib/python2.7/site-packages/sklearn/linear_model/stochastic_gradient.py:84: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
"and default tol will be 1e-3." % type(self), FutureWarning)

After:
/usr/local/lib/python2.7/site-packages/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
"and default tol will be 1e-3." % type(self), FutureWarning)

@vrishank97
Copy link
Contributor Author

@vrishank97 vrishank97 commented Sep 20, 2017

Script
import numpy as np
from sklearn.model_selection import cross_val_score
from sklearn import linear_model
n_samples, n_features = 10, 5
np.random.seed(0)
y = np.random.randn(n_samples)
X = np.random.randn(n_samples, n_features)
clf = linear_model.SGDRegressor()
scores = cross_val_score(clf, X, y, cv=5)

@vrishank97
Copy link
Contributor Author

@vrishank97 vrishank97 commented Sep 24, 2017

@jnothman I've included the CV output. Can you have a look at it?

@jnothman
Copy link
Member

@jnothman jnothman commented Sep 25, 2017

At master I get:


/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)

which is what I was expecting.

In your branch, I get:


/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)

I now realise that this is identical because __init__ already does set_max_iter=False. I believe now that set_max_iter=False should also be the case for set_params, and that this PR is unnecessary if that change is made.

@vrishank97
Copy link
Contributor Author

@vrishank97 vrishank97 commented Sep 25, 2017

Should I close the PR or add set_max_iter to set_params instead?

@jnothman
Copy link
Member

@jnothman jnothman commented Sep 25, 2017

@vrishank97
Copy link
Contributor Author

@vrishank97 vrishank97 commented Sep 25, 2017

I've added set_max_iter as a param for set_param.

Copy link
Member

@jnothman jnothman left a comment

LGTM. I assume this is MRG now...

@jnothman
Copy link
Member

@jnothman jnothman commented Sep 25, 2017

(We could require a test, but I'm not sure it's worth much...)

@jnothman jnothman changed the title [WIP]: Add set_future_warning flag in BaseSGD [MRG+1]: Add set_future_warning flag in BaseSGD Sep 25, 2017
@jnothman
Copy link
Member

@jnothman jnothman commented Sep 25, 2017

But just to be sure, check the output when doing GridSearchCV(SGDRegressor(), {'alpha': np.logspace(-5, 1, 10)}).fit(...)

@jnothman jnothman changed the title [MRG+1]: Add set_future_warning flag in BaseSGD [MRG+1]: avoid FutureWarning in BaseSGD.set_params Sep 26, 2017
@jnothman jnothman added this to the 0.19.1 milestone Sep 26, 2017
@vrishank97
Copy link
Contributor Author

@vrishank97 vrishank97 commented Sep 26, 2017

Output-
/usr/local/lib/python2.7/site-packages/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
"and default tol will be 1e-3." % type(self), FutureWarning)

Code-
import numpy as np
from sklearn.model_selection import GridSearchCV
from sklearn.linear_model import SGDRegressor
n_samples, n_features = 10, 5
np.random.seed(0)
y = np.random.randn(n_samples)
X = np.random.randn(n_samples, n_features)
GridSearchCV(SGDRegressor(), {'alpha': np.logspace(-5, 1, 10)}).fit(X, y)

@jnothman
Copy link
Member

@jnothman jnothman commented Sep 26, 2017

@vrishank97
Copy link
Contributor Author

@vrishank97 vrishank97 commented Sep 26, 2017

No, it isn't. Is it possible that something else is triggering the warnings?

@jnothman
Copy link
Member

@jnothman jnothman commented Sep 27, 2017

It is for me! At master I get 62 warnings; at this branch I get 31. It's definitely an improvement!

@jnothman jnothman merged commit 5f863aa into scikit-learn:master Sep 27, 2017
6 checks passed
@jnothman
Copy link
Member

@jnothman jnothman commented Sep 27, 2017

Merging, thanks.

jnothman added a commit to jnothman/scikit-learn that referenced this issue Oct 3, 2017
maskani-moh added a commit to maskani-moh/scikit-learn that referenced this issue Nov 15, 2017
jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this issue Dec 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

2 participants