New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1]: avoid FutureWarning in BaseSGD.set_params #9802

Merged
merged 4 commits into from Sep 27, 2017

Conversation

Projects
None yet
2 participants
@vrishank97
Contributor

vrishank97 commented Sep 19, 2017

Reference Issue

Fixes #9752

What does this implement/fix? Explain your changes.

Adds a set_future_warning flag in BaseSGD to help avoid FutureWarnings
in uses of init or set_params.

@jnothman

Looks about right. Can you show the output of cross validation over an SGDRegressor() before and after this change please?

@vrishank97

This comment has been minimized.

Show comment
Hide comment
@vrishank97

vrishank97 Sep 20, 2017

Contributor

Before:
/usr/local/lib/python2.7/site-packages/sklearn/linear_model/stochastic_gradient.py:84: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
"and default tol will be 1e-3." % type(self), FutureWarning)

After:
/usr/local/lib/python2.7/site-packages/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
"and default tol will be 1e-3." % type(self), FutureWarning)

Contributor

vrishank97 commented Sep 20, 2017

Before:
/usr/local/lib/python2.7/site-packages/sklearn/linear_model/stochastic_gradient.py:84: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
"and default tol will be 1e-3." % type(self), FutureWarning)

After:
/usr/local/lib/python2.7/site-packages/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
"and default tol will be 1e-3." % type(self), FutureWarning)

@vrishank97

This comment has been minimized.

Show comment
Hide comment
@vrishank97

vrishank97 Sep 20, 2017

Contributor

Script
import numpy as np
from sklearn.model_selection import cross_val_score
from sklearn import linear_model
n_samples, n_features = 10, 5
np.random.seed(0)
y = np.random.randn(n_samples)
X = np.random.randn(n_samples, n_features)
clf = linear_model.SGDRegressor()
scores = cross_val_score(clf, X, y, cv=5)

Contributor

vrishank97 commented Sep 20, 2017

Script
import numpy as np
from sklearn.model_selection import cross_val_score
from sklearn import linear_model
n_samples, n_features = 10, 5
np.random.seed(0)
y = np.random.randn(n_samples)
X = np.random.randn(n_samples, n_features)
clf = linear_model.SGDRegressor()
scores = cross_val_score(clf, X, y, cv=5)

@vrishank97

This comment has been minimized.

Show comment
Hide comment
@vrishank97

vrishank97 Sep 24, 2017

Contributor

@jnothman I've included the CV output. Can you have a look at it?

Contributor

vrishank97 commented Sep 24, 2017

@jnothman I've included the CV output. Can you have a look at it?

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Sep 25, 2017

Member

At master I get:


/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)

which is what I was expecting.

In your branch, I get:


/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)

I now realise that this is identical because __init__ already does set_max_iter=False. I believe now that set_max_iter=False should also be the case for set_params, and that this PR is unnecessary if that change is made.

Member

jnothman commented Sep 25, 2017

At master I get:


/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)

which is what I was expecting.

In your branch, I get:


/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)
/Users/joel/repos/scikit-learn/sklearn/linear_model/stochastic_gradient.py:129: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
  "and default tol will be 1e-3." % type(self), FutureWarning)

I now realise that this is identical because __init__ already does set_max_iter=False. I believe now that set_max_iter=False should also be the case for set_params, and that this PR is unnecessary if that change is made.

@vrishank97

This comment has been minimized.

Show comment
Hide comment
@vrishank97

vrishank97 Sep 25, 2017

Contributor

Should I close the PR or add set_max_iter to set_params instead?

Contributor

vrishank97 commented Sep 25, 2017

Should I close the PR or add set_max_iter to set_params instead?

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Sep 25, 2017

Member
Member

jnothman commented Sep 25, 2017

@vrishank97

This comment has been minimized.

Show comment
Hide comment
@vrishank97

vrishank97 Sep 25, 2017

Contributor

I've added set_max_iter as a param for set_param.

Contributor

vrishank97 commented Sep 25, 2017

I've added set_max_iter as a param for set_param.

@jnothman

LGTM. I assume this is MRG now...

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Sep 25, 2017

Member

(We could require a test, but I'm not sure it's worth much...)

Member

jnothman commented Sep 25, 2017

(We could require a test, but I'm not sure it's worth much...)

@jnothman jnothman changed the title from [WIP]: Add set_future_warning flag in BaseSGD to [MRG+1]: Add set_future_warning flag in BaseSGD Sep 25, 2017

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Sep 25, 2017

Member

But just to be sure, check the output when doing GridSearchCV(SGDRegressor(), {'alpha': np.logspace(-5, 1, 10)}).fit(...)

Member

jnothman commented Sep 25, 2017

But just to be sure, check the output when doing GridSearchCV(SGDRegressor(), {'alpha': np.logspace(-5, 1, 10)}).fit(...)

@jnothman jnothman changed the title from [MRG+1]: Add set_future_warning flag in BaseSGD to [MRG+1]: avoid FutureWarning in BaseSGD.set_params Sep 26, 2017

@jnothman jnothman added this to the 0.19.1 milestone Sep 26, 2017

@vrishank97

This comment has been minimized.

Show comment
Hide comment
@vrishank97

vrishank97 Sep 26, 2017

Contributor

Output-
/usr/local/lib/python2.7/site-packages/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
"and default tol will be 1e-3." % type(self), FutureWarning)

Code-
import numpy as np
from sklearn.model_selection import GridSearchCV
from sklearn.linear_model import SGDRegressor
n_samples, n_features = 10, 5
np.random.seed(0)
y = np.random.randn(n_samples)
X = np.random.randn(n_samples, n_features)
GridSearchCV(SGDRegressor(), {'alpha': np.logspace(-5, 1, 10)}).fit(X, y)

Contributor

vrishank97 commented Sep 26, 2017

Output-
/usr/local/lib/python2.7/site-packages/sklearn/linear_model/stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in <class 'sklearn.linear_model.stochastic_gradient.SGDRegressor'> in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.
"and default tol will be 1e-3." % type(self), FutureWarning)

Code-
import numpy as np
from sklearn.model_selection import GridSearchCV
from sklearn.linear_model import SGDRegressor
n_samples, n_features = 10, 5
np.random.seed(0)
y = np.random.randn(n_samples)
X = np.random.randn(n_samples, n_features)
GridSearchCV(SGDRegressor(), {'alpha': np.logspace(-5, 1, 10)}).fit(X, y)

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Sep 26, 2017

Member
Member

jnothman commented Sep 26, 2017

@vrishank97

This comment has been minimized.

Show comment
Hide comment
@vrishank97

vrishank97 Sep 26, 2017

Contributor

No, it isn't. Is it possible that something else is triggering the warnings?

Contributor

vrishank97 commented Sep 26, 2017

No, it isn't. Is it possible that something else is triggering the warnings?

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Sep 27, 2017

Member

It is for me! At master I get 62 warnings; at this branch I get 31. It's definitely an improvement!

Member

jnothman commented Sep 27, 2017

It is for me! At master I get 62 warnings; at this branch I get 31. It's definitely an improvement!

@jnothman jnothman merged commit 5f863aa into scikit-learn:master Sep 27, 2017

6 checks passed

ci/circleci Your tests passed on CircleCI!
Details
codecov/patch 100% of diff hit (target 96.17%)
Details
codecov/project 96.17% (+<.01%) compared to 3cfd1fa
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
lgtm analysis: Python No alert changes
Details
@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Sep 27, 2017

Member

Merging, thanks.

Member

jnothman commented Sep 27, 2017

Merging, thanks.

jnothman added a commit to jnothman/scikit-learn that referenced this pull request Oct 3, 2017

maskani-moh added a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017

jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment