Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] FIX report n_iter_ as at most max_iter, without always reducing by 1 #10723

Merged
merged 9 commits into from
Mar 5, 2018

Conversation

jnothman
Copy link
Member

Fix #10619

@jnothman jnothman changed the title [MRG] FIX report n_iter_ faithfully to scipy.optimize [MRG] FIX report n_iter_ in accordance with scipy.optimize Feb 28, 2018
@jnothman jnothman added the Bug label Feb 28, 2018
@lesteve
Copy link
Member

lesteve commented Feb 28, 2018

I pushed a minor change in the test that checks that the solver is lbfgs and that the scipy version is affected by this bug. Other than this, LGTM.

@jnothman
Copy link
Member Author

jnothman commented Feb 28, 2018 via email

@lesteve
Copy link
Member

lesteve commented Mar 1, 2018

It would be nice to have a second opinion on this one. Maybe @rth, @glemaitre or @qinhanmin2014?

Copy link
Member

@qinhanmin2014 qinhanmin2014 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, +1 for faithfully report the result from scipy. I don't think users should blame scikit-learn if they get different results here.

@@ -382,6 +382,11 @@ Linear, kernelized and related models
underlying implementation is broken. Use :class:`linear_model.Lasso` instead.
:issue:`9837` by `Alexandre Gramfort`_.

- :class:`linear_model.LogisticRegression` with ``solver='lbfgs'`` formerly
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt whether it belongs to API changes, either Enhancements or Bug fixes seems fine from my side.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going from scikit-learn 0.19 to 0.20, logistic_regression.n_iter_ will change (unless you use scipy > 1.0.0). In this respect "API changes" feels like the right place.

@qinhanmin2014 qinhanmin2014 changed the title [MRG] FIX report n_iter_ in accordance with scipy.optimize [MRG+1] FIX report n_iter_ in accordance with scipy.optimize Mar 1, 2018
@qinhanmin2014
Copy link
Member

Going from scikit-learn 0.19 to 0.20, logistic_regression.n_iter_ will change (unless you use scipy > 1.0.0). In this respect "API changes" feels like the right place.

Fair enough. @lesteve @jnothman merge?

@lesteve
Copy link
Member

lesteve commented Mar 1, 2018

It'd be good to have a third opinion, it could potentially break some user code, e.g. if someone was checking logistic_regression.n_iter_ == logistic_regression.max_iter to figure out whether max_iter was reached before converging.

@jnothman
Copy link
Member Author

jnothman commented Mar 1, 2018 via email

@lesteve
Copy link
Member

lesteve commented Mar 2, 2018

IMO it's fine to leave n_iter_ as reported by scipy. Because this can break user code (a bit of a edge case though maybe), I think it'd be good to have another opinion.

@glemaitre
Copy link
Member

logistic_regression.n_iter_ == logistic_regression.max_iter

Tricky one. If we don't want to have any issue we should cover this case.

However, we could also expect that user set verbose=1 and catch the ConvergenceWarning from there. This should be one of the reason to raise those warnings, isn't it :) Not sure this a good excuse to go around the issue.

@lesteve
Copy link
Member

lesteve commented Mar 2, 2018

Tricky one.

Yep ...

Thinking a bit more about it, I think doing n_iter_ = min(n_iter_, max_iter) is the least worse option. At least we maintain a bit of internal consistency within scikit-learn (i.e. we keep n_iter_ <= max_iter). Obviously n_iter_ is still going to be wrong (in scipy <= 1.0.0).

@jnothman jnothman changed the title [MRG+1] FIX report n_iter_ in accordance with scipy.optimize [MRG] FIX report n_iter_ as at most max_iter, without always reducing by 1 Mar 3, 2018
Copy link
Member

@qinhanmin2014 qinhanmin2014 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think it's a better solution.

@qinhanmin2014
Copy link
Member

ping @jnothman @lesteve @glemaitre I quickly searched the codebase with git grep "'nit'". It seems that HuberRegressor has the same problem. Should we fix that here?

from sklearn.linear_model import HuberRegressor
from sklearn.datasets import load_boston
X, y = load_boston(return_X_y=True)
reg = HuberRegressor(max_iter=1)
reg.fit(X, y)
print(reg.n_iter_)
# 2

@jnothman
Copy link
Member Author

jnothman commented Mar 4, 2018 via email

Copy link
Member

@qinhanmin2014 qinhanmin2014 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'll give my +1 again.

.. versionchanged:: 0.20

In SciPy <= 1.0.0 the number of lbfgs iterations may exceed
``max_iter``. ``n_iter_`` will nowreport at most ``max_iter``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nowreport -> now report

@qinhanmin2014 qinhanmin2014 changed the title [MRG] FIX report n_iter_ as at most max_iter, without always reducing by 1 [MRG+1] FIX report n_iter_ as at most max_iter, without always reducing by 1 Mar 4, 2018
@lesteve
Copy link
Member

lesteve commented Mar 5, 2018

I pushed some minor tweaks. I'll merge this one when Travis is green. Thanks everyone, I feel like we reached a reasonable solution!

@lesteve lesteve merged commit 2aba6e2 into scikit-learn:master Mar 5, 2018
@jnothman
Copy link
Member Author

jnothman commented Mar 7, 2018

Yay! Travis Cron passed! No more daily "fail" emails!

@lesteve
Copy link
Member

lesteve commented Mar 7, 2018

Great stuff, thanks for tackling this one @jnothman!

jnothman added a commit to jnothman/scikit-learn that referenced this pull request Jul 4, 2018
Ma27 added a commit to Ma27/nixpkgs that referenced this pull request Jul 14, 2018
…aster

See scikit-learn/scikit-learn#10723

This fixes the build of `scikitlearn` on master and nixos-unstable.

The issue is originally an upstream issue
(see scikit-learn/scikit-learn#10619) which
was fixed on master and was mainly caused by changes to the environment.

Closes NixOS#43466
dotlambda pushed a commit to NixOS/nixpkgs that referenced this pull request Jul 14, 2018
…aster (#43483)

See scikit-learn/scikit-learn#10723

This fixes the build of `scikitlearn` on master and nixos-unstable.

The issue is originally an upstream issue
(see scikit-learn/scikit-learn#10619) which
was fixed on master and was mainly caused by changes to the environment.

Closes #43466
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants