Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH add verbosity to newton-cg solver #27526

Merged
merged 38 commits into from Apr 12, 2024

Conversation

lorentzenchr
Copy link
Member

Reference Issues/PRs

This PR is meant to be merged after #26721.

What does this implement/fix? Explain your changes.

This PR adds verbosity to _newton_cg solver in our private sklearn.utils.optimize module. It is used, e.g., in LogisticRegression.

Any other comments?

@github-actions
Copy link

github-actions bot commented Oct 3, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 1aa4996. Link to the linter CI: here

@lorentzenchr lorentzenchr marked this pull request as draft October 5, 2023 19:52
@adrinjalali
Copy link
Member

I wonder if we should add these verbosity PRs or add callbacks instead (cc @jeremiedbb )

@jeremiedbb
Copy link
Member

verbose through callbacks is a bit limited because it will be triggered only at places where callbacks are called so if we really need to have advanced low level verbosity we have to do it manually. I'm not convinced though that we really need advanced low level verbosity but no strong opinion here.

@jeremiedbb jeremiedbb modified the milestones: 1.4, 1.5 Dec 21, 2023
@lorentzenchr
Copy link
Member Author

@adrinjalali @jeremiedbb @ogrisel I would like to merge this. Newton-CG might be our best solver for multiclass logistic regression and the other solvers have verbose output, too. It doesn't change the API and is just convenience for a small group of users. Review should be super easy.

@ogrisel
Copy link
Member

ogrisel commented Apr 5, 2024

Let me update this branch with main to make it easier to test it with my local meson build.

@ogrisel
Copy link
Member

ogrisel commented Apr 5, 2024

For other reviewers here is how a typical verbose run would look like:

>>> from sklearn.linear_model import LogisticRegression
>>> LogisticRegression(solver="newton-cg", verbose=100).fit(*make_classification())
Newton-CG iter = 0
  Check Convergence
    1. max |gradient| 0.4887044022364469 <= 0.0001
  Inner CG solver iteration 1 stopped with
    sum(|residuals|) <= tol: 0.7554386095540608 <= 0.9662352699764765
  Line Search
    eps=16 * finfo.eps=3.552713678800501e-15
    try line search wolfe1
    wolfe1 line search was successful
Newton-CG iter = 1
  Check Convergence
    1. max |gradient| 0.14592025445011966 <= 0.0001
  Inner CG solver iteration 1 stopped with
    sum(|residuals|) <= tol: 0.2851651569422557 <= 0.34548568271404273
  Line Search
    eps=16 * finfo.eps=3.552713678800501e-15
    try line search wolfe1
    wolfe1 line search was successful
Newton-CG iter = 2
  Check Convergence
    1. max |gradient| 0.0540671431111577 <= 0.0001
  Inner CG solver iteration 2 stopped with
    sum(|residuals|) <= tol: 0.09220421429635615 <= 0.15390480184192634
  Line Search
    eps=16 * finfo.eps=3.552713678800501e-15
    try line search wolfe1
    wolfe1 line search was successful
Newton-CG iter = 3
  Check Convergence
    1. max |gradient| 0.015163843559910496 <= 0.0001
  Inner CG solver iteration 2 stopped with
    sum(|residuals|) <= tol: 0.026244178729999727 <= 0.037938424290546426
  Line Search
    eps=16 * finfo.eps=3.552713678800501e-15
    try line search wolfe1
    wolfe1 line search was successful
Newton-CG iter = 4
  Check Convergence
    1. max |gradient| 0.004579527123434936 <= 0.0001
  Inner CG solver iteration 4 stopped with
    sum(|residuals|) <= tol: 0.004167589412564355 <= 0.006848570879928111
  Line Search
    eps=16 * finfo.eps=3.552713678800501e-15
    try line search wolfe1
    wolfe1 line search was successful
Newton-CG iter = 5
  Check Convergence
    1. max |gradient| 0.0008076794452485875 <= 0.0001
  Inner CG solver iteration 5 stopped with
    sum(|residuals|) <= tol: 0.00029939950485330754 <= 0.0004458638755984877
  Line Search
    eps=16 * finfo.eps=3.552713678800501e-15
    try line search wolfe1
    wolfe1 line search was successful
Newton-CG iter = 6
  Check Convergence
    1. max |gradient| 3.631012151112875e-05 <= 0.0001
  Solver did converge at loss = 0.22370577235383718.
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.0s
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.0s
LogisticRegression(solver='newton-cg', verbose=100)

@ogrisel
Copy link
Member

ogrisel commented Apr 5, 2024

@lorentzenchr would it be possible to get more details on the line search iterations?

@lorentzenchr
Copy link
Member Author

would it be possible to get more details on the line search iterations?

Not really. We use private line search functions from scipy. They do not have a verbose option. Those functions call, e.g., DCSRCH which very recently ported from Fortran minpack2. If someone is interested, I would point to contributing upstream in scipy.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more quick pass but I have to go offline soon. I'll keep the firefox tab open to finalize my review soon this time.

sklearn/linear_model/_glm/_newton_solver.py Show resolved Hide resolved
doc/whats_new/v1.4.rst Outdated Show resolved Hide resolved
sklearn/utils/optimize.py Outdated Show resolved Hide resolved
sklearn/utils/optimize.py Outdated Show resolved Hide resolved
sklearn/utils/optimize.py Outdated Show resolved Hide resolved
sklearn/utils/optimize.py Outdated Show resolved Hide resolved
sklearn/utils/optimize.py Outdated Show resolved Hide resolved
sklearn/utils/tests/test_optimize.py Outdated Show resolved Hide resolved
sklearn/utils/tests/test_optimize.py Outdated Show resolved Hide resolved
@adrinjalali
Copy link
Member

CI failing.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another pass, LGTM.

@@ -71,6 +93,8 @@ def _line_search_wolfe12(f, fprime, xk, pk, gfk, old_fval, old_old_fval, **kwarg
# TODO: It seems that the new check for the sum of absolute gradients above
# catches all cases that, earlier, ended up here. In fact, our tests never
# trigger this "if branch" here and we can consider to remove it.
if is_verbose:
print(" last resort: try line search wolfe2")
ret = line_search_wolfe2(
f, fprime, xk, pk, gfk, old_fval, old_old_fval, **kwargs
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there anything interesting to print based on the result of wolfe2? Maybe we should at least print that the wolfe2 line search was successful when ret[0] is not None to be consistent with what we print for the wolfe1 line search.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there anything interesting to print based on the result of wolfe2?

🤷

print(
" check sum(|gradient|) < sum(|gradient_old|): "
f"{sum_abs_grad} < {sum_abs_grad_old} {check}"
)
if check:
ret = (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we make it explicit that we perform an update with unit step size in this case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current message is 1-1 with NewtonSolver.

eps = 16 * np.finfo(np.asarray(old_fval).dtype).eps
if is_verbose:
print(" Line Search")
print(f" eps=16 * finfo.eps={eps}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would defer the print of eps to when it's actually used:

Suggested change
print(f" eps=16 * finfo.eps={eps}")

otherwise one gets the impression that it's used by wolfe1, especially when it's successful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current message is 1-1 with NewtonSolver.

Comment on lines +66 to +69
print(
" check loss |improvement| <= eps * |loss_old|:"
f" {np.abs(loss_improvement)} <= {tiny_loss} {check}"
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
print(
" check loss |improvement| <= eps * |loss_old|:"
f" {np.abs(loss_improvement)} <= {tiny_loss} {check}"
)
print(f" eps=16 * finfo.eps={eps}")
print(
" check loss |improvement| <= eps * |loss_old|:"
f" {np.abs(loss_improvement)} <= {tiny_loss} {check}"
)

sklearn/utils/tests/test_optimize.py Outdated Show resolved Hide resolved
sklearn/utils/optimize.py Outdated Show resolved Hide resolved
Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some untested lines which it seems they're legit and should be tested, otherwise LGTM.

I think some of @ogrisel 's comments are valid in terms of moving messages or being more explicit, but I don't mind them being a separate PR since they'd touch multiple solvers.

@lorentzenchr
Copy link
Member Author

lorentzenchr commented Apr 11, 2024

There are some untested lines which it seems they're legit and should be tested

Except for about one occasion, the lines not tested according to codecov are just hard to trigger corner cases. There are currently no tests that trigger them, to my knowledge (so it’s a more than 10 year old liability). I would much prefer to not put that burden on this PR.

@adrinjalali adrinjalali merged commit 3ee60a7 into scikit-learn:main Apr 12, 2024
29 of 30 checks passed
@ogrisel ogrisel deleted the newton_cg_verbose branch April 12, 2024 13:45
@lorentzenchr lorentzenchr restored the newton_cg_verbose branch April 12, 2024 13:56
@lorentzenchr lorentzenchr deleted the newton_cg_verbose branch April 12, 2024 13:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants