- 
                Notifications
    You must be signed in to change notification settings 
- Fork 1
Stricter test for glms with newton-cholesky solver on collinear data #8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stricter test for glms with newton-cholesky solver on collinear data #8
Conversation
| I you agree with this plan, let me know and I can help you implement it to get the test in this PR to pass. But feel free to do it yourself if you prefer. | 
| 
 I'm very happy with it. I thought a lot about the best action (in a user's perspective) of a solver in case of "dynamically detected" convergence problems: 
 | 
| 
 I would appreciate your help very much. I added you as collaborator to this fork to make it easier to work together. | 
| I will try to move this forward this afternoon. My plan to: 
 EDIT: @lorentzenchr was too fast. Let me update this PR. | 
| I merged and now the updated collinear data test passes! | 
This PR aims at better testing the
lbfgs_stepfallback mechanism of scikit-learn#23314 (introduced in c9b1200 and subsequent commits).This test is currently failing and highlights the fact that newton-cholesky with an inner solver that switches to 4 steps of LBFGS whenever the Hessian is found the ill-conditioned does not work as expected: it can be very slow because this problem can happen many times in a fit call and furthermore repeatedly calling 'L-BFGS-B' with
"maxiter": 4viascipy.optimize.minimizeis not equivalent to calling it once with a largemaxiterbecause each time we loose the memory of the previous gradients.I would therefore recommend to stop attempting to implement a fined grained fallback mechanism in the for the
inner_solvestep of the Newton solvers. Instead I would suggest to let the Newton solver raise an exception when this happens and use a coarse grained fallback to the "lbfgs" solver until convergence (possibly warm starting from the previous coef in case the newton solver had a chance to update them successfully for a few iterations). This coarse fallback should be much simpler to implement and maintain and should be guaranteed to converge to solution as good as the ones if the user had chosen LBFGS originally.The scikit-learn-level warning message should be adapted accordingly.
What do you think about this plan @lorentzenchr?
Credits: this plan was originally suggested by @GaelVaroquaux IRL.