Fix float32 cond linear regression#33249
Fix float32 cond linear regression#33249antoinebaker wants to merge 12 commits intoscikit-learn:mainfrom
Conversation
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Junteng Li <JasonLiJT@users.noreply.github.com>
|
Thanks for the quick review @JasonLiJT @ogrisel. I think this small PR can be quickly merged, I've marked it ready for review. What are your thoughts on the follow up PR, do you prefer option 1 (find a good default for I personally prefer option 2, I think it will be easier (finding a good default |
Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
|
Rah, this is annoying :( The I think the most probable cause is the cutoff for "zero eigenvalue" in the various solvers (for I will try to investigate to be sure. But I have the intuition that we will probably need a more involved PR, either exposing consistently this cutoff or finding a good default. |
I changed my mind :) |
|
Closing in favor of #33565 |
TLDR: After investigation, finding a consistent choice for all solvers is not possible or too difficult. For some solvers we do not have any control, for the other solvers the |
Fixes #33032 but will reopen #26164.
What does this implement/fix? Explain your changes.
As explained in #33032, the
condchoice in PR #30040 to solve #26164 introduced new bugs for instance on float32 data. This PR reverts the changes made in #30040 and adds a reproducer for the bugs reported in #33032.Comments / next steps
This a temporary "fix" following the plan outlined here. We need to reopen #26164.
In follow-up PRs we should either:
condthat ideally work on any data shape, dtype, and passes the sample weight consistency checkscondas a parameter inLinearRegressionas done for the sparse case in AddtoltoLinearRegression#30521, we could actually re-use thetolparameter for this (which would meancondin the dense case).