-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH Add Poisson datafit #78
Conversation
Note to self: Poisson datafit is not Lipschitz, need for a line search |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can instead use PN solver as it doesn't require datafits to be Lipschitz and performs a backtracking line search to find a suitable step.
We should implement two functions raw_gradient
and raw_hessian
as in the Logisitc
datafit
skglm/skglm/datafits/single_task.py
Lines 132 to 139 in 19a5a6f
def raw_grad(self, y, Xw): | |
"""Compute gradient of datafit w.r.t ``Xw``.""" | |
return -y / (1 + np.exp(y * Xw)) / len(y) | |
def raw_hessian(self, y, Xw): | |
"""Compute Hessian of datafit w.r.t ``Xw``.""" | |
exp_minus_yXw = np.exp(-y * Xw) | |
return exp_minus_yXw / (1 + exp_minus_yXw) ** 2 / len(y) |
Did I get what you meant @PABannier?
Casting It remains to add an example, test against |
Good idea ! But we don't want statsmodel as a dependency for the main code, so it will need to be a test dependency only I thought sklearn supported regularization, but it's only L2 apparently. To discuss: it we match sklearn on Poisson + L2, is it enough or do we need to add a statsmodel test requirement to test Poisson + L1 ? @QB3 @Klopfe @PABannier thoughts on this tradeoff ? |
Yes absolutely, it will be a test dependency as we did in square root Lasso |
I'd opt for comparing l1-regularized models. I've added |
For the example, do you think about something in particular @mathurinm ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR looks great, once again a nice show of force from our design and solvers: it costs 20 lines to add a whole new model !!!
We only need to be a bit more careful with the statsmodel dependecy and that will be a green light for me
skglm/tests/test_datafits.py
Outdated
@@ -3,12 +3,13 @@ | |||
|
|||
from sklearn.linear_model import HuberRegressor | |||
from numpy.testing import assert_allclose, assert_array_less | |||
from statsmodels.discrete.discrete_model import Poisson as PoissonRegressor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See discussion here: https://github.com/scikit-learn-contrib/skglm/pull/57/files#r990803934
we need a more advanced scheme to avoid imposing statsmodel for every user, yet to make it clear that it's a test dependency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i've changed it
It can be anything really, but to quote the great man "if it's not in the doc, it does not exist". Maybe check out what sklearn has ? |
Thanks @PABannier |
Closes #62