New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH add Huber datafit #14
Conversation
Hello @EnLAI111 , you need to decorate your |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the PR @EnLAI111 ! I did a first pass of comments, don't hesitate to ping me if something is not explicit enough or if you need feedback.
skglm/datafits/Huber.py
Outdated
n_features = X.shape[1] | ||
self.lipschitz = np.zeros(n_features, dtype=X.dtype) | ||
for j in range(n_features): | ||
self.lipschitz[j] = (np.where(np.abs(y) < self.delta, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the lipschitz constant should just be norm(X_j) ** 2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have said norm(X_j) ** 2 / delta.
This makes me think on how do we want to solve this optimization problem?
If we want to use algorithms like in http://proceedings.mlr.press/v84/massias18a/massias18a.pdf and in https://arxiv.org/pdf/1902.02509.pdf, I think we will need a custom _cd_epoch
function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure for delta ? https://github.com/scikit-learn-contrib/skglm/pull/14/files#diff-518236c0559dd6839714e8c437731d42952421ed7fed1319945a7d9bbe9f315eR20
There is no optimization over delta so far so no need for this solver, but if we need to optimize over this variable a dedicated solver will be needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I see, there is no delta for this parametrization!
Thanks for the comments and suggestion! I have fixed the problems. Now with WeightedL1 there is no more posted errors of |
Note that this is not explicit in the doc. When reading it, I though it would be automatized by some Adding the |
To finalize this PR, unit tests should be added. Initially I wanted to compare it to sklearn's |
if we just want to test the convergence, we could use cvxpy to solve the optimization problem and check if the betas are the same. What do you think? |
@Klopfe using cvxpy would introduce heavy dependency, instead we can fit a sklearn Huber regressor, find its I have pushed a script that does it. Beware that sklearn uses squared L2 regularization, and that the penalty is not scaled by 1/2 (their docstring is not very clear). I had to use more samples than features otherwise it fits perfectly and We don't have a way to handle unpenalized problems so I have used WeightedL1 with 0 weights. For future works... |
another solution could be to use |
It's what HuberRegressor does already, so why rewrite this code ? |
Did my pass, merge when green if LGTY @QB3 @PABannier |
Thanks @EnLAI111 ! |
When calling the function
construct_grad(X, y, w, Xw, datafit, ws)
, I get the follow error.So maybe there are bugs in the class
Huber
somewhere, but I can't figure out where.