Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for non-convex penalties acting on the singular values #52

Closed
marcusvaltonen opened this issue Mar 15, 2022 · 6 comments
Closed

Comments

@marcusvaltonen
Copy link
Contributor

marcusvaltonen commented Mar 15, 2022

The nuclear norm is the soft-thresholding operator acting on the singular values. This essentially means that even the larger singular values are penalized just as hard as the smaller ones, which leads to the phenomenon shrinking bias which has been studied in e.g. [1]. Ideally, the smaller singular values (likely stemming from noise) should be penalized proportionally harder, and this can be done in two ways:

Weighted nuclear norm (WNN)

The weighted nuclear norm adds a penalty w_i for the i:th singular value, where {w_i} is increasing. [2]

Concave function acting on the singular value

There are many works that add instead apply a concave map f to the i:th singular σ_i -> f(σ_i). These include:

Proposition for enhancement

Implement the above penalties and their proximal operators.

References

[1] Carlsson et al. "An unbiased approach to low rank recovery", https://arxiv.org/abs/1909.13363
[2] Gu et al. "Weighted Nuclear Norm Minimization with Application to Image Denoising" In the Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), 2014
[3] Fan and Li "Variable selection via nonconcave penalized likelihood and its oracle properties" Journal of the American Statistical Association, 96(456):1348–1360, 2001
[4] Friedman "Fast sparse regression and classification", International Journal of Forecasting, 28(3):722 – 738, 2012.
[5] Zhang et al. "Nearly unbiased variable selection under minimax concave penalty" The Annals of Statistics, 38(2):894–942, 2010
[6] Gao et al. "A feasible nonconvex relaxation approach to feature selection", In Proceedings of the Conference on Artificial Intelligence (AAAI), 2011
[7] Geman and Yang "Nonlinear image recovery with half-quadratic regularization", IEEE Transactions on Image Processing, 4:932 – 946, 08 1995

@mrava87
Copy link
Contributor

mrava87 commented Mar 22, 2022

Nice :) I am familiar with the issue of soft-thresholding you mention, so I will definitely be keen to have more robust alternatives. Again, if you want to go ahead I suggest making PRs per method so we have 'small' pieces of code to review and push at once

@marcusvaltonen
Copy link
Contributor Author

I will look into this. Just a comment: these methods work together with both linear and bilinear frameworks, so they can be implemented before the second-order methods are merged. Again, I will look into the architectural design of introducing these penalties. I would like to make them classes, as also suggested in PyLops/pylops#330, and ideally using class-based solvers, as suggested in PyLops/pylops#90.

@mrava87
Copy link
Contributor

mrava87 commented Mar 23, 2022

Sounds good. We have a plan to have all solvers become class based and for the old ones we will probably support also the original version which internally will call the .solve method of the class. So far it is just an idea, I plan to spend some time playing around with the design and see what looks best. Feel free to go ahead and we can make sure things will look consistently once we get to the point you have things ready :)

@marcusvaltonen
Copy link
Contributor Author

I will not make a PR for MCP, since it coincides with f_mu for a certain choice of parameters.

@marcusvaltonen
Copy link
Contributor Author

marcusvaltonen commented Mar 31, 2022

Now PRs are out (or have been merged) for all the separable penalties mentioned above. I will look into the non-separable penalties, like weighted nuclear norm WNN, and perhaps the quadratic envelope of the projection of the rank/cardinality function.

@marcusvaltonen
Copy link
Contributor Author

Everything that was mentioned has been implemented, so closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants