ENH flexible gram solver with penalty and using datafit #16

mathurinm · 2022-05-05T15:32:37Z

This is a smaller version of #4 : only without groups, but reusing more code and supporting any penalty

PABannier · 2022-05-10T22:16:00Z

@mathurinm ready for a quick review ;)

skglm/solvers/cd_solver.py

mathurinm · 2022-05-11T06:25:27Z

skglm/solvers/cd_solver.py

+        if (isinstance(datafit, (Quadratic, Quadratic_32)) and n_samples > n_features
+                and n_features < 10_000) or solver in ("cd_gram", "fista"):
+            # Gram matrix must fit in memory hence the restriction n_features < 1e5
+            if not isinstance(datafit, (Quadratic, Quadratic_32)):


I think this bit is unreachable because the check is already performed L155

I've placed it because the first condition is "isinstance.... OR solver in ...". If the user manually inputs "cd_gram", I think we enter the if statement and I want to catch a wrong datafit, hence L158. Overkill maybe? Should we even expose solver? I think it is convenient for benchmarks.
WDYT?

Ok I understood, thanks.
Maybe we can indent the first if, breaking line before or solver to make it more visible ?

I tried to make it more obvious. WDYT?

mathurinm · 2022-05-11T06:26:45Z

skglm/solvers/cd_solver.py


-    coefs : array, shape (n_features, n_alphas)
-        Coefficients along the path.
+    obj_out : array, shape (n_iter,)


do we really return this? or the optimality condition violation instead

We do return this. See L371.

skglm/solvers/gram.py

mathurinm · 2022-05-11T06:34:45Z

skglm/solvers/cd_utils.py

+
+
+@njit
+def prox_vec(penalty, z, stepsize, n_features):


arf, I though we had penalty.prox

make this function private, remove n_features (access as z.shape[1])

we need a reflection on solvers, but probably all penalties will need to implement it. We can do so in basepenalty, but I fear looping over all coordinates will be slower than performing it in one step as ST_vec does

In [16]: %%time ...: out = _prox_vec(pen, z, 0.01) CPU times: user 28 µs, sys: 1e+03 ns, total: 29 µs Wall time: 34.1 µs In [17]: %%time ...: out2 = ST_vec(z, 0.01) CPU times: user 23 µs, sys: 0 ns, total: 23 µs Wall time: 25.7 µs

not a big difference, from my experiments. I tried with different thresholds.

with @QB3 we had an issue a while ago on flashcd with finance where this caused a big overhead. Just to keep it in mind

skglm/solvers/gram.py

Co-authored-by: mathurinm <mathurinm@users.noreply.github.com>

…kglm into gram_penalty_nogroup

skglm/solvers/cd_solver.py

…m_penalty_nogroup

…enalty_nogroup

PABannier

Overall LGTM.
Tests are missing for the solvers though, I can write some if needed.

PABannier · 2022-06-15T09:35:08Z

skglm/solvers/cd_solver.py

@@ -52,6 +56,9 @@ def cd_solver_path(X, y, datafit, penalty, alphas=None,
    return_n_iter : bool, optional
        If True, number of iterations along the path are returned.

+    solver : ('cd_ws'|'cd_gram'|'fista'), optional


FISTA is not a CD solver, it's confusing to expose it to the user like this.
@mathurinm WDYT?

mathurinm · 2022-08-22T08:38:28Z

skglm/solvers/gram.py

+@njit
+def _cd_epoch_gram(XtX, grad, w, datafit, penalty, n_samples, n_features):
+    lc = datafit.lipschitz
+    for j in range(n_features):


since we have complete access to grad at each iteration, it would be interesting to use a greedy selection rule here: do not pick j cyclically, but instead take j = np.argmax(np.abs(grad))

One "epoch" in this setting would only be the update of n_features coordinates.

Badr-MOUFAD · 2022-08-26T12:19:13Z

closing in favor of #59

mathurinm and others added 14 commits May 5, 2022 17:31

draft flexible gram solver with penalty and using datafit

a3c7a53

fix wrong docstring

34db4fc

reorg cd_gram_quadratic

2110db1

fix cd epoch

247bb75

green

11742dc

ERR circular import

4a44957

fix circular import

a85f5cf

linter happy

6c9b146

fix sparse

30363ea

tests are passing

14d3800

added FISTA gram

59b8f6c

linter happy

6bb4706

fix tests

f0ed28a

added solver arguments

5dd40fd

mathurinm commented May 11, 2022

View reviewed changes

PABannier and others added 5 commits May 11, 2022 09:53

Update skglm/solvers/cd_solver.py

07714ac

Co-authored-by: mathurinm <mathurinm@users.noreply.github.com>

Update skglm/solvers/gram.py

8aee1b6

Co-authored-by: mathurinm <mathurinm@users.noreply.github.com>

pass Mathurin's comments

0b940ee

Merge branch 'gram_penalty_nogroup' of https://github.com/mathurinm/s…

522cf69

…kglm into gram_penalty_nogroup

linter happy

b5b9d09

mathurinm commented May 13, 2022

View reviewed changes

skglm/solvers/cd_solver.py Outdated Show resolved Hide resolved

mathurinm requested a review from Klopfe May 13, 2022 08:37

PABannier and others added 3 commits May 14, 2022 14:00

fix w_init

fc791d6

ENH if statement

9e19e08

Merge branch 'main' of github.com:scikit-learn-contrib/skglm into gra…

b9ddc34

…m_penalty_nogroup

PABannier requested a review from QB3 May 16, 2022 20:49

Merge branch 'main' of https://github.com/PABannier/skglm into gram_p…

7fdac87

…enalty_nogroup

PABannier requested changes Jun 15, 2022

View reviewed changes

mathurinm assigned Badr-MOUFAD Aug 22, 2022

mathurinm commented Aug 22, 2022

View reviewed changes

Badr-MOUFAD closed this Aug 26, 2022

mathurinm deleted the gram_penalty_nogroup branch August 26, 2022 12:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH flexible gram solver with penalty and using datafit #16

ENH flexible gram solver with penalty and using datafit #16

mathurinm commented May 5, 2022

PABannier commented May 10, 2022

mathurinm May 11, 2022

PABannier May 11, 2022 •

edited

mathurinm May 13, 2022

PABannier May 14, 2022

mathurinm May 11, 2022

PABannier May 14, 2022

mathurinm May 11, 2022

PABannier May 11, 2022

mathurinm May 11, 2022

PABannier left a comment

PABannier Jun 15, 2022

mathurinm Aug 22, 2022

Badr-MOUFAD commented Aug 26, 2022

ENH flexible gram solver with penalty and using datafit #16

ENH flexible gram solver with penalty and using datafit #16

Conversation

mathurinm commented May 5, 2022

PABannier commented May 10, 2022

Choose a reason for hiding this comment

PABannier May 11, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PABannier left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Badr-MOUFAD commented Aug 26, 2022

PABannier May 11, 2022 •

edited