New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[READY] ENH - Add Gram Solver for single task Quadratic datafit #59

Merged

Badr-MOUFAD merged 27 commits into scikit-learn-contrib:main from Badr-MOUFAD:gram-solver

Aug 26, 2022

Collaborator

Badr-MOUFAD commented Aug 24, 2022 •

edited

This aims to add Gram solver that uses gram matrix precomputation to solve problems with quadratic datafit where the number of samples is largely greater than the number of features.

It will proceed as follows:

implement gram solver for dense data
extend gram solver to sparse data
perform benchmarks to show gains compared to main solver

Badr-MOUFAD added 4 commits

August 24, 2022 10:21


          init commit

b8fd539


          gram solver && unit test

c2aecba


          fix bug gram solver && tighten test

507fc8a


          add anderson acceleration

c9b64c2

mathurinm reviewed

View reviewed changes

skglm/solvers/gram_cd.py Show resolved Hide resolved

skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved

skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved

skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved

skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved

skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved

skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved

skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved

skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved

skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved

Badr-MOUFAD added 6 commits

August 24, 2022 17:23


          bug stop_criter && refactor

20c1911


          refactoring of var names

f2e985d


          handle w_init

2dbc8e4


          refactor _gram_cd_

8ca7a41


          gram epoch greedy and cyclic strategy


          extend to sparse case && unitest

8d3dbc1

mathurinm reviewed

View reviewed changes

skglm/solvers/gram_cd.py Show resolved Hide resolved

mathurinm reviewed

View reviewed changes

skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved

mathurinm reviewed

View reviewed changes

skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved

mathurinm reviewed

View reviewed changes

skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved

Badr-MOUFAD added 3 commits

August 25, 2022 10:41


          one implementation of _gram_cd && unittest

cdd7e34


          greedy_cd arg instead of cd_strategy

f4bfeaf


          Merge branch 'main' of https://github.com/scikit-learn-contrib/skglm …

95cf1d4

…into gram-solver

mathurinm reviewed

View reviewed changes

skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved

mathurinm reviewed

View reviewed changes

skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved

mathurinm reviewed

View reviewed changes

skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved

mathurinm reviewed

View reviewed changes

skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved

mathurinm reviewed

View reviewed changes

skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved


          add docs

4c0acca

Collaborator Author

Badr-MOUFAD commented Aug 25, 2022

I performed a benchmark and am convinced with the significant speedups the gram solver offers in cases n >> p.

However, I see that we have a huge overhead in the case of sparse datasets, which is not the case for dense data.

Sparse case

Sparse case

That needs to be investigated further, I am particularly suspecting the sparse matrix multiplication and the conversion to dense as these are what differ the sparse from the dense case.
Meanwhile, any thoughts/ideas, @mathurinm?

mathurinm and others added 5 commits

August 25, 2022 13:42


          script fast gram, not faster than scipy

dcab054


          fast gram timing

e8bc96e


          keep grads instead

61a67c4


          refactor chosen_j

1b6c169


          script to profile

c9c5575


          potential improvements, docstring

68a0458

mathurinm reviewed

View reviewed changes

skglm/solvers/gram_cd.py Outdated

+                                  grad[:] = grad_acc
+                      # store p_obj
+                      p_obj = 0.5 * w @ (scaled_gram @ w) - scaled_Xty @ w + penalty.value(w)

Collaborator

mathurinm Aug 26, 2022

I believe the constant term is missing

Collaborator Author

Badr-MOUFAD Aug 26, 2022

Indeed, nice catch!

mathurinm and others added 4 commits

August 26, 2022 09:30


          warnings.warn arguments in correct order

3788cc4


          cleanups: ann files

1ce391d


          fix p_obj computation

2476a34


          Merge branch 'main' of https://github.com/scikit-learn-contrib/skglm …

0f766e9

…into gram-solver

Collaborator Author

Badr-MOUFAD commented Aug 26, 2022

following #59 (comment), we will tackle the big overhead that we have in the sparse case due to the computation of the gram matrix, which particularly impacts the performance of the solver, in a separate issue+PR

Badr-MOUFAD changed the title ~~[WIP] ENH - Add Gram Solver for single task Quadratic datafit~~ [READY] ENH - Add Gram Solver for single task Quadratic datafit

Badr-MOUFAD requested a review from mathurinm

August 26, 2022 11:48


          typos + less cases in test, smaller X in tests

3208dfa

mathurinm approved these changes

View reviewed changes

Collaborator

mathurinm left a comment

LGTM, merge when green @Badr-MOUFAD

Badr-MOUFAD added 2 commits

August 26, 2022 14:05


          typo: XtXw --> grad

16f6ee4


          Merge branch 'gram-solver' of https://github.com/Badr-MOUFAD/skglm in…

e9b7224

…to gram-solver

Badr-MOUFAD merged commit fe3bedd into scikit-learn-contrib:main

This was referenced Aug 26, 2022

ENH flexible gram solver with penalty and using datafit #16

Closed

ENH - slow gram_cd_solver when fitted on sparse dataset #60

Open

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment