Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[READY] ENH - Add Gram Solver for single task Quadratic datafit #59

Merged
merged 27 commits into from Aug 26, 2022

Conversation

Badr-MOUFAD
Copy link
Collaborator

@Badr-MOUFAD Badr-MOUFAD commented Aug 24, 2022

This aims to add Gram solver that uses gram matrix precomputation to solve problems with quadratic datafit where the number of samples is largely greater than the number of features.

It will proceed as follows:

  • implement gram solver for dense data
  • extend gram solver to sparse data
  • perform benchmarks to show gains compared to main solver

skglm/solvers/gram_cd.py Show resolved Hide resolved
skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved
skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved
skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved
skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved
skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved
skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved
skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved
skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved
skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved
skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved
skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved
skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved
skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved
skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved
skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved
skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved
skglm/solvers/gram_cd.py Outdated Show resolved Hide resolved
@Badr-MOUFAD
Copy link
Collaborator Author

I performed a benchmark and am convinced with the significant speedups the gram solver offers in cases n >> p.

However, I see that we have a huge overhead in the case of sparse datasets, which is not the case for dense data.

Sparse case

sparse-case

Sparse case

dense-case

That needs to be investigated further, I am particularly suspecting the sparse matrix multiplication and the conversion to dense as these are what differ the sparse from the dense case.
Meanwhile, any thoughts/ideas, @mathurinm?

grad[:] = grad_acc

# store p_obj
p_obj = 0.5 * w @ (scaled_gram @ w) - scaled_Xty @ w + penalty.value(w)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the constant term is missing

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, nice catch!

@Badr-MOUFAD
Copy link
Collaborator Author

following #59 (comment), we will tackle the big overhead that we have in the sparse case due to the computation of the gram matrix, which particularly impacts the performance of the solver, in a separate issue+PR

@Badr-MOUFAD Badr-MOUFAD changed the title [WIP] ENH - Add Gram Solver for single task Quadratic datafit [READY] ENH - Add Gram Solver for single task Quadratic datafit Aug 26, 2022
Copy link
Collaborator

@mathurinm mathurinm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, merge when green @Badr-MOUFAD

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants