EFF Optimize memory usage for sparse matrices in LLE (Hessian, Modified and LTSA)#28096
Conversation
|
Could you add an entry in the changelog |
glemaitre
left a comment
There was a problem hiding this comment.
I am quite worry that with a bug, we did not have any test failing. We should make sure to have something minimal here.
removing double loop Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
resolving loop for sparse matrix Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
|
Sorry @giorgioangel I did not have time to follow-up before the release. I'll add the milestone for 1.6 and make a review. I'll sort out any conflict and ping someone else for a second review. |
glemaitre
left a comment
There was a problem hiding this comment.
LGTM. Since we have a test that check for all solvers and all methods, then it means that we did not introduced a regression. So I would advocate that we don't need any additional tests
OmarManzoor
left a comment
There was a problem hiding this comment.
LGTM. Thanks @giorgioangel
What does this implement/fix? Explain your changes.
This PR optimizes memory management with sparse matrices when using Modified Locally Linear Embedding.
Before this PR, a numpy NxN array was created, filled, and then converted to sparse. The creation of the said array can require huge memory when dealing with a large dataset.
On the dataset I was working with, the algorithm tried to allocate 400GB of RAM lol...
In the current PR, when M_sparse is true, the algorithm creates directly a sparse matrix, greatly reducing the memory requirements.