personalization() has explosive memory requirements due to pairwise comparison #37

ahgraber · 2022-01-07T19:01:35Z

On my system (16gb ram), a list of 10k recommendations will run. A list of 50k will crash out. I'd like to try to understand the personalization score across my entire hypothetical customer base 250k+.

Is there a way to chunk the scipy.sparse.csr_matrix and iteratively calculate the cosine similarity to avoid holding the whole thing in memory?

Alex-Bujorianu · 2023-01-31T15:03:53Z

I have the same issue. As a workaround, I randomly sampled a set of users from the population.

ibuda · 2023-03-17T19:49:33Z

@ahgraber @Alex-Bujorianu The problem with personalization(), besides the performance complexity, is that it uses quadratic space.
I had issues with 50k users with performance time only here. I resolved the space problem by increasing the swap from default 2 GB to 40 GB (I am using Ubuntu). Hope that helps.

ibuda · 2023-03-17T19:51:39Z

@gregwchase I wouldn't label this issue as a bug, as there is no way to bypass the space complexity here. I left a recommendation on how to increase memory space for those who use Linux machines.

gregwchase added the bug Something isn't working label Feb 23, 2023

gregwchase removed the bug Something isn't working label Mar 17, 2023

Alex-Bujorianu mentioned this issue Jul 20, 2023

The personalization function will now work on large arrays #63

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

personalization() has explosive memory requirements due to pairwise comparison #37

personalization() has explosive memory requirements due to pairwise comparison #37

ahgraber commented Jan 7, 2022

Alex-Bujorianu commented Jan 31, 2023

ibuda commented Mar 17, 2023

ibuda commented Mar 17, 2023

personalization() has explosive memory requirements due to pairwise comparison #37

personalization() has explosive memory requirements due to pairwise comparison #37

Comments

ahgraber commented Jan 7, 2022

Alex-Bujorianu commented Jan 31, 2023

ibuda commented Mar 17, 2023

ibuda commented Mar 17, 2023