Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improved memory efficiency in _score_genes #1196

Merged
merged 6 commits into from May 14, 2020

Conversation

redst4r
Copy link

@redst4r redst4r commented May 3, 2020

In short: Avoids creating dense matrices for score calculation.

Currently, _score_genes() unnecessarily creates dense matrices to use the np.nanmean function (which doesn't work on sparse matrices out of the box).
This causes memory problems for larger datasets (anything 50k cells got me into trouble) and can be completely avoided with a nanmean() implementation of sparse matrices.

Not sure why the build is failing now in completely unrelated functions though!

Avoids creating dense matrices for score calculation.
Copy link
Member

@ivirshup ivirshup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

A couple questions,

  • Does score genes even work if the are nan values in the array?
  • Could any tests please be added to the test suite?

@redst4r
Copy link
Author

redst4r commented May 4, 2020

hi,

  • yeah, it should work if the original array contain nan (i.e. it would just ignore these entries)
  • ah, just saw that there's a tests/test_score_genes.py file, I'll add a few tests the next days!

@redst4r
Copy link
Author

redst4r commented May 9, 2020

hi,

finally managed to add some tests. Had to refactor the original test_score_genes.py a little, I hope that's ok: The one test that was already there still exists, I just pulled out the creation of the adata into a separate function.

@ivirshup
Copy link
Member

LGTM, thanks!

@ivirshup ivirshup merged commit bafaa50 into scverse:master May 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants