Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Add batching option to filter_cells and filter_genes in rapids_scanpy_funcs #53

Open
cjnolet opened this issue Nov 12, 2020 · 0 comments
Labels
enhancement New feature or request

Comments

@cjnolet
Copy link
Member

cjnolet commented Nov 12, 2020

Because the cusparse API uses 32-bit integers to specify the size of the underlying workspaces in GPU memory, and because the Scipy/Cupy sparse APIs use them to specify the size of the underlying matrices, very large datasets run into problems during the filtering of cells and genes. We can get around this constraint in two ways- we can chunk the data across different GPUs using Dask or we can batch the filters on a single GPU.

We should do this specifically for the 1M cells notebook, so that we can remove the on_device argument.

@cjnolet cjnolet added the enhancement New feature or request label Nov 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant