Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

program quits when the matrix is big #4

Closed
fangge-Li opened this issue Jul 3, 2018 · 8 comments
Closed

program quits when the matrix is big #4

fangge-Li opened this issue Jul 3, 2018 · 8 comments

Comments

@fangge-Li
Copy link

Hi all,
Thank you for developing "STREAM". I have used STREAM for our data. It works very well when the gene-expression matrix is small.
Now I want to use STREAM for the new data containing 7465 cells and 8858 genes. And I have allocated 8 cores and 16G memory to Docker container. But the program would quit after printing "Selecting features...", without any error message.
Please help me to solve this problem.
Thank you so much!

@huidongchen
Copy link
Collaborator

Hi, Fangge,

Thanks for trying STREAM.
Would you mind sharing with us the data matrix ? We are happy to take a look.

Best,
Huidong

@fangge-Li
Copy link
Author

fangge-Li commented Jul 5, 2018 via email

@huidongchen
Copy link
Collaborator

huidongchen commented Jul 5, 2018

@fangge-Li Sure!

@cstill1992
Copy link

Hi,

I was wondering if there was any further update to this issue. I'm running into a similar situation where st.dimension_reduction(data) keeps quiting due to memory constraints (specifically I'm getting feedback that the process has been "killed" after about 2-3 hours of working). I'm working with a large dataset as well, 13230 cells with 10622 genes present (Drop-seq dataset). I'm doing this on a cluster allocating 30 cores and 128G of memory. Any help would be appreciated.

Thanks,
Chris

@huidongchen
Copy link
Collaborator

Hi Chris @cstill1992 ,

Sorry about the issue. st.dimension_reduction(data) uses 'MLLE' as the default dimension reduction method, which currently is not efficiently implemented yet.

As an alternative solution, you can try

st.dimension_reduction(adata,method='umap', n_components=2)
st.seed_elastic_principal_graph(adata ,clustering='kmeans')

Of course you can also increase the n_components to get more detailed structure.

This will speed it up significantly for large dataset.

@cstill1992
Copy link

Hi Huidong,

Thanks for the quick response. I'll give it a try and let you know how it goes.

Thanks,
Chris

@cstill1992
Copy link

Hi Huidong,

Thanks for the help. The fix worked nicely.

Best,
Chris

@huidongchen
Copy link
Collaborator

That’s great! Feel free to let me know if you have any questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants