Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel FindMarkers with Future #7000

Closed
weshorton opened this issue Mar 1, 2023 · 3 comments
Closed

Parallel FindMarkers with Future #7000

weshorton opened this issue Mar 1, 2023 · 3 comments

Comments

@weshorton
Copy link

Hello,

I am a beginner in terms of parallel computing in R and am trying to run FindMarkers() using the framework described in the vignette. I'm actually trying to use FindAllMarkers(), but my issue appears with both of them. After setting the plan and running my code, I check out my cores using htop and find that only one core is being used.

> availableCores()
system 
    24 

> plan("multisession")
> plan()
multisession:
- args: function (..., workers = availableCores(), lazy = FALSE, rscript_libs = .libPaths(), envir = parent.frame())
- tweaked: FALSE
- call: plan("multisession")

clusters_v <- levels(seurat_object)
test <- FindMarkers(seurat_object, 
                                   ident.1 = clusters_v[1], 
                                   ident.2 = clusters_v[-1],
                                   only.pos = F,
                                   test.use = "MAST",
                                   latent.vars = "batchID"

### OR
# test <- FindAllMarkers(seurat_object, only.pos = F, test.use = "MAST", latent.vars = "batchID")

Looking at htop, I see this:

Screen Shot 2023-03-01 at 1 10 59 PM

The function call does seem to be running faster than previously, but I feel like I should be seeing usage on other cores instead of just the one. Any comments, links to resources, etc. are appreciated!

@AustinHartman
Copy link
Contributor

The MAST differential expression method is unfortunately not configured to run in parallel. If you try a different method (like test.use = "wilcox") you should see more cores used

@ppm1337
Copy link

ppm1337 commented Oct 4, 2023

I would like to add to this discussion that the current vignette proves that the parallelization is not working any more. The default test.use = "wilcox" with 4 workers in the example has no speedup according to the plot in the vignette:

In an archived vignette, the same example showed a considerable speedup with 4 workers:

Untitled

@ishwarvh
Copy link

ishwarvh commented Oct 9, 2023

Would like to add that parallelization seems to have broken when using the future package within RStudio.
Neither specifying "multisession" nor "multicore" has any effect on the number of cores used. All the previously parallelized functions run on only a single core!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants