Skip to content
This repository has been archived by the owner on May 29, 2024. It is now read-only.

Restore %dopar% functionality for main processing loop of seurat objects. #9

Closed
inmanjm opened this issue Jun 8, 2020 · 3 comments
Closed

Comments

@inmanjm
Copy link
Collaborator

inmanjm commented Jun 8, 2020

foreach ( ) %dopar% is failing sporadically. Determine the cause of these failures.

@inmanjm
Copy link
Collaborator Author

inmanjm commented Jun 8, 2020

Reservation setup as follows:

registerDoParallel( cores=num_cores )

Additional options to multicore setup as:

mcoptions=list( silent=TRUE )

Entering the loop as follows:

## Ugh! Sometimes with preschedule=F. %dopar% will finish but only process the first job.
## Until that gets straightened out, change to %dopar% without preschedule=F,
## but without it %dopar% sometimes has a job die, in which case settle for %do%. :(
## Note to self: Might need to specify a particular number of cores based on the number of files?
#foreach ( seurat_filepath=iter( seurat_files ), options.multicore=mcoptions, preschedule=F ) %dopar% {
foreach ( seurat_filepath=iter( seurat_files ) ) %dopar% {
#foreach ( seurat_filepath=iter( seurat_files ) ) %do% {

@inmanjm
Copy link
Collaborator Author

inmanjm commented Jun 30, 2020

This appears to have resolved as quickly as it appeared. Will close, but might have to re-open in the future.

@inmanjm inmanjm closed this as completed Jun 30, 2020
@inmanjm
Copy link
Collaborator Author

inmanjm commented Aug 7, 2020

I've somewhat serendipitously discovered that the original issue of %dopar% failing on a single job is very likely due to insufficient memory on the node the R script is running. When running under %do%, the script consistently failed on the same step every time (FindAllMarkers for a particular group). I ran it a couple of times from the sinteractive node that didn't request enough memory, and it failed the same way each time. I then started a new sinteractive session with more memory allocated (64GB vs 48GB) and the entire job was able to complete (including a rerun where I switched back to %dopar%). (This particular process used a peak of just over 58 GB). I'm going to bump up the default requested memory for the R script step in the pipeline, which should reduce the number of times this is seen. It may or may not be worth investigating how to predict the amount of memory needed for this step and request as appropriate.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant