Parallelism and co-expression inference speed #36

TheAustinator · 2023-10-01T03:12:41Z

Hi Liang, I'm attempting to run gene network inference with 100 seed genes on 92 cores via the argument ncore=92 to the nebula function and running each of the 100 differential expression models in sequence. Presumedly, the seed gene is being tested against all genes in the counts matrix in parallel. However, I'm finding that the model's utilization of parallelism is very low, and wondering if there's a way I could improve this. Here are the ideas I have in order of increasing optimization:

Of course, I could parallelize across seed genes by kicking off many R jobs, but each job would have to load the entire counts matrix, which would cause the machine to run out of memory quickly
For now, I'll chunk the counts matrix into 92 subsets along the gene axis, then parallelize across the counts matrix chunks rather than using the ncore parameter
Is there a bottleneck that could be removed in the original nebula parallelization?
Is there a way to parallelize nebula both across the counts matrix genes (assuming that's how it's originally parallelized) and simultaneously across seed genes for the specific case of co-expression/GRN inference so that each (seed gene, counts matrix gene) pair gets a thread?

Here are my parameters -- please let me know if there's a better set of parameters for high-fidelity co-expression inference that will run faster.
Data: Analyzing 10827 genes with 4 subjects and 8003 cells.
Params: kappa=200, ncore=64, model="NBLMM", method="LN"

Low parallelism utilization:

The text was updated successfully, but these errors were encountered:

TheAustinator · 2023-10-01T21:40:56Z

Update: I parallelized over gene-chunks of the counts matrix and efficiency skyrocketed. I'm a python guy, so I wrote a wrapper with joblib, which I'll share once I get it cleaned up. But I'm sure many users would love to have a solution built into the nebula package, so even though I've solved the problem for myself, I'm happy to help brainstorm further.

lhe17 · 2023-10-02T13:37:46Z

Hi Austin, You are a genius. I am still curious why ncore=64 in nebula does not work on your side. So, from your screenshot, is there only one thread running instead of 64? Did you get a warning message from nebula like "The specified ncore is larger than the number of available cores. The detected number of cores is used instead.\n"? Are you running the job on a cluster with some job scheduler such as grid engine? If you use a job scheduler, you need to request the number of available cores explicitly when submitting the job. Best regards, Liang

…

On Sun, Oct 1, 2023 at 5:41 PM Austin McKay ***@***.***> wrote: Update: I parallelized over gene-chunks of the counts matrix and efficiency skyrocketed. I'm a python guy, so I wrote a wrapper with joblib, which I'll share once I get it cleaned up. But I'm sure many users would love to have a solution built into the nebula package, so even though I've solved the problem for myself, I'm happy to help brainstorm further. [image: Screen Shot 2023-10-01 at 3 37 09 PM] <https://user-images.githubusercontent.com/30134334/271854781-b4b6acfe-a267-4fbb-bdb5-91d5db54d691.png> — Reply to this email directly, view it on GitHub <#36 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGDISUV4BJITUDGJ6VS3GYTX5HPPHANCNFSM6AAAAAA5N74BTU> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

TheAustinator · 2023-10-04T03:23:21Z

I'm just running on a large Ubuntu AWS EC2 instance (so just a single machine with 192 cores). My python wrapper may have obscured parallelism-related error messages from R -- I do pipe stdout and stderr to log files, which didn't show any sign of error, but if R routes logging elsewhere, I could have missed it. However, when using the standard nebula parallelization, if I watched htop (the application in the screenshots) for a while, I occasionally saw all cores fire up for a split second before a return to single-core.

Is there a chance that there's a single-threaded bottleneck that's taking most of the time with the parallel parts finishing quickly? Are you parallelizing over genes or something else? Or if your CPU utilization is near 100% (or CPU load is roughly equal to the number of cores) when you're running it, then it could just be something about my system.

Let me know if you're interested in the python wrapper, although if this isn't just an issue with my system, I imagine you'd be more interested in an R fix. And happy to help if there's anything else I can do!

Cheers,
Austin

lhe17 · 2023-10-04T16:14:49Z

Hi Austin, Thank you for your additional information. I guess I know what the problem is. It is because the AWS EC2 instance is a virtual machine. It looks like a single local machine, but they are actually computing nodes. The current implementation of nebula only supports parallel computing on a multi-core local machine. Adding the support for cloud nodes is now on my to-do list for future versions. Best regards, Liang

…

On Tue, Oct 3, 2023 at 11:23 PM Austin McKay ***@***.***> wrote: I'm just running on a large Ubuntu AWS EC2 instance (so just a single machine with 192 cores). My python wrapper may have obscured parallelism-related error messages from R -- I do pipe stdout and stderr to log files, which didn't show any sign of error, but if R routes logging elsewhere, I could have missed it. However, when using the standard nebula parallelization, if I watched htop (the application in the screenshots) for a while, I occasionally saw all cores fire up for a split second before a return to single-core. Is there a chance that there's a single-threaded bottleneck that's taking most of the time with the parallel parts finishing quickly? Are you parallelizing over genes or something else? Or if your CPU utilization is near 100% (or CPU load is roughly equal to the number of cores) when you're running it, then it could just be something about my system. Let me know if you're interested in the python wrapper, although if this isn't just an issue with my system, I imagine you'd be more interested in an R fix. And happy to help if there's anything else I can do! Cheers, Austin — Reply to this email directly, view it on GitHub <#36 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGDISUVPUMAA4OPZ434V65LX5TJDHAVCNFSM6AAAAAA5N74BTWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONBWGA3TEMJQHA> . You are receiving this because you commented.Message ID: ***@***.***>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelism and co-expression inference speed #36

Parallelism and co-expression inference speed #36

TheAustinator commented Oct 1, 2023

TheAustinator commented Oct 1, 2023

lhe17 commented Oct 2, 2023 via email

TheAustinator commented Oct 4, 2023

lhe17 commented Oct 4, 2023 via email

Parallelism and co-expression inference speed #36

Parallelism and co-expression inference speed #36

Comments

TheAustinator commented Oct 1, 2023

TheAustinator commented Oct 1, 2023

lhe17 commented Oct 2, 2023 via email

TheAustinator commented Oct 4, 2023

lhe17 commented Oct 4, 2023 via email