New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelism and co-expression inference speed #36
Comments
Update: I parallelized over gene-chunks of the counts matrix and efficiency skyrocketed. I'm a python guy, so I wrote a wrapper with joblib, which I'll share once I get it cleaned up. But I'm sure many users would love to have a solution built into the nebula package, so even though I've solved the problem for myself, I'm happy to help brainstorm further. |
Hi Austin,
You are a genius.
I am still curious why ncore=64 in nebula does not work on your side. So,
from your screenshot, is there only one thread running instead of 64? Did
you get a warning message from nebula like "The specified ncore is larger
than the number of available cores. The detected number of cores is used
instead.\n"?
Are you running the job on a cluster with some job scheduler such as grid
engine? If you use a job scheduler, you need to request the number of
available cores explicitly when submitting the job.
Best regards,
Liang
…On Sun, Oct 1, 2023 at 5:41 PM Austin McKay ***@***.***> wrote:
Update: I parallelized over gene-chunks of the counts matrix and
efficiency skyrocketed. I'm a python guy, so I wrote a wrapper with joblib,
which I'll share once I get it cleaned up. But I'm sure many users would
love to have a solution built into the nebula package, so even though I've
solved the problem for myself, I'm happy to help brainstorm further.
[image: Screen Shot 2023-10-01 at 3 37 09 PM]
<https://user-images.githubusercontent.com/30134334/271854781-b4b6acfe-a267-4fbb-bdb5-91d5db54d691.png>
—
Reply to this email directly, view it on GitHub
<#36 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGDISUV4BJITUDGJ6VS3GYTX5HPPHANCNFSM6AAAAAA5N74BTU>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
I'm just running on a large Ubuntu AWS EC2 instance (so just a single machine with 192 cores). My python wrapper may have obscured parallelism-related error messages from R -- I do pipe stdout and stderr to log files, which didn't show any sign of error, but if R routes logging elsewhere, I could have missed it. However, when using the standard nebula parallelization, if I watched htop (the application in the screenshots) for a while, I occasionally saw all cores fire up for a split second before a return to single-core. Is there a chance that there's a single-threaded bottleneck that's taking most of the time with the parallel parts finishing quickly? Are you parallelizing over genes or something else? Or if your CPU utilization is near 100% (or CPU load is roughly equal to the number of cores) when you're running it, then it could just be something about my system. Let me know if you're interested in the python wrapper, although if this isn't just an issue with my system, I imagine you'd be more interested in an R fix. And happy to help if there's anything else I can do! Cheers, |
Hi Austin,
Thank you for your additional information.
I guess I know what the problem is. It is because the AWS EC2 instance is a
virtual machine. It looks like a single local machine, but they are
actually computing nodes. The current implementation of nebula only
supports parallel computing on a multi-core local machine. Adding the
support for cloud nodes is now on my to-do list for future versions.
Best regards,
Liang
…On Tue, Oct 3, 2023 at 11:23 PM Austin McKay ***@***.***> wrote:
I'm just running on a large Ubuntu AWS EC2 instance (so just a single
machine with 192 cores). My python wrapper may have obscured
parallelism-related error messages from R -- I do pipe stdout and stderr to
log files, which didn't show any sign of error, but if R routes logging
elsewhere, I could have missed it. However, when using the standard nebula
parallelization, if I watched htop (the application in the screenshots) for
a while, I occasionally saw all cores fire up for a split second before a
return to single-core.
Is there a chance that there's a single-threaded bottleneck that's taking
most of the time with the parallel parts finishing quickly? Are you
parallelizing over genes or something else? Or if your CPU utilization is
near 100% (or CPU load is roughly equal to the number of cores) when you're
running it, then it could just be something about my system.
Let me know if you're interested in the python wrapper, although if this
isn't just an issue with my system, I imagine you'd be more interested in
an R fix. And happy to help if there's anything else I can do!
Cheers,
Austin
—
Reply to this email directly, view it on GitHub
<#36 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGDISUVPUMAA4OPZ434V65LX5TJDHAVCNFSM6AAAAAA5N74BTWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONBWGA3TEMJQHA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Hi Liang, I'm attempting to run gene network inference with 100 seed genes on 92 cores via the argument
ncore=92
to thenebula
function and running each of the 100 differential expression models in sequence. Presumedly, the seed gene is being tested against all genes in the counts matrix in parallel. However, I'm finding that the model's utilization of parallelism is very low, and wondering if there's a way I could improve this. Here are the ideas I have in order of increasing optimization:ncore
parameterHere are my parameters -- please let me know if there's a better set of parameters for high-fidelity co-expression inference that will run faster.
Data: Analyzing 10827 genes with 4 subjects and 8003 cells.
Params: kappa=200, ncore=64, model="NBLMM", method="LN"
Low parallelism utilization:
The text was updated successfully, but these errors were encountered: