-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
error modeling knn takes up all cores #51
Comments
- The number of cores might have to do with the environment, but just in case, could you please provide the exact command you’re using.
- In terms of runtime, it depends on the number of cells (linearly) and the number of genes. Filtering out low-expressed genes from the matrix should speed up calculations considerably.
… On Sep 14, 2017, at 10:41 AM, dtm2117 ***@***.***> wrote:
Hello,
I am trying to run the error modeling step on a sample with thousands of cells. I've increased K and min.nonfailed. But I find that even if I set n.cores = 1 , I have threads running on every core of the cluster. Furthermore I've never had the error modeling step finish.
Any thoughts on this issue?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#51>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ALT78h8XRbxBgUwdKoP_15e2RofPAhw6ks5siTsXgaJpZM4PXsYB>.
|
Here is the command: dim of the matrix is ~ 13k genes but 4k cells. I've filtered out any genes that have no expression. |
This is UMI data also |
For the runtime issue, I think k needs to be lowered considerably. To something like 50 or a 100. It just needs sufficient number of neighboring cells to calculate the few parameters for the error model. 2k cells would be definitely an overkill for that.
Not sure about the number of cores though.
Can you see if doing something like
pagoda:::papply( 1:1e2, function(x) rnorm(1e3), n.cores=12)
also uses too many cores?
Best,
-peter.
… On Sep 21, 2017, at 12:12 PM, dtm2117 ***@***.***> wrote:
Here is the command:
knn <- knn.error.models(cd_new_nodup, k = ncol(cd)/2, n.cores = 12, min.count.threshold = 1, min.nonfailed = 20, max.model.plots = 10)
dim of the matrix is ~ 13k genes but 4k cells. I've filtered out any genes that have no expression.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#51 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ALT78gYZAjHmjdofH7JKoK8acS3Bxrmuks5skorxgaJpZM4PXsYB>.
|
Ok, I will try this. On the scde help page it says that k may need to be increased for 1000s of cells, which is why I kept the denominator low. Can't run that command because the pagoda library is not installed apparently. I thought it was installed along with SCDE package? |
When running scde:::papply( 1:1e2, function(x) rnorm(1e3), n.cores=12) |
Yes, I meant scde package. You'd need to increase the rnorm argument to take some sizeable amount of time.
…-peter.
On Sep 21, 2017, at 12:37, dtm2117 ***@***.***> wrote:
When running scde:::papply( 1:1e2, function(x) rnorm(1e3), n.cores=12)
It finishes in about 2 seconds, and can't tell the core usage.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
after increasing rnorm, it seems to be running on 12 cores only. |
While the scde:::papply( 1:1e2, function(x) rnorm(1e3), n.cores=12) runs on only 12 cores, the knn parameter still uses all cores. Any ideas on why the discrepancy? |
Hello,
I am trying to run the error modeling step on a sample with thousands of cells. I've increased K and min.nonfailed. But I find that even if I set n.cores = 1 , I have threads running on every core of the cluster. Furthermore I've never had the error modeling step finish.
Any thoughts on this issue?
The text was updated successfully, but these errors were encountered: