error in optimize #1

mcieslik-mctp · 2017-07-05T01:17:28Z

I tried running SAVER on a dataset of ~1500 cells. After approx. 24h on 64 cores the program crashed with the following message:

Error in optimize(calc.loglik.a, interval = c(0, var(y/sf)/mean(y/sf)^2),  :                                                                                                                                                                                             
   invalid 'xmin' value                                                                                                                                                                                                                                                   
 In addition: Warning message:                                                                                                                                                                                                                                            
 In matrix(gene.means, ngenes, ncells) :                                                                                                                                                                                                                                  
 data length [16562] is not a sub-multiple or multiple of the number of rows [16710]

Thanks!

The text was updated successfully, but these errors were encountered:

MaxKman · 2017-07-11T15:42:09Z

I encountered a similar error message just a few minutes after starting the algorithm. I used a very sparse dataset with ~10000 cells.

argument is not numeric or logical: returning NAargument is not numeric or logical: returning NAargument is not numeric or logical: returning NAargument is not numeric or logical: returning NAargument is not numeric or logical: returning NAPredictions finished. Calculating posterior...
argument is not numeric or logical: returning NA
Show Traceback
Error in optimize(calc.loglik.b, interval = c(0, var(y/sf)/mean(y/sf)), : invalid 'xmin' value

Help appreciated. Thanks!

mohuangx · 2017-07-13T15:57:38Z

Sorry for the late response. I will take a look and will let you know when it is fixed.

mohuangx · 2017-07-13T21:03:27Z

SAVER v0.1.3 should be able to solve the issue. Please let me know if this same problem occurs.

MaxKman · 2017-07-14T09:21:31Z

Thanks very much for addressing the issue. I tried to update the package but whatever I did packageVersion("SAVER") always returned 0.1.2. I am running Microsoft R Open 3.4.0.

I tried the following:
removing the package restarting the session +

install_github("mohuangx/SAVER")
install_github("mohuangx/SAVER@v0.1.3")
download as tar.gz from https://github.com/mohuangx/SAVER/releases and installing using install.packages(path_to_file, repos = NULL, type="source")

none of it worked.

EDIT: I found that the version name is not updated in the DESCRIPTION file so the installation probably worked but didn't solve the problem. I am still getting:

argument is not numeric or logical: returning NAargument is not numeric or logical: returning NAargument is not numeric or logical: returning NAargument is not numeric or logical: returning NAargument is not numeric or logical: returning NAPredictions finished. Calculating posterior...
argument is not numeric or logical: returning NA
Show Traceback
Error in optimize(calc.loglik.b, interval = c(0, var(y/sf)/mean(y/sf)), : invalid 'xmin' value

I sent you my dataset via email to make it easier to address the issue.

Best
Max

mohuangx · 2017-07-14T13:33:40Z

Hi Max,

Apologies. I forgot to update the version number in the DESCRIPTION file. However, it is still concerning that the issue is not resolved. I will try to run it on your dataset and will let you know how it goes.

Mo

mohuangx · 2017-07-24T14:38:40Z

Hi Max,

Sorry for the lengthy turnaround. I updated the package to version 0.2.0, which I was able to run without errors on your dataset. Let me know if you're able to run it as well.

Mo

MaxKman · 2017-07-24T16:08:31Z

Hey Mo, thank you so much for addressing the issue. I can't quiet get it to work yet. I run the following commands: library(doParallel)

library(SAVER) my.data <- read.delim("star_gene_exon_tagged.dge.10681cells.txt.gz", header = TRUE) rownames(my.data) <- my.data[,1] my.data <- my.data[,2:ncol(my.data)] registerDoParallel(cores = 8) my.data.normalized <- saver(my.data, parallel = TRUE)

It returns the following: Removing 3 cells with zero expression. Calculating predictions... number of

observations in y (1) not equal to the number of rows of x (10678)argument is not numeric or logical: returning NAnumber of observations in y (1) not equal to the number of rows of x (10678)argument is not numeric or logical: returning NAnumber of observations in y (1) not equal to the number of rows of x (10678)argument is not numeric or logical: returning NAnumber of observations in y (1) not equal to the number of rows of x (10678)argument is not numeric or logical: returning NAnumber of observations in y (1) not equal to the number of rows of x (10678)argument is not numeric or logical: returning NAError in if (var(mu) == 0) { : missing value where TRUE/FALSE needed

I run Microsoft R Open 3.4.0. Since it worked in your environment. Maybe there is an issue with the way I import the data?! Best regards Max 2017-07-24 16:38 GMT+02:00 mohuangx <notifications@github.com>:

…

Hi Max, Sorry for the lengthy turnaround. I updated the package to version 0.2.0, which I was able to run without errors on your dataset. Let me know if you're able to run it as well. Mo — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AcsDxdNUT7dx8cqGgctShRH9PwLSUqqDks5sRKxwgaJpZM4ONxEy> .

mohuangx · 2017-07-24T16:22:37Z

Hi Max,

Try running the saver function on as.matrix(my.data), i.e.,
my.data.normalized <- saver(as.matrix(my.data), parallel = TRUE)

Mo

MaxKman · 2017-07-24T16:51:11Z

Thanks for pointing that out! I must have changed that somewhere on the way of playing around with the function. Now its running without throwing an error so far. 2017-07-24 18:22 GMT+02:00 mohuangx <notifications@github.com>:

…

Hi Max, Try running the saver function on as.matrix(my.data), i.e., my.data.normalized <- saver(as.matrix(my.data), parallel = TRUE) Mo — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AcsDxXi2REIDAKCJTppbEp_qz7vEBU8gks5sRMTOgaJpZM4ONxEy> .

MaxKman · 2017-07-26T08:38:10Z

How long did my dataset run on your machine? It has been running for 40h now on 8 cores. Thanks! 2017-07-24 18:51 GMT+02:00 Max Kaufmann <Max.Ka@gmx.de>:

…

Thanks for pointing that out! I must have changed that somewhere on the way of playing around with the function. Now its running without throwing an error so far. 2017-07-24 18:22 GMT+02:00 mohuangx ***@***.***>: > Hi Max, > > Try running the saver function on as.matrix(my.data), i.e., > my.data.normalized <- saver(as.matrix(my.data), parallel = TRUE) > > Mo > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <#1 (comment)>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/AcsDxXi2REIDAKCJTppbEp_qz7vEBU8gks5sRMTOgaJpZM4ONxEy> > . >

mcieslik-mctp · 2017-07-26T10:59:56Z

Thanks! Our initial tests indicate that the problem is resolved, @MaxKman FYI it takes approx 24h on 64 cores w/ ~8k detected genes.

mohuangx · 2017-07-26T15:40:10Z

Hi Max,

I only ran it on 100 genes on 10 cores, which took about 3 hours (although the posterior calculation was performed on all ~20,000 genes). I would guess it might take around 60-80 hours running on 8 cores. I'm currently working on ways to speed up the program so look out for improvements in the coming versions!

MaxKman · 2017-08-07T06:42:23Z

Thank you for all the help so far. I led saver run over the weekend on 10 cores. When I returned for work I found the following error:

Error in rownames<-(*tmp*, value = c("A1BG", "A1BG-AS1", "A2M", "A2M-AS1", :
length of 'dimnames' [1] not equal to array extent

Help appreciated!

Best
Max

mohuangx · 2017-08-07T13:57:46Z

Hi Max,

Sorry for the inconvenience. Could you provide the command that you used to run saver and the version?

Thanks,
Mo

MaxKman · 2017-08-08T06:15:44Z

Hey Mo,

I used saver 0.2.0 and the following commands:

cells.data <- read.delim("10681cells.txt.gz", header = TRUE)
rownames(cells.data) <- cells.data[,1]
cells.data <- cells.data[,2:ncol(cells.data)]

library(doParallel)
library(SAVER)
registerDoParallel(cores = 10)
cells.normalized <- saver(as.matrix(cells.data), parallel = TRUE)
save(cells.normalized, file="cells.normalized.rData")

The output was the following:

Removing 3 cells with zero expression.
Calculating predictions...
Approximate finish time: 2017-08-05 16:10:46
Running in parallel: 10 workers
Loading required package: Matrix
Loaded glmnet 2.0-10
Error in rownames<-(*tmp*, value = c("A1BG", "A1BG-AS1", "A2M", "A2M-AS1", :
length of 'dimnames' [1] not equal to array extent

nicolee-mctp · 2017-08-08T14:06:49Z

Hi Mo,

I'm a collaborator of OP, and we ran into another error with SAVER. We got it to work on one dataset fine a couple weeks ago, but when we tried to use it on another dataset, we got the following error:

library(SAVER)
library(doParallel)
registerDoParallel(cores = 64)
saver <- saver(as.matrix(mat), parallel = TRUE)
calculating predictions...
Approximate finish time: 2017-08-08 05:09:55
Running in parallel: 64 workers
Loading required package: Matrix
Loaded glmnet 2.0-10

Predictions finished. Calculating posterior...
Error in mu[pred.genes, ] <- mu.par : 
  number of items to replace is not a multiple of replacement length

Any insights on how to fix this? Thanks!

Nicole

mohuangx · 2017-08-08T16:06:39Z

Hi Max,

I ran SAVER 0.2.1 on a subset of the dataset and was able to get it to run without any errors. Could you try running it on a subset using your current version SAVER 0.2.0 to see if you get the same error and then try updating to SAVER 0.2.1?

Sorry again for the repeated issues.

Mo

mohuangx · 2017-08-08T16:08:11Z

Hi Nicole,

It appears that you're using an older version of SAVER. Try reinstalling SAVER and run it on a subset of the dataset to see if it works, and if it works then try running it on the full dataset. Please let me know if you are still getting an error.

Mo

MaxKman · 2017-08-12T11:22:49Z

Hi Mo,

I ran a small subset of my dataset with SAVER 0.2.1 and it went through fine. After that I tried the whole dataset and it ran for 5 days on 10 cores when finally returning an error. See below:

Removing 3 cells with zero expression.
Calculating predictions for 11841 genes using 10678 cells...
Approximate finish time: 2017-08-11 01:17:54
Running in parallel: 10 workers
Loading required package: Matrix
Loaded glmnet 2.0-10

Error in out[[i]][lasso.genes, ] <- lasso[[i]] :
number of items to replace is not a multiple of replacement length

save(cells.normalized, file="cells.normalized.rData")
Error in save(cells.normalized, file = "cells.normalized.rData") :
object ‘cells.normalized’ not found

The commands I used are the same as posted above.

mohuangx · 2017-08-12T13:14:17Z

Hi Max,

Sorry for the repeated errors. I will try running it on the entire dataset and will get back to you when it's finished.

Mo

MaxKman · 2017-08-12T17:06:00Z

Great, thank you!

mohuangx · 2017-08-13T21:46:32Z

Hi Max,

I updated SAVER to version 0.2.2 and was able to run it on your dataset without any problems. Hopefully it will finally work for you.

Mo

MaxKman · 2017-08-13T22:41:57Z

Hi Mo,

thanks very much! I will attempt another run tomorow. Did you by any chance save the results from your run and could send it to me?

Best
Max

mohuangx · 2017-08-14T15:42:24Z

Hi Max,

Sure, I will email you a link.

Mo

MaxKman · 2017-08-23T11:55:18Z

Hi Mo,

this time everything worked without throwing an error. I had to set nzero = 50 like you did for it to work though. Thank you for your help!

Best
Max

mohuangx · 2017-08-30T18:01:51Z

Hi Max,

Thanks for the response. Did it not work when nzero was not specified for SAVER version 0.2.2?

Thanks,
Mo

MaxKman · 2017-08-31T10:59:30Z

Exactly. Unfortunately I didn't log the error message this time but from what I remember it was similiar to the one before.

mohuangx · 2017-09-01T17:09:30Z

Thanks Max for bringing this to my attention. I'll try and see what the problem is.

nicolee-mctp · 2017-09-14T14:38:44Z

Hi Mo,

I got the same error as Max even though I set nzero (10 to match my other analyses). I verified that I was using version 0.2.2.

> mat <- read.csv("data/raw/counts.csv", row.names = 1)
> mat <- as.matrix(mat)
> library(SAVER)
> registerDoParallel(cores = 64) 
> saver <- saver(mat, parallel = TRUE, nzero = 10)

Calculating predictions for 19205 genes using 9393 cells...
Approximate finish time: 2017-09-13 13:05:06
Running in parallel: 64 workers
Loading required package: Matrix
Loaded glmnet 2.0-10

Error in out[[i]][lasso.genes, ] <- Reduce(rbind, lapply(lasso, `[[`,  : 
  number of items to replace is not a multiple of replacement length

mohuangx · 2017-09-14T15:33:11Z

Hi Nicole,

Sorry for the error. Do you mind sharing the dataset so that I can try to diagnose the problem?

Thanks,
Mo

nicolee-mctp · 2017-09-15T17:51:26Z

Hi Mo,

I emailed you yesterday with the dataset. Please let me know if you got it.

Thanks!
Nicole

mohuangx · 2017-09-15T18:02:09Z

Hi Nicole,

Thanks for emailing me the dataset! I'm currently running it and will hopefully identify the error soon.

Mo

fanli-gcb · 2017-09-15T22:00:36Z

Those types of errors are commonly seen in parallel computation when jobs die unexpectedly (ie due to lack of memory). Perhaps reducing the number of cores down from 64 would help?

nicolee-mctp · 2017-09-15T22:20:45Z

@fanli-gcb Thanks for the suggestion! I'll try that now, hopefully that will fix the problem.

mohuangx · 2017-09-20T14:01:32Z

@fanli-gcb Thanks for pointing this out. Indeed, this seemed to be where the bottleneck was, since Reduce was being used to combine the list of lists, which is computationally intensive.

I have updated the combine function to use unlist instead, which is much faster. The changes can be found in SAVER version 0.3.0.

@nicolee-mctp I ran your dataset with SAVER version 0.3.0 and was able to get results without any issues. I sent you an email with a link to the results.

mohuangx closed this as completed Nov 16, 2017

error in optimize #1

error in optimize #1

Comments

mcieslik-mctp commented Jul 5, 2017

MaxKman commented Jul 11, 2017 • edited

mohuangx commented Jul 13, 2017

mohuangx commented Jul 13, 2017

MaxKman commented Jul 14, 2017 • edited

mohuangx commented Jul 14, 2017

mohuangx commented Jul 24, 2017

MaxKman commented Jul 24, 2017 via email

mohuangx commented Jul 24, 2017

MaxKman commented Jul 24, 2017 via email

MaxKman commented Jul 26, 2017 via email

mcieslik-mctp commented Jul 26, 2017

mohuangx commented Jul 26, 2017

MaxKman commented Aug 7, 2017

mohuangx commented Aug 7, 2017

MaxKman commented Aug 8, 2017

nicolee-mctp commented Aug 8, 2017

mohuangx commented Aug 8, 2017

mohuangx commented Aug 8, 2017

MaxKman commented Aug 12, 2017

mohuangx commented Aug 12, 2017

MaxKman commented Aug 12, 2017

mohuangx commented Aug 13, 2017

MaxKman commented Aug 13, 2017

mohuangx commented Aug 14, 2017

MaxKman commented Aug 23, 2017

mohuangx commented Aug 30, 2017

MaxKman commented Aug 31, 2017

mohuangx commented Sep 1, 2017

nicolee-mctp commented Sep 14, 2017

mohuangx commented Sep 14, 2017

nicolee-mctp commented Sep 15, 2017

mohuangx commented Sep 15, 2017

fanli-gcb commented Sep 15, 2017

nicolee-mctp commented Sep 15, 2017

mohuangx commented Sep 20, 2017

MaxKman commented Jul 11, 2017 •

edited

MaxKman commented Jul 14, 2017 •

edited