Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory issue #2

Open
lzj1769 opened this issue Jan 17, 2020 · 3 comments
Open

Memory issue #2

lzj1769 opened this issue Jan 17, 2020 · 3 comments

Comments

@lzj1769
Copy link

lzj1769 commented Jan 17, 2020

Hi,

I got this error message when I use scBFA for our dataset:

Error in asMethod(object) : 
  Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line 105
Calls: scBFA ... getGeneExpr -> as.matrix -> as.matrix.Matrix -> as -> asMethod
Execution halted

Can you help me with this?

Best,
Zhijian

@gquon
Copy link
Contributor

gquon commented Jan 19, 2020

Hi, can you give more details on the size of the dataset (number of cells, rows)? Any chance you could provide an anonymized version of your dataset? (you could delete the row and column names if you want).

@lzj1769
Copy link
Author

lzj1769 commented Jan 31, 2020

Hi,

I upload my data here:
https://drive.google.com/file/d/1ESa7pcD6_uzZkrgWUeWxGelOtD2Kfsj-/view?usp=sharing

and below is my code to run scBFA:

counts <- readRDS("scATAC.Rds")
x <- CreateSeuratObject(counts = counts)
bfa_model <- scBFA(scData = x, numFactors = 30, method = "CG")
zz <- as.matrix(t(bfa_model$ZZ))
colnames(zz) <- colnames(counts)
write.table(zz, file = "./scBFA.txt", quote = FALSE, sep = "\t")

Thanks,
Li

@RuoxinLi
Copy link
Contributor

Hi,

We are currently still working on the memory issue, as the number of sites in your raw data (>200,000) is quite large. We are now testing a mini-batch optimization procedure, it might take us 2~3 more weeks to get things working properly. It's worth to note that these visualizations typically do some feature selection (analogous to how highly variable genes are typically selected in scRNA-seq analyses before visualization). We've noticed many of the sites are open in very few cells (<1%). Filtering to keep only sites that are open in at least 2-3% of cells will help a lot with memory.

Best

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants