Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simulating a 2000cell*150000peak scATAC #3

Open
Chen-Li-17 opened this issue Nov 23, 2022 · 2 comments
Open

simulating a 2000cell*150000peak scATAC #3

Chen-Li-17 opened this issue Nov 23, 2022 · 2 comments

Comments

@Chen-Li-17
Copy link

Chen-Li-17 commented Nov 23, 2022

  • It nearly spends a whole day to simulate, which is inconvenient.
  • I encountered an error like this when I try to simulate a 2000cell*150000peak scATAC. Could you tell me how to fix it?
Input Data Construction Start

Warning message in asMethod(object):
“sparse->dense coercion: allocating vector of size 2.4 GiB”
Input Data Construction End

Start Marginal Fitting

Warning message in mclapply(seq_len(n), do_one, mc.preschedule = mc.preschedule, :
“scheduled cores 1, 2 did not deliver results, all values of the jobs will be affected”

Error in names(answer) <- dots[[1L]]: attempt to set an attribute on NULL
Traceback:

1. scdesign3(sce = sce_seurat, assay_use = "counts", celltype = "cell_type", 
 .     pseudotime = NULL, spatial = NULL, other_covariates = NULL, 
 .     mu_formula = "cell_type", sigma_formula = "1", family_use = "zip", 
 .     n_cores = 2, usebam = FALSE, corr_formula = "cell_type", 
 .     copula = "gaussian", DT = TRUE, pseudo_obs = FALSE, return_model = FALSE, 
 .     nonzerovar = FALSE)
2. fit_marginal(mu_formula = mu_formula, sigma_formula = sigma_formula, 
 .     n_cores = n_cores, data = input_data, family_use = family_use, 
 .     usebam = usebam, parallelization = parallelization, BPPARAM = BPPARAM)
3. suppressMessages(paraFunc(fit_model_func, gene = feature_names, 
 .     family_gene = family_use, mc.cores = n_cores, MoreArgs = list(dat_use = dat_cov, 
 .         mgcv_formula = mgcv_formula, mu_formula = mu_formula, 
 .         sigma_formula = sigma_formula, predictor = predictor, 
 .         count_mat = count_mat), SIMPLIFY = FALSE))
4. withCallingHandlers(expr, message = function(c) if (inherits(c, 
 .     classes)) tryInvokeRestart("muffleMessage"))
5. paraFunc(fit_model_func, gene = feature_names, family_gene = family_use, 
 .     mc.cores = n_cores, MoreArgs = list(dat_use = dat_cov, mgcv_formula = mgcv_formula, 
 .         mu_formula = mu_formula, sigma_formula = sigma_formula, 
 .         predictor = predictor, count_mat = count_mat), SIMPLIFY = FALSE)

@SONGDONGYUAN1994
Copy link
Owner

Hi Lee,
Thanks for your interest in our work! For the time complexity, it depends on your data dimension, model setting, and the number of cores. If your data is very high-dimensional, using more cores (e.g., > 10) can reduce the time dramatically.

For your error here, we seem to have some issues in the marginal regression model fitting. Please check two things:

  1. Do you have any features with all 0?
  2. If you set family_use = 'poisson', does it work? ZIP is usually less stable.

Without a reproducible case, sorry that I cannot say much about the reason. You can email me: dongyuansong@ucla.edu, and we can set a virtual meeting if it helps.

Best,
Dongyuan

@Chen-Li-17
Copy link
Author

Dear Dongyuan,
Thank you for the reply, I'll rerun my code on your advice and give you feedback soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants