Error: scheduled core X did not deliver a result #160

ArthurDondi · 2024-01-30T15:52:39Z

Hi! I am running numbat using singularity, with 4 cores (128Gb each) on a <500 cells dataset, and it seems to be stuck at Retesting CNVs.. for hours, then it throws an error:

out = run_numbat(
    count_mat_dgC, # gene x cell integer UMI count matrix 
    ref_hca,#ref_internal reference expression profile, a gene x cell type normalized expression level matrix
    df_allele, # allele dataframe generated by pileup_and_phase script
    genome = "hg38",
    t = 1e-5,
    ncores = 4,
    ncores_nni = 4,
    plot = TRUE,
    out_dir = '/mnt/test'
)

Filtering out 22 cells with 0 coverage
Numbat version: 1.2.3
Running under parameters:
t = 1e-05
alpha = 1e-04
gamma = 20
min_cells = 50
init_k = 3
max_cost = 142.5
n_cut = 0
max_iter = 2
max_nni = 100
min_depth = 0
use_loh = auto
multi_allelic = TRUE
min_LLR = 5
min_overlap = 0.45
max_entropy = 0.5
skip_nj = FALSE
diploid_chroms = 
ncores = 4
ncores_nni = 4
common_diploid = TRUE
tau = 0.3
check_convergence = FALSE
plot = TRUE
genome = hg38
Input metrics:
475 cells
Mem used: 1.25Gb
Approximating initial clusters using smoothed expression ..
Mem used: 1.25Gb
number of genes left: 10941
running hclust...
Iteration 1
Mem used: 1.78Gb
Running HMMs on 5 cell groups..
Retesting CNVs..
Retesting CNVs..
Retesting CNVs..
Retesting CNVs..
Error in vctrs::vec_locate_matches(needles = needles, haystack = haystack,  : 
  Match procedure results in an allocation larger than 2^31-1 elements. Attempted allocation size was 39716884476.
ℹ In file match.c at line 2644.
ℹ Install the winch package to get additional debugging info the next time you get this error.
ℹ This is an internal error that was detected in the vctrs package.
  Please report it at <https://github.com/r-lib/vctrs/issues> with a reprex (<https://tidyverse.org/help/>) and the full backtrace.

Error in `recycle_columns()`:
! Tibble columns must have compatible sizes.
• Size 248571: Column `1`.
• Size 74786391: Column `2`.
ℹ Only values of size one are recycled.
Run `rlang::last_trace()` to see where the error occurred.
Warning messages:
1: In mclapply(bulks %>% split(.$sample), mc.cores = ncores, function(bulk) { :
  scheduled core 2 did not deliver a result, all values of the job will be affected
2: In mclapply(bulks %>% split(.$sample), mc.cores = ncores, function(bulk) { :
  scheduled core 1 encountered error in user code, all values of the job will be affected

recycle_columns() seems to be a problem with cores not communicating, I don't know about vctrs::vec_locate_matches

Any idea on how to solve this?

And how to improve speed in general? It's a small dataset so I guess it should be able to run on one core worst-case, but is it supposed to take >10 hours?

The text was updated successfully, but these errors were encountered:

ArthurDondi · 2024-01-30T23:08:12Z

I tried running it with one core and it's still crashing, after being blocked for a few hours on Retesting CNVs..

INFO [2024-01-30 17:45:03] Filtering out 22 cells with 0 coverage
Filtering out 22 cells with 0 coverage
Numbat version: 1.2.3
Running under parameters:
t = 1e-05
alpha = 1e-04
gamma = 20
min_cells = 50
init_k = 3
max_cost = 142.5
n_cut = 0
max_iter = 2
max_nni = 100
min_depth = 0
use_loh = auto
multi_allelic = TRUE
min_LLR = 5
min_overlap = 0.45
max_entropy = 0.5
skip_nj = FALSE
diploid_chroms = 
ncores = 1
ncores_nni = 1
common_diploid = TRUE
tau = 0.3
check_convergence = FALSE
plot = TRUE
genome = hg38
Input metrics:
475 cells
Mem used: 1.25Gb
Approximating initial clusters using smoothed expression ..
Mem used: 1.25Gb
number of genes left: 11006
running hclust...
Iteration 1
Mem used: 1.78Gb
Running HMMs on 5 cell groups..
Retesting CNVs..
Error in `vctrs::vec_locate_matches()`:
! Match procedure results in an allocation larger than 2^31-1 elements. Attempted allocation size was 43404245895.
ℹ In file 'match.c' at line 2644.
ℹ This is an internal error that was detected in the vctrs package.
  Please report it at <https://github.com/r-lib/vctrs/issues> with a reprex (<https://tidyverse.org/help/>) and the full backtrace.
Backtrace:
     ▆
  1. ├─numbat::run_numbat(...)
  2. │ └─bulk_subtrees %>% ...
  3. ├─numbat:::run_group_hmms(...)
  4. │ └─parallel::mclapply(...)
  5. │   └─base::lapply(X = X, FUN = FUN, ...)
  6. │     └─numbat (local) FUN(X[[i]], ...)
  7. │       └─bulk %>% ...
  8. ├─numbat::analyze_bulk(...)
  9. │ └─... %>% ungroup()
 10. ├─dplyr::ungroup(.)
 11. ├─dplyr::mutate(., phi_mle_roll = zoo::na.locf(phi_mle_roll, na.rm = FALSE))
 12. ├─dplyr::group_by(., CHROM)
 13. ├─dplyr::left_join(...)
 14. ├─dplyr:::left_join.data.frame(...)
 15. │ └─dplyr:::join_mutate(...)
 16. │   └─dplyr:::join_rows(...)
 17. │     └─dplyr:::dplyr_locate_matches(...)
 18. │       ├─base::withCallingHandlers(...)
 19. │       └─vctrs::vec_locate_matches(...)
 20. └─rlang:::stop_internal_c_lib(...)
 21.   └─rlang::abort(message, call = call, .internal = TRUE, .frame = frame)
Warning message:
There were 18 warnings in `summarise()`.
The first warning was:
ℹ In argument: `approx_theta_post(...)`.
ℹ In group 30: `CHROM = 1`, `seg = 1jj`, `seg_start = 71440768`, `seg_end =
  71581715`, `cnv_state = "del_2"`.
Caused by warning in `cppdbbinom()`:
! NaNs produced
ℹ Run `dplyr::last_dplyr_warnings()` to see the 17 remaining warnings. 
Execution halted

The command I ran:

library(data.table)
library(numbat)

filename<-"/mydata/bam/featurecount/B486_Tum.counts.formated.txt"
temp <- read.csv(filename, row.names=1)
sc_counts<- as.matrix(temp)
count_mat_dgC <- as(sc_counts, "dgCMatrix") 

filename<-"/mydata/bam/featurecount/B486_Om.counts.formated.txt"
temp <- read.csv(filename, row.names=1)
sc_counts<- as.matrix(temp)
refcount_mat_dgC <- as(sc_counts, "dgCMatrix")

cell_annot <- read.csv('/mnt/ctypes/B486_Om.txt', sep='\t')
cell_annot <- as.data.frame(cell_annot)

ref_internal <- numbat::aggregate_counts(refcount_mat_dgC , cell_annot)

df_allele_file<-'/mnt/run_01/B486_Tum_allele_counts.tsv.gz'
df_allele <-fread(df_allele_file)

# run
out = run_numbat(
    count_mat_dgC, # gene x cell integer UMI count matrix 
    ref_internal,#ref_hca reference expression profile, a gene x cell type normalized expression level matrix
    df_allele, # allele dataframe generated by pileup_and_phase script
    genome = "hg38",
    t = 1e-5,
    ncores = 1,
    ncores_nni = 1,
    plot = TRUE,
    out_dir = '/mnt/test2'
)

From head(), count_mat_dgC, ref_internal and df_allele all look OK

teng-gao · 2024-02-01T15:37:18Z

Thanks for reporting this. This shouldn't be happening unless you have extremely high coverage in those cells. If you share the input with me via email I can take a look

teng-gao · 2024-02-23T02:59:33Z

I took a look. Please upgrade your numbat version to 1.3.2 or use the more recent docker image; I get more informative error message this way:

numbat version: 1.4.0
scistreer version: 1.2.0
hahmmr version: 1.0.0
Running under parameters:
t = 1e-05
alpha = 1e-04
gamma = 20
min_cells = 50
init_k = 3
max_cost = 142.5
n_cut = 0
max_iter = 2
max_nni = 100
min_depth = 0
use_loh = auto
segs_loh = None
call_clonal_loh = FALSE
segs_consensus_fix = None
multi_allelic = TRUE
min_LLR = 5
min_overlap = 0.45
max_entropy = 0.5
skip_nj = FALSE
diploid_chroms = None
ncores = 30
ncores_nni = 30
common_diploid = TRUE
tau = 0.3
check_convergence = FALSE
plot = TRUE
genome = hg38
Input metrics:
475 cells

Mem used: 1.35Gb

Approximating initial clusters using smoothed expression ..

Mem used: 1.35Gb

number of genes left: 10941

running hclust...

Iteration 1

Mem used: 1.88Gb

High SNP contamination detected (41%). Please make sure that cells from only one individual are included in genotyping step.

Expression noise level (MSE): high (1.8). Consider using a custom expression reference profile.

Running HMMs on 5 cell groups..

Warning message in mclapply(bulks %>% split(.$sample), mc.cores = ncores, function(bulk) {:
“scheduled cores 5, 4, 2, 1 encountered errors in user code, all values of the jobs will be affected”
Error in find_common_diploid(bulks, gamma = gamma, alpha = alpha, ncores = ncores): Error in smooth_segs(., min_genes = min_genes) : 
  No segments containing more than 10 genes for CHROM 18,21.

Traceback:

1. run_numbat(count_mat_dgC, ref_hca, df_allele, genome = "hg38", 
 .     t = 1e-05, ncores = 30, plot = TRUE, out_dir = "/home/tenggao/numbat_issues/160/results")
2. bulk_subtrees %>% run_group_hmms(t = t, gamma = gamma, alpha = alpha, 
 .     nu = nu, min_genes = min_genes, common_diploid = common_diploid, 
 .     diploid_chroms = diploid_chroms, ncores = ncores, verbose = verbose)   # at line 301-311 of file /home/tenggao/numbat/R/main.R
3. run_group_hmms(., t = t, gamma = gamma, alpha = alpha, nu = nu, 
 .     min_genes = min_genes, common_diploid = common_diploid, diploid_chroms = diploid_chroms, 
 .     ncores = ncores, verbose = verbose)
4. find_common_diploid(bulks, gamma = gamma, alpha = alpha, ncores = ncores)   # at line 814 of file /home/tenggao/numbat/R/main.R
5. stop(results[bad][[1]])   # at line 1151 of file /home/tenggao/numbat/R/utils.R

Notably:

High SNP contamination detected (41%). Please make sure that cells from only one individual are included in genotyping step.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error: scheduled core X did not deliver a result #160

Error: scheduled core X did not deliver a result #160

ArthurDondi commented Jan 30, 2024

ArthurDondi commented Jan 30, 2024

teng-gao commented Feb 1, 2024

teng-gao commented Feb 23, 2024 •

edited

Error: scheduled core X did not deliver a result #160

Error: scheduled core X did not deliver a result #160

Comments

ArthurDondi commented Jan 30, 2024

ArthurDondi commented Jan 30, 2024

teng-gao commented Feb 1, 2024

teng-gao commented Feb 23, 2024 • edited

teng-gao commented Feb 23, 2024 •

edited