Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in mobster::mobster_fit(x, ...) #12

Closed
perllb opened this issue Aug 30, 2023 · 2 comments
Closed

Error in mobster::mobster_fit(x, ...) #12

perllb opened this issue Aug 30, 2023 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@perllb
Copy link

perllb commented Aug 30, 2023

Running TINC (TINC_0.1.0) on a sample gives the following error upon autofit with a strelka2-vcf:
(Please find the vcf attached - note that it is anonymised)

  • Note that
    • TINC succeeds when running on several other samples, only this particular sample causes error
    • We managed to bypass the error by adding a few extra variants (PASSing some of the variants that has Strelka-EVS just below thresholds)
    • TINC succeeds for other samples with even fewer variants, so it is not a matter of too few variants
    • We use WES data, not WGS. But TINC has worked with several WES strelka-output before
    • The same error arise when we load the strelka-vcf with load_VCF_Strelka instead of manually
> dummy = as_tibble(read.table("dummy_data.tsv.txt", sep = "\t", header = T))

> head(dummy)
# A tibble: 6 × 9
  chr        from        to ref   alt   n_ref_count n_alt_count t_ref_count t_alt_count
  <chr>     <int>     <int> <chr> <chr>       <int>       <int>       <int>       <int>
1 chr1    1339794   1339795 A     C             560           2         470          16
2 chr1   15729850  15729851 C     G             486           0         393          10
3 chr1   42747159  42747160 C     T             697           0         580          18
4 chr1   54200483  54200484 G     C             417           0         353           8
5 chr1   89263797  89263798 A     C             391           0         418           9
6 chr1  110222976 110222977 C     G             638           0         517          16

> fit = autofit(input = dummy, cna = NULL) 
 [ TINC ] 


── Loading TINC input data ─────────────────────────────────────────────────────────────────────────────────────────────────────────
✔ Input data contains n = 86 mutations, selecting operation mode.
✔ Mutation with VAF within 0 and 0.7 ~ n = 86.

── Analysing tumour sample with MOBSTER ────────────────────────────────────────────────────────────────────────────────────────────

 [ MOBSTER fit ] 

✔ Loaded input data, n = 86.
❯ n = 86. Mixture with k = 1,2,3 Beta(s). Pareto tail: TRUE and FALSE. Output clusters with π > 0.02 and n > 10.
❯ Custom fit by Moments-matching in up to 300 steps, with ε = 1e-09 and random initialisation.
❯ Scoring (without parallel) 6 x 3 x 2 = 36 models by reICL.

[easypar] 36/36 computations returned errors and will be removed.                         
Error in mobster::mobster_fit(x, ...) : 
  All task returned errors, no fit available, raising this error to interrupt the computation....
In addition: There were 50 or more warnings (use warnings() to see the first 50) 

Warnings:

> warnings()
Warning messages:
1: Unknown or uninitialised column: `a`.
2: Unknown or uninitialised column: `b`.
3: Unknown or uninitialised column: `a`.
4: Unknown or uninitialised column: `b`.
5: In (function (X, K = 3, init = "peaks", tail = TRUE, epsilon = 1e-10,  ... :
  Possible singularity in one Beta component a/b --> Inf.
6: In (function (X, K = 3, init = "peaks", tail = TRUE, epsilon = 1e-10,  ... :
  Possible singularity in one Beta component a/b --> Inf.
..
50: In (function (X, K = 3, init = "peaks", tail = TRUE, epsilon = 1e-10,  ... :
  Possible singularity in one Beta component a/b --> Inf.

sessionInfo:

sessionInfo()
R version 4.2.1 (2022-06-23)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Monterey 12.2

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ggplot2_3.4.3 dplyr_1.1.2   tidyr_1.3.0   TINC_0.1.0    vcfR_1.14.0  

loaded via a namespace (and not attached):
  [1] VGAM_1.1-8             colorspace_2.1-0       ggsignif_0.6.4         seqinr_4.2-30          XVector_0.38.0        
  [6] GenomicRanges_1.50.2   rstudioapi_0.15.0      ggpubr_0.6.0           farver_2.1.1           graphlayouts_1.0.0    
 [11] ggrepel_0.9.3          fansi_1.0.4            mvtnorm_1.2-3          codetools_0.2-18       splines_4.2.1         
 [16] doParallel_1.0.17      memuse_4.2-3           polyclip_1.10-4        ade4_1.7-22            entropy_1.3.1         
 [21] Rsamtools_2.14.0       broom_1.0.5            cluster_2.1.3          ggforce_0.4.1          readr_2.1.4           
 [26] compiler_4.2.1         backports_1.4.1        Matrix_1.4-1           cli_3.6.1              tweenr_2.0.2          
 [31] easypar_1.0.0          prettyunits_1.1.1      tools_4.2.1            igraph_1.5.1           gtable_0.3.4          
 [36] glue_1.6.2             GenomeInfoDbData_1.2.9 reshape2_1.4.4         mobster_1.0.0          Rcpp_1.0.11           
 [41] carData_3.0-5          bbmle_1.0.25           vctrs_0.6.3            Biostrings_2.66.0      GUILDS_1.4.6          
 [46] ape_5.7-1              nlme_3.1-157           iterators_1.0.14       pinfsc50_1.2.0         ggraph_2.1.0          
 [51] stringr_1.5.0          lifecycle_1.0.3        akima_0.6-3.4          gtools_3.9.4           rstatix_0.7.2         
 [56] poilog_0.4.2           MASS_7.3-57            zlibbioc_1.44.0        scales_1.2.1           tidygraph_1.2.3       
 [61] dndscv_0.0.1.0         sads_0.4.2             clisymbols_1.2.0       hms_1.1.3              parallel_4.2.1        
 [66] tidyverse_2.0.0        VIBER_0.1.0            RColorBrewer_1.1-3     gridExtra_2.3          pio_0.1.0             
 [71] bdsmatrix_1.3-6        stringi_1.7.12         S4Vectors_0.36.2       foreach_1.5.2          permute_0.9-7         
 [76] BiocGenerics_0.44.0    BiocParallel_1.32.6    GenomeInfoDb_1.34.9    rlang_1.1.1            pkgconfig_2.0.3       
 [81] bitops_1.0-7           lattice_0.20-45        purrr_1.0.2            labeling_0.4.2         cowplot_1.1.1         
 [86] tidyselect_1.2.0       plyr_1.8.8             magrittr_2.0.3         R6_2.5.1               IRanges_2.32.0        
 [91] generics_0.1.3         CNAqc_1.0.0            pillar_1.9.0           withr_2.5.0            mgcv_1.8-40           
 [96] abind_1.4-5            RCurl_1.98-1.12        sp_2.0-0               tibble_3.2.1           crayon_1.5.2          
[101] car_3.1-2              utf8_1.2.3             tzdb_0.4.0             viridis_0.6.2          progress_1.2.2        
[106] grid_4.2.1             vegan_2.6-4            matrixcalc_1.0-6       BMix_1.0.0             digest_0.6.33         
[111] numDeriv_2016.8-1.1    stats4_4.2.1           munsell_0.5.0          viridisLite_0.4.2      ctree_1.1.0       

dummy_data.tsv.txt

@caravagn caravagn added the bug Something isn't working label Aug 31, 2023
@caravagn
Copy link
Collaborator

@Militeee does this still apply?

@Militeee
Copy link
Contributor

With the current mobster and TINC versions it terminates with an informative message. That one was an issue with mobster in reality, but please reopen the issue here or on the mobster page if you still encounter the error.

> fit = autofit(input = dummy, cna = NULL) 
 [ TINC ] 


── Loading TINC input data ───────────────────────────────────────────────────────────────────────────────────
✔ Input data contains n = 86 mutations, selecting operation mode.
✔ Mutation with VAF within 0 and 0.7 ~ n = 86.

── Analysing tumour sample with MOBSTER ──────────────────────────────────────────────────────────────────────

 [ MOBSTER fit ] 

✔ Loaded input data, n = 86.
❯ n = 86. Mixture with k = 1,2,3 Beta(s). Pareto tail: TRUE and FALSE. Output clusters with π > 0.02 and n >
10.
❯ Custom fit by Moments-matching in up to 300 steps, with ε = 1e-09 and random initialisation.
❯ Scoring (without parallel) 6 x 3 x 2 = 36 models by reICL.

[easypar] 25/36 computations returned errors and will be removed.                         


ℹ MOBSTER fits completed in 20.6s.

── [ MOBSTER ] My MOBSTER model n = 86 with k = 0 Beta(s) and a tail ─────────────────────────────────────────
● Clusters: π = 100% [Tail], with π > 0.
● Tail [n = 86, 100%] with alpha = 1.3.
✖ No Beta fit.

ℹ Score(s): NLL = -248.68; ICL = -483.99 (-483.99), H = 0 (0). Fit converged by MM in 2 steps.

ℹ Using the location likelihood heuristic to inspect mutations' distribution
✖ Location Likelihood: all clusters  fail the test, returning the first one.
ℹ Without CNA, TINC will estimate tumour purity as 2*x, with x the clonal peak.

✔ MOBSTER found n = 0 clonal mutations from cluster NA

── Analysing normal sample with BMix ─────────────────────────────────────────────────────────────────────────

Error in analyse_BMix(x = as_normal(x) %>% dplyr::filter(OK_clonal), cna_map = cna_map,  : 
  There are no tumour clonal mutations in the normal sample, there is no contamination?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants