Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

echoverse - Failing to find an unknown package #87

Closed
AMCalejandro opened this issue Mar 30, 2022 · 7 comments
Closed

echoverse - Failing to find an unknown package #87

AMCalejandro opened this issue Mar 30, 2022 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@AMCalejandro
Copy link
Contributor

AMCalejandro commented Mar 30, 2022

1. Bug description

finemap_loci pipeline fails to find echoverse package?

Console output

Registered S3 method overwritten by 'GGally':
  method from   
  +.gg   ggplot2
⠊⠉⠡⣀⣀⠊⠉⠡⣀⣀⠊⠉⠢⣀⡠⠊⠉⠢⣀⡠⠊⠉⠢⣀⡠⠊⠉⠢⣀⡠⠊⠉⠢⣀⡠⠊⠉⠢⣀⡠                                    
⠌⢁⡐⠉⣀⠊⢂⡐⠑⣀⠊⢂⡐⠑⣀⠊⢂⡐⠑⣀⠊⢂⡐⠑⣀⠊⢂⡐⠑⣀⠉⢂⡈⠑⣀⠉⢄⡈⠡⣀                                    
⠌⡈⡐⢂⢁⠒⡈⡐⢂⢁⠒⡈⡐⢂⢁⠑⡈⡈⢄⢁⠡⠌⡈⠤⢁⠡⠌⡈⠤⢁⠡⠌⡈⡠⢁⢁⠊⡈⡐⢂                                    

── 🦇  🦇  🦇 e c h o l o c a t o R 🦇  🦇  🦇 ─────────────────────────────────

── v2.0.0 ──────────────────────────────────────────────────────────────────────
⠌⡈⡐⢂⢁⠒⡈⡐⢂⢁⠒⡈⡐⢂⢁⠑⡈⡈⢄⢁⠡⠌⡈⠤⢁⠡⠌⡈⠤⢁⠡⠌⡈⡠⢁⢁⠊⡈⡐⢂                                    
⠌⢁⡐⠉⣀⠊⢂⡐⠑⣀⠊⢂⡐⠑⣀⠊⢂⡐⠑⣀⠊⢂⡐⠑⣀⠊⢂⡐⠑⣀⠉⢂⡈⠑⣀⠉⢄⡈⠡⣀                                    
⠊⠉⠡⣀⣀⠊⠉⠡⣀⣀⠊⠉⠢⣀⡠⠊⠉⠢⣀⡠⠊⠉⠢⣀⡠⠊⠉⠢⣀⡠⠊⠉⠢⣀⡠⠊⠉⠢⣀⡠                                    
ⓞ If you use echolocatoR, please cite:                                          
     ▶ Brian M Schilder, Jack Humphrey, Towfique                                
     Raj (2021) echolocatoR: an automated                                       
     end-to-end statistical and functional                                      
     genomic fine-mapping pipeline,                                             
     Bioinformatics; btab658,                                                   
     https://doi.org/10.1093/bioinformatics/btab658                             
ⓞ Please report any bugs/feature requests on GitHub:
     ▶
     https://github.com/RajLabMSSM/echolocatoR/issues
ⓞ Contributions are welcome!:
     ▶
     https://github.com/RajLabMSSM/echolocatoR/pulls

────────────────────────────────────────────────────────────────────────────────
echoconda:: Conda already installed.
echoconda:: Active conda env: 'echoverse'
echoconda:: Requested conda_env is already active: 'echoverse'
echoconda:: Attempting to activate conda env: 'echoverse'

)   )  ) ))))))}}}}}}}} LINC00511  ( 1  /  2 ) {{{{{{{{{(((((( (  (   (
+ Extracting relevant variants from fullSS...
+ Query Method: tabix
Constructing GRanges query using min/max ranges across one or more chromosomes.
+ as_blocks=TRUE: Will query a single range per chromosome that covers all regions requested (plus anything in between).
========= echotabix::convert =========
Converting full summary stats file to tabix format for fast querying.
Inferring comment_char from header: '#MarkerName'
Determining chrom type from file header.
Chromosome format: 1
Detecting column delimiter.
Identified column separator: \t
Sorting rows by coordinates via bash.
Searching for header row with grep.
( grep ^'#MarkerName' .../for_echolocatoR_axialOutcome_3.tsv; grep
    -v ^'#MarkerName' .../for_echolocatoR_axialOutcome_3.tsv | sort
    -k2,2n
    -k3,3n ) > .../file2dbbdf43076c0a_sorted.tsv
Constructing outputs
Using existing bgzipped file: /mnt/rreal/RDS/RDS/acarrasco/ANALYSES_WORKSPACE/EARLY_PD/POST_GWAS/ECHOLOCATOR/for_echolocatoR_axialOutcome_3.tsv.bgz 
Set force_new=TRUE to override this.
Tabix-indexing file using Rsamtools
Data successfully converted to bgzip-compressed, tabix-indexed format:
  - data: /mnt/rreal/RDS/RDS/acarrasco/ANALYSES_WORKSPACE/EARLY_PD/POST_GWAS/ECHOLOCATOR/for_echolocatoR_axialOutcome_3.tsv.bgz 
  - index: /mnt/rreal/RDS/RDS/acarrasco/ANALYSES_WORKSPACE/EARLY_PD/POST_GWAS/ECHOLOCATOR/for_echolocatoR_axialOutcome_3.tsv.bgz.tbi
========= echotabix::query =========
query_dat is already a GRanges object. Returning directly.
Inferred format: 'table'
Querying tabular tabix file using: Rsamtools.
Checking query chromosome style is correct.
Chromosome format: 1
Retrieving data.
Converting query results to data.table.
Processing query: 17:70330179-70330179
Adding 'query' column to results.
Retrieved data with 1 rows
Saving query ==> /mnt/rreal/RDS/RDS/acarrasco/ANALYSES_WORKSPACE/EARLY_PD/POST_GWAS/ECHOLOCATOR/RESULTS_25.3.2022/mixedmodels_GWAS/earlymotorPD_axial/LINC00511/LINC00511_earlymotorPD_axial_subset.tsv.gz
LD:: Standardizing summary statistics subset.
++ Preparing Gene col
Could not recognize genome build of:
 - target_genome
These will be inferred from the data.
++ Preparing A1,A1 cols
++ Preparing MAF,Freq cols
++ Inferring MAF from frequency column...
++ Preparing N_cases,N_controls cols
++ Preparing `proportion_cases` col
++ 'proportion_cases' not included in data subset.
++ Preparing N col
--
WARNING: Neff column could not be calculated as the columns N_CAS & N_CON were not found in the datset
--
+ Mapping colnames from MungeSumstats ==> echolocatoR
++ Preparing t-stat col
+ Calculating t-statistic from Effect and StdErr...
++ Assigning lead SNP
++ Ensuring Effect, StdErr, P are numeric
++ Ensuring 1 SNP per row
++ Removing extra whitespace
++ Saving subset ==> /mnt/rreal/RDS/RDS/acarrasco/ANALYSES_WORKSPACE/EARLY_PD/POST_GWAS/ECHOLOCATOR/RESULTS_25.3.2022/mixedmodels_GWAS/earlymotorPD_axial/LINC00511/LINC00511_earlymotorPD_axial_subset.tsv.gz
+ Extraction completed in 42.87 seconds
+ 1 SNPs x  12 columns
+ Mapping colnames from MungeSumstats ==> echolocatoR
Standardising column headers.
First line of summary statistics file: 
CHR	POS	SNP	P	Effect	StdErr	A1	A2	Freq	MAF	t_stat	leadSNP	
Using UK Biobank LD reference panel.
+ UKB LD file name: chr17_70000001_73000001
Downloading full .gz/.npz UKB files and saving to disk.
Downloading with axel (using 95 cores).
+ Overwriting pre-existing file.
Searching for 1 package(s) across 1 conda environment(s):
 - echoverse
Identified paths for 1 / 1 packages.
1 unique package(s) found across 1 conda environment(s).
sh: 1: echoverse: not found
axel download failed. Trying with download.file.
Downloading with download.file.
Time difference of 9.8 secs
Downloading with axel (using 95 cores).
+ Overwriting pre-existing file.
Searching for 1 package(s) across 1 conda environment(s):
 - echoverse
Identified paths for 1 / 1 packages.
1 unique package(s) found across 1 conda environment(s).
sh: 1: echoverse: not found
axel download failed. Trying with download.file.
Downloading with download.file.
Time difference of 28.3 secs
Error in py_run_file_impl(file, local, convert) : 
  Unable to open file '' (does it exist?)
In addition: Warning messages:
1: 'genome' not found: hg37 
2: In system(cmd) : error in running command
3: In system(cmd) : error in running command
Fine-mapping complete in:
Time difference of 1.5 mins

Code

library(data.table)
library(echolocatoR)  
#library(tidyverse)

fullSS_path <- "/mnt/rreal/RDS/RDS/acarrasco/ANALYSES_WORKSPACE/EARLY_PD/POST_GWAS/ECHOLOCATOR/for_echolocatoR_axialOutcome_3.tsv"                                                                    
fullRS_path <- "/mnt/rreal/RDS/RDS/acarrasco/ANALYSES_WORKSPACE/EARLY_PD/POST_GWAS/ECHOLOCATOR/RESULTS_25.3.2022"

top_SNPs = fread("../topSNPs_axialMotorSymptom.txt") 
top_SNPs = top_SNPs[c(5,6), ]

res = finemap_loci(top_SNPs = top_SNPs,
                                              loci = top_SNPs$Locus, 
                                              dataset_name = "earlymotorPD_axial", 
                                              dataset_type = "mixedmodels_GWAS",   
                                              force_new_subset = T, 
                                              force_new_LD = T, 
                                              force_new_finemap = T, 
                                              remove_tmps = F, 
                                               
                     # SUMMARY STATS ARGUMENTS 
                     fullSS_genome_build = "hg19",
                     fullSS_path = fullSS_path,
                     results_dir = fullRS_path,
                     query_by = "tabix", 
                     chrom_col = "CHR", position_col = "POS", snp_col = "#MarkerName", 
                     pval_col = "Pval", effect_col = "Effect", stderr_col = "StdErr", 
                     freq_col = "medianFreq", MAF_col = "calculate", 
                     A1_col = "Allele1", 
                     A2_col = "Allele2", 
                     #N_cases_col = "TotalSampleSize",
                     #N_controls = 0,
                     
                     # FILTERING ARGUMENTS 
                     ## It's often desirable to use a larger window size  
                     ## (e.g. 2Mb which is bp_distance=500000*2),  
                     ## but we use a small window here to speed up the process.  
                     bp_distance = 500000*2, 
                     min_MAF = 0.001,   
                     trim_gene_limits = F, 
                     
                     # FINE-MAPPING ARGUMENTS 
                     ## General 
                     finemap_methods = c("ABF", "SUSIE", "POLYFUN_SUSIE", "FINEMAP"),  
                     n_causal = 5, 
                     PP_threshold = .95,  
                     consensus_threshold = 2,
                     # LD ARGUMENTS  
                     LD_genome_build = "hg19",
                     LD_reference = "UKB", 
                     superpopulation = "EUR", 
                     download_method = "axel", 
                      
                     # Additional arguments - My arguments
                     case_control = F,
                     nThread = 15,
                     sample_size = 3572,
                     
                     # PLOT ARGUMENTS  
                     ## general    
                     plot_types = c("fancy"), 
                     ## Generate multiple plots of different window sizes;  
                     ### all SNPs, 4x zoomed-in, and a 50000bp window 
                     zoom = c("all","4x","10x", "30x"), 
                     ## XGR 
                     # plot.XGR_libnames=c("ENCODE_TFBS_ClusteredV3_CellTypes"),  
                     ## Roadmap 
                     roadmap=FALSE,
                     roadmap_query=NULL,
                     
                     #plot.Roadmap = F, 
                     #plot.Roadmap_query = NULL, 
                     # Nott et al. (2019) 
                     nott_epigenome=TRUE,
                     nott_show_placseq=TRUE,
                     #plot.Nott_epigenome = T,  
                     #plot.Nott_show_placseq = T,  
                     
                     verbose = TRUE,
                     
                     # ENVIRONMENT ARGS
                     conda_env= "echoverse"
                    )

2. Session info

R version 4.1.2 (2021-11-01)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 20.04.3 LTS

Matrix products: default
BLAS/LAPACK: /home/acarrasco/.conda/envs/echoverse/lib/libopenblasp-r0.3.18.so

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] echolocatoR_2.0.0 data.table_1.14.2

loaded via a namespace (and not attached):
  [1] utf8_1.2.2                  reticulate_1.24            
  [3] R.utils_2.11.0              tidyselect_1.1.2           
  [5] RSQLite_2.2.11              AnnotationDbi_1.56.2       
  [7] htmlwidgets_1.5.4           grid_4.1.2                 
  [9] BiocParallel_1.28.3         XGR_1.1.7                  
 [11] munsell_0.5.0               DT_0.22                    
 [13] colorspace_2.0-3            Biobase_2.54.0             
 [15] filelock_1.0.2              OrganismDbi_1.36.0         
 [17] knitr_1.38                  supraHex_1.32.0            
 [19] rstudioapi_0.13             stats4_4.1.2               
 [21] DescTools_0.99.44           MatrixGenerics_1.6.0       
 [23] GenomeInfoDbData_1.2.7      mixsqp_0.3-43              
 [25] bit64_4.0.5                 echoconda_0.99.5           
 [27] rprojroot_2.0.2             vctrs_0.3.8                
 [29] generics_0.1.2              xfun_0.30                  
 [31] biovizBase_1.42.0           BiocFileCache_2.2.1        
 [33] R6_2.5.1                    GenomeInfoDb_1.30.1        
 [35] AnnotationFilter_1.18.0     bitops_1.0-7               
 [37] cachem_1.0.6                reshape_0.8.8              
 [39] DelayedArray_0.20.0         assertthat_0.2.1           
 [41] BiocIO_1.4.0                scales_1.1.1               
 [43] nnet_7.3-17                 rootSolve_1.8.2.3          
 [45] gtable_0.3.0                lmom_2.8                   
 [47] ggbio_1.42.0                ensembldb_2.18.4           
 [49] rlang_1.0.2                 clisymbols_1.2.0           
 [51] MungeSumstats_1.3.16        echodata_0.99.7            
 [53] splines_4.1.2               rtracklayer_1.54.0         
 [55] lazyeval_0.2.2              gargle_1.2.0               
 [57] dichromat_2.0-0             hexbin_1.28.2              
 [59] checkmate_2.0.0             BiocManager_1.30.16        
 [61] yaml_2.3.5                  reshape2_1.4.4             
 [63] snpStats_1.44.0             GenomicFeatures_1.46.5     
 [65] ggnetwork_0.5.10            backports_1.4.1            
 [67] Hmisc_4.6-0                 RBGL_1.70.0                
 [69] tools_4.1.2                 echoplot_0.99.2            
 [71] ggplot2_3.3.5               ellipsis_0.3.2             
 [73] RColorBrewer_1.1-2          proxy_0.4-26               
 [75] BiocGenerics_0.40.0         coloc_5.1.2                
 [77] Rcpp_1.0.8.3                plyr_1.8.7                 
 [79] base64enc_0.1-3             progress_1.2.2             
 [81] zlibbioc_1.40.0             purrr_0.3.4                
 [83] RCurl_1.98-1.6              prettyunits_1.1.1          
 [85] rpart_4.1.16                viridis_0.6.2              
 [87] S4Vectors_0.32.4            SummarizedExperiment_1.24.0
 [89] ggrepel_0.9.1               cluster_2.1.2              
 [91] here_1.0.1                  fs_1.5.2                   
 [93] magrittr_2.0.2              echotabix_0.99.5           
 [95] dnet_1.1.7                  openxlsx_4.2.5             
 [97] gh_1.3.0                    mvtnorm_1.1-3              
 [99] ProtGenerics_1.26.0         matrixStats_0.61.0         
[101] patchwork_1.1.1             hms_1.1.1                  
[103] XML_3.99-0.9                jpeg_0.1-9                 
[105] IRanges_2.28.0              gridExtra_2.3              
[107] compiler_4.1.2              biomaRt_2.50.3             
[109] tibble_3.1.6                crayon_1.5.1               
[111] R.oo_1.24.0                 htmltools_0.5.2            
[113] echoannot_0.99.4            tzdb_0.3.0                 
[115] Formula_1.2-4               tidyr_1.2.0                
[117] expm_0.999-6                Exact_3.1                  
[119] lubridate_1.8.0             DBI_1.1.2                  
[121] dbplyr_2.1.1                MASS_7.3-56                
[123] rappdirs_0.3.3              boot_1.3-28                
[125] Matrix_1.4-1                readr_2.1.2                
[127] piggyback_0.1.1             cli_3.2.0                  
[129] R.methodsS3_1.8.1           echofinemap_0.99.0         
[131] parallel_4.1.2              igraph_1.2.11              
[133] GenomicRanges_1.46.1        pkgconfig_2.0.3            
[135] GenomicAlignments_1.30.0    RCircos_1.2.2              
[137] foreign_0.8-82              xml2_1.3.3                 
[139] XVector_0.34.0              echoLD_0.99.1              
[141] stringr_1.4.0               VariantAnnotation_1.40.0   
[143] digest_0.6.29               graph_1.72.0               
[145] Biostrings_2.62.0           htmlTable_2.4.0            
[147] gld_2.6.4                   restfulr_0.0.13            
[149] curl_4.3.2                  Rsamtools_2.10.0           
[151] rjson_0.2.21                lifecycle_1.0.1            
[153] nlme_3.1-157                jsonlite_1.8.0             
[155] viridisLite_0.4.0           BSgenome_1.62.0            
[157] fansi_1.0.3                 downloadR_0.99.1           
[159] susieR_0.11.92              pillar_1.7.0               
[161] lattice_0.20-45             GGally_2.1.2               
[163] KEGGREST_1.34.0             fastmap_1.1.0              
[165] httr_1.4.2                  survival_3.3-1             
[167] googleAuthR_2.0.0           glue_1.6.2                 
[169] zip_2.2.0                   png_0.1-7                  
[171] bit_4.0.4                   Rgraphviz_2.38.0           
[173] class_7.3-20                stringi_1.7.6              
[175] blob_1.2.2                  latticeExtra_0.6-29        
[177] memoise_2.0.1               dplyr_1.0.8                
[179] irlba_2.3.5                 e1071_1.7-9                
[181] ape_5.6-2                  


@AMCalejandro AMCalejandro added the bug Something isn't working label Mar 30, 2022
@bschilder bschilder self-assigned this Aug 27, 2022
@bschilder
Copy link
Member

Could you clarify how you created the conda environment "echoverse"?

Either way, it's odd that echoconda would be looking for a software package of the same name. Looking into this now.

@bschilder
Copy link
Member

bschilder commented Aug 27, 2022

Ok, based on the date posted i think this was a bug in an older version of echoverse. Could you try reinstalling on that branch (now master branch) and try again?

Apologies for the long delay!

@bschilder
Copy link
Member

@AMCalejandro has this since resolved for you with the updates?

@AMCalejandro
Copy link
Contributor Author

AMCalejandro commented Sep 20, 2022

Just to be clear, that conda environment was create from yml file.

Please, find it attached here


name: echoverse
channels:
  - conda-forge
  - bioconda
  - nodefaults
dependencies:
  # Python
  - python>=3.6.1
  - pandas>=0.25.0
  - pandas-plink
  - fastparquet
  - pyarrow
  - scipy
  - scikit-learn
  - tqdm
  - bitarray
  - networkx
  - rpy2
  - requests
  # Command line
  - htslib
  - plink
  - bcftools
  - wget
  - axel
  # R
  - r>=4.1.0
  - r-biocmanager
  - bioconductor-snpstats
  - bioconductor-ggbio
  - bioconductor-ensdb.hsapiens.v75
  - bioconductor-biomart
  - radian
  - pip

I am happy to reinstall echolocatoR, and give a try to run the workflow again using the default echoR_mini.
I will do so when you push the fix the sample size input to master

@bschilder
Copy link
Member

Please, find it attached here

Perfect, thanks

I will do so when you push the fix the sample size input to master

Cool, this has already been pushed.

@AMCalejandro
Copy link
Contributor Author

AMCalejandro commented Sep 20, 2022 via email

@bschilder
Copy link
Member

The necessary edits were all done in echodata. So you just need to update that subpackage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Development

No branches or pull requests

2 participants