Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in run_eggla_gwas when use_info=TRUE #80

Closed
4 tasks done
annihei opened this issue Oct 24, 2022 · 8 comments · Fixed by #81
Closed
4 tasks done

Error in run_eggla_gwas when use_info=TRUE #80

annihei opened this issue Oct 24, 2022 · 8 comments · Fixed by #81
Assignees
Milestone

Comments

@annihei
Copy link
Collaborator

annihei commented Oct 24, 2022

Bug description

I'm trying run_eggla_gwas again, and managed to run it with argument use_info=FALSE. With the following command

run_eggla_gwas(
  data = "selected_cleaned_PCs_NFBC1966.csv",
  results = results_archives,
  id_column = "ID",
  traits = c("slope_.*", "auc_.*"),
  covariates = c("sex", "PC1", "PC2", "PC3", "PC4", "PC5"),
  vcfs = list.files(
    path = gen_file_path,
    pattern = "\\.vcf$|\\.vcf.gz$",
    full.names = TRUE
  ),
  working_directory = work_dir_path,
  use_info = TRUE,
  build = "37",
  strand = "+",
  vep = NULL,
  bin_path = list(
    bcftools = "/usr/local/bin/bcftools",
    plink2 = "/usr/local/bin/plink2"
  ),
  threads = 1
)

I get error and message:
Formatting VCFs ...
[chr1.dose.vcf.gz] Performing PLINK2 regression ...
[chr1.dose.vcf.gz] Extracting INFO ...
[chr1.dose.vcf.gz] Combining results files ...
[chr1.dose.vcf.gz] Writing annotated results file ...
Error in colnamesInt(x, neworder, check_dups = FALSE) :
argument specifying columns specify non existing column(s): cols[21]='INFO'
Calls: run_eggla_gwas ... resolve.list -> signalConditionsASAP -> signalConditions
Execution halted
srun: error: localhost: task 0: Exited with exit code 1

Is there something that is missing in the arguments, or in my vcf files? I don't catch where it's coming from.

$ bcftools view -h "genetic_files/chr20.dose.vcf.gz"
##fileformat=VCFv4.1
##FILTER=<ID=PASS,Description="All filters passed">
##FILTER=<ID=GENOTYPED,Description="Site was genotyped">
##FILTER=<ID=GENOTYPED_ONLY,Description="Site was genotyped only">
##filedate=2016.10.6
##source=Minimac3
##contig=<ID=20>
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=DS,Number=1,Type=Float,Description="Estimated Alternate Allele Dosage : [P(0/1)+2*P(1/1)]">
##FORMAT=<ID=GP,Number=3,Type=Float,Description="Estimated Posterior Probabilities for Genotypes 0/0, 0/1 and 1/1">
##INFO=<ID=AF,Number=1,Type=Float,Description="Estimated Alternate Allele Frequency">
##INFO=<ID=MAF,Number=1,Type=Float,Description="Estimated Minor Allele Frequency">
##INFO=<ID=R2,Number=1,Type=Float,Description="Estimated Imputation Accuracy">
##INFO=<ID=ER2,Number=1,Type=Float,Description="Empirical (Leave-One-Out) R-square (available only for genotyped variants)">
##bcftools_viewVersion=1.4+htslib-1.4
##bcftools_viewCommand=view -h chr20.dose.vcf.gz; Date=Thu May 17 10:11:34 2018
##bcftools_viewVersion=1.10.2+htslib-1.10.2
##bcftools_viewCommand=view -h genetic_files/chr20.dose.vcf.gz; Date=Mon Oct 24 10:39:04 2022
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  ...

eggla version output

R version 4.2.1 (2022-06-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.5 LTS

Locale:
  LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
  LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
  LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
  LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

Package version:
  backports_1.4.1      bayestestR_0.13.0    beeswarm_0.4.0      
  BH_1.75.0.0          bit_4.0.4            bit64_4.0.5         
  broom_1.0.1          broom.mixed_0.2.9.4  cli_3.4.1           
  clipr_0.8.0          coda_0.19.4          codetools_0.2.18    
  colorspace_2.0.3     compiler_4.2.1       cpp11_0.4.3         
  crayon_1.5.2         curl_4.3.3           data.table_1.14.4   
  datawizard_0.6.3     digest_0.6.30        distributional_0.3.1
  doParallel_1.0.17    dplyr_1.0.10         eggla_0.17.3        
  ellipsis_0.3.2       fansi_1.0.3          farver_2.1.1        
  forcats_0.5.2        foreach_1.5.2        furrr_0.3.1         
  future_1.28.0        generics_0.1.3       ggbeeswarm_0.6.0    
  ggdist_3.2.0         ggplot2_3.3.6        ggtext_0.1.2        
  globals_0.16.1       glue_1.6.2           graphics_4.2.1      
  grDevices_4.2.1      grid_4.2.1           gridtext_0.1.5      
  growthcleanr_2.0.0   gtable_0.3.1         haven_2.5.1         
  HDInterval_0.2.2     hms_1.1.2            insight_0.18.6      
  isoband_0.2.6        iterators_1.0.14     jpeg_0.1.9          
  labeling_0.4.2       labelled_2.10.0      lattice_0.20.44     
  lifecycle_1.0.3      listenv_0.8.0        magrittr_2.0.3      
  markdown_1.2         MASS_7.3.53          Matrix_1.3.3        
  methods_4.2.1        mgcv_1.8.35          mime_0.12           
  munsell_0.5.0        nlme_3.1.152         numDeriv_2016.8.1.1 
  parallel_4.2.1       parallelly_1.32.1    patchwork_1.1.2     
  performance_0.10.0   pillar_1.8.1         pkgconfig_2.0.3     
  plyr_1.8.7           png_0.1.7            prettyunits_1.1.1   
  progress_1.2.2       purrr_0.3.5          R6_2.5.1            
  RColorBrewer_1.1.3   Rcpp_1.0.9           readr_2.1.3         
  rlang_1.0.6          scales_1.2.1         splines_4.2.1       
  stats_4.2.1          stringi_1.7.8        stringr_1.4.1       
  tibble_3.1.8         tidyr_1.2.1          tidyselect_1.2.0    
  tools_4.2.1          tzdb_0.3.0           utf8_1.2.2          
  utils_4.2.1          vctrs_0.5.0          vipor_0.4.5         
  viridisLite_0.4.1    vroom_1.6.0          withr_2.5.0         
  xfun_0.34            xml2_1.3.3          

Checklist

  • formatted your issue so it is easier for us to read?
  • included a minimal, fully reproducible example in a single .qmd file? Please provide the whole file rather than the snippet you believe is causing the issue.
  • documented the version of eggla you're running, by pasting the output from running packageVersion('eggla') in the "eggla version output" text area?
  • documented which operating system you're running? If on Linux, please provide the specific distribution as well.
@annihei annihei added the bug label Oct 24, 2022
@mcanouil
Copy link
Owner

I hardcoded the imputation quality field in INFO as "INFO" while it could also be "R2".
I'll release a patch today.

@mcanouil mcanouil self-assigned this Oct 24, 2022
mcanouil added a commit that referenced this issue Oct 24, 2022
@mcanouil
Copy link
Owner

mcanouil commented Oct 24, 2022

This should be fixed now, can you try to install the development version?

remotes::install_github("mcanouil/eggla")

PS: do not forget to provide the proper information for the imputation quality metrics by changing the default value for info_type, i.e., info_type = "IMPUTE2 info score via 'bcftools +impute-info'", in your case it's something about "R2".

@annihei
Copy link
Collaborator Author

annihei commented Oct 24, 2022

Thanks, got the 0.17.4 and running the analysis now.

@mcanouil
Copy link
Owner

Ok, let me know if there is still an issue or if it goes smoothly, and in that case I'll release v0.17.4..

@annihei
Copy link
Collaborator Author

annihei commented Oct 24, 2022

Will do, with use_info=TRUE it took about 2,5 days to finish.

@mcanouil mcanouil reopened this Oct 24, 2022
@mcanouil
Copy link
Owner

I'll keep the issue open until you can confirmed it is fixed then.

@annihei
Copy link
Collaborator Author

annihei commented Oct 27, 2022

This worked! I quickly looked at the results, seems like everything else is OK I just need to add CALL_RATE.

@mcanouil
Copy link
Owner

Great! I'll start the release process for v0.17.4.

I don't think the CALL_RATE is really needed though.
In theory, your data have been QCed, this implies a call rate threshold, thus the column will be between the threshold and one, which makes it not really useful for any postprocessing step.

mcanouil added a commit that referenced this issue Dec 5, 2022
@mcanouil mcanouil added this to the v1.0.0 milestone Jan 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants