Skip to content

Error with binary GWAS encoding #298

@hde08

Description

@hde08

Hello,

Thank you very much for this great tool. I have been using GEMMA without issues for a while but now wanted to do a GWAS using a binary trait, as I have seen it is supported in GEMMA manual, but this does not seem to work properly.

I have encoded the binary trait as 0/1 (control/cases) as outlined in the manual. However when running the GWAS model (here lmm but the same occurs irrespective of model type), all individuals encoded as 0 are considered as missing and GEMMA crashes (see below)

 ```##GEMMA 0.98.5 (2021-08-25) by Xiang Zhou, Pjotr Prins and team (C) 2012-2021
##Reading Files ...
##number of total individuals = 624
##number of analyzed individuals = 107
##number of covariates = 1
##number of phenotypes = 1
##number of total SNPs/var        =   440918
##number of analyzed SNPs         =   440669
##Start Eigen-Decomposition...
##pve estimate =2.4726e-06
##se(pve) =-nan
**** WARNING: Brent did not converge               5%
**** WARNING: Brent did not converge
**** WARNING: Brent did not converge
**** WARNING: Brent did not converge
**** WARNING: Brent did not converge
**** WARNING: Brent did not converge
**** WARNING: Brent did not converge
**** WARNING: Brent did not converge
``` 

107 corresponds to the number of cases only, the total sample size is 211. I managed to run it by encoding the binary trait as 1/2 (control/case) but I suspect GEMMA is not running the model correctly (see below)

 ```##
## GEMMA Version    = 0.98.5 (2021-08-25)
## Build profile    = /gnu/store/8rvid272yb53bgascf5c468z0jhsyflj-profile
## GCC version      = 7.5.0
## GSL Version      = 2.6
## OpenBlas         = OpenBLAS 0.3.9  - OpenBLAS 0.3.9 DYNAMIC_ARCH NO_AFFINITY SkylakeX MAX_THREADS=128
##   arch           = SkylakeX
##   threads        = 1
##   parallel type  = threaded
##
## Command Line Input = /home/hdenis/Programs/gemma-0.98.5-linux-static-AMD64 -bfile /nvme/disk0/lecellier_data/WGS_GBR_NC_data/Vcf_files/GWAS_files/Imputed_files/aspat_clean_all_chr_SNP_filtered_2_GQ15_0mis_imputed_maf0.05_maxgeno95_Group1_relat0.2_geno_pheno_binary -k /nvme/disk0/lecellier_data/WGS_GBR_NC_data/Vcf_files/GWAS_files/Imputed_files/aspat_clean_all_chr_SNP_filtered_2_GQ15_0mis_imputed_maf0.05_maxgeno95_Group1_relat0.2_binary.cXX.txt -lmm 1 -outdir /nvme/disk0/lecellier_data/WGS_GBR_NC_data/Assoc_studies_outputs/GEMMA/ -o Group1_GQ15_relat0.2_In_situ_mortality_binary_lmm 
##
## Date = Fri Aug  1 20:35:04 2025
##
## Summary Statistics:
## number of total individuals = 624
## number of analyzed individuals = 211
## number of covariates = 1
## number of phenotypes = 1
## number of total SNPs/var = 440918
## number of analyzed SNPs/var = 440865
## REMLE log-likelihood in the null model = -152.894
## MLE log-likelihood in the null model = -150.163
## pve estimate in the null model = 2.48451e-06
## se(pve) in the null model = 0.580001
## vg estimate in the null model = 2.51139e-06
## ve estimate in the null model = 0.251139
## beta estimate in the null model =   1.50711
## se(beta) =   0.0344997
##
## Computation Time:
## total computation time = 0.638227 min 
## computation time break down: 
##      time on eigen-decomposition = 8.90167e-05 min 
##      time on calculating UtX = 0.0119906 min 
##      time on optimization = 0.486226 min 
##``` 

Can you advise how to fix this issue ? Thank you very much !

Hugo

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions