The hogwash R package is a phylogenetically-informed, convergence-based method for performing genome-wide association studies in bacteria. In short, the user inputs a phylogenetic tree, a phenotype (either binary or continuous), and a genotype (a binary matrix) and receives an output of the genotypes that are significantly associated with the phenotype after correcting for multiple testing, requiring convergence, and accounting for the clonal structure of the population.
Install the package
Please check out the wiki or vignette for a brief primer on bacterial GWAS, detailed descriptions of the algorithms, and example data with results.
To learn about using hogwash on your bacterial data please read our paper "Hogwash: three methods for genome-wide association studies in bacteria". It describes the algorithms and their performance on simulated data. The simulated data used in the paper were generated using the code in this repository.