witsGWAS is a simple human GWAS analysis workflow built at the Sydney Brenner Institute for data quality control (QC) and basic association testing. It takes away the need for having to enter individual commands at the unix prompt and rather organizes GWAS tasks sequentially (facilitated via Ruffus) for submission to a distributed PBS Torque cluster (managed via Rubra). witsGWAS monitors (using flag files) the progress of jobs/tasks submitted to the cluster on behalf of the user, courteously waiting for one job to finish before sending another one
Installation, Examples and tutorials for witsGWAS can be accessed at the witsGWAS_wiki
QC of Affymetrix array data (SNP6 raw .CEL files)
- genotype calling
- converting birdseed calls to PLINK format
Sample and SNP QC of PLINK Binaries
Sample QC tasks checking:
- discordant sex information
- calculating missingness
- heterozygosity scores
- relatedness
- divergent ancestry
SNP QC tasks checking:
- minor allele frequencies
- SNP missingness
- differential missingness
- Hardy Weinberg Equilibrium deviations
Association testing
- Basic PLINK association tests, producing manhattan and qqplots
- CMH association test - Association analysis, accounting for clusters
- permutation testing
- logistic regression
- emmax association testing
The pipeline has been 'dockerized', simplifying its use. See the Dockerized section on the WitsGWAS Wiki for more information.
Lerato E. Magosi, Scott Hazelhurst, Rob Clucas and the WITS Bioinformatics team
witsGWAS is offered under the MIT license. See LICENSE.txt.
Anderson, C. et al. Data quality control in genetic case-control association studies. Nature Protocols. 5, 1564-1573, 2010
Sloggett, Clare; Wakefield, Matthew; Philip, Gayle; Pope, Bernard (2014): Rubra - flexible distributed pipelines. figshare. http://dx.doi.org/10.6084/m9.figshare.895626