Performing a Genome-Wide Association Studies (GWAS) can be a laborious task. In this reporsitory, we describe the pipeline that we have adapted to carry on a GWAS analysis using ddRAD-seq-derived SNPs. Underneath this README, we illustrate the different steps needed to run the analysis and we provide the codes necessary to reproduce this work.
- Stacks
- BWA-Mem
- SAMtools
- BCFtools
- GATK-HaplotypeCaller
- Freebayes
- Plink
- R program and RStudio Desktop:
- ggplot
- GAPIT
- CMplot
- SmartPCA
Step 1. ddRAD-Sequencing
Step 2. Data processing
- De-multiplexing: ------------------------> stacks.sh
- QC filtering: ------------------------> trimmomatic.sh
- Remove duplicated reads ------------------------> dedup.sh
- Reads mapping: ------------------------> align_bwa.mk
Step 3. Variant calling and filtering
- BCFtools calling: ------------------------> bcftoolsCall.mk
- Freebayes calling: ------------------------> freebayesCall.mk
- GATK calling: ------------------------> gatkCall.mk
- MAF filtering anf imputation: ------------------------> maf_imputation.mk
Step 4. GWAS analysis
- Phenotypic data assessement ------------------------> trait transformations
- PCA analysis
- Relatedness
- Statistical model assessment