Welcome to Dr. Wheeler's lab! This tutorial will introduce you to some essential bioinformatic tools for GWAS and statistical analysis used in the lab. See the Wiki tab to begin! Students will learn to use PLINK2 for handling large genotype data and running GWAS results, and R for data transformation, statistical validation, and visualizing results, and LocusZoom. The workshop offers a step by step guide with instructions for both Windows and Mac users.
- R
- R studio
- Plink 2
- LocusZoom (web access)
- data.table
- dplyr
- ggplot
- RNOmni
- qqman
-
genotype data
- .bed: binary file of genotype calls
- .bim: SNP map file (chromosome, position, alleles)
- .fam: sample information (individual ID, family ID, gender, phenotype)
-
phenotype file: a text file containing phenotype values. Used to asses and normalize phenotype distributions.
- tutorial_phenotype.txt
- problem_set_phenotype.txt
-
population info file: contains population group assignments. Used for PCA plots and visualizing population structure.
-
Plink Outputs
- plink2.prune.in/ plink2,prune.out: List of SNPS retained and excluded after LD pruning.
- plink2.eigenvec: principal component values per individual.
- plink2,eigenval: eigenvals for each principal component.
-
GWAS results
- sampleGWAS.log: log of the GWAS run.
- sampleGWAS.RNphenotype.glm.linear: Main results file with regression output per SNP.
- sampleGWAS.RNphenotype.glm.linear.adjusted: corrected p-values.
-
R output
- sample_phenotype_RNphenotype.txt: Normalized Phenotype
- Manhattan plot
- QQ plot
- PCA plots
- Box plot
-
LocusZoom output
- regional plot showing linkage disequilibrium and top SNPs