# Genome Wide Association Study (GWAS)

## Brief Description:
A genome-wide association study (abbreviated GWAS) is a research approach used to identify genomic variants that are statistically associated with a risk for a disease or a particular trait. The method involves surveying the genomes of many people, looking for genomic variants that occur more frequently in those with a specific disease or trait compared to those without the disease or trait. Once such genomic variants are identified, they are typically used to search for nearby variants that contribute directly to the disease or trait. (https://www.genome.gov/genetics-glossary/Genome-Wide-Association-Studies)

## Methodology:

There are several steps one must take in order to perform a GWAS. These steps are summarized in the following image ([Uffelmann et al. 2021](https://www.nature.com/articles/s43586-021-00056-9)).

<img src="https://media.springernature.com/full/springer-static/image/art%3A10.1038%2Fs43586-021-00056-9/MediaObjects/43586_2021_56_Fig1_HTML.png?as=webp" alt="GWAS Image" style="height: 1000px; width:1000px;"/>

All of the above steps involve careful consideration and planning in order to lead to replicable, actionable results. Errors in any of these steps can not only lead to incorrect results but can ultimately hinder proper patient care.

## Importance:

The primary reason why we perform GWAS analyses is to find statistical associations between specific physical locations in the genome and a particular trait/phenotype. By doing so, we can:
* uncover biological mechanisms affecting the phenotype
* allow for potential future prediction of a phenotype from genomic information

The results obtained from GWAS can also benefit:
* **medicine** by leading to molecular or environmental interventions against harmful phenotypes
* **biotechnology** by improving the ways we utilize microbes, plants or animals
* **forensics** by more accurate identification of an individual from a DNA sample
* **biogeographic ancestry inference** of individuals, populations and species
* our understanding of the role of natural selection and other evolutionary forces in the living world

Source: [What is a GWAS?](https://www.mv.helsinki.fi/home/mjxpirin/GWAS_course/material/GWAS1.pdf)

## What this tool aims to do:

UF-GWAS aims to streamline Quality Control, Imputation, and Association Testing (steps C-E in the above methodology diagram) in order to improve GWAS analyses. Specifically, UF-GWAS utilizes a Python library called Hail to parallelize operations and utilizes JupyterNotebooks to make analyses more user-friendly, reproducible, and shareable. Check out the "**gwas_hail_tutorial.iypnb**" notebook for a guided walkthrough of how to utilize UF-GWAS to conduct a GWAS.

## Relevant Links/Publications:

1. [Genome-wide association studies - A Nature Review](https://www.nature.com/articles/s43586-021-00056-9)
2. [GWAS Catalog - Catalog of previous GWAS](https://www.ebi.ac.uk/gwas/)
3. [NIH GWAS - NIH's information page](https://www.genome.gov/genetics-glossary/Genome-Wide-Association-Studies)
4. [GWAS - Wikipedia](https://en.wikipedia.org/wiki/Genome-wide_association_study)