# Visualize GWAS results with Hail

This notebook shows how to visualize GWAS results with a Q-Q plot and Manhattan plot with Hail. See documentation for guidance on launch specs for the JupyterLab with Spark Cluster app for different data sizes: https://documentation.dnanexus.com/science/using-hail-to-analyze-genomic-data

Pre-conditions for running this notebook successfully:
- There is an existing GWAS results Hail Table in DNAX (see *GWAS with Hail*)

## 1) Initiate Spark and Hail

In [None]:
# Running this cell will output a red-colored message- this is expected.
# The 'Welcome to Hail' message in the output will indicate that Hail is ready to use in the notebook.

from pyspark.sql import SparkSession
import hail as hl

builder = (
    SparkSession
    .builder
    .enableHiveSupport()
)
spark = builder.getOrCreate()
hl.init(sc=spark.sparkContext)

## 2) Read GWAS results Table

In [None]:
# Define GWAS results Table url

tb_url = "dnax://database-GFpXJ5j0vzZxPZQ2Ggf14x7q/gwas.ht"

In [None]:
# Read GWAS results Table

gwas_tb = hl.read_table(tb_url)

In [None]:
# View structure of Table

gwas_tb.describe()

## 3) Import Bokeh to visualize plots

Bokeh is a Python library that is included in this JupyterLab environment- which makes it easy for us to import!

In [None]:
from bokeh.io import output_notebook, show
output_notebook()

## 4) Create Q-Q plot

Additional documentation: https://hail.is/docs/0.2/plot.html#hail.plot.qq

In [None]:
qq_plot = hl.plot.qq(gwas_tb.p_value)

In [None]:
show(qq_plot)

## 5) Create Manhattan plot

Additional documentation: https://hail.is/docs/0.2/plot.html#hail.plot.manhattan

In [None]:
manhattan_plot = hl.plot.manhattan(gwas_tb.p_value)

In [None]:
show(manhattan_plot)