Incorporating TADs into GWAS Analysis - TAD_Pathways

IMPORTANT: This repository is no longer maintained. In addition to running a TAD_Pathways analysis for Bone Mineral Density GWAS, this analysis pipeline downloads genomic data and explores distributions across TADs. A more streamlined analysis pipeline that does not require data preprocessing is available at https://github.com/greenelab/tad_pathways_pipeline

NOTE: Several files built from this pipeline are used in the above repository including the TAD based gene index and the hg19 converted NHGRI-EBI GWAS Catalog.

Incorporating TADs into GWAS Analysis - TAD_Pathways

Gregory P. Way and Casey S. Greene 2016

Summary

The repository contains methods for manipulating, observing, and visualizing topologically associating domains (TADs) in the context of SNPs, genes, and repeat elements for human (hg19) and mouse (mm9) genomes.

The repository also proposes methods and tools for the incorporation of TAD domains into the prioritization of GWAS signals through the investigation of publicly available GWAS data. We introduce TAD pathways as a method to identify the likely causal genes from GWAS independent of distance to sentinel SNP.

A preprint of our method is available here on bioRxiv

Contact

For all questions and bug reporting please file a GitHub issue

For all other questions contact Casey Greene at csgreene@mail.med.upenn.edu or Struan Grant at grants@email.chop.edu

Usage

There are two ways to implement a TAD_Pathways analysis:

Disease/Trait Specific - Uses GWAS identified SNPs
Custom - Uses custom SNP list

Disease/Trait Specific

Curates the GWAS catalog and TAD boundaries to visualize TADs and generate TAD based gene lists. This will also perform a TAD pathways analysis for Bone Mineral Density GWAS. This will reproduce the analysis and figures used in the paper.

# Using python dependencies
conda env create --quiet --force --file environment.yml
source activate tad_pathways

bash scripts/run_pipeline.sh

This will download data, perform analyses, and output several genomic figures. The command will also output TAD based genes for 299 different GWAS traits. Our TAD_Pathways method can be applied directly using these gene lists.

Custom

TAD_Pathways is customizable and allows a user to prespecify any SNP list of interest to test TAD based pathway associations. To perform a custom analysis create a comma separated file where the first row of each column names the list of snps below in subsequent rows.

E.g.: custom_example.csv

Group 1	Group 2
rs12345	rs67891
rs19876	rs54321

Then, perform the following steps:

# Extract locations for SNP list
Rscript --vanilla scripts/tad_util/build_snp_list.R \
        --snp_file "custom_example.csv" \
        --output_file "mapped_results.tsv"

# Build TAD based genelists for each group
python scripts/build_custom_TAD_genelist.py \
       --snp_data_file "mapped_results.tsv" \
       --output_file "custom_tad_genelist.tsv"

# The output file is then ready for the manual "TAD_Pathways" steps below

TAD_Pathways

As a case study to demonstrate the utility of a TAD based approach, input the TAD based gene list for the Bone Mineral Density (1,297 genes) into a pathway analysis:

Next, run a WebGestalt pathway analysis on the gene list.

WebGestalt Parameters

Parameter	Input
Select gene ID type	hsapiens__gene_symbol
Enrichment Analysis	GO Analysis
GO Slim Classification	Yes
Reference Set	hsapiens__genome
Statistical Method	Hypergeometric
Multiple Test Adjustment	BH
Significance Level	Top10
Minimum Number of Genes for a Category	4

Note - The output of scripts/run_pipeline.sh in data/TAD_based_genes/ for all traits is ready for TAD Pathway Analysis.

After performing the WebGestalt analysis, click Export TSV Only and save the file in data/gestalt/<TRAIT>_gestalt.tsv where <TRAIT> is "BMD" for the example.

GWAS/eQTL Integration

Data Access

GWAS Catalog (2016-02-25)
eQTL (2016-05-09) eQTL Browser

Nearest BMD gene GWAS reports

eQTL Browser Parameters

Analysis ID (All)
Association Test Significance Filters (p-value 1 x 10^-1)
Phenotype Traits (Bone Mineral Density)

Dependencies

All analyses were performed in the Anaconda python distribution (3.5.1). For specific package versions please refer to environment.yml. R version 3.3.0 was used for visualization. For more specific environment dependencies refer to our accompanying docker file at docker/Dockerfile

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
data		data
docker		docker
results		results
scripts		scripts
tables		tables
tad_pathway		tad_pathway
.codeclimate.yml		.codeclimate.yml
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Incorporating TADs into GWAS Analysis - TAD_Pathways

Summary

Contact

Usage

Disease/Trait Specific

Custom

TAD_Pathways

WebGestalt Parameters

GWAS/eQTL Integration

Data Access

Nearest BMD gene GWAS reports

eQTL Browser Parameters

Dependencies

About

Releases

Packages

Contributors 2

Languages

License

greenelab/tad_pathways

Folders and files

Latest commit

History

Repository files navigation

Incorporating TADs into GWAS Analysis - TAD_Pathways

Summary

Contact

Usage

Disease/Trait Specific

Custom

TAD_Pathways

WebGestalt Parameters

GWAS/eQTL Integration

Data Access

Nearest BMD gene GWAS reports

eQTL Browser Parameters

Dependencies

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages