-
Notifications
You must be signed in to change notification settings - Fork 2
/
partition_pip_tutorial.Rmd
132 lines (107 loc) · 4.7 KB
/
partition_pip_tutorial.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
---
title: "Partition fine-mapping PIPs by annotation categories"
author: Kaixuan Luo
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, eval=TRUE, comment = "#")
```
## Partitioning PIPs by cell types and annotation categories
Here, we show an example using AFib GWAS finemapping result and
scATAC-seq data from our heart study.
We assigned the likely cell type(s) through
which the causal variants act in each locus using fine-mapped SNPs,
and cell-type specific open chromatin regions (OCRs) obtained from scATAC-seq data.
**Required input data**:
* Fine-mapped summary statistics.
* A list of GRanges objects of annotation regions.
Load R packages
```{r load-packages, message=FALSE, warning=FALSE}
library(mapgen)
```
Load fine-mapping summary statistics from the AFib study
Keep SNPs with PIP > 1e-5, rename columns, and save as a GRanges object.
```{r load-finemapping-res}
finemapstats <- readRDS(system.file("extdata", "AF.finemapping.sumstats.rds", package = "mapgen"))
finemapstats <- process_finemapping_sumstats(finemapstats,
snp = 'snp', chr = 'chr',
pos = 'pos', pip = 'susie_pip',
pval = 'pval', zscore = 'zscore',
cs = 'cs', locus = 'locus',
pip.thresh = 1e-5)
```
We can partition PIPs into different functional annotation categories.
Load genomic annotations (hg19).
```{r load-genomic-annotations}
genomic.annots <- readRDS(system.file("extdata", "genomic.annots.hg19.rds", package = "mapgen"))
```
Load cell-type specific open chromatin regions (OCRs) (hg19).
```{r load-all-OCRs}
OCRs <- readRDS(system.file("extdata", "OCRs.hg19.gr.rds", package = "mapgen"))
```
Create a list of the annotations, with priority in the order of OCRs, UTRS, Exons, and Introns.
```{r create-annots.list}
annots.list <- list(OCRs = OCRs,
UTRs = genomic.annots$UTRs,
Exons = genomic.annots$exons,
Introns = genomic.annots$introns)
```
### Sum PIPs within annotation categories
Unlike \code{partition_pip_regions()}(below),
it is OK to have overlapping annotations here.
If a SNP is in multiple annotation categories,
it will be assigned to the first ordered category.
```{r sum-pip-annots}
sum.pip.res <- partition_pip_annots(finemapstats, annots.list)
```
Sum of PIPs in each annotation category:
```{r}
sum.pips <- sum.pip.res$sum.pips
head(sum.pips)
```
Obtain a matrix of the proportion of PIPs in each annotation category.
```{r get-prop-pips-annots}
locus.order <- rownames(sum.pips)[with(sum.pips, order(-OCRs, UTRs, Exons, Introns, others))]
sum.pips <- sum.pips[locus.order,]
prop.pip.mat <- sum.pips/rowSums(sum.pips)
head(prop.pip.mat)
```
We can make a structure plot to show the proportion of PIPs in each annotation category.
```{r structure-plot-annots, fig.width=10, fig.height=2.5}
categories <- c("OCRs", "UTRs", "Exons", "Introns", "others")
colors <- c(OCRs = "#E18727FF", UTRs = "#238b45", Exons = "#bee6af", Introns = "#B09C85FF", others = "#aaaaaa")
pip_structure_plot(prop.pip.mat, categories = categories, colors = colors)
```
### Partition PIPs into disjoint OCRs for different cell types
We can partition PIPs into disjoint OCRs for different cell types.
Load a list of GRanges objects containing disjoint OCRs for different cell types.
```{r load-disjoint-OCRs}
disjoint.OCRs <- readRDS(system.file("extdata", "disjoint.OCRs.hg19.grlist.rds", package = "mapgen"))
```
Sum PIPs within cell-type specific OCRs.
```{r sum-pip-disjoint-OCRs}
sum.pip.res <- partition_pip_regions(finemapstats, disjoint.OCRs)
```
Sum of PIPs in each cell type OCR category:
```{r}
sum.pips <- sum.pip.res$sum.pips
head(sum.pips)
```
Select loci with total PIPs in OCR > 0.25,
and compute the proportion of PIPs partitioned in each cell type category.
```{r filter-locus-proportion-pips-OCRs}
# reorder the loci to match the previous figure
sum.pips <- sum.pips[locus.order, ]
# filter loci with total PIPs in OCR > 0.25
sum.pips.filtered <- sum.pips[rowSums(sum.pips) > 0.25,]
prop.pip.mat <- sum.pips.filtered/rowSums(sum.pips.filtered)
head(prop.pip.mat)
```
We can make a structure plot to show the proportion of PIPs in each cell type category.
```{r structure-plot-OCRs, fig.width=10, fig.height=2.5}
categories <- c("Cardiomyocyte", "Endothelial", "Fibroblast", "Lymphoid",
"Myeloid", "Pericyte", "Shared 2-3", "Shared 4+")
colors <- c("#b22222", "#8DD3C7", "#BEBADA", "#FB8072",
"#80B1D3", "#B3DE69", "royalblue", "#003C86")
pip_structure_plot(prop.pip.mat, categories = categories, colors = colors)
```