# How to conduct colocalization analysis using T1D GWAS fine-mapped signals and pancreatic and islet QTL fine-mapped signals

In [1]:
library(susieR)
library(coloc)
library(glue)
library(tidyr)
suppressPackageStartupMessages(library(dplyr))
library(ggplot2)
library(optparse)
library(gtable)
library(grid)
library(gridExtra)
library(cowplot)
ggplot2::theme_set(theme_cowplot())
library(locuscomparer)

This is coloc version 5.2.3


Attaching package: ‘gridExtra’


The following object is masked from ‘package:dplyr’:

    combine




In [2]:
susie_dropsets = function(res, drop){
    res$sets$cs = res$sets$cs[-drop]
    res$sets$cs_index = res$sets$cs_index[-drop]
    res$sets$coverage = res$sets$coverage[-drop]
    res$sets$purity = res$sets$purity[-drop,,drop=FALSE]
    return(res)
}

## T1D coloc with eQTL-InsPIRE

### Step 1: Get lead SNP of all T1D and eQTL-InsPIRE fine-mapped sets

For T1D signals, we can get the lead SNP (i.e., SNP with the highest PIP) for each signal, then extend from the SNP coordinates 250kb up- and down-stream to get a window of 500kb around the lead SNP. Any QTL signals (gene-, exon- or splicing-level from prancreatic or islet tissues) will be tested for colocalization.

T1D lead SNP files can look like this
```
head /nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_t1d-susie/results/hg38/susie/freeze/leadSNPs__250kb.bed 

chr16	80000126	80500127	rs8046043	16q23	1	16q23__credibleSet1__selected.txt
chr17	67918776	68418777	rs57209021	17q24	1	17q24__credibleSet1__selected.txt
chr2	12244667	12744668	rs1881146	2p24	1	2p24__credibleSet1__selected.txt
chr2	24666497	25166498	rs55893453	ADCY3	1	ADCY3__credibleSet1__selected.txt
chr2	99900248	100400249	rs4490209	AFF3	1	AFF3__credibleSet1__selected.txt
```
The columns in this files are `chromosome`, `start_coordinate`, `stop_coordinate`, `lead_snp`, `locusName`, `credibleset_id`, `files_with_SNPs_in_set`.

Similarly, we can create a file with lead SNPs of gene-level eQTL from islet tissue:
```
head /nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_eQTL-inspire-susie/results/susie/freeze/eQTL-inspire-leadSNPs.bed
chr6	46166968	46166969	rs9472715	ENSG00000001561__ENPP4__credibleSet1.txt
chr4	11496745	11496746	rs36023867	ENSG00000002587__HS3ST1__credibleSet1.txt
chr4	11578817	11578818	rs112241598	ENSG00000002587__HS3ST1__credibleSet2.txt
chr17	48233712	48233713	rs9889470	ENSG00000002919__SNX11__credibleSet1.txt
```

We can use `bedtools` to identify pairs of signals that are within the 250kb distance of each other:
```
ml Bioinformatics Bioinformatics  gcc/10.3.0-k2osx5y bedtools2/2.30.0-svcfwbm
cd /nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/4_coloc

bedtools intersect -a /nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_t1d-susie/results/hg38/susie/freeze/leadSNPs__250kb.bed  -b /nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_eQTL-inspire-susie/results/susie/freeze/eQTL-inspire-leadSNPs.bed -wa -wb > t1d_eQTL-inspire_snpPairs.txt
```

### Step 2: Carry out coloc between signals within 250kb window

In R, we carry out the following steps:

In [3]:
pairs <- read.table("/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/4_coloc/t1d_eQTL-inspire_snpPairs.txt", header = F)
pairs <- pairs[, c("V7", "V12")]
head(pairs)
dim(pairs)

Unnamed: 0_level_0,V7,V12
Unnamed: 0_level_1,<chr>,<chr>
1,17q24__credibleSet1__selected.txt,ENSG00000182481__KPNA2__credibleSet1.txt
2,ADCY3__credibleSet1__selected.txt,ENSG00000138031__ADCY3__credibleSet1.txt
3,ADCY3__credibleSet1__selected.txt,ENSG00000084710__EFR3B__credibleSet1.txt
4,ADCY3__credibleSet1__selected.txt,ENSG00000115138__POMC__credibleSet1.txt
5,AFF3__credibleSet1__selected.txt,ENSG00000170500__LONRF2__credibleSet1.txt
6,AGO2__credibleSet1__selected.txt,ENSG00000104472__CHRAC1__credibleSet1.txt


In [4]:
# load t1d relevant info, where the credible set txt file is mapped to its Rda fine-mapped file
t1d_masterList <- read.table("/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_t1d-susie/results/hg38/susie/t1d-info.txt", header = F)
head(t1d_masterList, 2)

Unnamed: 0_level_0,V1,V2
Unnamed: 0_level_1,<chr>,<chr>
1,16q23__credibleSet1__selected.txt,/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_t1d-susie/results/hg38/for-coloc/t1d__16q23__rs8046043__P__chr16-80000126-80500127__250kb.selected.Rda
2,17q24__credibleSet1__selected.txt,/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_t1d-susie/results/hg38/for-coloc/t1d__17q24__rs57209021__P__chr17-67918776-68418777__250kb.selected.Rda


In [5]:
qtl_masterList <- read.table("/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_eQTL-inspire-susie/results/susie/eqtl_inspire-info.txt", header = F)
head(qtl_masterList, 2)

Unnamed: 0_level_0,V1,V2
Unnamed: 0_level_1,<chr>,<chr>
1,ENSG00000260682__7SK__credibleSet1.txt,/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_eQTL-inspire-susie/results/susie-prep/gene_eQTLs__7SK__rs4591139__P__chr16-81711926-82211927__250kb.susie.Rda
2,ENSG00000260682__7SK__credibleSet1.txt,/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_eQTL-inspire-susie/results/susie-prep/gene_eQTLs__7SK__rs72834729__S__chr16-81711926-82211927__250kb.susie.Rda


In [6]:
pairs <- inner_join(pairs, t1d_masterList, by = c("V7" = "V1"))
pairs <- inner_join(pairs, qtl_masterList, by = c("V12" = "V1"))
pairs <- distinct(pairs)

pairs$t1d <- unlist(lapply(strsplit(pairs$V7, '__'), '[', 1))
pairs$qtl <- unlist(lapply(strsplit(pairs$V12, '__'), '[', 2))

colnames(pairs) <- c("T1D_cs", "QTL_cs", "T1D_file", "QTL_file", "T1D_locus", "QTL_locus")
head(pairs, 2)

“[1m[22mDetected an unexpected many-to-many relationship between `x` and `y`.
[36mℹ[39m Row 6 of `x` matches multiple rows in `y`.
[36mℹ[39m Row 747 of `y` matches multiple rows in `x`.
[36mℹ[39m If a many-to-many relationship is expected, set `relationship =


Unnamed: 0_level_0,T1D_cs,QTL_cs,T1D_file,QTL_file,T1D_locus,QTL_locus
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>
1,17q24__credibleSet1__selected.txt,ENSG00000182481__KPNA2__credibleSet1.txt,/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_t1d-susie/results/hg38/for-coloc/t1d__17q24__rs57209021__P__chr17-67918776-68418777__250kb.selected.Rda,/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_eQTL-inspire-susie/results/susie-prep/gene_eQTLs__KPNA2__rs78794747__P__chr17-67785519-68285520__250kb.susie.Rda,17q24,KPNA2
2,ADCY3__credibleSet1__selected.txt,ENSG00000138031__ADCY3__credibleSet1.txt,/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_t1d-susie/results/hg38/for-coloc/t1d__ADCY3__rs55893453__P__chr2-24666497-25166498__250kb.selected.Rda,/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_eQTL-inspire-susie/results/susie-prep/gene_eQTLs__ADCY3__rs10176214__P__chr2-24669839-25169840__250kb.susie.Rda,ADCY3,ADCY3


In [None]:
df <- data.frame("nsnps"=NA, "hit1"=NA, "hit2"=NA,
                 "PP.H0.abf"=NA, "PP.H1.abf"=NA, "PP.H2.abf"=NA, "PP.H3.abf"=NA, "PP.H4.abf"=NA,
                 "idx1"=NA, "idx2"=NA, "t1dSignal"=NA, "qtlSignal"=NA)

for (i in 1:nrow(pairs)) {
    t1dcs <- read.table(paste0("/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_t1d-susie/results/hg38/susie/freeze/", 
                             pairs[i, "T1D_cs"]), header= T)
    load(pairs[i, "T1D_file"])
    drop <- c()
    for (j in 1:length(S1$sets$cs)) {
        n <- stringr::str_extract(names(S1$sets$cs[[j]]), "[^-]*")
        if ( length(setdiff(n, t1dcs$snp)) > 0 ) {
            drop <- c(drop, j)
        }
    }
    if (length(drop)>0) {
        S1 <- susie_dropsets(S1, drop)
    }
    
    qtl_cs <- read.table(paste0("/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_eQTL-inspire-susie/results/susie/freeze/", 
                             pairs[i, "QTL_cs"]), header= T)
    load(pairs[i, "QTL_file"])
    drop <- c()
    for (j in 1:length(S2$sets$cs)) {
        n <- stringr::str_extract(names(S2$sets$cs[[j]]), "[^-]*")
        if ( length(setdiff(n, qtl_cs$snp)) > 0 ) {
            drop <- c(drop, j) #get the exact set that is in window with T1D signal
        }
    }
    if (length(drop)>0) {
        S2 <- susie_dropsets(S2, drop)
    }
    
    res = coloc.susie(S1, S2)


    tmp <- as.data.frame(res$summary)
    tmp$t1dSignal <- pairs$T1D_cs[i]
    tmp$qtlSignal <- pairs$QTL_cs[i]

    df <- rbind(df, tmp)
}

df <- df[!is.na(df$PP.H4.abf),]

In [None]:
df <- distinct(df)
df <- df[order(df$PP.H4.abf),]
df$hit1 <- stringr::str_extract(df$hit1, "[^-]*")
df$hit2 <- stringr::str_extract(df$hit2, "[^-]*")
df
#write.table(df, "/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/4_coloc/full_t1d_eQTL-inspire_coloc.txt", row.names = F, quote = F, sep = "\t")

In [7]:
df <- read.table("/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/4_coloc/full_t1d_eQTL-inspire_coloc.txt", header = T)
df

nsnps,hit1,hit2,PP.H0.abf,PP.H1.abf,PP.H2.abf,PP.H3.abf,PP.H4.abf,idx1,idx2,t1dSignal,qtlSignal
<int>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<int>,<int>,<chr>,<chr>
1171,rs12927355,rs13334203,2.606776e-79,3.797552e-45,6.864360e-35,1.0000000,1.693910e-22,1,1,DEXI__credibleSet1__selected.txt,ENSG00000175643__RMI2__credibleSet2.txt
815,rs34536443,rs35605746,5.260703e-56,4.592104e-16,1.145597e-40,1.0000000,7.100493e-18,1,1,TYK2__credibleSet1__selected.txt,ENSG00000129347__KRI1__credibleSet1.txt
465,rs11074908,rs452823,7.671871e-78,3.956590e-70,1.939011e-08,1.0000000,1.390041e-10,1,3,IL27__credibleSet1__selected.txt,ENSG00000251417__RP11-1348G14.4__credibleSet3.txt
465,rs11074908,rs117115302,3.894018e-25,2.008249e-17,1.939011e-08,1.0000000,3.894598e-10,1,4,IL27__credibleSet1__selected.txt,ENSG00000251417__RP11-1348G14.4__credibleSet6.txt
465,rs11074908,rs7498491,0.000000e+00,0.000000e+00,1.939011e-08,1.0000000,7.615871e-10,1,6,IL27__credibleSet1__selected.txt,ENSG00000251417__RP11-1348G14.4__credibleSet1.txt
918,rs229527,rs6000715,1.307545e-113,3.632741e-106,3.599334e-08,1.0000000,1.057387e-09,1,1,C1QTNF6__credibleSet1__selected.txt,ENSG00000166897__ELFN2__credibleSet1.txt
1067,rs689,rs800350,1.027657e-321,8.935066e-07,1.149636e-315,0.9999991,6.646746e-09,1,1,INS__credibleSet1__selected.txt,ENSG00000110651__CD81__credibleSet1.txt
836,rs12720356,rs35605746,8.313219e-22,4.592096e-16,1.810329e-06,0.9999982,1.371508e-08,2,1,TYK2__credibleSet2__selected.txt,ENSG00000129347__KRI1__credibleSet1.txt
667,rs35327136,rs112550936,1.097617e-18,2.294304e-13,4.784071e-06,0.9999952,5.401961e-08,1,2,MAPT__credibleSet1__selected.txt,ENSG00000267198__RP11-798G7.6__credibleSet1.txt
681,rs6434435,rs10445782,3.606560e-13,2.035739e-06,1.771618e-07,0.9999977,8.760210e-08,1,1,STAT4__credibleSet1__selected.txt,ENSG00000227542__AC092614.2__credibleSet1.txt


Two signals are considered colocalized if their `PP.H4 >= 0.5`

In [9]:
df[df$PP.H4.abf >= 0.5,]

Unnamed: 0_level_0,nsnps,hit1,hit2,PP.H0.abf,PP.H1.abf,PP.H2.abf,PP.H3.abf,PP.H4.abf,idx1,idx2,t1dSignal,qtlSignal
Unnamed: 0_level_1,<int>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<int>,<int>,<chr>,<chr>
87,906,rs35327136,rs3972613,3.000816e-09,0.0006272489,1.287673e-06,0.26769391,0.7316776,1,1,MAPT__credibleSet1__selected.txt,ENSG00000159314__ARHGAP27__credibleSet1.txt
88,773,rs35327136,rs62065450,5.500376e-12,1.149721e-06,1.087837e-06,0.22583779,0.77416,1,1,MAPT__credibleSet1__selected.txt,ENSG00000225190__PLEKHM1__credibleSet1.txt
89,1073,rs56750287,rs870829,9.752658e-05,0.03188592,0.0005324258,0.17248445,0.7949997,1,1,GSDMB__credibleSet1__selected.txt,ENSG00000073605__GSDMB__credibleSet1.txt
90,1273,rs55893453,rs10176214,1.035486e-07,2.760424e-05,0.0001657926,0.04228235,0.9575242,1,1,ADCY3__credibleSet1__selected.txt,ENSG00000138031__ADCY3__credibleSet1.txt


## T1D coloc with exonQTL-InsPIRE, eQTL-GTEx and sQTL-GTEx

### Step 1: Get lead SNP of all T1D and QTL fine-mapped sets

Follow the same instruction as in `T1D coloc with eQTL-InsPIRE`. 

We can use `bedtools` to identify pairs of signals that are within the 250kb distance of each other:
```
ml Bioinformatics Bioinformatics  gcc/10.3.0-k2osx5y bedtools2/2.30.0-svcfwbm
cd /nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/4_coloc

bedtools intersect -a /nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_t1d-susie/results/hg38/susie/freeze/leadSNPs__250kb.bed  -b /nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_exonQTL-inspire-susie/results/susie/freeze/exonQTL-inspire-leadSNPs.bed -wa -wb > t1d_exonQTL-inspire_snpPairs.txt
```

### Step 2: Carry out coloc between signals within 250kb window

Follow the same instruction as in `T1D coloc with eQTL-InsPIRE`. 

#### Results for T1D and `exonQTL-InsPIRE`:

In [10]:
df <- read.table("/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/4_coloc/full_t1d_exonQTL-inspire_coloc.txt", header = T)
df

nsnps,hit1,hit2,PP.H0.abf,PP.H1.abf,PP.H2.abf,PP.H3.abf,PP.H4.abf,idx1,idx2,t1dSignal,qtlSignal
<int>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<int>,<int>,<chr>,<chr>
1067,rs689,rs1088978,0.000000e+00,4.778481e-11,1.149637e-315,1.0000000,3.299123e-13,1,1,INS__credibleSet1__selected.txt,ENSG00000110651__CD81__ENSG00000110651.7_2397407_2397587__credibleSet1.txt
435,rs231972,rs4149381,1.900759e-34,1.644252e-24,1.156002e-10,1.0000000,7.549817e-13,1,8,IL27__credibleSet2__selected.txt,ENSG00000196502__SULT1A1__ENSG00000196502.7_28617377_28617557__credibleSet5.txt
435,rs231972,rs40835,0.000000e+00,0.000000e+00,1.156002e-10,1.0000000,2.171700e-12,1,6,IL27__credibleSet2__selected.txt,ENSG00000196502__SULT1A1__ENSG00000196502.7_28617377_28617557__credibleSet4.txt
435,rs231972,rs190012453,4.032951e-27,3.488705e-17,1.156002e-10,1.0000000,3.596958e-12,1,10,IL27__credibleSet2__selected.txt,ENSG00000196502__SULT1A1__ENSG00000196502.7_28617377_28617557__credibleSet10.txt
435,rs231972,rs113927841,3.455287e-31,2.988996e-21,1.156002e-10,1.0000000,2.891409e-11,1,7,IL27__credibleSet2__selected.txt,ENSG00000196502__SULT1A1__ENSG00000196502.7_28617377_28617557__credibleSet7.txt
247,rs11074908,rs4149381,3.188229e-32,1.644252e-24,1.939015e-08,1.0000000,1.337187e-10,1,8,IL27__credibleSet1__selected.txt,ENSG00000196502__SULT1A1__ENSG00000196502.7_28617377_28617557__credibleSet5.txt
465,rs11074908,rs452823,5.922037e-20,3.054154e-12,1.939011e-08,1.0000000,1.631582e-10,1,6,IL27__credibleSet1__selected.txt,ENSG00000251417__RP11-1348G14.4__ENSG00000251417.1_28827457_28829149__credibleSet6.txt
465,rs11074908,rs7498491,0.000000e+00,0.000000e+00,1.939011e-08,1.0000000,7.615871e-10,1,5,IL27__credibleSet1__selected.txt,ENSG00000251417__RP11-1348G14.4__ENSG00000251417.1_28827457_28829149__credibleSet3.txt
1376,rs56994090,rs9324026,7.081783e-21,1.120374e-07,6.320908e-14,0.9999999,8.185822e-10,1,1,DLK1__credibleSet1__selected.txt,ENSG00000225746__AL132709.5__ENSG00000225746.4_101424599_101424887__credibleSet1.txt
764,rs2476601,rs1624335,7.197144e-209,6.301894e-09,1.142060e-200,1.0000000,3.718886e-09,1,1,PTPN22__credibleSet1__selected.txt,ENSG00000116793__PHTF1__ENSG00000116793.11_114267957_114268583__credibleSet1.txt


Two signals are considered colocalized if their `PP.H4 >= 0.5`

In [11]:
df[df$PP.H4.abf >= 0.5,]

Unnamed: 0_level_0,nsnps,hit1,hit2,PP.H0.abf,PP.H1.abf,PP.H2.abf,PP.H3.abf,PP.H4.abf,idx1,idx2,t1dSignal,qtlSignal
Unnamed: 0_level_1,<int>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<int>,<int>,<chr>,<chr>
97,906,rs35327136,rs3972613,2.562483e-08,0.005356258,1.444003e-06,0.30044588,0.6941964,1,1,MAPT__credibleSet1__selected.txt,ENSG00000159314__ARHGAP27__ENSG00000159314.7_43471275_43472999__credibleSet1.txt
98,435,rs231972,rs180743,0.0,0.0,3.10763e-11,0.26736027,0.7326397,1,1,IL27__credibleSet2__selected.txt,ENSG00000196502__SULT1A1__ENSG00000196502.7_28617377_28617557__credibleSet6.txt
99,526,rs1701704,rs1131017,5.255514e-59,0.006449908,1.7568639999999998e-57,0.21405478,0.7794953,1,1,IKZF4__credibleSet1__selected.txt,ENSG00000197728__RPS26__ENSG00000197728.5_56437903_56438116__credibleSet1.txt
100,773,rs35327136,rs62064652,4.219461e-16,8.819769e-11,1.037734e-06,0.21534401,0.784655,1,1,MAPT__credibleSet1__selected.txt,ENSG00000225190__PLEKHM1__ENSG00000225190.4_43545575_43545959__credibleSet3.txt
101,1073,rs56750287,rs12939565,2.302587e-08,7.528215e-06,0.000592548,0.19211661,0.8072833,1,1,GSDMB__credibleSet1__selected.txt,ENSG00000073605__GSDMB__ENSG00000073605.14_38065211_38065295__credibleSet1.txt
102,1609,rs13018977,rs7578199,2.133964e-05,0.01744443,0.0002011908,0.16282761,0.8195054,1,2,SEPT2__credibleSet1__selected.txt,ENSG00000168385__SEPT2__ENSG00000168385.13_242291358_242293442__credibleSet2.txt
103,1647,rs13018977,rs7590653,2.428236e-06,0.001985006,0.0002171782,0.17589244,0.821903,1,1,SEPT2__credibleSet1__selected.txt,ENSG00000006607__FARP2__ENSG00000006607.9_242404875_242405451__credibleSet1.txt
104,1609,rs13018977,rs6726915,2.1864370000000002e-18,1.787338e-15,0.0001756888,0.14190374,0.8579206,1,1,SEPT2__credibleSet1__selected.txt,ENSG00000168385__SEPT2__ENSG00000168385.13_242255418_242255990__credibleSet1.txt
105,1273,rs55893453,rs13393590,1.072615e-07,2.859402e-05,0.0001066194,0.02647609,0.9733886,1,1,ADCY3__credibleSet1__selected.txt,ENSG00000138031__ADCY3__ENSG00000138031.10_25062742_25062900__credibleSet1.txt
106,1016,rs55893453,rs73920612,1.434028e-06,0.0003810257,9.872902e-05,0.02428214,0.9752367,1,1,ADCY3__credibleSet1__selected.txt,ENSG00000138092__CENPO__ENSG00000138092.6_25042223_25045245__credibleSet1.txt


#### Results for T1D and `eQTL-GTEx`:

In [12]:
df <- read.table("/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/4_coloc/full_t1d_eQTL-gtex_coloc.txt", header = T)
df

nsnps,hit1,hit2,PP.H0.abf,PP.H1.abf,PP.H2.abf,PP.H3.abf,PP.H4.abf,idx1,idx2,t1dSignal,qtlSignal
<int>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<int>,<int>,<chr>,<chr>
434,rs112088479,rs3121203,0.000000e+00,0.000000e+00,0.000000e+00,1.0000000,0.000000e+00,5,2,NOTCH2__credibleSet2__selected.txt,ENSG00000134250__NOTCH2__credibleSet2.txt
434,rs112088479,rs6677820,0.000000e+00,0.000000e+00,0.000000e+00,1.0000000,0.000000e+00,5,3,NOTCH2__credibleSet2__selected.txt,ENSG00000134250__NOTCH2__credibleSet1.txt
434,rs112088479,rs2603848,0.000000e+00,0.000000e+00,0.000000e+00,1.0000000,0.000000e+00,5,6,NOTCH2__credibleSet2__selected.txt,ENSG00000134250__NOTCH2__credibleSet4.txt
434,rs112088479,rs60780736,0.000000e+00,0.000000e+00,0.000000e+00,1.0000000,0.000000e+00,5,9,NOTCH2__credibleSet2__selected.txt,ENSG00000134250__NOTCH2__credibleSet6.txt
434,rs112088479,rs113736263,0.000000e+00,0.000000e+00,0.000000e+00,1.0000000,0.000000e+00,5,1,NOTCH2__credibleSet2__selected.txt,ENSG00000134250__NOTCH2__credibleSet7.txt
434,rs112088479,rs9428330,0.000000e+00,0.000000e+00,0.000000e+00,1.0000000,0.000000e+00,5,10,NOTCH2__credibleSet2__selected.txt,ENSG00000134250__NOTCH2__credibleSet3.txt
434,rs112088479,rs140779657,0.000000e+00,0.000000e+00,0.000000e+00,1.0000000,0.000000e+00,5,5,NOTCH2__credibleSet2__selected.txt,ENSG00000134250__NOTCH2__credibleSet5.txt
434,rs112088479,rs9699857,0.000000e+00,2.393437e-305,0.000000e+00,1.0000000,2.801831e-301,5,4,NOTCH2__credibleSet2__selected.txt,ENSG00000134250__NOTCH2__credibleSet8.txt
638,rs689,rs1104890,0.000000e+00,2.284996e-54,1.149637e-315,1.0000000,7.164846e-57,1,1,INS__credibleSet1__selected.txt,ENSG00000214026__MRPL23__credibleSet1.txt
434,rs112088479,rs9428186,0.000000e+00,4.876141e-58,0.000000e+00,1.0000000,3.115513e-50,5,8,NOTCH2__credibleSet2__selected.txt,ENSG00000134250__NOTCH2__credibleSet10.txt


Two signals are considered colocalized if their `PP.H4 >= 0.5`

In [13]:
df[df$PP.H4.abf >= 0.5,]

Unnamed: 0_level_0,nsnps,hit1,hit2,PP.H0.abf,PP.H1.abf,PP.H2.abf,PP.H3.abf,PP.H4.abf,idx1,idx2,t1dSignal,qtlSignal
Unnamed: 0_level_1,<int>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<int>,<int>,<chr>,<chr>
165,1246,rs55893453,rs6728219,0.0002796138,0.07668506,0.001455288,0.39807079,0.5235092,1,1,ADCY3__credibleSet1__selected.txt,ENSG00000224165__DNAJC27-AS1__credibleSet1.txt
166,1053,rs766751473,rs2522051,8.21897e-05,0.1037468,0.0001483852,0.18588399,0.7101387,1,1,IRF1__credibleSet1__selected.txt,ENSG00000197375__SLC22A5__credibleSet1.txt
167,271,rs231972,rs3785354,5.885253e-13,0.004797354,3.313159e-11,0.26861842,0.7265842,1,1,IL27__credibleSet2__selected.txt,ENSG00000188322__SBK1__credibleSet1.txt
168,1455,rs12128789,rs11120052,2.231897e-05,0.02473829,0.0002027782,0.2232553,0.7517813,1,3,BATF3__credibleSet1__selected.txt,ENSG00000162769__FLVCR1__credibleSet1.txt
169,1685,12:9878144_TA_T,rs2268146,1.693501e-15,3.442699e-10,4.431433e-07,0.08826261,0.9117369,1,1,CD69__credibleSet1__selected.txt,ENSG00000184293__CLECL1__credibleSet1.txt
170,1461,rs7237497,rs8096138,1.408501e-36,1.013935e-10,6.375198e-28,0.04398098,0.956019,1,1,PTPN2__credibleSet1__selected.txt,ENSG00000260302__RP11-973H7.1__credibleSet1.txt
171,962,rs12742756,rs28760325,7.273777e-15,4.420496e-12,7.490366e-05,0.04360861,0.9563165,1,1,INPP5B__credibleSet1__selected.txt,ENSG00000183431__SF3A3__credibleSet1.txt


#### Results for T1D and `sQTL-GTEx`:

In [14]:
df <- read.table("/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/4_coloc/full_t1d_sQTL-gtex_coloc.txt", header = T)
df

nsnps,hit1,hit2,PP.H0.abf,PP.H1.abf,PP.H2.abf,PP.H3.abf,PP.H4.abf,idx1,idx2,t1dSignal,qtlSignal
<int>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<int>,<int>,<chr>,<chr>
638,rs689,rs1104890,0.000000e+00,4.124722e-43,1.149637e-315,1.0000000,2.158083e-45,1,1,INS__credibleSet1__selected.txt,ENSG00000214026__MRPL23__chr11:1952855:1972559:clu_4921:ENSG00000214026.10__credibleSet1.txt
701,rs34536443,rs7710,6.734070e-79,5.878217e-39,1.145597e-40,1.0000000,4.583470e-41,1,1,TYK2__credibleSet1__selected.txt,ENSG00000130811__EIF3G__chr19:10118956:10119088:clu_18343:ENSG00000130811.11__credibleSet1.txt
887,rs35320372,rs2402203,0.000000e+00,2.796709e-37,0.000000e+00,1.0000000,7.060248e-37,1,1,CFTR__credibleSet2__selected.txt,ENSG00000001626__CFTR__chr7:117542108:117559464:clu_25847:ENSG00000001626.14__credibleSet1.txt
663,rs34536443,rs2305791,7.315684e-55,6.385912e-15,1.145597e-40,1.0000000,3.380538e-17,1,1,TYK2__credibleSet1__selected.txt,ENSG00000130810__PPAN__chr19:10107854:10107964:clu_18337:ENSG00000130810.19__credibleSet1.txt
440,rs231972,rs147436559,1.278688e-28,1.176191e-18,1.087143e-10,1.0000000,7.847979e-13,1,5,IL27__credibleSet2__selected.txt,ENSG00000184110__EIF3C__chr16:28715434:28723164:clu_13330:ENSG00000184110.14__credibleSet7.txt
440,rs231972,rs4787453,6.695200e-170,6.158525e-160,1.087143e-10,1.0000000,1.190013e-12,1,9,IL27__credibleSet2__selected.txt,ENSG00000184110__EIF3C__chr16:28715434:28723164:clu_13330:ENSG00000184110.14__credibleSet2.txt
440,rs231972,rs12935321,1.683423e-145,1.548483e-135,1.087143e-10,1.0000000,1.227128e-12,1,10,IL27__credibleSet2__selected.txt,ENSG00000184110__EIF3C__chr16:28715434:28723164:clu_13330:ENSG00000184110.14__credibleSet6.txt
440,rs231972,rs11648192,8.014492e-96,7.372065e-86,1.087143e-10,1.0000000,1.677082e-12,1,6,IL27__credibleSet2__selected.txt,ENSG00000184110__EIF3C__chr16:28715434:28723164:clu_13330:ENSG00000184110.14__credibleSet4.txt
440,rs231972,rs34954534,1.227205e-24,1.128835e-14,1.087143e-10,1.0000000,6.651259e-11,1,2,IL27__credibleSet2__selected.txt,ENSG00000184110__EIF3C__chr16:28715434:28723164:clu_13330:ENSG00000184110.14__credibleSet5.txt
368,rs11074908,rs147436559,1.916342e-26,9.886624e-19,1.938318e-08,1.0000000,1.704554e-10,1,5,IL27__credibleSet1__selected.txt,ENSG00000184110__EIF3C__chr16:28715434:28723164:clu_13330:ENSG00000184110.14__credibleSet7.txt


Two signals are considered colocalized if their `PP.H4 >= 0.5`

In [15]:
df[df$PP.H4.abf >= 0.5,]

Unnamed: 0_level_0,nsnps,hit1,hit2,PP.H0.abf,PP.H1.abf,PP.H2.abf,PP.H3.abf,PP.H4.abf,idx1,idx2,t1dSignal,qtlSignal
Unnamed: 0_level_1,<int>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<int>,<int>,<chr>,<chr>
74,440,rs231972,rs13331170,0.0,0.0,3.644599e-11,0.333913345,0.6660867,1,1,IL27__credibleSet2__selected.txt,ENSG00000184110__EIF3C__chr16:28715434:28723164:clu_13330:ENSG00000184110.14__credibleSet10.txt
75,1442,rs28648882,rs10153800,3.3511549999999997e-65,2.202164e-62,0.0003553612,0.231985262,0.7676594,1,1,SEPT2__credibleSet1__selected.txt,ENSG00000168385__SEPT2__chr2:241316540:241317499:clu_38317:ENSG00000168385.17__credibleSet2.txt
76,887,rs35320372,rs177069,0.0,7.806095e-37,0.0,0.002031573,0.9979684,1,2,CFTR__credibleSet2__selected.txt,ENSG00000001626__CFTR__chr7:117542108:117559464:clu_25847:ENSG00000001626.14__credibleSet2.txt
