# XWAS/COLOC using GTEX v8 brain dataset

**Start date:** 2023-12-28

**Updated date:** 2023-12-28

**Author(s):** Ruth Chia

**Working directory on biowulf:** `/data/ALS_50k/DementiaSeq.TopmedJointCalled.June2020/LBD/Analysis.XWAS_GLM`

In [1]:
!pwd

/vf/users/ALS_50k/DementiaSeq.TopmedJointCalled.June2020/LBD/Analysis.XWAS_GLM


## ***UPDATED 12-28-2023:*** COLOC with GTEX v8 brain dataset

Some notes:
1. Downloaded 13 brain eQTL associations from GTEX v8. Location of formatted eQTL summ stats for 13 brain regions are: `/data/NDRS_LNG/GTEX_v8/GTEX_v8_chrX_brain/eQTL/Brain_*.v8.EUR.allpairs.chrX.txt`


2. The 13 brain regions are:
- `Brain_Amygdala`
- `Brain_Anterior_cingulate_cortex_BA24`
- `Brain_Caudate_basal_ganglia`
- `Brain_Cerebellar_Hemisphere`
- `Brain_Cerebellum`
- `Brain_Cortex`
- `Brain_Frontal_Cortex_BA9`
- `Brain_Hippocampus`
- `Brain_Hypothalamus`
- `Brain_Nucleus_accumbens_basal_ganglia`
- `Brain_Putamen_basal_ganglia`
- `Brain_Spinal_cord_cervical_c-1`
- `Brain_Substantia_nigra`

    
3. XWAS summary stats to run coloc on are (only for regions of interest with significant/subsignificant (significance threshold was set at 7.94x10-6) hit +/- 0.5Mb i.e. according the the coloc recommendation on `https://chr1swallace.github.io/coloc/articles/a02_data.html`):
    - females (not conditioned on ApoE): `/data/ALS_50k/DementiaSeq.TopmedJointCalled.June2020/LBD/Analysis.XWAS_GLM/females_only_x-autosomal-pc/toMeta.LBD.controls.UNRELATED.females.maf0.01overall.hg38.chrX.rsid_REFORMATTED.txt`
    - females (conditioned on ApoE): `/data/ALS_50k/DementiaSeq.TopmedJointCalled.June2020/LBD/Analysis.XWAS_GLM/cond_ApoE4_x-autosomal-pc/females_only/toMeta.LBD.controls.UNRELATED.females.maf0.01overall.hg38.chrX.rsid_REFORMATTED.txt`
    
____

**What I would need to do:**
1. Filter summ stats to include significant/subsignificant hit from summ stats - then flank out by 1Mb from top index variant.
2. Harmonize eQTL summ stats to include the same set of variants and to make sure that the variant ID is the same (can keep it to chr:pos:ref:alt' format.
3. Prep not summ stats for XWAS and eQTL so that it contains the necessary columns to run coloc using the coloc R package (tutorial to run coloc on two summ stats can be found: `https://chr1swallace.github.io/coloc/articles/a02_data.html`).

In [5]:
!mkdir Analysis.COLOC_redo

In [6]:
!mkdir Analysis.COLOC_redo/females_only
!mkdir Analysis.COLOC_redo/females_only_cond_ApoE4

In [7]:
!mkdir Analysis.COLOC_redo/GTEXv8_eQTL_brains
!mkdir Analysis.COLOC_redo/GTEXv8_sQTL_brains

### Create subset of snp region to run coloc on 

#### females-only xwas (unconditioned on APOE4)

top index variants from Table 1 of paper are:
- rs141773145 (pos = 19,513,849)
- rs12860838 (pos = 117,373,825)

use these variants and flank out by 1Mb

In [142]:
import pandas as pd
import numpy as np

females_uncond = pd.read_csv("/data/ALS_50k/DementiaSeq.TopmedJointCalled.June2020/LBD/Analysis.XWAS_GLM/females_only_x-autosomal-pc/toMeta.LBD.controls.UNRELATED.females.maf0.01overall.hg38.chrX.rsid_REFORMATTED.txt",sep="\t")
print(females_uncond.columns)

Index(['CHROM', 'POS', 'ID', 'rsID', 'EffectAllele', 'OtherAllele', 'BETA',
       'BETA_SE', 'OR', 'OR_low95', 'OR_high95', 'P', 'MAF_EffectAllele',
       'MAF_EffectAllele_CASE', 'MAF_EffectAllele_CTRL', 'CT_EffectAllele',
       'CT_EffectAllele_CASE', 'CT_EffectAllele_CTRL', 'ALLELE_CT',
       'CASE_ALLELE_CT', 'CTRL_ALLELE_CT'],
      dtype='object')


In [143]:
females_uncond.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
POS,258247.0,79333550.0,45637530.0,2781514.0,36143080.0,83587170.0,119362400.0,155700600.0
BETA,258247.0,-0.006831485,0.1435296,-1.369221,-0.06981106,-0.00273975,0.06471038,0.9655455
BETA_SE,258247.0,0.1180516,0.06720031,0.055391,0.0636907,0.0873601,0.1600105,0.440813
OR,258247.0,1.003327,0.1435414,0.254305,0.9325705,0.997274,1.06685,2.62622
OR_low95,258247.0,0.8030415,0.1415592,0.1081022,0.7350166,0.836058,0.8983707,1.628731
OR_high95,258247.0,1.275058,0.2717356,0.5841018,1.1119,1.193951,1.344726,4.234603
P,258247.0,0.4850853,0.2933628,3.43679e-05,0.224982,0.480579,0.7400285,0.999995
MAF_EffectAllele,258247.0,0.1689946,0.1479782,0.0101396,0.0340758,0.122839,0.284741,0.5
MAF_EffectAllele_CASE,258247.0,0.1685472,0.1480764,0.00316456,0.0337553,0.12289,0.285074,0.529008
MAF_EffectAllele_CTRL,258247.0,0.1692005,0.1480586,0.00703883,0.0339806,0.12233,0.283981,0.512864


In [144]:
females_uncond.head(2)

Unnamed: 0,CHROM,POS,ID,rsID,EffectAllele,OtherAllele,BETA,BETA_SE,OR,OR_low95,...,P,MAF_EffectAllele,MAF_EffectAllele_CASE,MAF_EffectAllele_CTRL,CT_EffectAllele,CT_EffectAllele_CASE,CT_EffectAllele_CTRL,ALLELE_CT,CASE_ALLELE_CT,CTRL_ALLELE_CT
0,X,2781514,chrX:2781514:C:A,rs311165,A,C,-0.02556,0.058254,0.974764,0.869586,...,0.660837,0.416057,0.410338,0.418689,2503,778,1725,6016,1896,4120
1,X,2781604,chrX:2781604:G:T,rs28579419,T,G,0.022065,0.082083,1.02231,0.870387,...,0.788093,0.13863,0.14346,0.136408,834,272,562,6016,1896,4120


In [145]:
females_uncond_indexvar_1_flank1Mb = females_uncond[(females_uncond.POS > 19513849 - 1000000) & (females_uncond.POS < 19513849 + 1000000)]
females_uncond_indexvar_2_flank1Mb = females_uncond[(females_uncond.POS > 117373825 - 1000000) & (females_uncond.POS < 117373825 + 1000000)]

In [146]:
females_uncond_indexvar_1_flank1Mb.to_csv("Analysis.COLOC_redo/females_only/XWAS_summ_stats_rs141773145_flank1Mb_no-maf-filter.txt",sep="\t",index=False,header=True)
females_uncond_indexvar_2_flank1Mb.to_csv("Analysis.COLOC_redo/females_only/XWAS_summ_stats_rs12860838_flank1Mb_no-maf-filter.txt",sep="\t",index=False,header=True)

In [147]:
print(females_uncond_indexvar_1_flank1Mb.shape)
print(females_uncond_indexvar_2_flank1Mb.shape)

(2419, 21)
(5314, 21)


#### females-only xwas (conditioned on APOE4)

top index variants from Table 1 of paper are:
- rs141773145(pos = 19,513,849)
- rs6648060 (pos = 76,575,769)
- rs141193614 (pos = 98,584,345)
- rs12860838 (pos = 117,373,825)

use these variants and flank out by 0.5Mb

In [149]:
import pandas as pd
import numpy as np

females_cond = pd.read_csv("/data/ALS_50k/DementiaSeq.TopmedJointCalled.June2020/LBD/Analysis.XWAS_GLM/cond_ApoE4_x-autosomal-pc/females_only/toMeta.LBD.controls.UNRELATED.females.maf0.01overall.hg38.chrX.rsid_REFORMATTED.txt",sep="\t")
print(females_cond.columns)

Index(['CHROM', 'POS', 'ID', 'rsID', 'EffectAllele', 'OtherAllele', 'BETA',
       'BETA_SE', 'OR', 'OR_low95', 'OR_high95', 'P', 'MAF_EffectAllele',
       'MAF_EffectAllele_CASE', 'MAF_EffectAllele_CTRL', 'CT_EffectAllele',
       'CT_EffectAllele_CASE', 'CT_EffectAllele_CTRL', 'ALLELE_CT',
       'CASE_ALLELE_CT', 'CTRL_ALLELE_CT'],
      dtype='object')


In [150]:
females_cond_indexvar_1_flank1Mb = females_cond[(females_cond.POS > 19513849 - 1000000) & (females_cond.POS < 19513849 + 1000000)]
females_cond_indexvar_2_flank1Mb = females_cond[(females_cond.POS > 76575769 - 1000000) & (females_cond.POS < 76575769 + 1000000)]
females_cond_indexvar_3_flank1Mb = females_cond[(females_cond.POS > 98584345 - 1000000) & (females_cond.POS < 98584345 + 1000000)]
females_cond_indexvar_4_flank1Mb = females_cond[(females_cond.POS > 117373825 - 1000000) & (females_cond.POS < 117373825 + 1000000)]

In [151]:
females_cond_indexvar_1_flank1Mb.to_csv("Analysis.COLOC_redo/females_only_cond_ApoE4/XWAS_summ_stats_rs141773145_flank1Mb_no-maf-filter.txt",sep="\t",index=False,header=True)
females_cond_indexvar_2_flank1Mb.to_csv("Analysis.COLOC_redo/females_only_cond_ApoE4/XWAS_summ_stats_rs6648060_flank1Mb_no-maf-filter.txt",sep="\t",index=False,header=True)
females_cond_indexvar_3_flank1Mb.to_csv("Analysis.COLOC_redo/females_only_cond_ApoE4/XWAS_summ_stats_rs141193614_flank1Mb_no-maf-filter.txt",sep="\t",index=False,header=True)
females_cond_indexvar_4_flank1Mb.to_csv("Analysis.COLOC_redo/females_only_cond_ApoE4/XWAS_summ_stats_rs12860838_flank1Mb_no-maf-filter.txt",sep="\t",index=False,header=True)

In [152]:
print(females_cond_indexvar_1_flank1Mb.shape)
print(females_cond_indexvar_2_flank1Mb.shape)
print(females_cond_indexvar_3_flank1Mb.shape)
print(females_cond_indexvar_4_flank1Mb.shape)

(2422, 21)
(5192, 21)
(2416, 21)
(5315, 21)


### Harmonize eQTL dataset

Use the unconditioned XWAS from females only to reformat variant ID.

The conditioned XWAS summ stats should have the same REF/ALT designation.

In [153]:
import pandas as pd
import numpy as np

xwas = pd.read_csv("/data/ALS_50k/DementiaSeq.TopmedJointCalled.June2020/LBD/Analysis.XWAS_GLM/females_only_x-autosomal-pc/toMeta.LBD.controls.UNRELATED.females.maf0.01overall.hg38.chrX.rsid_REFORMATTED.txt",sep="\t")     
print(xwas.shape)

(258247, 21)


In [154]:
xwas.head()

Unnamed: 0,CHROM,POS,ID,rsID,EffectAllele,OtherAllele,BETA,BETA_SE,OR,OR_low95,...,P,MAF_EffectAllele,MAF_EffectAllele_CASE,MAF_EffectAllele_CTRL,CT_EffectAllele,CT_EffectAllele_CASE,CT_EffectAllele_CTRL,ALLELE_CT,CASE_ALLELE_CT,CTRL_ALLELE_CT
0,X,2781514,chrX:2781514:C:A,rs311165,A,C,-0.02556,0.058254,0.974764,0.869586,...,0.660837,0.416057,0.410338,0.418689,2503,778,1725,6016,1896,4120
1,X,2781604,chrX:2781604:G:T,rs28579419,T,G,0.022065,0.082083,1.02231,0.870387,...,0.788093,0.13863,0.14346,0.136408,834,272,562,6016,1896,4120
2,X,2781635,chrX:2781635:G:A,rs60075487,A,G,0.114979,0.07713,1.12185,0.964451,...,0.136048,0.153757,0.165084,0.148544,925,313,612,6016,1896,4120
3,X,2781927,chrX:2781927:A:G,rs2306737,A,G,0.053882,0.057451,1.05536,0.942968,...,0.348336,0.436336,0.455169,0.42767,2625,863,1762,6016,1896,4120
4,X,2781986,chrX:2781986:T:C,rs2306736,T,C,0.068425,0.057451,1.07082,0.956783,...,0.233619,0.437334,0.457806,0.427913,2631,868,1763,6016,1896,4120


In [155]:
import warnings
warnings.filterwarnings('ignore')

brain_list = ['Brain_Amygdala','Brain_Anterior_cingulate_cortex_BA24','Brain_Caudate_basal_ganglia','Brain_Cerebellar_Hemisphere','Brain_Cerebellum','Brain_Cortex','Brain_Frontal_Cortex_BA9','Brain_Hippocampus','Brain_Hypothalamus','Brain_Nucleus_accumbens_basal_ganglia','Brain_Putamen_basal_ganglia','Brain_Spinal_cord_cervical_c-1','Brain_Substantia_nigra']

for i in brain_list:
    input = "/data/NDRS_LNG/GTEX_v8/GTEX_v8_chrX_brain/eQTL/" + i + ".v8.EUR.allpairs.chrX.txt"
    eqtl = pd.read_csv(input,sep="\t")
    
    temp = eqtl['variant_id'].str.split("_", expand=True)
    temp.columns = ['chrom','POS','ref','alt','build']
    temp.head()
    
    eqtl_updated = pd.concat([eqtl,temp[['chrom','POS','ref','alt']]], axis=1)
    eqtl_updated['var_id'] = eqtl_updated['chrom'] + ":" + eqtl_updated['POS'] + ":" + eqtl_updated['ref'] + ":" + eqtl_updated['alt']
    eqtl_updated['var_id_alt'] = eqtl_updated['chrom'] + ":" + eqtl_updated['POS'] + ":" + eqtl_updated['alt'] + ":" + eqtl_updated['ref']
    
    # check against xwas summstats SNP ID
    ## eqtl_updated var_id that matches chrom:pos:ref:alt
    temp1 = eqtl_updated[(eqtl_updated.var_id.isin(xwas.ID))]
    temp1['ID'] = temp1['var_id']
    temp1['REF'] = temp1['ref']
    temp1['ALT'] = temp1['alt']
    temp1 = temp1[['phenotype_id', 'variant_id', 'tss_distance', 'ma_samples', 'ma_count',
       'pval_nominal', 'beta', 'varbeta', 'N', 'MAF', 'chrom', 'POS', 'REF',
       'ALT', 'ID']]
    print(temp1.shape)

    ## eqtl_updated var_id_alt that matches chrom:pos:ref:alt
    # for those that matches, will need to flip the beta and MAF values
    temp2 = eqtl_updated[(eqtl_updated.var_id_alt.isin(xwas.ID))]
    print(temp2.shape)
    temp2['beta_corr'] = -1 * temp2['beta']
    temp2['MAF_corr'] = 0.5 - temp2['MAF']
    temp2['ID'] = temp2['var_id_alt']
    temp2['REF'] = temp2['alt']
    temp2['ALT'] = temp2['ref']
    temp2 = temp2[['phenotype_id', 'variant_id', 'tss_distance', 'ma_samples', 'ma_count',
       'pval_nominal', 'beta_corr', 'varbeta', 'N', 'MAF_corr', 'chrom', 'POS', 'REF',
       'ALT', 'ID']]
    temp2 = temp2.rename(columns={'beta_corr':'beta', 'MAF_corr':'MAF'})

    # concat both temp1 and temp2 dataframes - these will only contain variants that are shared between xwas summ stats and eqtl
    eqtl_updated_clean = pd.concat([temp1,temp2], axis = 0)
    print(eqtl_updated_clean.shape)
    
    # get rsid from xwas summ stats
    rsid_map = xwas[['ID','rsID']]
    eqtl_updated_clean_rsid = eqtl_updated_clean.merge(rsid_map, on="ID")
    eqtl_updated_clean_rsid.to_csv("Analysis.COLOC_redo/GTEXv8_eQTL_brains/" + i + ".v8.EUR.allpairs.chrX_harmonized_snpid_shared.txt",sep="\t",index=False,header=True)


(2074203, 15)
(0, 16)
(2074203, 15)
(2086703, 15)
(0, 16)
(2086703, 15)
(2090599, 15)
(0, 16)
(2090599, 15)
(2085779, 15)
(0, 16)
(2085779, 15)
(2087214, 15)
(0, 16)
(2087214, 15)
(2086016, 15)
(0, 16)
(2086016, 15)
(2096348, 15)
(0, 16)
(2096348, 15)
(2052472, 15)
(0, 16)
(2052472, 15)
(2166400, 15)
(0, 16)
(2166400, 15)
(2133849, 15)
(0, 16)
(2133849, 15)
(1997686, 15)
(0, 16)
(1997686, 15)
(2091078, 15)
(0, 16)
(2091078, 15)
(2022392, 15)
(0, 16)
(2022392, 15)


In [156]:
import warnings
warnings.filterwarnings('ignore')

brain_list = ['Brain_Amygdala','Brain_Anterior_cingulate_cortex_BA24','Brain_Caudate_basal_ganglia','Brain_Cerebellar_Hemisphere','Brain_Cerebellum','Brain_Cortex','Brain_Frontal_Cortex_BA9','Brain_Hippocampus','Brain_Hypothalamus','Brain_Nucleus_accumbens_basal_ganglia','Brain_Putamen_basal_ganglia','Brain_Spinal_cord_cervical_c-1','Brain_Substantia_nigra']

for i in brain_list:
    input = "/data/NDRS_LNG/GTEX_v8/GTEX_v8_chrX_brain/eQTL/" + i + ".v8.EUR.signif_pairs.txt"
    eqtl = pd.read_csv(input,sep="\t")
    
    temp = eqtl['variant_id'].str.split("_", expand=True)
    temp.columns = ['chrom','POS','ref','alt','build']
    temp.head()
    
    eqtl_updated = pd.concat([eqtl,temp[['chrom','POS','ref','alt']]], axis=1)
    eqtl_updated['var_id'] = eqtl_updated['chrom'] + ":" + eqtl_updated['POS'] + ":" + eqtl_updated['ref'] + ":" + eqtl_updated['alt']
    eqtl_updated['var_id_alt'] = eqtl_updated['chrom'] + ":" + eqtl_updated['POS'] + ":" + eqtl_updated['alt'] + ":" + eqtl_updated['ref']
    
    # check against xwas summstats SNP ID
    ## eqtl_updated var_id that matches chrom:pos:ref:alt
    temp1 = eqtl_updated[(eqtl_updated.var_id.isin(xwas.ID))]
    temp1['ID'] = temp1['var_id']
    temp1['REF'] = temp1['ref']
    temp1['ALT'] = temp1['alt']
    temp1 = temp1[['phenotype_id', 'variant_id', 'tss_distance', 'ma_samples', 'ma_count',
       'pval_nominal', 'beta', 'varbeta', 'N', 'MAF', 'chrom', 'POS', 'REF',
       'ALT', 'ID']]
    print(temp1.shape)

    ## eqtl_updated var_id_alt that matches chrom:pos:ref:alt
    # for those that matches, will need to flip the beta and MAF values
    temp2 = eqtl_updated[(eqtl_updated.var_id_alt.isin(xwas.ID))]
    print(temp2.shape)
    temp2['beta_corr'] = -1 * temp2['beta']
    temp2['MAF_corr'] = 0.5 - temp2['MAF']
    temp2['ID'] = temp2['var_id_alt']
    temp2['REF'] = temp2['alt']
    temp2['ALT'] = temp2['ref']
    temp2 = temp2[['phenotype_id', 'variant_id', 'tss_distance', 'ma_samples', 'ma_count',
       'pval_nominal', 'beta_corr', 'varbeta', 'N', 'MAF_corr', 'chrom', 'POS', 'REF',
       'ALT', 'ID']]
    temp2 = temp2.rename(columns={'beta_corr':'beta', 'MAF_corr':'MAF'})

    # concat both temp1 and temp2 dataframes - these will only contain variants that are shared between xwas summ stats and eqtl
    eqtl_updated_clean = pd.concat([temp1,temp2], axis = 0)
    print(eqtl_updated_clean.shape)
    
    # get rsid from xwas summ stats
    rsid_map = xwas[['ID','rsID']]
    eqtl_updated_clean_rsid = eqtl_updated_clean.merge(rsid_map, on="ID")
    eqtl_updated_clean_rsid.to_csv("Analysis.COLOC_redo/GTEXv8_eQTL_brains/" + i + ".v8.EUR.signif_pairs.chrX_harmonized_snpid_shared.txt",sep="\t",index=False,header=True)


(5566, 15)
(0, 16)
(5566, 15)
(7962, 15)
(0, 16)
(7962, 15)
(16002, 15)
(0, 16)
(16002, 15)
(19720, 15)
(0, 16)
(19720, 15)
(27471, 15)
(0, 16)
(27471, 15)
(18755, 15)
(0, 16)
(18755, 15)
(17151, 15)
(0, 16)
(17151, 15)
(8407, 15)
(0, 16)
(8407, 15)
(10050, 15)
(0, 16)
(10050, 15)
(16032, 15)
(0, 16)
(16032, 15)
(10713, 15)
(0, 16)
(10713, 15)
(8080, 15)
(0, 16)
(8080, 15)
(3999, 15)
(0, 16)
(3999, 15)


In [157]:
import warnings
warnings.filterwarnings('ignore')

brain_list = ['Brain_Amygdala','Brain_Anterior_cingulate_cortex_BA24','Brain_Caudate_basal_ganglia','Brain_Cerebellar_Hemisphere','Brain_Cerebellum','Brain_Cortex','Brain_Frontal_Cortex_BA9','Brain_Hippocampus','Brain_Hypothalamus','Brain_Nucleus_accumbens_basal_ganglia','Brain_Putamen_basal_ganglia','Brain_Spinal_cord_cervical_c-1','Brain_Substantia_nigra']

for i in brain_list:
    input = "/data/NDRS_LNG/GTEX_v8/GTEX_v8_chrX_brain/eQTL/" + i + ".v8.EUR.egenes.txt"
    eqtl = pd.read_csv(input,sep="\t")
    
    temp = eqtl['variant_id'].str.split("_", expand=True)
    temp.columns = ['chrom','POS','ref','alt','build']
    temp.head()
    
    eqtl_updated = pd.concat([eqtl,temp[['chrom','POS','ref','alt']]], axis=1)
    eqtl_updated['var_id'] = eqtl_updated['chrom'] + ":" + eqtl_updated['POS'] + ":" + eqtl_updated['ref'] + ":" + eqtl_updated['alt']
    eqtl_updated['var_id_alt'] = eqtl_updated['chrom'] + ":" + eqtl_updated['POS'] + ":" + eqtl_updated['alt'] + ":" + eqtl_updated['ref']
    
    # check against xwas summstats SNP ID
    ## eqtl_updated var_id that matches chrom:pos:ref:alt
    temp1 = eqtl_updated[(eqtl_updated.var_id.isin(xwas.ID))]
    temp1['ID'] = temp1['var_id']
    temp1['REF'] = temp1['ref']
    temp1['ALT'] = temp1['alt']
    temp1 = temp1[['phenotype_id', 'variant_id', 'tss_distance', 'ma_samples', 'ma_count',
       'pval_nominal', 'beta', 'varbeta', 'N', 'MAF', 'chrom', 'POS', 'REF',
       'ALT', 'ID']]
    print(temp1.shape)

    ## eqtl_updated var_id_alt that matches chrom:pos:ref:alt
    # for those that matches, will need to flip the beta and MAF values
    temp2 = eqtl_updated[(eqtl_updated.var_id_alt.isin(xwas.ID))]
    print(temp2.shape)
    temp2['beta_corr'] = -1 * temp2['beta']
    temp2['MAF_corr'] = 0.5 - temp2['MAF']
    temp2['ID'] = temp2['var_id_alt']
    temp2['REF'] = temp2['alt']
    temp2['ALT'] = temp2['ref']
    temp2 = temp2[['phenotype_id', 'variant_id', 'tss_distance', 'ma_samples', 'ma_count',
       'pval_nominal', 'beta_corr', 'varbeta', 'N', 'MAF_corr', 'chrom', 'POS', 'REF',
       'ALT', 'ID']]
    temp2 = temp2.rename(columns={'beta_corr':'beta', 'MAF_corr':'MAF'})

    # concat both temp1 and temp2 dataframes - these will only contain variants that are shared between xwas summ stats and eqtl
    eqtl_updated_clean = pd.concat([temp1,temp2], axis = 0)
    print(eqtl_updated_clean.shape)
    
    # get rsid from xwas summ stats
    rsid_map = xwas[['ID','rsID']]
    eqtl_updated_clean_rsid = eqtl_updated_clean.merge(rsid_map, on="ID")
    eqtl_updated_clean_rsid.to_csv("Analysis.COLOC_redo/GTEXv8_eQTL_brains/" + i + ".v8.EUR.egenes.chrX_harmonized_snpid_shared.txt",sep="\t",index=False,header=True)


(675, 15)
(0, 16)
(675, 15)
(688, 15)
(0, 16)
(688, 15)
(679, 15)
(0, 16)
(679, 15)
(687, 15)
(0, 16)
(687, 15)
(693, 15)
(0, 16)
(693, 15)
(686, 15)
(0, 16)
(686, 15)
(683, 15)
(0, 16)
(683, 15)
(680, 15)
(0, 16)
(680, 15)
(704, 15)
(0, 16)
(704, 15)
(690, 15)
(0, 16)
(690, 15)
(635, 15)
(0, 16)
(635, 15)
(698, 15)
(0, 16)
(698, 15)
(652, 15)
(0, 16)
(652, 15)


### run COLOC for each brain tissue and for each subsetted xwas region

#### Download LD binaries for creating LD matrix for analysis using `coloc.susie`

ref link: `https://mrcieu.github.io/ieugwasr/reference/ld_matrix.html`


In [None]:
%%bash
cd Analysis.COLOC_redo/
mkdir LD
cd LD
wget http://fileserve.mrcieu.ac.uk/ld/1kg.v3.tgz

In [None]:
%%bash
cd Analysis.COLOC_redo/LD
tar -xvzf 1kg.v3.tgz

#### females-only (unconditioned on APOE4)
##### index_var1 (rs141773145)

In [158]:
%%bash
cd Analysis.COLOC_redo/females_only

module load R/4.3
R --vanilla --no-save

require(data.table)
require(tidyverse)
# install.packages("coloc")
require(coloc)
#devtools::install_github("mrcieu/ieugwasr", force=TRUE)
require(ieugwasr)
#devtools::install_github("explodecomputer/genetics.binaRies")
genetics.binaRies::get_plink_binary()

brain_tissues <- c("Brain_Amygdala","Brain_Anterior_cingulate_cortex_BA24","Brain_Caudate_basal_ganglia","Brain_Cerebellar_Hemisphere","Brain_Cerebellum","Brain_Cortex","Brain_Frontal_Cortex_BA9","Brain_Hippocampus","Brain_Hypothalamus","Brain_Nucleus_accumbens_basal_ganglia","Brain_Putamen_basal_ganglia","Brain_Spinal_cord_cervical_c-1","Brain_Substantia_nigra")

run_coloc <- function(rsid,i){
    # read in files
    tissue <- brain_tissues[i]
    xwas <- fread(paste("XWAS_summ_stats_",rsid,"_flank1Mb_no-maf-filter.txt",sep="")) %>% arrange(POS)
    eqtl <- fread(paste("../GTEXv8_eQTL_brains/",tissue,".v8.EUR.allpairs.chrX_harmonized_snpid_shared.txt",sep="")) %>% arrange(POS)
    dim(eqtl)
    dim(xwas)    
    
    # prep so that only columns that are needed for coloc is present
    eqtl_subset <- subset(eqtl, eqtl$ID %in% xwas$ID) %>%
                   mutate(position = POS,
                          snp = ID) %>%
                   filter(!is.na(beta)) %>%
                   filter(rsID != "") %>%
                   select(rsID, snp, position, beta, varbeta, N, MAF, phenotype_id, pval_nominal) %>%
                   arrange(snp,pval_nominal) %>%
                   group_by(snp) %>%
                   slice(1:1) %>%
                   arrange(position) %>%
                   data.frame()
  
    xwas_subset <- subset(xwas, xwas$ID %in% eqtl_subset$snp) %>%
                   filter(rsID != "") %>%
                   mutate(position = POS,
                          snp = ID, 
                          beta = BETA,
                          varbeta = (BETA_SE**2)) %>%
                   select(rsID, snp, position, beta, varbeta,P)
    
    # check dimensions
    dim(eqtl_subset)
    dim(xwas_subset)
        
    # convert to acceptable data structure for input in coloc
    eqtl_subset_list <- as.list(eqtl_subset)
    eqtl_subset_list$type <- "quant"
    xwas_subset_list <- as.list(xwas_subset)
    xwas_subset_list$type <- "cc"
    xwas_subset_list$N <- 2591 + 4023
    #check_dataset(eqtl_subset_list)
    #check_dataset(xwas_subset_list)
        
    # run coloc.abf
    print(paste("run coloc for: ",tissue,sep=""))
    my.res <- coloc.abf(dataset1=xwas_subset_list,
                    dataset2=eqtl_subset_list)
    print(my.res)
    output_name <- paste("coloc_",rsid,"_",tissue, sep="")

    summary <- my.res$summary %>% data.frame() %>% t() %>% data.frame() %>% mutate(Tissue = tissue, index_variant = rsid)
    results <- my.res$results %>% data.frame() %>% mutate(Tissue = tissue,index_variant = rsid)
    priors <- my.res$priors %>% data.frame() %>% t() %>% data.frame() %>% mutate(Tissue = tissue, index_variant = rsid)
    
    summary
    results
    priors
    
    write.table(summary, paste(output_name,"_summary.txt",sep=""), quote=F, sep="\t", row.names=F, col.names=T)
    write.table(results, paste(output_name,"_results.txt",sep=""), quote=F, sep="\t", row.names=F, col.names=T)
    write.table(priors, paste(output_name,"_priors.txt",sep=""), quote=F, sep="\t", row.names=F, col.names=T)
    print(paste("saving coloc results for: ",tissue,sep=""))
}

for (i in 1:length(brain_tissues)){
    run_coloc("rs141773145", i)
}

[-] Unloading gcc  11.3.0  ... 
[-] Unloading HDF5  1.12.2 
[-] Unloading netcdf  4.9.0 
[-] Unloading openmpi/4.1.3/gcc-11.3.0  ... 
[-] Unloading pandoc  2.18  on cn3180 
[-] Unloading R 4.3.0 
[+] Loading gcc  11.3.0  ... 
[+] Loading HDF5  1.12.2 
[+] Loading netcdf  4.9.0 
[-] Unloading gcc  11.3.0  ... 
[+] Loading gcc  11.3.0  ... 
[+] Loading openmpi/4.1.3/gcc-11.3.0  ... 
[+] Loading pandoc  2.18  on cn3180 
[+] Loading R 4.3.0 



R version 4.3.0 (2023-04-21) -- "Already Tomorrow"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> 
> require(data.table)


Loading required package: data.table


> require(tidyverse)


Loading required package: tidyverse
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::between()     masks data.table::between()
✖ dplyr::filter()      masks stats::filter()
✖ dplyr::first()       masks data.table::first()
✖ lubridate::hour()    masks data.table::hour()
✖ lubridate::isoweek() masks data.table::isoweek()
✖ dplyr::lag()         masks stats::lag()
✖ dplyr::last()        masks data.table::last()
✖ lubridate::mday()    masks data.table::mday()
✖ lubridate::minute()  masks data.table::minute()
✖ lubridate::month()   masks data.table::month()
✖ lubridate::quarter() masks data.table::quarter()
✖ lubridate::second()  masks data.table::second()
✖ purrr::transpose()   masks data.tab

> # install.packages("coloc")
> require(coloc)


Loading required package: coloc
This is coloc version 5.2.3


> #devtools::install_github("mrcieu/ieugwasr", force=TRUE)
> require(ieugwasr)


Loading required package: ieugwasr
API: public: http://gwas-api.mrcieu.ac.uk/


> #devtools::install_github("explodecomputer/genetics.binaRies")
> genetics.binaRies::get_plink_binary()
[1] "/gpfs/gsfs9/users/chiarp/R/rhel8/4.3/genetics.binaRies/bin/plink"
> 
> brain_tissues <- c("Brain_Amygdala","Brain_Anterior_cingulate_cortex_BA24","Brain_Caudate_basal_ganglia","Brain_Cerebellar_Hemisphere","Brain_Cerebellum","Brain_Cortex","Brain_Frontal_Cortex_BA9","Brain_Hippocampus","Brain_Hypothalamus","Brain_Nucleus_accumbens_basal_ganglia","Brain_Putamen_basal_ganglia","Brain_Spinal_cord_cervical_c-1","Brain_Substantia_nigra")
> 
> run_coloc <- function(rsid,i){
+     # read in files
+     tissue <- brain_tissues[i]
+     xwas <- fread(paste("XWAS_summ_stats_",rsid,"_flank1Mb_no-maf-filter.txt",sep="")) %>% arrange(POS)
+     eqtl <- fread(paste("../GTEXv8_eQTL_brains/",tissue,".v8.EUR.allpairs.chrX_harmonized_snpid_shared.txt",sep="")) %>% arrange(POS)
+     dim(eqtl)
+     dim(xwas)    
+     
+     # prep so that only columns that are needed for coloc is present
+     

Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


       nsnps           H0           H1           H2           H3           H4 
2.015000e+03 6.361878e-01 9.806622e-02 2.183740e-01 3.364789e-02 1.372406e-02 
[1] "saving coloc results for: Brain_Amygdala"
[1] "run coloc for: Brain_Anterior_cingulate_cortex_BA24"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.6440    0.0998    0.2090    0.0324    0.0138 
[1] "PP abf for shared variant: 1.38%"
   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2         H3      H4
 0.5345154 0.2022 0.2022 0.04086462 0.02022
       nsnps           H0           H1           H2           H3           H4 
2.022000e+03 6.444935e-01 9.980599e-02 2.094822e-01 3.242653e-02 1.379177e-02 


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


[1] "saving coloc results for: Brain_Anterior_cingulate_cortex_BA24"
[1] "run coloc for: Brain_Caudate_basal_ganglia"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
  0.01330   0.00208   0.82600   0.12900   0.03020 
[1] "PP abf for shared variant: 3.02%"
   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2         H3      H4
 0.5315088 0.2034 0.2034 0.04135122 0.02034
       nsnps           H0           H1           H2           H3           H4 
2.034000e+03 1.333040e-02 2.079736e-03 8.256164e-01 1.287780e-01 3.019551e-02 


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


[1] "saving coloc results for: Brain_Caudate_basal_ganglia"
[1] "run coloc for: Brain_Cerebellar_Hemisphere"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
  0.06010   0.00935   0.78000   0.12100   0.02930 
[1] "PP abf for shared variant: 2.93%"
   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2         H3      H4
 0.5320101 0.2032 0.2032 0.04126992 0.02032
       nsnps           H0           H1           H2           H3           H4 
2.032000e+03 6.007769e-02 9.348435e-03 7.799209e-01 1.213309e-01 2.932206e-02 


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


[1] "saving coloc results for: Brain_Cerebellar_Hemisphere"
[1] "run coloc for: Brain_Cerebellum"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
 5.89e-05  9.19e-06  8.39e-01  1.31e-01  2.98e-02 
[1] "PP abf for shared variant: 2.98%"
   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2        H3      H4
 0.5312581 0.2035 0.2035 0.0413919 0.02035
       nsnps           H0           H1           H2           H3           H4 
2.035000e+03 5.890564e-05 9.194100e-06 8.392040e-01 1.309547e-01 2.977322e-02 


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


[1] "saving coloc results for: Brain_Cerebellum"
[1] "run coloc for: Brain_Cortex"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.4600    0.0716    0.3900    0.0608    0.0181 
[1] "PP abf for shared variant: 1.81%"
   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2        H3      H4
 0.5312581 0.2035 0.2035 0.0413919 0.02035
       nsnps           H0           H1           H2           H3           H4 
2.035000e+03 4.595582e-01 7.163649e-02 3.899323e-01 6.076500e-02 1.810802e-02 


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


[1] "saving coloc results for: Brain_Cortex"
[1] "run coloc for: Brain_Frontal_Cortex_BA9"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.1060    0.0164    0.7360    0.1140    0.0279 
[1] "PP abf for shared variant: 2.79%"
   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2         H3      H4
 0.5330124 0.2028 0.2028 0.04110756 0.02028
       nsnps           H0           H1           H2           H3           H4 
2.028000e+03 1.055174e-01 1.640405e-02 7.358249e-01 1.143657e-01 2.788800e-02 


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


[1] "saving coloc results for: Brain_Frontal_Cortex_BA9"
[1] "run coloc for: Brain_Hippocampus"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.4700    0.0733    0.3790    0.0592    0.0185 
[1] "PP abf for shared variant: 1.85%"
   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2        H3      H4
 0.5312581 0.2035 0.2035 0.0413919 0.02035
       nsnps           H0           H1           H2           H3           H4 
2.035000e+03 4.695511e-01 7.328181e-02 3.794766e-01 5.920560e-02 1.848491e-02 


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


[1] "saving coloc results for: Brain_Hippocampus"
[1] "run coloc for: Brain_Hypothalamus"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.3760    0.0583    0.4710    0.0730    0.0216 
[1] "PP abf for shared variant: 2.16%"
   p1    p2   p12 
1e-04 1e-04 1e-05 
       H0     H1     H2       H3      H4
 0.533764 0.2025 0.2025 0.040986 0.02025
       nsnps           H0           H1           H2           H3           H4 
2.025000e+03 3.759906e-01 5.828073e-02 4.710916e-01 7.300031e-02 2.163673e-02 


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


[1] "saving coloc results for: Brain_Hypothalamus"
[1] "run coloc for: Brain_Nucleus_accumbens_basal_ganglia"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.1830    0.0286    0.6600    0.1030    0.0251 
[1] "PP abf for shared variant: 2.51%"
   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2        H3      H4
 0.5312581 0.2035 0.2035 0.0413919 0.02035
       nsnps           H0           H1           H2           H3           H4 
2.035000e+03 1.832167e-01 2.859889e-02 6.600424e-01 1.030030e-01 2.513896e-02 


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


[1] "saving coloc results for: Brain_Nucleus_accumbens_basal_ganglia"
[1] "run coloc for: Brain_Putamen_basal_ganglia"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
  0.04550   0.00706   0.79200   0.12300   0.03220 
[1] "PP abf for shared variant: 3.22%"
   p1    p2   p12 
1e-04 1e-04 1e-05 
       H0     H1     H2         H3      H4
 0.533263 0.2027 0.2027 0.04106702 0.02027
       nsnps           H0           H1           H2           H3           H4 
2.027000e+03 4.546125e-02 7.064732e-03 7.922128e-01 1.230786e-01 3.218260e-02 


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


[1] "saving coloc results for: Brain_Putamen_basal_ganglia"
[1] "run coloc for: Brain_Spinal_cord_cervical_c-1"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.1470    0.0227    0.6950    0.1070    0.0279 
[1] "PP abf for shared variant: 2.79%"
   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2         H3      H4
 0.5355169 0.2018 0.2018 0.04070306 0.02018
       nsnps           H0           H1           H2           H3           H4 
2.018000e+03 1.467410e-01 2.265066e-02 6.953791e-01 1.073095e-01 2.791976e-02 


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


[1] "saving coloc results for: Brain_Spinal_cord_cervical_c-1"
[1] "run coloc for: Brain_Substantia_nigra"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.6540    0.1000    0.2010    0.0308    0.0140 
[1] "PP abf for shared variant: 1.4%"
   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2        H3      H4
 0.5385197 0.2006 0.2006 0.0402203 0.02006
       nsnps           H0           H1           H2           H3           H4 
2.006000e+03 6.537681e-01 1.000758e-01 2.013055e-01 3.080086e-02 1.404979e-02 


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


[1] "saving coloc results for: Brain_Substantia_nigra"




> 


##### index_var2 (rs12860838)

In [159]:
%%bash
cd Analysis.COLOC_redo/females_only

module load R/4.3
R --vanilla --no-save

require(data.table)
require(tidyverse)
# install.packages("coloc")
require(coloc)
#devtools::install_github("mrcieu/ieugwasr", force=TRUE)
require(ieugwasr)
#devtools::install_github("explodecomputer/genetics.binaRies")
genetics.binaRies::get_plink_binary()

brain_tissues <- c("Brain_Amygdala","Brain_Anterior_cingulate_cortex_BA24","Brain_Caudate_basal_ganglia","Brain_Cerebellar_Hemisphere","Brain_Cerebellum","Brain_Cortex","Brain_Frontal_Cortex_BA9","Brain_Hippocampus","Brain_Hypothalamus","Brain_Nucleus_accumbens_basal_ganglia","Brain_Putamen_basal_ganglia","Brain_Spinal_cord_cervical_c-1","Brain_Substantia_nigra")

run_coloc <- function(rsid,i){
    # read in files
    tissue <- brain_tissues[i]
    xwas <- fread(paste("XWAS_summ_stats_",rsid,"_flank1Mb_no-maf-filter.txt",sep="")) %>% arrange(POS)
    eqtl <- fread(paste("../GTEXv8_eQTL_brains/",tissue,".v8.EUR.allpairs.chrX_harmonized_snpid_shared.txt",sep="")) %>% arrange(POS)
    dim(eqtl)
    dim(xwas)    
    
    # prep so that only columns that are needed for coloc is present
    eqtl_subset <- subset(eqtl, eqtl$ID %in% xwas$ID) %>%
                   mutate(position = POS,
                          snp = ID) %>%
                   filter(!is.na(beta)) %>%
                   filter(rsID != "") %>%
                   select(rsID, snp, position, beta, varbeta, N, MAF, phenotype_id, pval_nominal) %>%
                   arrange(snp,pval_nominal) %>%
                   group_by(snp) %>%
                   slice(1:1) %>%
                   arrange(position) %>%
                   data.frame()
  
    xwas_subset <- subset(xwas, xwas$ID %in% eqtl_subset$snp) %>%
                   filter(rsID != "") %>%
                   mutate(position = POS,
                          snp = ID, 
                          beta = BETA,
                          varbeta = (BETA_SE**2)) %>%
                   select(rsID, snp, position, beta, varbeta,P)
    
    # check dimensions
    dim(eqtl_subset)
    dim(xwas_subset)
        
    # convert to acceptable data structure for input in coloc
    eqtl_subset_list <- as.list(eqtl_subset)
    eqtl_subset_list$type <- "quant"
    xwas_subset_list <- as.list(xwas_subset)
    xwas_subset_list$type <- "cc"
    xwas_subset_list$N <- 2591 + 4023
    #check_dataset(eqtl_subset_list)
    #check_dataset(xwas_subset_list)
        
    # run coloc.abf
    print(paste("run coloc for: ",tissue,sep=""))
    my.res <- coloc.abf(dataset1=xwas_subset_list,
                    dataset2=eqtl_subset_list)
    print(my.res)
    output_name <- paste("coloc_",rsid,"_",tissue, sep="")

    summary <- my.res$summary %>% data.frame() %>% t() %>% data.frame() %>% mutate(Tissue = tissue, index_variant = rsid)
    results <- my.res$results %>% data.frame() %>% mutate(Tissue = tissue,index_variant = rsid)
    priors <- my.res$priors %>% data.frame() %>% t() %>% data.frame() %>% mutate(Tissue = tissue, index_variant = rsid)
    
    write.table(summary, paste(output_name,"_summary.txt",sep=""), quote=F, sep="\t", row.names=F, col.names=T)
    write.table(results, paste(output_name,"_results.txt",sep=""), quote=F, sep="\t", row.names=F, col.names=T)
    write.table(priors, paste(output_name,"_priors.txt",sep=""), quote=F, sep="\t", row.names=F, col.names=T)
    print(paste("saving coloc results for: ",tissue,sep=""))
}

for (i in 1:length(brain_tissues)){
    run_coloc("rs12860838", i)
}


[-] Unloading gcc  11.3.0  ... 
[-] Unloading HDF5  1.12.2 
[-] Unloading netcdf  4.9.0 
[-] Unloading openmpi/4.1.3/gcc-11.3.0  ... 
[-] Unloading pandoc  2.18  on cn3180 
[-] Unloading R 4.3.0 
[+] Loading gcc  11.3.0  ... 
[+] Loading HDF5  1.12.2 
[+] Loading netcdf  4.9.0 
[-] Unloading gcc  11.3.0  ... 
[+] Loading gcc  11.3.0  ... 
[+] Loading openmpi/4.1.3/gcc-11.3.0  ... 
[+] Loading pandoc  2.18  on cn3180 
[+] Loading R 4.3.0 



R version 4.3.0 (2023-04-21) -- "Already Tomorrow"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> 
> require(data.table)


Loading required package: data.table


> require(tidyverse)


Loading required package: tidyverse
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::between()     masks data.table::between()
✖ dplyr::filter()      masks stats::filter()
✖ dplyr::first()       masks data.table::first()
✖ lubridate::hour()    masks data.table::hour()
✖ lubridate::isoweek() masks data.table::isoweek()
✖ dplyr::lag()         masks stats::lag()
✖ dplyr::last()        masks data.table::last()
✖ lubridate::mday()    masks data.table::mday()
✖ lubridate::minute()  masks data.table::minute()
✖ lubridate::month()   masks data.table::month()
✖ lubridate::quarter() masks data.table::quarter()
✖ lubridate::second()  masks data.table::second()
✖ purrr::transpose()   masks data.tab

> # install.packages("coloc")
> require(coloc)


Loading required package: coloc
This is coloc version 5.2.3


> #devtools::install_github("mrcieu/ieugwasr", force=TRUE)
> require(ieugwasr)


Loading required package: ieugwasr
API: public: http://gwas-api.mrcieu.ac.uk/


> #devtools::install_github("explodecomputer/genetics.binaRies")
> genetics.binaRies::get_plink_binary()
[1] "/gpfs/gsfs9/users/chiarp/R/rhel8/4.3/genetics.binaRies/bin/plink"
> 
> brain_tissues <- c("Brain_Amygdala","Brain_Anterior_cingulate_cortex_BA24","Brain_Caudate_basal_ganglia","Brain_Cerebellar_Hemisphere","Brain_Cerebellum","Brain_Cortex","Brain_Frontal_Cortex_BA9","Brain_Hippocampus","Brain_Hypothalamus","Brain_Nucleus_accumbens_basal_ganglia","Brain_Putamen_basal_ganglia","Brain_Spinal_cord_cervical_c-1","Brain_Substantia_nigra")
> 
> run_coloc <- function(rsid,i){
+     # read in files
+     tissue <- brain_tissues[i]
+     xwas <- fread(paste("XWAS_summ_stats_",rsid,"_flank1Mb_no-maf-filter.txt",sep="")) %>% arrange(POS)
+     eqtl <- fread(paste("../GTEXv8_eQTL_brains/",tissue,".v8.EUR.allpairs.chrX_harmonized_snpid_shared.txt",sep="")) %>% arrange(POS)
+     dim(eqtl)
+     dim(xwas)    
+     
+     # prep so that only columns that are needed for coloc is present
+     

Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
         H0     H1     H2        H3      H4
 -0.2564402 0.4859 0.4859 0.2360502 0.04859
       nsnps           H0           H1           H2           H3           H4 
4.859000e+03 2.394184e-01 3.484568e-01 1.522586e-01 2.215635e-01 3.830275e-02 
[1] "saving coloc results for: Brain_Amygdala"
[1] "run coloc for: Brain_Anterior_cingulate_cortex_BA24"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.2140    0.3120    0.1790    0.2610    0.0353 
[1] "PP abf for shared variant: 3.53%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
         H0     H1     H2        H3      H4
 -0.2665877 0.4892 0.4892 0.2392677 0.04892
       nsnps           H0           H1           H2           H3           H4 
4892.0000000    0.2135729    0.3115344    0.1788009    0.2607779    0.0353139 
[1] "saving coloc results for: Brain_Anterior_cingulate_cortex_BA24"
[1] "run coloc for: Brain_Caudate_basal_ganglia"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.2230    0.3250    0.1660    0.2420    0.0446 
[1] "PP abf for shared variant: 4.46%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0   H1   H2       H3    H4
 -0.269051 0.49 0.49 0.240051 0.049
       nsnps           H0           H1           H2           H3           H4 
4900.0000000    0.2227690    0.3251229    0.1656927    0.2417777    0.0446377 
[1] "saving coloc results for: Brain_Caudate_basal_ganglia"
[1] "run coloc for: Brain_Cerebellar_Hemisphere"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.0241    0.0351    0.3580    0.5220    0.0615 
[1] "PP abf for shared variant: 6.15%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
         H0     H1     H2        H3      H4
 -0.2681271 0.4897 0.4897 0.2397571 0.04897
       nsnps           H0           H1           H2           H3           H4 
4.897000e+03 2.405103e-02 3.510447e-02 3.575341e-01 5.217890e-01 6.152143e-02 
[1] "saving coloc results for: Brain_Cerebellar_Hemisphere"
[1] "run coloc for: Brain_Cerebellum"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.1520    0.2220    0.2340    0.3410    0.0513 
[1] "PP abf for shared variant: 5.13%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2       H3      H4
 -0.269667 0.4902 0.4902 0.240247 0.04902
       nsnps           H0           H1           H2           H3           H4 
4.902000e+03 1.520473e-01 2.219784e-01 2.336369e-01 3.410424e-01 5.129505e-02 
[1] "saving coloc results for: Brain_Cerebellum"
[1] "run coloc for: Brain_Cortex"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.1950    0.2850    0.1750    0.2560    0.0891 
[1] "PP abf for shared variant: 8.91%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
         H0     H1     H2        H3      H4
 -0.2684351 0.4898 0.4898 0.2398551 0.04898
       nsnps           H0           H1           H2           H3           H4 
4.898000e+03 1.949325e-01 2.845336e-01 1.754239e-01 2.559687e-01 8.914125e-02 
[1] "saving coloc results for: Brain_Cortex"
[1] "run coloc for: Brain_Frontal_Cortex_BA9"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.2370    0.3460    0.1540    0.2250    0.0391 
[1] "PP abf for shared variant: 3.91%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2       H3      H4
 -0.269667 0.4902 0.4902 0.240247 0.04902
       nsnps           H0           H1           H2           H3           H4 
4902.0000000    0.2367534    0.3456425    0.1538869    0.2246244    0.0390928 
[1] "saving coloc results for: Brain_Frontal_Cortex_BA9"
[1] "run coloc for: Brain_Hippocampus"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
    0.197     0.287     0.194     0.283     0.040 
[1] "PP abf for shared variant: 4%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2       H3      H4
 -0.269667 0.4902 0.4902 0.240247 0.04902
       nsnps           H0           H1           H2           H3           H4 
4.902000e+03 1.966987e-01 2.871731e-01 1.935820e-01 2.825828e-01 3.996337e-02 
[1] "saving coloc results for: Brain_Hippocampus"
[1] "run coloc for: Brain_Hypothalamus"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
    0.239     0.348     0.152     0.222     0.039 
[1] "PP abf for shared variant: 3.9%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
         H0     H1     H2        H3      H4
 -0.2672034 0.4894 0.4894 0.2394634 0.04894
       nsnps           H0           H1           H2           H3           H4 
4.894000e+03 2.385011e-01 3.479759e-01 1.523183e-01 2.221951e-01 3.900965e-02 
[1] "saving coloc results for: Brain_Hypothalamus"
[1] "run coloc for: Brain_Nucleus_accumbens_basal_ganglia"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.1950    0.2850    0.1920    0.2800    0.0485 
[1] "PP abf for shared variant: 4.85%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
         H0     H1     H2        H3      H4
 -0.2702831 0.4904 0.4904 0.2404431 0.04904
       nsnps           H0           H1           H2           H3           H4 
4.904000e+03 1.951241e-01 2.849101e-01 1.916407e-01 2.797753e-01 4.854981e-02 
[1] "saving coloc results for: Brain_Nucleus_accumbens_basal_ganglia"
[1] "run coloc for: Brain_Putamen_basal_ganglia"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.2120    0.3100    0.1720    0.2510    0.0545 
[1] "PP abf for shared variant: 5.45%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
         H0     H1     H2        H3      H4
 -0.2665877 0.4892 0.4892 0.2392677 0.04892
       nsnps           H0           H1           H2           H3           H4 
4.892000e+03 2.124685e-01 3.099918e-01 1.720590e-01 2.509798e-01 5.450096e-02 
[1] "saving coloc results for: Brain_Putamen_basal_ganglia"
[1] "run coloc for: Brain_Spinal_cord_cervical_c-1"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.2340    0.3410    0.1590    0.2320    0.0331 
[1] "PP abf for shared variant: 3.31%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
         H0     H1     H2        H3      H4
 -0.2573618 0.4862 0.4862 0.2363418 0.04862
       nsnps           H0           H1           H2           H3           H4 
4.862000e+03 2.342308e-01 3.410173e-01 1.594840e-01 2.321601e-01 3.310783e-02 
[1] "saving coloc results for: Brain_Spinal_cord_cervical_c-1"
[1] "run coloc for: Brain_Substantia_nigra"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
    0.253     0.362     0.135     0.193     0.057 
[1] "PP abf for shared variant: 5.7%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
         H0     H1     H2        H3      H4
 -0.1139816 0.4388 0.4388 0.1925016 0.04388
       nsnps           H0           H1           H2           H3           H4 
4.388000e+03 2.532806e-01 3.617263e-01 1.351027e-01 1.928918e-01 5.699862e-02 
[1] "saving coloc results for: Brain_Substantia_nigra"
> 




#### females-only (conditioned on APOE4)
##### index_var1 (rs141773145)

In [160]:
%%bash
cd Analysis.COLOC_redo/females_only_cond_ApoE4/

module load R/4.3
R --vanilla --no-save

require(data.table)
require(tidyverse)
# install.packages("coloc")
require(coloc)
#devtools::install_github("mrcieu/ieugwasr", force=TRUE)
require(ieugwasr)
#devtools::install_github("explodecomputer/genetics.binaRies")
genetics.binaRies::get_plink_binary()

brain_tissues <- c("Brain_Amygdala","Brain_Anterior_cingulate_cortex_BA24","Brain_Caudate_basal_ganglia","Brain_Cerebellar_Hemisphere","Brain_Cerebellum","Brain_Cortex","Brain_Frontal_Cortex_BA9","Brain_Hippocampus","Brain_Hypothalamus","Brain_Nucleus_accumbens_basal_ganglia","Brain_Putamen_basal_ganglia","Brain_Spinal_cord_cervical_c-1","Brain_Substantia_nigra")

run_coloc <- function(rsid,i){
    # read in files
    tissue <- brain_tissues[i]
    xwas <- fread(paste("XWAS_summ_stats_",rsid,"_flank1Mb_no-maf-filter.txt",sep="")) %>% arrange(POS)
    eqtl <- fread(paste("../GTEXv8_eQTL_brains/",tissue,".v8.EUR.allpairs.chrX_harmonized_snpid_shared.txt",sep="")) %>% arrange(POS)
    dim(eqtl)
    dim(xwas)    
    
    # prep so that only columns that are needed for coloc is present
    eqtl_subset <- subset(eqtl, eqtl$ID %in% xwas$ID) %>%
                   mutate(position = POS,
                          snp = ID) %>%
                   filter(!is.na(beta)) %>%
                   filter(rsID != "") %>%
                   select(rsID, snp, position, beta, varbeta, N, MAF, phenotype_id, pval_nominal) %>%
                   arrange(snp,pval_nominal) %>%
                   group_by(snp) %>%
                   slice(1:1) %>%
                   arrange(position) %>%
                   data.frame()
  
    xwas_subset <- subset(xwas, xwas$ID %in% eqtl_subset$snp) %>%
                   filter(rsID != "") %>%
                   mutate(position = POS,
                          snp = ID, 
                          beta = BETA,
                          varbeta = (BETA_SE**2)) %>%
                   select(rsID, snp, position, beta, varbeta,P)
    
    # check dimensions
    dim(eqtl_subset)
    dim(xwas_subset)
        
    # convert to acceptable data structure for input in coloc
    eqtl_subset_list <- as.list(eqtl_subset)
    eqtl_subset_list$type <- "quant"
    xwas_subset_list <- as.list(xwas_subset)
    xwas_subset_list$type <- "cc"
    xwas_subset_list$N <- 2591 + 4023
    #check_dataset(eqtl_subset_list)
    #check_dataset(xwas_subset_list)
        
    # run coloc.abf
    print(paste("run coloc for: ",tissue,sep=""))
    my.res <- coloc.abf(dataset1=xwas_subset_list,
                    dataset2=eqtl_subset_list)
    print(my.res)
    output_name <- paste("coloc_",rsid,"_",tissue, sep="")

    summary <- my.res$summary %>% data.frame() %>% t() %>% data.frame() %>% mutate(Tissue = tissue, index_variant = rsid)
    results <- my.res$results %>% data.frame() %>% mutate(Tissue = tissue,index_variant = rsid)
    priors <- my.res$priors %>% data.frame() %>% t() %>% data.frame() %>% mutate(Tissue = tissue, index_variant = rsid)
    
    write.table(summary, paste(output_name,"_summary.txt",sep=""), quote=F, sep="\t", row.names=F, col.names=T)
    write.table(results, paste(output_name,"_results.txt",sep=""), quote=F, sep="\t", row.names=F, col.names=T)
    write.table(priors, paste(output_name,"_priors.txt",sep=""), quote=F, sep="\t", row.names=F, col.names=T)
    print(paste("saving coloc results for: ",tissue,sep=""))
}

for (i in 1:length(brain_tissues)){
    run_coloc("rs141773145", i)
}

[-] Unloading gcc  11.3.0  ... 
[-] Unloading HDF5  1.12.2 
[-] Unloading netcdf  4.9.0 
[-] Unloading openmpi/4.1.3/gcc-11.3.0  ... 
[-] Unloading pandoc  2.18  on cn3180 
[-] Unloading R 4.3.0 
[+] Loading gcc  11.3.0  ... 
[+] Loading HDF5  1.12.2 
[+] Loading netcdf  4.9.0 
[-] Unloading gcc  11.3.0  ... 
[+] Loading gcc  11.3.0  ... 
[+] Loading openmpi/4.1.3/gcc-11.3.0  ... 
[+] Loading pandoc  2.18  on cn3180 
[+] Loading R 4.3.0 



R version 4.3.0 (2023-04-21) -- "Already Tomorrow"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> 
> require(data.table)


Loading required package: data.table


> require(tidyverse)


Loading required package: tidyverse
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::between()     masks data.table::between()
✖ dplyr::filter()      masks stats::filter()
✖ dplyr::first()       masks data.table::first()
✖ lubridate::hour()    masks data.table::hour()
✖ lubridate::isoweek() masks data.table::isoweek()
✖ dplyr::lag()         masks stats::lag()
✖ dplyr::last()        masks data.table::last()
✖ lubridate::mday()    masks data.table::mday()
✖ lubridate::minute()  masks data.table::minute()
✖ lubridate::month()   masks data.table::month()
✖ lubridate::quarter() masks data.table::quarter()
✖ lubridate::second()  masks data.table::second()
✖ purrr::transpose()   masks data.tab

> # install.packages("coloc")
> require(coloc)


Loading required package: coloc
This is coloc version 5.2.3


> #devtools::install_github("mrcieu/ieugwasr", force=TRUE)
> require(ieugwasr)


Loading required package: ieugwasr
API: public: http://gwas-api.mrcieu.ac.uk/


> #devtools::install_github("explodecomputer/genetics.binaRies")
> genetics.binaRies::get_plink_binary()
[1] "/gpfs/gsfs9/users/chiarp/R/rhel8/4.3/genetics.binaRies/bin/plink"
> 
> brain_tissues <- c("Brain_Amygdala","Brain_Anterior_cingulate_cortex_BA24","Brain_Caudate_basal_ganglia","Brain_Cerebellar_Hemisphere","Brain_Cerebellum","Brain_Cortex","Brain_Frontal_Cortex_BA9","Brain_Hippocampus","Brain_Hypothalamus","Brain_Nucleus_accumbens_basal_ganglia","Brain_Putamen_basal_ganglia","Brain_Spinal_cord_cervical_c-1","Brain_Substantia_nigra")
> 
> run_coloc <- function(rsid,i){
+     # read in files
+     tissue <- brain_tissues[i]
+     xwas <- fread(paste("XWAS_summ_stats_",rsid,"_flank1Mb_no-maf-filter.txt",sep="")) %>% arrange(POS)
+     eqtl <- fread(paste("../GTEXv8_eQTL_brains/",tissue,".v8.EUR.allpairs.chrX_harmonized_snpid_shared.txt",sep="")) %>% arrange(POS)
+     dim(eqtl)
+     dim(xwas)    
+     
+     # prep so that only columns that are needed for coloc is present
+     

Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2         H3      H4
 0.5367684 0.2013 0.2013 0.04050156 0.02013
       nsnps           H0           H1           H2           H3           H4 
2.013000e+03 6.278530e-01 1.057056e-01 2.153844e-01 3.624741e-02 1.480954e-02 
[1] "saving coloc results for: Brain_Amygdala"
[1] "run coloc for: Brain_Anterior_cingulate_cortex_BA24"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.6360    0.1080    0.2070    0.0349    0.0148 
[1] "PP abf for shared variant: 1.48%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2         H3      H4
 0.5345154 0.2022 0.2022 0.04086462 0.02022
       nsnps           H0           H1           H2           H3           H4 
2.022000e+03 6.359438e-01 1.075629e-01 2.067032e-01 3.494672e-02 1.484332e-02 
[1] "saving coloc results for: Brain_Anterior_cingulate_cortex_BA24"
[1] "run coloc for: Brain_Caudate_basal_ganglia"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
  0.01310   0.00224   0.81400   0.13900   0.03170 
[1] "PP abf for shared variant: 3.17%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2         H3      H4
 0.5320101 0.2032 0.2032 0.04126992 0.02032
       nsnps           H0           H1           H2           H3           H4 
2.032000e+03 1.314893e-02 2.237809e-03 8.143719e-01 1.385658e-01 3.167561e-02 
[1] "saving coloc results for: Brain_Caudate_basal_ganglia"
[1] "run coloc for: Brain_Cerebellar_Hemisphere"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.0593    0.0101    0.7700    0.1310    0.0302 
[1] "PP abf for shared variant: 3.02%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0    H1    H2        H3     H4
 0.5325113 0.203 0.203 0.0411887 0.0203
       nsnps           H0           H1           H2           H3           H4 
2.030000e+03 5.929509e-02 1.007064e-02 7.697602e-01 1.307054e-01 3.016871e-02 
[1] "saving coloc results for: Brain_Cerebellar_Hemisphere"
[1] "run coloc for: Brain_Cerebellum"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
 5.81e-05  9.89e-06  8.27e-01  1.41e-01  3.17e-02 
[1] "PP abf for shared variant: 3.17%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2         H3      H4
 0.5317594 0.2033 0.2033 0.04131056 0.02033
       nsnps           H0           H1           H2           H3           H4 
2.033000e+03 5.807942e-05 9.888277e-06 8.274256e-01 1.408412e-01 3.166530e-02 
[1] "saving coloc results for: Brain_Cerebellum"
[1] "run coloc for: Brain_Cortex"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.4530    0.0771    0.3850    0.0654    0.0193 
[1] "PP abf for shared variant: 1.93%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2         H3      H4
 0.5317594 0.2033 0.2033 0.04131056 0.02033
       nsnps           H0           H1           H2           H3           H4 
2.033000e+03 4.534575e-01 7.714125e-02 3.846618e-01 6.541856e-02 1.932089e-02 
[1] "saving coloc results for: Brain_Cortex"
[1] "run coloc for: Brain_Frontal_Cortex_BA9"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.1040    0.0177    0.7270    0.1230    0.0282 
[1] "PP abf for shared variant: 2.82%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2        H3      H4
 0.5335135 0.2026 0.2026 0.0410265 0.02026
       nsnps           H0           H1           H2           H3           H4 
2.026000e+03 1.042159e-01 1.767424e-02 7.267259e-01 1.232191e-01 2.816486e-02 
[1] "saving coloc results for: Brain_Frontal_Cortex_BA9"
[1] "run coloc for: Brain_Hippocampus"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.4640    0.0789    0.3750    0.0637    0.0193 
[1] "PP abf for shared variant: 1.93%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2         H3      H4
 0.5317594 0.2033 0.2033 0.04131056 0.02033
       nsnps           H0           H1           H2           H3           H4 
2.033000e+03 4.635490e-01 7.890686e-02 3.745293e-01 6.373434e-02 1.928045e-02 
[1] "saving coloc results for: Brain_Hippocampus"
[1] "run coloc for: Brain_Hypothalamus"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.3710    0.0628    0.4650    0.0787    0.0224 
[1] "PP abf for shared variant: 2.24%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2         H3      H4
 0.5340145 0.2024 0.2024 0.04094552 0.02024
       nsnps           H0           H1           H2           H3           H4 
2.024000e+03 3.711261e-01 6.283345e-02 4.649554e-01 7.869681e-02 2.238822e-02 
[1] "saving coloc results for: Brain_Hypothalamus"
[1] "run coloc for: Brain_Nucleus_accumbens_basal_ganglia"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.1810    0.0308    0.6510    0.1110    0.0262 
[1] "PP abf for shared variant: 2.62%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2         H3      H4
 0.5317594 0.2033 0.2033 0.04131056 0.02033
       nsnps           H0           H1           H2           H3           H4 
2.033000e+03 1.808113e-01 3.078467e-02 6.513384e-01 1.108697e-01 2.619584e-02 
[1] "saving coloc results for: Brain_Nucleus_accumbens_basal_ganglia"
[1] "run coloc for: Brain_Putamen_basal_ganglia"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.0448    0.0076    0.7810    0.1320    0.0340 
[1] "PP abf for shared variant: 3.4%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2        H3      H4
 0.5335135 0.2026 0.2026 0.0410265 0.02026
       nsnps           H0           H1           H2           H3           H4 
2.026000e+03 4.482508e-02 7.604767e-03 7.811161e-01 1.324857e-01 3.396828e-02 
[1] "saving coloc results for: Brain_Putamen_basal_ganglia"
[1] "run coloc for: Brain_Spinal_cord_cervical_c-1"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.1450    0.0244    0.6870    0.1160    0.0279 
[1] "PP abf for shared variant: 2.79%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2         H3      H4
 0.5357673 0.2017 0.2017 0.04066272 0.02017
       nsnps           H0           H1           H2           H3           H4 
2.017000e+03 1.449731e-01 2.442240e-02 6.869839e-01 1.157025e-01 2.791806e-02 
[1] "saving coloc results for: Brain_Spinal_cord_cervical_c-1"
[1] "run coloc for: Brain_Substantia_nigra"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.6450    0.1080    0.1980    0.0332    0.0157 
[1] "PP abf for shared variant: 1.57%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2        H3      H4
 0.5387698 0.2005 0.2005 0.0401802 0.02005
       nsnps           H0           H1           H2           H3           H4 
2.005000e+03 6.447120e-01 1.079086e-01 1.984477e-01 3.319942e-02 1.573231e-02 
[1] "saving coloc results for: Brain_Substantia_nigra"
> 




##### index_var2 (rs6648060)

In [161]:
%%bash
cd Analysis.COLOC_redo/females_only_cond_ApoE4/

module load R/4.3
R --vanilla --no-save

require(data.table)
require(tidyverse)
# install.packages("coloc")
require(coloc)
#devtools::install_github("mrcieu/ieugwasr", force=TRUE)
require(ieugwasr)
#devtools::install_github("explodecomputer/genetics.binaRies")
genetics.binaRies::get_plink_binary()

brain_tissues <- c("Brain_Amygdala","Brain_Anterior_cingulate_cortex_BA24","Brain_Caudate_basal_ganglia","Brain_Cerebellar_Hemisphere","Brain_Cerebellum","Brain_Cortex","Brain_Frontal_Cortex_BA9","Brain_Hippocampus","Brain_Hypothalamus","Brain_Nucleus_accumbens_basal_ganglia","Brain_Putamen_basal_ganglia","Brain_Spinal_cord_cervical_c-1","Brain_Substantia_nigra")

run_coloc <- function(rsid,i){
    # read in files
    tissue <- brain_tissues[i]
    xwas <- fread(paste("XWAS_summ_stats_",rsid,"_flank1Mb_no-maf-filter.txt",sep="")) %>% arrange(POS)
    eqtl <- fread(paste("../GTEXv8_eQTL_brains/",tissue,".v8.EUR.allpairs.chrX_harmonized_snpid_shared.txt",sep="")) %>% arrange(POS)
    dim(eqtl)
    dim(xwas)    
    
    # prep so that only columns that are needed for coloc is present
    eqtl_subset <- subset(eqtl, eqtl$ID %in% xwas$ID) %>%
                   mutate(position = POS,
                          snp = ID) %>%
                   filter(!is.na(beta)) %>%
                   filter(rsID != "") %>%
                   select(rsID, snp, position, beta, varbeta, N, MAF, phenotype_id, pval_nominal) %>%
                   arrange(snp,pval_nominal) %>%
                   group_by(snp) %>%
                   slice(1:1) %>%
                   arrange(position) %>%
                   data.frame()
  
    xwas_subset <- subset(xwas, xwas$ID %in% eqtl_subset$snp) %>%
                   filter(rsID != "") %>%
                   mutate(position = POS,
                          snp = ID, 
                          beta = BETA,
                          varbeta = (BETA_SE**2)) %>%
                   select(rsID, snp, position, beta, varbeta,P)
    
    # check dimensions
    dim(eqtl_subset)
    dim(xwas_subset)
        
    # convert to acceptable data structure for input in coloc
    eqtl_subset_list <- as.list(eqtl_subset)
    eqtl_subset_list$type <- "quant"
    xwas_subset_list <- as.list(xwas_subset)
    xwas_subset_list$type <- "cc"
    xwas_subset_list$N <- 2591 + 4023
    #check_dataset(eqtl_subset_list)
    #check_dataset(xwas_subset_list)
        
    # run coloc.abf
    print(paste("run coloc for: ",tissue,sep=""))
    my.res <- coloc.abf(dataset1=xwas_subset_list,
                    dataset2=eqtl_subset_list)
    print(my.res)
    output_name <- paste("coloc_",rsid,"_",tissue, sep="")

    summary <- my.res$summary %>% data.frame() %>% t() %>% data.frame() %>% mutate(Tissue = tissue, index_variant = rsid)
    results <- my.res$results %>% data.frame() %>% mutate(Tissue = tissue,index_variant = rsid)
    priors <- my.res$priors %>% data.frame() %>% t() %>% data.frame() %>% mutate(Tissue = tissue, index_variant = rsid)
    
    write.table(summary, paste(output_name,"_summary.txt",sep=""), quote=F, sep="\t", row.names=F, col.names=T)
    write.table(results, paste(output_name,"_results.txt",sep=""), quote=F, sep="\t", row.names=F, col.names=T)
    write.table(priors, paste(output_name,"_priors.txt",sep=""), quote=F, sep="\t", row.names=F, col.names=T)
    print(paste("saving coloc results for: ",tissue,sep=""))
}

for (i in 1:length(brain_tissues)){
    run_coloc("rs6648060", i)
}

[-] Unloading gcc  11.3.0  ... 
[-] Unloading HDF5  1.12.2 
[-] Unloading netcdf  4.9.0 
[-] Unloading openmpi/4.1.3/gcc-11.3.0  ... 
[-] Unloading pandoc  2.18  on cn3180 
[-] Unloading R 4.3.0 
[+] Loading gcc  11.3.0  ... 
[+] Loading HDF5  1.12.2 
[+] Loading netcdf  4.9.0 
[-] Unloading gcc  11.3.0  ... 
[+] Loading gcc  11.3.0  ... 
[+] Loading openmpi/4.1.3/gcc-11.3.0  ... 
[+] Loading pandoc  2.18  on cn3180 
[+] Loading R 4.3.0 



R version 4.3.0 (2023-04-21) -- "Already Tomorrow"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> 
> require(data.table)


Loading required package: data.table


> require(tidyverse)


Loading required package: tidyverse
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::between()     masks data.table::between()
✖ dplyr::filter()      masks stats::filter()
✖ dplyr::first()       masks data.table::first()
✖ lubridate::hour()    masks data.table::hour()
✖ lubridate::isoweek() masks data.table::isoweek()
✖ dplyr::lag()         masks stats::lag()
✖ dplyr::last()        masks data.table::last()
✖ lubridate::mday()    masks data.table::mday()
✖ lubridate::minute()  masks data.table::minute()
✖ lubridate::month()   masks data.table::month()
✖ lubridate::quarter() masks data.table::quarter()
✖ lubridate::second()  masks data.table::second()
✖ purrr::transpose()   masks data.tab

> # install.packages("coloc")
> require(coloc)


Loading required package: coloc
This is coloc version 5.2.3


> #devtools::install_github("mrcieu/ieugwasr", force=TRUE)
> require(ieugwasr)


Loading required package: ieugwasr
API: public: http://gwas-api.mrcieu.ac.uk/


> #devtools::install_github("explodecomputer/genetics.binaRies")
> genetics.binaRies::get_plink_binary()
[1] "/gpfs/gsfs9/users/chiarp/R/rhel8/4.3/genetics.binaRies/bin/plink"
> 
> brain_tissues <- c("Brain_Amygdala","Brain_Anterior_cingulate_cortex_BA24","Brain_Caudate_basal_ganglia","Brain_Cerebellar_Hemisphere","Brain_Cerebellum","Brain_Cortex","Brain_Frontal_Cortex_BA9","Brain_Hippocampus","Brain_Hypothalamus","Brain_Nucleus_accumbens_basal_ganglia","Brain_Putamen_basal_ganglia","Brain_Spinal_cord_cervical_c-1","Brain_Substantia_nigra")
> 
> run_coloc <- function(rsid,i){
+     # read in files
+     tissue <- brain_tissues[i]
+     xwas <- fread(paste("XWAS_summ_stats_",rsid,"_flank1Mb_no-maf-filter.txt",sep="")) %>% arrange(POS)
+     eqtl <- fread(paste("../GTEXv8_eQTL_brains/",tissue,".v8.EUR.allpairs.chrX_harmonized_snpid_shared.txt",sep="")) %>% arrange(POS)
+     dim(eqtl)
+     dim(xwas)    
+     
+     # prep so that only columns that are needed for coloc is present
+     

Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
          H0     H1     H2        H3      H4
 -0.07396756 0.4253 0.4253 0.1808376 0.04253
       nsnps           H0           H1           H2           H3           H4 
4253.0000000    0.1895181    0.3011238    0.1807952    0.2872227    0.0413402 
[1] "saving coloc results for: Brain_Amygdala"
[1] "run coloc for: Brain_Anterior_cingulate_cortex_BA24"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
    0.106     0.167     0.269     0.425     0.033 
[1] "PP abf for shared variant: 3.3%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2       H3      H4
 -0.065714 0.4225 0.4225 0.178464 0.04225
       nsnps           H0           H1           H2           H3           H4 
4.225000e+03 1.055683e-01 1.670299e-01 2.689456e-01 4.254920e-01 3.296408e-02 
[1] "saving coloc results for: Brain_Anterior_cingulate_cortex_BA24"
[1] "run coloc for: Brain_Caudate_basal_ganglia"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.1130    0.1800    0.2580    0.4100    0.0393 
[1] "PP abf for shared variant: 3.93%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
          H0     H1     H2        H3      H4
 -0.08135006 0.4278 0.4278 0.1829701 0.04278
       nsnps           H0           H1           H2           H3           H4 
4278.0000000    0.1131746    0.1801491    0.2575021    0.4098477    0.0393265 
[1] "saving coloc results for: Brain_Caudate_basal_ganglia"
[1] "run coloc for: Brain_Cerebellar_Hemisphere"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.0251    0.0400    0.3500    0.5570    0.0276 
[1] "PP abf for shared variant: 2.76%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
          H0     H1     H2        H3      H4
 -0.08135006 0.4278 0.4278 0.1829701 0.04278
       nsnps           H0           H1           H2           H3           H4 
4.278000e+03 2.513740e-02 4.001470e-02 3.500654e-01 5.572202e-01 2.756229e-02 
[1] "saving coloc results for: Brain_Cerebellar_Hemisphere"
[1] "run coloc for: Brain_Cerebellum"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
  0.00127   0.00203   0.36800   0.58600   0.04300 
[1] "PP abf for shared variant: 4.3%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
          H0     H1     H2        H3      H4
 -0.07987256 0.4273 0.4273 0.1825426 0.04273
       nsnps           H0           H1           H2           H3           H4 
4.273000e+03 1.273431e-03 2.026589e-03 3.680514e-01 5.856890e-01 4.295959e-02 
[1] "saving coloc results for: Brain_Cerebellum"
[1] "run coloc for: Brain_Cortex"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.0901    0.1430    0.2780    0.4430    0.0456 
[1] "PP abf for shared variant: 4.56%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
         H0     H1     H2        H3      H4
 -0.0822368 0.4281 0.4281 0.1832268 0.04281
       nsnps           H0           H1           H2           H3           H4 
4.281000e+03 9.006886e-02 1.433961e-01 2.781524e-01 4.427930e-01 4.558967e-02 
[1] "saving coloc results for: Brain_Cortex"
[1] "run coloc for: Brain_Frontal_Cortex_BA9"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.0182    0.0289    0.3570    0.5680    0.0283 
[1] "PP abf for shared variant: 2.83%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
          H0     H1     H2        H3      H4
 -0.07514792 0.4257 0.4257 0.1811779 0.04257
       nsnps           H0           H1           H2           H3           H4 
4.257000e+03 1.820608e-02 2.894195e-02 3.570127e-01 5.675097e-01 2.832954e-02 
[1] "saving coloc results for: Brain_Frontal_Cortex_BA9"
[1] "run coloc for: Brain_Hippocampus"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.2350    0.3730    0.1350    0.2140    0.0434 
[1] "PP abf for shared variant: 4.34%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
         H0     H1     H2        H3      H4
 -0.0763286 0.4261 0.4261 0.1815186 0.04261
       nsnps           H0           H1           H2           H3           H4 
4.261000e+03 2.346226e-01 3.729331e-01 1.347963e-01 2.142155e-01 4.343244e-02 
[1] "saving coloc results for: Brain_Hippocampus"
[1] "run coloc for: Brain_Hypothalamus"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.1520    0.2410    0.2150    0.3420    0.0499 
[1] "PP abf for shared variant: 4.99%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
         H0     H1     H2        H3      H4
 -0.0704284 0.4241 0.4241 0.1798184 0.04241
       nsnps           H0           H1           H2           H3           H4 
4.241000e+03 1.519727e-01 2.413017e-01 2.151960e-01 3.416375e-01 4.989211e-02 
[1] "saving coloc results for: Brain_Hypothalamus"
[1] "run coloc for: Brain_Nucleus_accumbens_basal_ganglia"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.0569    0.0904    0.3160    0.5020    0.0350 
[1] "PP abf for shared variant: 3.5%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
          H0     H1     H2        H3      H4
 -0.07721432 0.4264 0.4264 0.1817743 0.04264
       nsnps           H0           H1           H2           H3           H4 
4.264000e+03 5.685320e-02 9.043541e-02 3.156610e-01 5.020816e-01 3.496874e-02 
[1] "saving coloc results for: Brain_Nucleus_accumbens_basal_ganglia"
[1] "run coloc for: Brain_Putamen_basal_ganglia"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.0632    0.1000    0.3120    0.4950    0.0298 
[1] "PP abf for shared variant: 2.98%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
         H0    H1    H2        H3     H4
 -0.0730825 0.425 0.425 0.1805825 0.0425
       nsnps           H0           H1           H2           H3           H4 
4.250000e+03 6.321164e-02 1.004213e-01 3.115893e-01 4.949770e-01 2.980087e-02 
[1] "saving coloc results for: Brain_Putamen_basal_ganglia"
[1] "run coloc for: Brain_Spinal_cord_cervical_c-1"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.0147    0.0234    0.3620    0.5750    0.0249 
[1] "PP abf for shared variant: 2.49%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
         H0     H1     H2        H3      H4
 -0.0745577 0.4255 0.4255 0.1810077 0.04255
       nsnps           H0           H1           H2           H3           H4 
4.255000e+03 1.474501e-02 2.342721e-02 3.619185e-01 5.749994e-01 2.490988e-02 
[1] "saving coloc results for: Brain_Spinal_cord_cervical_c-1"
[1] "run coloc for: Brain_Substantia_nigra"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.1660    0.2640    0.2020    0.3210    0.0482 
[1] "PP abf for shared variant: 4.82%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
          H0     H1     H2        H3      H4
 -0.07278752 0.4249 0.4249 0.1804975 0.04249
       nsnps           H0           H1           H2           H3           H4 
4.249000e+03 1.659074e-01 2.635033e-01 2.018544e-01 3.205481e-01 4.818679e-02 
[1] "saving coloc results for: Brain_Substantia_nigra"
> 




##### index_var3 (rs141193614)

In [162]:
%%bash
cd Analysis.COLOC_redo/females_only_cond_ApoE4/

module load R/4.3
R --vanilla --no-save

require(data.table)
require(tidyverse)
# install.packages("coloc")
require(coloc)
#devtools::install_github("mrcieu/ieugwasr", force=TRUE)
require(ieugwasr)
#devtools::install_github("explodecomputer/genetics.binaRies")
genetics.binaRies::get_plink_binary()

brain_tissues <- c("Brain_Amygdala","Brain_Anterior_cingulate_cortex_BA24","Brain_Caudate_basal_ganglia","Brain_Cerebellar_Hemisphere","Brain_Cerebellum","Brain_Cortex","Brain_Frontal_Cortex_BA9","Brain_Hippocampus","Brain_Hypothalamus","Brain_Nucleus_accumbens_basal_ganglia","Brain_Putamen_basal_ganglia","Brain_Spinal_cord_cervical_c-1","Brain_Substantia_nigra")

run_coloc <- function(rsid,i){
    # read in files
    tissue <- brain_tissues[i]
    xwas <- fread(paste("XWAS_summ_stats_",rsid,"_flank1Mb_no-maf-filter.txt",sep="")) %>% arrange(POS)
    eqtl <- fread(paste("../GTEXv8_eQTL_brains/",tissue,".v8.EUR.allpairs.chrX_harmonized_snpid_shared.txt",sep="")) %>% arrange(POS)
    dim(eqtl)
    dim(xwas)    
    
    # prep so that only columns that are needed for coloc is present
    eqtl_subset <- subset(eqtl, eqtl$ID %in% xwas$ID) %>%
                   mutate(position = POS,
                          snp = ID) %>%
                   filter(!is.na(beta)) %>%
                   filter(rsID != "") %>%
                   select(rsID, snp, position, beta, varbeta, N, MAF, phenotype_id, pval_nominal) %>%
                   arrange(snp,pval_nominal) %>%
                   group_by(snp) %>%
                   slice(1:1) %>%
                   arrange(position) %>%
                   data.frame()
  
    xwas_subset <- subset(xwas, xwas$ID %in% eqtl_subset$snp) %>%
                   filter(rsID != "") %>%
                   mutate(position = POS,
                          snp = ID, 
                          beta = BETA,
                          varbeta = (BETA_SE**2)) %>%
                   select(rsID, snp, position, beta, varbeta,P)
    
    # check dimensions
    dim(eqtl_subset)
    dim(xwas_subset)
        
    # convert to acceptable data structure for input in coloc
    eqtl_subset_list <- as.list(eqtl_subset)
    eqtl_subset_list$type <- "quant"
    xwas_subset_list <- as.list(xwas_subset)
    xwas_subset_list$type <- "cc"
    xwas_subset_list$N <- 2591 + 4023
    #check_dataset(eqtl_subset_list)
    #check_dataset(xwas_subset_list)
        
    # run coloc.abf
    print(paste("run coloc for: ",tissue,sep=""))
    my.res <- coloc.abf(dataset1=xwas_subset_list,
                    dataset2=eqtl_subset_list)
    print(my.res)
    output_name <- paste("coloc_",rsid,"_",tissue, sep="")

    summary <- my.res$summary %>% data.frame() %>% t() %>% data.frame() %>% mutate(Tissue = tissue, index_variant = rsid)
    results <- my.res$results %>% data.frame() %>% mutate(Tissue = tissue,index_variant = rsid)
    priors <- my.res$priors %>% data.frame() %>% t() %>% data.frame() %>% mutate(Tissue = tissue, index_variant = rsid)
    
    write.table(summary, paste(output_name,"_summary.txt",sep=""), quote=F, sep="\t", row.names=F, col.names=T)
    write.table(results, paste(output_name,"_results.txt",sep=""), quote=F, sep="\t", row.names=F, col.names=T)
    write.table(priors, paste(output_name,"_priors.txt",sep=""), quote=F, sep="\t", row.names=F, col.names=T)
    print(paste("saving coloc results for: ",tissue,sep=""))
}

for (i in 1:length(brain_tissues)){
    run_coloc("rs141193614", i)
}

[-] Unloading gcc  11.3.0  ... 
[-] Unloading HDF5  1.12.2 
[-] Unloading netcdf  4.9.0 
[-] Unloading openmpi/4.1.3/gcc-11.3.0  ... 
[-] Unloading pandoc  2.18  on cn3180 
[-] Unloading R 4.3.0 
[+] Loading gcc  11.3.0  ... 
[+] Loading HDF5  1.12.2 
[+] Loading netcdf  4.9.0 
[-] Unloading gcc  11.3.0  ... 
[+] Loading gcc  11.3.0  ... 
[+] Loading openmpi/4.1.3/gcc-11.3.0  ... 
[+] Loading pandoc  2.18  on cn3180 
[+] Loading R 4.3.0 



R version 4.3.0 (2023-04-21) -- "Already Tomorrow"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> 
> require(data.table)


Loading required package: data.table


> require(tidyverse)


Loading required package: tidyverse
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::between()     masks data.table::between()
✖ dplyr::filter()      masks stats::filter()
✖ dplyr::first()       masks data.table::first()
✖ lubridate::hour()    masks data.table::hour()
✖ lubridate::isoweek() masks data.table::isoweek()
✖ dplyr::lag()         masks stats::lag()
✖ dplyr::last()        masks data.table::last()
✖ lubridate::mday()    masks data.table::mday()
✖ lubridate::minute()  masks data.table::minute()
✖ lubridate::month()   masks data.table::month()
✖ lubridate::quarter() masks data.table::quarter()
✖ lubridate::second()  masks data.table::second()
✖ purrr::transpose()   masks data.tab

> # install.packages("coloc")
> require(coloc)


Loading required package: coloc
This is coloc version 5.2.3


> #devtools::install_github("mrcieu/ieugwasr", force=TRUE)
> require(ieugwasr)


Loading required package: ieugwasr
API: public: http://gwas-api.mrcieu.ac.uk/


> #devtools::install_github("explodecomputer/genetics.binaRies")
> genetics.binaRies::get_plink_binary()
[1] "/gpfs/gsfs9/users/chiarp/R/rhel8/4.3/genetics.binaRies/bin/plink"
> 
> brain_tissues <- c("Brain_Amygdala","Brain_Anterior_cingulate_cortex_BA24","Brain_Caudate_basal_ganglia","Brain_Cerebellar_Hemisphere","Brain_Cerebellum","Brain_Cortex","Brain_Frontal_Cortex_BA9","Brain_Hippocampus","Brain_Hypothalamus","Brain_Nucleus_accumbens_basal_ganglia","Brain_Putamen_basal_ganglia","Brain_Spinal_cord_cervical_c-1","Brain_Substantia_nigra")
> 
> run_coloc <- function(rsid,i){
+     # read in files
+     tissue <- brain_tissues[i]
+     xwas <- fread(paste("XWAS_summ_stats_",rsid,"_flank1Mb_no-maf-filter.txt",sep="")) %>% arrange(POS)
+     eqtl <- fread(paste("../GTEXv8_eQTL_brains/",tissue,".v8.EUR.allpairs.chrX_harmonized_snpid_shared.txt",sep="")) %>% arrange(POS)
+     dim(eqtl)
+     dim(xwas)    
+     
+     # prep so that only columns that are needed for coloc is present
+     

Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
       H0     H1     H2         H3      H4
 0.869962 0.0602 0.0602 0.00361802 0.00602
       nsnps           H0           H1           H2           H3           H4 
6.020000e+02 8.974496e-01 4.408782e-02 5.173372e-02 2.537262e-03 4.191582e-03 
[1] "saving coloc results for: Brain_Amygdala"
[1] "run coloc for: Brain_Anterior_cingulate_cortex_BA24"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
 0.953000  0.022700  0.021400  0.000509  0.002130 
[1] "PP abf for shared variant: 0.213%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2        H3      H4
 0.9520315 0.0226 0.0226 0.0005085 0.00226
       nsnps           H0           H1           H2           H3           H4 
2.260000e+02 9.532109e-01 2.272881e-02 2.142625e-02 5.087724e-04 2.125248e-03 
[1] "saving coloc results for: Brain_Anterior_cingulate_cortex_BA24"
[1] "run coloc for: Brain_Caudate_basal_ganglia"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
  0.89200   0.04420   0.05620   0.00278   0.00448 
[1] "PP abf for shared variant: 0.448%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2         H3      H4
 0.8688516 0.0607 0.0607 0.00367842 0.00607
       nsnps           H0           H1           H2           H3           H4 
6.070000e+02 8.923663e-01 4.420335e-02 5.617161e-02 2.777979e-03 4.480789e-03 
[1] "saving coloc results for: Brain_Caudate_basal_ganglia"
[1] "run coloc for: Brain_Cerebellar_Hemisphere"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
 0.954000  0.022800  0.020400  0.000484  0.002120 
[1] "PP abf for shared variant: 0.212%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2        H3      H4
 0.9520315 0.0226 0.0226 0.0005085 0.00226
       nsnps           H0           H1           H2           H3           H4 
2.260000e+02 9.542574e-01 2.275376e-02 2.038152e-02 4.838630e-04 2.123422e-03 
[1] "saving coloc results for: Brain_Cerebellar_Hemisphere"
[1] "run coloc for: Brain_Cerebellum"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
 0.953000  0.022800  0.021000  0.000501  0.002250 
[1] "PP abf for shared variant: 0.225%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
       H0     H1     H2         H3      H4
 0.951817 0.0227 0.0227 0.00051302 0.00227
       nsnps           H0           H1           H2           H3           H4 
2.270000e+02 9.533992e-01 2.283337e-02 2.101768e-02 5.011130e-04 2.248664e-03 
[1] "saving coloc results for: Brain_Cerebellum"
[1] "run coloc for: Brain_Cortex"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
 0.953000  0.022800  0.021400  0.000509  0.002460 
[1] "PP abf for shared variant: 0.246%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
       H0     H1     H2         H3      H4
 0.951817 0.0227 0.0227 0.00051302 0.00227
       nsnps           H0           H1           H2           H3           H4 
2.270000e+02 9.528488e-01 2.282019e-02 2.136247e-02 5.091597e-04 2.459371e-03 
[1] "saving coloc results for: Brain_Cortex"
[1] "run coloc for: Brain_Frontal_Cortex_BA9"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
 0.954000  0.022800  0.020400  0.000487  0.002140 
[1] "PP abf for shared variant: 0.214%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
       H0     H1     H2         H3      H4
 0.951817 0.0227 0.0227 0.00051302 0.00227
       nsnps           H0           H1           H2           H3           H4 
2.270000e+02 9.540919e-01 2.284996e-02 2.042826e-02 4.871025e-04 2.142781e-03 
[1] "saving coloc results for: Brain_Frontal_Cortex_BA9"
[1] "run coloc for: Brain_Hippocampus"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
  0.88200   0.04360   0.06610   0.00326   0.00498 
[1] "PP abf for shared variant: 0.498%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2        H3      H4
 0.8690737 0.0606 0.0606 0.0036663 0.00606
       nsnps           H0           H1           H2           H3           H4 
6.060000e+02 8.820779e-01 4.362620e-02 6.605071e-02 3.261782e-03 4.983382e-03 
[1] "saving coloc results for: Brain_Hippocampus"
[1] "run coloc for: Brain_Hypothalamus"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
  0.89000   0.04400   0.05810   0.00287   0.00452 
[1] "PP abf for shared variant: 0.452%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2        H3      H4
 0.8690737 0.0606 0.0606 0.0036663 0.00606
       nsnps           H0           H1           H2           H3           H4 
6.060000e+02 8.904821e-01 4.404311e-02 5.808672e-02 2.868441e-03 4.519617e-03 
[1] "saving coloc results for: Brain_Hypothalamus"
[1] "run coloc for: Brain_Nucleus_accumbens_basal_ganglia"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
  0.88700   0.04400   0.06140   0.00304   0.00456 
[1] "PP abf for shared variant: 0.456%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2         H3      H4
 0.8686294 0.0608 0.0608 0.00369056 0.00608
       nsnps           H0           H1           H2           H3           H4 
6.080000e+02 8.869495e-01 4.402813e-02 6.141559e-02 3.044104e-03 4.562627e-03 
[1] "saving coloc results for: Brain_Nucleus_accumbens_basal_ganglia"
[1] "run coloc for: Brain_Putamen_basal_ganglia"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
  0.89400   0.04410   0.05490   0.00271   0.00451 
[1] "PP abf for shared variant: 0.451%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2        H3      H4
 0.8690737 0.0606 0.0606 0.0036663 0.00606
       nsnps           H0           H1           H2           H3           H4 
6.060000e+02 8.937788e-01 4.413670e-02 5.487266e-02 2.705222e-03 4.506571e-03 
[1] "saving coloc results for: Brain_Putamen_basal_ganglia"
[1] "run coloc for: Brain_Spinal_cord_cervical_c-1"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
  0.80400   0.07470   0.10400   0.00970   0.00726 
[1] "PP abf for shared variant: 0.726%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2         H3      H4
 0.6914074 0.1379 0.1379 0.01900262 0.01379
       nsnps           H0           H1           H2           H3           H4 
1.379000e+03 8.039040e-01 7.466283e-02 1.044755e-01 9.695929e-03 7.261761e-03 
[1] "saving coloc results for: Brain_Spinal_cord_cervical_c-1"
[1] "run coloc for: Brain_Substantia_nigra"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
 0.956000  0.022200  0.019700  0.000455  0.002100 
[1] "PP abf for shared variant: 0.21%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0    H1    H2        H3     H4
 0.9533182 0.022 0.022 0.0004818 0.0022
       nsnps           H0           H1           H2           H3           H4 
2.200000e+02 9.555769e-01 2.217815e-02 1.969396e-02 4.549845e-04 2.095985e-03 
[1] "saving coloc results for: Brain_Substantia_nigra"
> 




##### index_var4 (rs12860838)

In [163]:
%%bash
cd Analysis.COLOC_redo/females_only_cond_ApoE4/

module load R/4.3
R --vanilla --no-save

require(data.table)
require(tidyverse)
# install.packages("coloc")
require(coloc)
#devtools::install_github("mrcieu/ieugwasr", force=TRUE)
require(ieugwasr)
#devtools::install_github("explodecomputer/genetics.binaRies")
genetics.binaRies::get_plink_binary()

brain_tissues <- c("Brain_Amygdala","Brain_Anterior_cingulate_cortex_BA24","Brain_Caudate_basal_ganglia","Brain_Cerebellar_Hemisphere","Brain_Cerebellum","Brain_Cortex","Brain_Frontal_Cortex_BA9","Brain_Hippocampus","Brain_Hypothalamus","Brain_Nucleus_accumbens_basal_ganglia","Brain_Putamen_basal_ganglia","Brain_Spinal_cord_cervical_c-1","Brain_Substantia_nigra")

run_coloc <- function(rsid,i){
    # read in files
    tissue <- brain_tissues[i]
    xwas <- fread(paste("XWAS_summ_stats_",rsid,"_flank1Mb_no-maf-filter.txt",sep="")) %>% arrange(POS)
    eqtl <- fread(paste("../GTEXv8_eQTL_brains/",tissue,".v8.EUR.allpairs.chrX_harmonized_snpid_shared.txt",sep="")) %>% arrange(POS)
    dim(eqtl)
    dim(xwas)    
    
    # prep so that only columns that are needed for coloc is present
    eqtl_subset <- subset(eqtl, eqtl$ID %in% xwas$ID) %>%
                   mutate(position = POS,
                          snp = ID) %>%
                   filter(!is.na(beta)) %>%
                   filter(rsID != "") %>%
                   select(rsID, snp, position, beta, varbeta, N, MAF, phenotype_id, pval_nominal) %>%
                   arrange(snp,pval_nominal) %>%
                   group_by(snp) %>%
                   slice(1:1) %>%
                   arrange(position) %>%
                   data.frame()
  
    xwas_subset <- subset(xwas, xwas$ID %in% eqtl_subset$snp) %>%
                   filter(rsID != "") %>%
                   mutate(position = POS,
                          snp = ID, 
                          beta = BETA,
                          varbeta = (BETA_SE**2)) %>%
                   select(rsID, snp, position, beta, varbeta,P)
    
    # check dimensions
    dim(eqtl_subset)
    dim(xwas_subset)
        
    # convert to acceptable data structure for input in coloc
    eqtl_subset_list <- as.list(eqtl_subset)
    eqtl_subset_list$type <- "quant"
    xwas_subset_list <- as.list(xwas_subset)
    xwas_subset_list$type <- "cc"
    xwas_subset_list$N <- 2591 + 4023
    #check_dataset(eqtl_subset_list)
    #check_dataset(xwas_subset_list)
        
    # run coloc.abf
    print(paste("run coloc for: ",tissue,sep=""))
    my.res <- coloc.abf(dataset1=xwas_subset_list,
                    dataset2=eqtl_subset_list)
    print(my.res)
    output_name <- paste("coloc_",rsid,"_",tissue, sep="")

    summary <- my.res$summary %>% data.frame() %>% t() %>% data.frame() %>% mutate(Tissue = tissue, index_variant = rsid)
    results <- my.res$results %>% data.frame() %>% mutate(Tissue = tissue,index_variant = rsid)
    priors <- my.res$priors %>% data.frame() %>% t() %>% data.frame() %>% mutate(Tissue = tissue, index_variant = rsid)
    
    write.table(summary, paste(output_name,"_summary.txt",sep=""), quote=F, sep="\t", row.names=F, col.names=T)
    write.table(results, paste(output_name,"_results.txt",sep=""), quote=F, sep="\t", row.names=F, col.names=T)
    write.table(priors, paste(output_name,"_priors.txt",sep=""), quote=F, sep="\t", row.names=F, col.names=T)
    print(paste("saving coloc results for: ",tissue,sep=""))
}

for (i in 1:length(brain_tissues)){
    run_coloc("rs12860838", i)
}

[-] Unloading gcc  11.3.0  ... 
[-] Unloading HDF5  1.12.2 
[-] Unloading netcdf  4.9.0 
[-] Unloading openmpi/4.1.3/gcc-11.3.0  ... 
[-] Unloading pandoc  2.18  on cn3180 
[-] Unloading R 4.3.0 
[+] Loading gcc  11.3.0  ... 
[+] Loading HDF5  1.12.2 
[+] Loading netcdf  4.9.0 
[-] Unloading gcc  11.3.0  ... 
[+] Loading gcc  11.3.0  ... 
[+] Loading openmpi/4.1.3/gcc-11.3.0  ... 
[+] Loading pandoc  2.18  on cn3180 
[+] Loading R 4.3.0 



R version 4.3.0 (2023-04-21) -- "Already Tomorrow"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> 
> require(data.table)


Loading required package: data.table


> require(tidyverse)


Loading required package: tidyverse
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::between()     masks data.table::between()
✖ dplyr::filter()      masks stats::filter()
✖ dplyr::first()       masks data.table::first()
✖ lubridate::hour()    masks data.table::hour()
✖ lubridate::isoweek() masks data.table::isoweek()
✖ dplyr::lag()         masks stats::lag()
✖ dplyr::last()        masks data.table::last()
✖ lubridate::mday()    masks data.table::mday()
✖ lubridate::minute()  masks data.table::minute()
✖ lubridate::month()   masks data.table::month()
✖ lubridate::quarter() masks data.table::quarter()
✖ lubridate::second()  masks data.table::second()
✖ purrr::transpose()   masks data.tab

> # install.packages("coloc")
> require(coloc)


Loading required package: coloc
This is coloc version 5.2.3


> #devtools::install_github("mrcieu/ieugwasr", force=TRUE)
> require(ieugwasr)


Loading required package: ieugwasr
API: public: http://gwas-api.mrcieu.ac.uk/


> #devtools::install_github("explodecomputer/genetics.binaRies")
> genetics.binaRies::get_plink_binary()
[1] "/gpfs/gsfs9/users/chiarp/R/rhel8/4.3/genetics.binaRies/bin/plink"
> 
> brain_tissues <- c("Brain_Amygdala","Brain_Anterior_cingulate_cortex_BA24","Brain_Caudate_basal_ganglia","Brain_Cerebellar_Hemisphere","Brain_Cerebellum","Brain_Cortex","Brain_Frontal_Cortex_BA9","Brain_Hippocampus","Brain_Hypothalamus","Brain_Nucleus_accumbens_basal_ganglia","Brain_Putamen_basal_ganglia","Brain_Spinal_cord_cervical_c-1","Brain_Substantia_nigra")
> 
> run_coloc <- function(rsid,i){
+     # read in files
+     tissue <- brain_tissues[i]
+     xwas <- fread(paste("XWAS_summ_stats_",rsid,"_flank1Mb_no-maf-filter.txt",sep="")) %>% arrange(POS)
+     eqtl <- fread(paste("../GTEXv8_eQTL_brains/",tissue,".v8.EUR.allpairs.chrX_harmonized_snpid_shared.txt",sep="")) %>% arrange(POS)
+     dim(eqtl)
+     dim(xwas)    
+     
+     # prep so that only columns that are needed for coloc is present
+     

Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
         H0     H1     H2        H3      H4
 -0.2558259 0.4857 0.4857 0.2358559 0.04857
       nsnps           H0           H1           H2           H3           H4 
4.857000e+03 1.631117e-01 4.220668e-01 1.036972e-01 2.682835e-01 4.284086e-02 
[1] "saving coloc results for: Brain_Amygdala"
[1] "run coloc for: Brain_Anterior_cingulate_cortex_BA24"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.1460    0.3770    0.1220    0.3160    0.0398 
[1] "PP abf for shared variant: 3.98%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
         H0    H1    H2        H3     H4
 -0.2659721 0.489 0.489 0.2390721 0.0489
       nsnps           H0           H1           H2           H3           H4 
4.890000e+03 1.455758e-01 3.771412e-01 1.218449e-01 3.156222e-01 3.981586e-02 
[1] "saving coloc results for: Brain_Anterior_cingulate_cortex_BA24"
[1] "run coloc for: Brain_Caudate_basal_ganglia"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.1510    0.3920    0.1130    0.2920    0.0517 
[1] "PP abf for shared variant: 5.17%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
         H0     H1     H2        H3      H4
 -0.2684351 0.4898 0.4898 0.2398551 0.04898
       nsnps           H0           H1           H2           H3           H4 
4.898000e+03 1.514486e-01 3.924806e-01 1.126149e-01 2.917911e-01 5.166476e-02 
[1] "saving coloc results for: Brain_Caudate_basal_ganglia"
[1] "run coloc for: Brain_Cerebellar_Hemisphere"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.0166    0.0430    0.2470    0.6390    0.0547 
[1] "PP abf for shared variant: 5.47%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
         H0     H1     H2        H3      H4
 -0.2675113 0.4895 0.4895 0.2395613 0.04895
       nsnps           H0           H1           H2           H3           H4 
4.895000e+03 1.659034e-02 4.299379e-02 2.466217e-01 6.390644e-01 5.472982e-02 
[1] "saving coloc results for: Brain_Cerebellar_Hemisphere"
[1] "run coloc for: Brain_Cerebellum"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.1030    0.2680    0.1590    0.4120    0.0579 
[1] "PP abf for shared variant: 5.79%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2       H3      H4
 -0.269359 0.4901 0.4901 0.240149 0.04901
       nsnps           H0           H1           H2           H3           H4 
4.901000e+03 1.034065e-01 2.680317e-01 1.588840e-01 4.117724e-01 5.790539e-02 
[1] "saving coloc results for: Brain_Cerebellum"
[1] "run coloc for: Brain_Cortex"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
    0.128     0.333     0.115     0.299     0.125 
[1] "PP abf for shared variant: 12.5%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
         H0     H1     H2        H3      H4
 -0.2678192 0.4896 0.4896 0.2396592 0.04896
       nsnps           H0           H1           H2           H3           H4 
4896.0000000    0.1283041    0.3325170    0.1154377    0.2990473    0.1246939 
[1] "saving coloc results for: Brain_Cortex"
[1] "run coloc for: Brain_Frontal_Cortex_BA9"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.1610    0.4170    0.1040    0.2710    0.0473 
[1] "PP abf for shared variant: 4.73%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0   H1   H2       H3    H4
 -0.269051 0.49 0.49 0.240051 0.049
       nsnps           H0           H1           H2           H3           H4 
4.900000e+03 1.607783e-01 4.167250e-01 1.044717e-01 2.707354e-01 4.728951e-02 
[1] "saving coloc results for: Brain_Frontal_Cortex_BA9"
[1] "run coloc for: Brain_Hippocampus"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.1350    0.3490    0.1330    0.3440    0.0397 
[1] "PP abf for shared variant: 3.97%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2       H3      H4
 -0.269359 0.4901 0.4901 0.240149 0.04901
       nsnps           H0           H1           H2           H3           H4 
4.901000e+03 1.347504e-01 3.492862e-01 1.326013e-01 3.436759e-01 3.968615e-02 
[1] "saving coloc results for: Brain_Hippocampus"
[1] "run coloc for: Brain_Hypothalamus"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.1620    0.4190    0.1030    0.2670    0.0493 
[1] "PP abf for shared variant: 4.93%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
         H0     H1     H2        H3      H4
 -0.2665877 0.4892 0.4892 0.2392677 0.04892
       nsnps           H0           H1           H2           H3           H4 
4.892000e+03 1.615881e-01 4.186825e-01 1.031654e-01 2.672573e-01 4.930665e-02 
[1] "saving coloc results for: Brain_Hypothalamus"
[1] "run coloc for: Brain_Nucleus_accumbens_basal_ganglia"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
    0.132     0.343     0.130     0.337     0.058 
[1] "PP abf for shared variant: 5.8%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
        H0     H1     H2       H3      H4
 -0.269667 0.4902 0.4902 0.240247 0.04902
       nsnps           H0           H1           H2           H3           H4 
4.902000e+03 1.323240e-01 3.430071e-01 1.299352e-01 3.367570e-01 5.797667e-02 
[1] "saving coloc results for: Brain_Nucleus_accumbens_basal_ganglia"
[1] "run coloc for: Brain_Putamen_basal_ganglia"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.1450    0.3750    0.1170    0.3030    0.0607 
[1] "PP abf for shared variant: 6.07%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
         H0    H1    H2        H3     H4
 -0.2659721 0.489 0.489 0.2390721 0.0489
       nsnps           H0           H1           H2           H3           H4 
4.890000e+03 1.445514e-01 3.745368e-01 1.170300e-01 3.031674e-01 6.071437e-02 
[1] "saving coloc results for: Brain_Putamen_basal_ganglia"
[1] "run coloc for: Brain_Spinal_cord_cervical_c-1"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.1600    0.4130    0.1090    0.2810    0.0382 
[1] "PP abf for shared variant: 3.82%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
         H0    H1    H2        H3     H4
 -0.2567474 0.486 0.486 0.2361474 0.0486
       nsnps           H0           H1           H2           H3           H4 
4.860000e+03 1.595135e-01 4.127673e-01 1.085782e-01 2.809256e-01 3.821548e-02 
[1] "saving coloc results for: Brain_Spinal_cord_cervical_c-1"
[1] "run coloc for: Brain_Substantia_nigra"
PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf 
   0.1710    0.4370    0.0911    0.2330    0.0676 
[1] "PP abf for shared variant: 6.76%"


Coloc analysis of trait 1, trait 2

SNP Priors

Hypothesis Priors

Posterior


   p1    p2   p12 
1e-04 1e-04 1e-05 
         H0     H1     H2        H3      H4
 -0.1133861 0.4386 0.4386 0.1923261 0.04386
       nsnps           H0           H1           H2           H3           H4 
4.386000e+03 1.707745e-01 4.374195e-01 9.105843e-02 2.331683e-01 6.757928e-02 
[1] "saving coloc results for: Brain_Substantia_nigra"
> 




#### merge coloc results and make posterior probability table

In [164]:
%%bash
cd Analysis.COLOC_redo/

head -n 1 females_only/coloc_rs141773145_Brain_Amygdala_summary.txt > females_only/coloc_posteriors_compiled_allIndexVars_allBrainTissues.txt
cat females_only/coloc_rs141773145_Brain_*_summary.txt females_only/coloc_rs12860838_Brain_*_summary.txt | grep -v "Tissue" >> females_only/coloc_posteriors_compiled_allIndexVars_allBrainTissues.txt


head -n 1 females_only_cond_ApoE4/coloc_rs141773145_Brain_Amygdala_summary.txt > females_only_cond_ApoE4/coloc_posteriors_compiled_allIndexVars_allBrainTissues.txt
cat females_only_cond_ApoE4/coloc_rs141773145_Brain_*_summary.txt females_only_cond_ApoE4/coloc_rs6648060_Brain_*_summary.txt females_only_cond_ApoE4/coloc_rs141193614_Brain_*_summary.txt females_only_cond_ApoE4/coloc_rs12860838_Brain_*_summary.txt | grep -v "Tissue" >> females_only_cond_ApoE4/coloc_posteriors_compiled_allIndexVars_allBrainTissues.txt


In [165]:
import pandas as pd
compiled_post = pd.read_csv("Analysis.COLOC_redo/females_only/coloc_posteriors_compiled_allIndexVars_allBrainTissues.txt",sep="\t")
compiled_post

Unnamed: 0,nsnps,PP.H0.abf,PP.H1.abf,PP.H2.abf,PP.H3.abf,PP.H4.abf,Tissue,index_variant
0,2015,0.636188,0.098066,0.218374,0.033648,0.013724,Brain_Amygdala,rs141773145
1,2022,0.644494,0.099806,0.209482,0.032427,0.013792,Brain_Anterior_cingulate_cortex_BA24,rs141773145
2,2034,0.01333,0.00208,0.825616,0.128778,0.030196,Brain_Caudate_basal_ganglia,rs141773145
3,2032,0.060078,0.009348,0.779921,0.121331,0.029322,Brain_Cerebellar_Hemisphere,rs141773145
4,2035,5.9e-05,9e-06,0.839204,0.130955,0.029773,Brain_Cerebellum,rs141773145
5,2035,0.459558,0.071636,0.389932,0.060765,0.018108,Brain_Cortex,rs141773145
6,2028,0.105517,0.016404,0.735825,0.114366,0.027888,Brain_Frontal_Cortex_BA9,rs141773145
7,2035,0.469551,0.073282,0.379477,0.059206,0.018485,Brain_Hippocampus,rs141773145
8,2025,0.375991,0.058281,0.471092,0.073,0.021637,Brain_Hypothalamus,rs141773145
9,2035,0.183217,0.028599,0.660042,0.103003,0.025139,Brain_Nucleus_accumbens_basal_ganglia,rs141773145


In [166]:
compiled_post_cond = pd.read_csv("Analysis.COLOC_redo/females_only_cond_ApoE4/coloc_posteriors_compiled_allIndexVars_allBrainTissues.txt",sep="\t")
compiled_post_cond

Unnamed: 0,nsnps,PP.H0.abf,PP.H1.abf,PP.H2.abf,PP.H3.abf,PP.H4.abf,Tissue,index_variant
0,2013,0.627853,0.105706,0.215384,0.036247,0.01481,Brain_Amygdala,rs141773145
1,2022,0.635944,0.107563,0.206703,0.034947,0.014843,Brain_Anterior_cingulate_cortex_BA24,rs141773145
2,2032,0.013149,0.002238,0.814372,0.138566,0.031676,Brain_Caudate_basal_ganglia,rs141773145
3,2030,0.059295,0.010071,0.76976,0.130705,0.030169,Brain_Cerebellar_Hemisphere,rs141773145
4,2033,5.8e-05,1e-05,0.827426,0.140841,0.031665,Brain_Cerebellum,rs141773145
5,2033,0.453457,0.077141,0.384662,0.065419,0.019321,Brain_Cortex,rs141773145
6,2026,0.104216,0.017674,0.726726,0.123219,0.028165,Brain_Frontal_Cortex_BA9,rs141773145
7,2033,0.463549,0.078907,0.374529,0.063734,0.01928,Brain_Hippocampus,rs141773145
8,2024,0.371126,0.062833,0.464955,0.078697,0.022388,Brain_Hypothalamus,rs141773145
9,2033,0.180811,0.030785,0.651338,0.11087,0.026196,Brain_Nucleus_accumbens_basal_ganglia,rs141773145
