# fastENLOC from susie objects

The first input for the pipeline is a table listing the `susie` objects for each of the LD Blocks. This takes the form of one column with the name of the LD block and another with the path to the object file.

In [6]:
head /restricted/projectnb/casa/oaolayin/AD_GWAS_sum_stats/fastenloc/susie_tables/kunkle.tsv

ld_block	susie_object_file
chr1_101384274_104443097	/restricted/projectnb/casa/oaolayin/AD_GWAS_sum_stats/Final_Finemapping/outdir/kunkle/kunkle_sumstat_hg38_qc.chr1.chr1_101384274_104443097.unisusie_rss.fit.rds
chr1_104443097_106225286	/restricted/projectnb/casa/oaolayin/AD_GWAS_sum_stats/Final_Finemapping/outdir/kunkle/kunkle_sumstat_hg38_qc.chr1.chr1_104443097_106225286.unisusie_rss.fit.rds
chr1_106225286_109761915	/restricted/projectnb/casa/oaolayin/AD_GWAS_sum_stats/Final_Finemapping/outdir/kunkle/kunkle_sumstat_hg38_qc.chr1.chr1_106225286_109761915.unisusie_rss.fit.rds
chr1_109761915_111483530	/restricted/projectnb/casa/oaolayin/AD_GWAS_sum_stats/Final_Finemapping/outdir/kunkle/kunkle_sumstat_hg38_qc.chr1.chr1_109761915_111483530.unisusie_rss.fit.rds
chr1_111483530_113276642	/restricted/projectnb/casa/oaolayin/AD_GWAS_sum_stats/Final_Finemapping/outdir/kunkle/kunkle_sumstat_hg38_qc.chr1.chr1_111483530_113276642.unisusie_rss.fit.rds
chr1_113276642_115338054	/restricted/projectnb/c

Additionally we need a VCF file containing the xQTLs for the tissue of interest

In [1]:
zcat /restricted/projectnb/casa/oaolayin/gtex_v8.eqtl_annot_rsid.vcf.gz | head 

chr1	14677	rs201327123	G	A	ENSG00000228463:2@Skin_Not_Sun_Exposed=1.00000e+00[1.000e+00:1]|ENSG00000228463:1@Adipose_Visceral_Omentum=1.00000e+00[1.000e+00:1]|ENSG00000228463:2@Nerve_Tibial=1.00000e+00[1.000e+00:1]|ENSG00000228463:2@Muscle_Skeletal=1.00000e+00[1.000e+00:1]|ENSG00000228463:2@Skin_Sun_Exposed=9.99998e-01[1.000e+00:1]|ENSG00000228463:1@Heart_Left_Ventricle=9.99995e-01[1.000e+00:1]|ENSG00000228463:1@Heart_Atrial_Appendage=9.99990e-01[1.000e+00:1]|ENSG00000241860:1@Skin_Sun_Exposed=9.99986e-01[1.000e+00:1]|ENSG00000241860:1@Skin_Not_Sun_Exposed=9.99983e-01[1.000e+00:1]|ENSG00000228327:1@Muscle_Skeletal=9.99942e-01[9.999e-01:1]|ENSG00000241860:1@Nerve_Tibial=9.99938e-01[9.999e-01:1]|ENSG00000228327:1@Adipose_Visceral_Omentum=9.99918e-01[9.999e-01:1]|ENSG00000228327:2@Adipose_Subcutaneous=9.99918e-01[9.999e-01:1]|ENSG00000228327:1@Heart_Atrial_Appendage=9.99866e-01[9.999e-01:1]|ENSG00000228327:2@Lung=9.99832e-01[9.998e-01:1]|ENSG00000241860:1@Artery_Tibial=9.99822e-01[9.998e-

In [7]:
[global]
parameter: file_table = path
parameter: out_file = path
parameter: out_pre = path
parameter: eqtl_vcf = path
parameter: tissue = ''
parameter: container = ''
parameter: job_size = 1
parameter: walltime = "5h"
parameter: mem = "8G"
parameter: numThreads = 1

In the first step we convert the susie objects into one table for use with fastENLOC.

In [2]:
[fastenloc_1]
task: trunk_workers = 1, trunk_size = job_size, walltime = walltime, mem = mem, cores = numThreads, tags = f'{step_name}_{_output:bn}'
R: expand= "$[ ]"
  susie_tbl = read.csv('$[file_table]', sep = "\t")
  out_tbl = list()
  out_tbl$var = c()
  out_tbl$pip = c()
  out_tbl$set = c()
  for(idx in seq(1,nrow(susie_tbl))) {
    ld_block = susie_tbl$ld_block[idx]
    filename = susie_tbl$susie_object_file[idx]
    ssie = readRDS(filename)
    vars = ssie$variants
    out_tbl$var = c(out_tbl$var, vars)
    out_tbl$set = c(out_tbl$set, rep(ld_block, length(vars)))
    pip = ssie$pip
    out_tbl$pip = c(out_tbl$pip, pip)
  }
  out_tbl = as.data.frame(out_tbl)
  out_tbl$var = paste0(out_tbl$var, "_b38")
  gzf = gzfile('$[out_file]', 'w+')
  write.table(out_tbl, gzf, sep = "\t", quote = F, row.names = F, col.names = F)
  close(gzf)

And then run fastenloc on that table

In [3]:
[fastenloc_2]
task: trunk_workers = 1, trunk_size = job_size, walltime = walltime, mem = mem, cores = numThreads, tags = f'{step_name}_{_output:bn}'
sh: expand=True
    fastenloc -eqtl {eqtl_vcf} -gwas {out_file} -t {tissue} -prefix {out_pre}

# Example
In this example we run it on the brain tissue eQTL found in GTEx.

In [None]:
tissues=$(zcat /restricted/projectnb/casa/oaolayin/gtex_v8.eqtl_annot_rsid.vcf.gz | head -n 2 | cut -f6 | tr "@" "\n" | grep "=" | cut -d"=" -f1 | grep Brain | sort | uniq )
tmp_file=$(mktemp)
for t in ${tissues}; do
    sos run /restricted/projectnb/casa/oaolayin/fastenloc.ipynb fastenloc \
        --file-table /restricted/projectnb/casa/oaolayin/AD_GWAS_sum_stats/fastenloc/susie_tables/kunkle.tsv \
        --eqtl-vcf /restricted/projectnb/casa/oaolayin/gtex_v8.eqtl_annot_rsid.vcf.gz \
        --out-file ${tmp_file} --out-pre /restricted/projectnb/casa/oaolayin/fastenloc_test/kunkle_${t} --tissue ${t} \
        --container /restricted/projectnb/casa/oaolayin/xqtl-pipeline/container/singularity/fastenloc.sif
done


gzip: stdout: Broken pipe
INFO: Running [32mfastenloc_1[0m: 
INFO: [32mfastenloc_1[0m is [32mcompleted[0m.
INFO: Running [32mfastenloc_2[0m: 

		                     fastENLOC (v2.0)                        

		                        April, 2022                              




Parameters and options:

Input files:
    * Molecular qtl annotation file: /restricted/projectnb/casa/oaolayin/gtex_v8.eqtl_annot_rsid.vcf.gz
    * GWAS fine-mapping file: /scratch/192325.1.ood/tmp.ke7wRoXfOG
    * Tissue specified: Brain_Caudate_basal_ganglia

Enrichment parameters:
    * Rounds of multiple imputation: 25
    * Shrinkage parameter: 1.0

Miscsellaneous options:
    * Total GWAS variants: unspecified, use GWAS file input
    * Simultaneous running threads: 1

Output options:
    * Output file prefix: /restricted/projectnb/casa/oaolayin/fastenloc_test/kunkle_Brain_Caudate_basal_ganglia
    * RCP and SCP output threshold: 1.0e-04


Processing eQTL annotations ... 
read in 895725 SNPs, 585