# How to conduct fine-mapping analysis for eQTL data from islet tissue on gene level

Islet gene-level eQTL data were downloaded from Viñuela, A., Varshney, A., van de Bunt, M. et al. Genetic variant effects on gene expression in human pancreatic islets and their implications for T2D. Nat Commun 11, 4912 (2020). https://doi.org/10.1038/s41467-020-18581-8. Specifically, the list of lead signals and the full summary statistic can be downloaded from https://zenodo.org/record/3408356

This documentation contains instruction on how to conduct fine-mapping analysis for islet gene-level eQTL data.

In this analysis, we employed genotype data from 40,000 unrelated British individuals in the UK Biobank.

We thank Dr. Arushi Varshney (Parker Lab) for their valuable support in shaping the analysis strategies and code development.

## Step 1: LiftOver the summary data from hg19 to hg38

Inputs from this analysis involved the UKBB reference data from 40K unrelated individuals, full summary stat file `InsPIRE_Islets_Gene_eQTLs_Nominal_Pvalues.txt.gz` and list of eGenes (i.e. genes with significant correlation with changes in expression) `PacreaticIslets_independent_Gene_eQTLs.txt`. See example of a Snakemake file at `scripts/hg19liftOverToHg38.sf` and `scripts/hg19liftOverToHg38_leads.sf`.

The example Snakefile `scripts/hg19liftOverToHg38.sf` was developed to conduct the follwing steps:
- Step 1 (`rule ukbb_rsids`): this rule excludes genotypes from the UKBB reference vcf files to make a smaller vcf file per chromosome.
- Step 2 (`rule merge_ukbb`): this rule merges the smaller vcf file per chromosome to create a genome-wide reference file.
- Step 3 (`rule liftover`): this rule translates the hg19 coordinates of variants in the summary stat file to hg38. LiftOver tool and hg19toHg19 chain file. To download LiftOver, follow here https://hgdownload.soe.ucsc.edu/downloads.html#liftover
- Step 4 (`rule getNewCoor`): this rule will create a file with the new hg38 coordinates and the summary stats
- Step 5 (`rule alignGenes`): this rule will keep SNPs within 500kb of gene TSS only, because after liftover some gene and SNP coordinates may change.
- Step 6 (`rule splitGene`): this rule gives per-feature (e.g., per gene) variants and summary stats
- Step 7 (`rule getSummStatEqtl`): this rule merges variants and summ. stats across all chromosomes and index the file

The Snakefile `scripts/hg19liftOverToHg38_leads.sf` is organized similar to `scripts/hg19liftOverToHg38.sf`; however, it has a different purpose as it was used to liftover lead SNPs of eGenes.

## Step 2: Set up data for the fine-mapping pipeline

We set up a file with gene-level summary stat files for all lead signals, which will be used in the next step.

In [1]:
library(dplyr)

#ind are SNPs in hg19 genome
ind <- read.table("/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_eQTL-inspire-susie/data/PacreaticIslets_independent_Gene_eQTLs.txt", header = T)
ind$DiscoveryOrder2 <- ifelse(ind$DiscoveryOrder == 1, "P", "S")
ind$GeneStableID <- unlist(lapply(strsplit(ind$GeneID, '\\.'), '[', 1))
head(ind)


Attaching package: ‘dplyr’


The following objects are masked from ‘package:stats’:

    filter, lag


The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union




Unnamed: 0_level_0,GeneName,Strand,GencodeLevel,GeneType,GeneID,ChrPheno,StartPheno,EndPheno,NumSNPs,DistanceWithBest,⋯,chrSNP,StartSNP,EndSNP,Nominal_Pval,Slope,EmpiricalAdjustedPval,BetaAdjustedPval,DiscoveryOrder,DiscoveryOrder2,GeneStableID
Unnamed: 0_level_1,<chr>,<chr>,<int>,<chr>,<chr>,<int>,<int>,<int>,<int>,<int>,⋯,<int>,<int>,<int>,<dbl>,<dbl>,<dbl>,<dbl>,<int>,<chr>,<chr>
1,ACTR8,-,1,protein_coding,ENSG00000113812.9,3,53916229,53916229,5186,6498,⋯,3,53909731,53909731,4.59563e-44,-0.332961,0.000999001,7.04865e-37,1,P,ENSG00000113812
2,ERC2,-,2,protein_coding,ENSG00000187672.8,3,56502391,56502391,5997,-232239,⋯,3,56734630,56734630,9.17441e-07,0.212021,0.002997,0.00194155,1,P,ENSG00000187672
3,CCDC66,+,2,protein_coding,ENSG00000180376.12,3,56591189,56591189,5981,-134916,⋯,3,56726105,56726105,8.62068e-11,0.171517,0.000999001,3.52246e-07,1,P,ENSG00000180376
4,ARHGEF3,-,1,protein_coding,ENSG00000163947.7,3,57113357,57113357,5666,319635,⋯,3,56793722,56793722,8.44758e-07,-0.149698,0.000999001,0.00124083,1,P,ENSG00000163947
5,ABHD6,+,2,protein_coding,ENSG00000163686.9,3,58223233,58223233,4746,-56828,⋯,3,58280061,58280061,6.79776e-12,-0.228906,0.000999001,8.71536e-09,1,P,ENSG00000163686
6,PXK,+,1,protein_coding,ENSG00000168297.11,3,58318607,58318607,4823,41071,⋯,3,58277536,58277536,1.21914e-16,-0.299304,0.000999001,5.84402e-12,1,P,ENSG00000168297


In [2]:
#lead are SNPs in hg38 genome and matching with ukbb
lead <- read.table("/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/2_t1d-eQTL-coloc/results/hg38/eqtl_Gene_leads/eqtl_Gene.bed.gz", 
                   comment.char = "", header = T, fill = T)
lead <- inner_join(lead, ind[, c("Nominal_Pval",  "GeneStableID", "DiscoveryOrder2")], by = c("Pvalue" = "Nominal_Pval", "GeneStableID" = "GeneStableID"))
df38 <- data.frame(seqnames = lead$X.snp_chrom, start = lead$snp_start, end = lead$snp_end,
                   name = paste0(lead$GeneName, "__", lead$SNP, "__", lead$DiscoveryOrder2),
                   gene_id = lead$GeneStableID)
head(df38)

“[1m[22mDetected an unexpected many-to-many relationship between `x` and `y`.
[36mℹ[39m Row 112 of `x` matches multiple rows in `y`.
[36mℹ[39m Row 1344 of `y` matches multiple rows in `x`.
[36mℹ[39m If a many-to-many relationship is expected, set `relationship =


Unnamed: 0_level_0,seqnames,start,end,name,gene_id
Unnamed: 0_level_1,<chr>,<int>,<int>,<chr>,<chr>
1,chr1,989147,989148,RP11-54O7.17__rs34712273__P,ENSG00000272512
2,chr1,1083799,1083800,RP11-54O7.18__rs9442396__P,ENSG00000273443
3,chr1,1407231,1407232,B3GALT6__rs2275915__P,ENSG00000176022
4,chr1,1407564,1407565,MRPL20__rs34442823__P,ENSG00000242485
5,chr1,1418251,1418252,ANKRD65__rs4970449__P,ENSG00000235098
6,chr1,1508768,1508769,ATAD3B__rs190181683__P,ENSG00000160072


In [3]:
files <- list.files("/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/2_t1d-eQTL-coloc/results/hg38/eqtl_Gene_indexed/", pattern = "gz")
files <- files[grep("tbi", files, invert = T)]
file_df <- data.frame(eqtl_input = files)
file_df$gene <- unlist(lapply(strsplit(file_df$eqtl_input, '__'), '[', 1))
file_df$GeneStableID <- unlist(lapply(strsplit(file_df$gene, '\\.'), '[', 1))

df38 <- inner_join(df38, file_df, by = c("gene_id" = "GeneStableID"))
df38$eqtl_input <- paste0("/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/2_t1d-eQTL-coloc/results/hg38/eqtl_Gene_indexed/",
                          df38$eqtl_input)

tss <- read.table("/scratch/scjp_root/scjp99/vthihong/genome/geneTSS.bed", header = F)
df38 <- inner_join(df38, tss[, c(1, 2, 3, 6)], by = c("gene_id" = "V6"))
df <- df38[, c("V1", "V2", "V3", "name", "gene_id", "eqtl_input")]
df <- distinct(df)
colnames(df) <- c("chr", "start", "end", "locus", "gene_id", "eqtl_input")
head(df)

write.table(df, row.names = F, sep = "\t", quote = F,
            "/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_eQTL-inspire-susie/data/gene_eQTLs-selected.tsv")

Unnamed: 0_level_0,chr,start,end,locus,gene_id,eqtl_input
Unnamed: 0_level_1,<chr>,<int>,<int>,<chr>,<chr>,<chr>
1,chr1,998051,998052,RP11-54O7.17__rs34712273__P,ENSG00000272512,/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/2_t1d-eQTL-coloc/results/hg38/eqtl_Gene_indexed/ENSG00000272512.1__InsPIRE_Islets_Gene__1:995966.bed.gz
2,chr1,1063288,1063289,RP11-54O7.18__rs9442396__P,ENSG00000273443,/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/2_t1d-eQTL-coloc/results/hg38/eqtl_Gene_indexed/ENSG00000273443.1__InsPIRE_Islets_Gene__1:1062208.bed.gz
3,chr1,1232265,1232266,B3GALT6__rs2275915__P,ENSG00000176022,/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/2_t1d-eQTL-coloc/results/hg38/eqtl_Gene_indexed/ENSG00000176022.3__InsPIRE_Islets_Gene__1:1232237.bed.gz
4,chr1,1407313,1407314,MRPL20__rs34442823__P,ENSG00000242485,/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/2_t1d-eQTL-coloc/results/hg38/eqtl_Gene_indexed/ENSG00000242485.1__InsPIRE_Islets_Gene__1:1401909.bed.gz
5,chr1,1421769,1421770,ANKRD65__rs4970449__P,ENSG00000235098,/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/2_t1d-eQTL-coloc/results/hg38/eqtl_Gene_indexed/ENSG00000235098.4__InsPIRE_Islets_Gene__1:1418420.bed.gz
6,chr1,1471769,1471770,ATAD3B__rs190181683__P,ENSG00000160072,/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/2_t1d-eQTL-coloc/results/hg38/eqtl_Gene_indexed/ENSG00000160072.15__InsPIRE_Islets_Gene__1:1471765.bed.gz


Save the `df` object in a file named `gene_eQTLs-selected.tsv`

## Step 3: Set up scripts for every eGene of interest

First, we need to set up a config file with some house-keeping information such as directory of files and parameters. See example in `config.yaml`. The file `gene_eQTLs-selected.tsv` is used for `trait1-leads` and `selected-stats`. Then, we can use `scripts/make_susie-sh.py` script to create a SLURM job per region of interest.

Important note: the script `scripts/make_susie-sh.py` requires two other scripts that should be specified in the config file, namely:
```
prep-template: "{base}/scripts/dosage-template.sh"
susie-template: "{base}/scripts/susie-template.sh"
```

```
cd /scratch/scjp_root/scjp99/vthihong/2_PanKBase/colocGWAS_T1D/1_eQTL-inspire-susie/results/susie-region

python /scratch/scjp_root/scjp99/vthihong/2_PanKBase/colocGWAS_T1D/1_eQTL-inspire-susie/scripts/make_susie-sh.py --config /scratch/scjp_root/scjp99/vthihong/2_PanKBase/colocGWAS_T1D/1_eQTL-inspire-susie/scripts/config.yaml
```

At this point, we have a series of individual scripts for each region, with names in the format `gene_eQTLs__<locus name, no other special characters like ;() etc>__<lead snp rsid>__<primary P or secondary S>__<region>__<window>.susieprep.sh` and `gene_eQTLs___<locus name, no other special characters like ;() etc>__<lead snp rsid>__<primary P or secondary S>__<region>__<window>.susie.sh`. The `*susieprep.sh` is necessary to fetch information such as variants and dosages. The `*susie.sh` is to run the fine-mapping analysis.

Example of a `susieprep.sh` file is as the following:
```
cat gene_eQTLs__A1CF__rs12244405__P__chr10-50635675-51135676__250kb.susieprep.sh

#!/bin/bash

## fetch variants in the region and intersect UKBB vcfs
for i in /nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/0_data/hg38/chr10.imputed.poly.vcf.gz; do tabix $i chr10:50635675-51135676 | awk '{if (($0 !~ /^#/ && $0 !~ /^chr/)) print "chr"$0; else print $0}' ; done | sort | uniq > gene_eQTLs__A1CF__rs12244405__P__chr10-50635675-51135676__250kb.ukbb.genotypes
zcat /nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/0_data/hg38/chr10.imputed.poly.vcf.gz | head -10000 | awk '{if (($0 ~ /^#/)) print $0}' > gene_eQTLs__A1CF__rs12244405__P__chr10-50635675-51135676__250kb.ukbb.header
cat gene_eQTLs__A1CF__rs12244405__P__chr10-50635675-51135676__250kb.ukbb.header gene_eQTLs__A1CF__rs12244405__P__chr10-50635675-51135676__250kb.ukbb.genotypes | bgzip -c > gene_eQTLs__A1CF__rs12244405__P__chr10-50635675-51135676__250kb.ukbb.vcf.gz; tabix gene_eQTLs__A1CF__rs12244405__P__chr10-50635675-51135676__250kb.ukbb.vcf.gz
rm gene_eQTLs__A1CF__rs12244405__P__chr10-50635675-51135676__250kb.ukbb.genotypes gene_eQTLs__A1CF__rs12244405__P__chr10-50635675-51135676__250kb.ukbb.header

## fetch UKBB dosages 
zcat gene_eQTLs__A1CF__rs12244405__P__chr10-50635675-51135676__250kb.ukbb.vcf.gz | head -10000 | awk -F'\t' '{if (($0 ~/^#CHROM/)) print $0}' OFS='\t' | sed -e 's:#CHROM\tPOS\tID\tREF\tALT\tQUAL\tFILTER\tINFO\tFORMAT:ID:g' > gene_eQTLs__A1CF__rs12244405__P__chr10-50635675-51135676__250kb.ukbb-header.txt 
bcftools query -f "%ID-%REF-%ALT[\t%DS]\n" gene_eQTLs__A1CF__rs12244405__P__chr10-50635675-51135676__250kb.ukbb.vcf.gz | cat gene_eQTLs__A1CF__rs12244405__P__chr10-50635675-51135676__250kb.ukbb-header.txt - > gene_eQTLs__A1CF__rs12244405__P__chr10-50635675-51135676__250kb.ukbb-dosages.tsv 

## bgzip to save space
module load Bioinformatics
module load Bioinformatics  gcc/10.3.0-k2osx5y
module load samtools/1.13-fwwss5n

bgzip -@ 2 gene_eQTLs__A1CF__rs12244405__P__chr10-50635675-51135676__250kb.ukbb-dosages.tsv

## cleanup
rm -rf gene_eQTLs__A1CF__rs12244405__P__chr10-50635675-51135676__250kb.ukbb-header.txt gene_eQTLs__A1CF__rs12244405__P__chr10-50635675-51135676__250kb.ukbb.vcf.gz*
```

Example of a `susie.sh` file is as the following:
```
cat gene_eQTLs__A1CF__rs12244405__P__chr10-50635675-51135676__250kb.susie.sh

#!/bin/bash

################## running SuSiE for  gene_eQTLs__A1CF__rs12244405__P__chr10-50635675-51135676__250kb:

## Susie 
/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/scripts/PanKgraph-finemap-coloc/1_eQTL-inspire-susie/scripts/susie-eqtl.R --prefix gene_eQTLs__A1CF__rs12244405__P__chr10-50635675-51135676__250kb --type quant --beta beta --p p_nominal --effect ALT --non_effect REF --sdY 1 --coverage 0.95 --maxit 10000 --min_abs_corr 0.1 --s_threshold 0.3 --number_signals_default 10 --number_signals_high_s 1 --marker rs12244405 --trait1 /nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/2_t1d-eQTL-coloc/results/hg38/eqtl_Gene_indexed/ENSG00000148584.10__InsPIRE_Islets_Gene__10:50799409.bed.gz --trait1_ld gene_eQTLs__A1CF__rs12244405__P__chr10-50635675-51135676__250kb.ukbb-dosages.tsv.gz 
```

## Step 4: Conduct fine-mapping analysis for all regions of interest

After we set up analysis scripts for each eGene, we can run the analysis for every eGene using Snakemake. See example of a Snakemake file at `scripts/susie.sf`.

Signals of each eGene by default will be saved in a R object names `*.susie.Rda`.

## Step 5: Obtain output files for PanKgraph

For the purpose of PanKgraph, we will extract some outputs into text files. Example of code is the following:

In [4]:
library(glue)
library(tidyr)
suppressPackageStartupMessages(library(dplyr))

In [5]:
process_dosage = function(f, snplist){
    ld = read.csv(f, sep='\t', check.names = F)
    dups = ld[ (duplicated(ld$ID) | duplicated(ld$ID, fromLast = TRUE)),]
    print(glue("N duplicates = {nrow(dups)}"))
    ld = ld[! (duplicated(ld$ID) | duplicated(ld$ID, fromLast = TRUE)),]
    row.names(ld) = ld$ID
    ld$ID = NULL
    idlist = intersect(snplist, row.names(ld))
    ld = ld[idlist,]
    print(ld[1:5, 1:10])
    ld = cor(t(ld))
    return(ld)
}

meta <- read.table("/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_eQTL-inspire-susie/data/gene_eQTLs-selected.tsv", header = T)
meta <- distinct(meta)
head(meta)

Unnamed: 0_level_0,chr,start,end,locus,gene_id,eqtl_input
Unnamed: 0_level_1,<chr>,<int>,<int>,<chr>,<chr>,<chr>
1,chr1,998051,998052,RP11-54O7.17__rs34712273__P,ENSG00000272512,/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/2_t1d-eQTL-coloc/results/hg38/eqtl_Gene_indexed/ENSG00000272512.1__InsPIRE_Islets_Gene__1:995966.bed.gz
2,chr1,1063288,1063289,RP11-54O7.18__rs9442396__P,ENSG00000273443,/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/2_t1d-eQTL-coloc/results/hg38/eqtl_Gene_indexed/ENSG00000273443.1__InsPIRE_Islets_Gene__1:1062208.bed.gz
3,chr1,1232265,1232266,B3GALT6__rs2275915__P,ENSG00000176022,/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/2_t1d-eQTL-coloc/results/hg38/eqtl_Gene_indexed/ENSG00000176022.3__InsPIRE_Islets_Gene__1:1232237.bed.gz
4,chr1,1407313,1407314,MRPL20__rs34442823__P,ENSG00000242485,/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/2_t1d-eQTL-coloc/results/hg38/eqtl_Gene_indexed/ENSG00000242485.1__InsPIRE_Islets_Gene__1:1401909.bed.gz
5,chr1,1421769,1421770,ANKRD65__rs4970449__P,ENSG00000235098,/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/2_t1d-eQTL-coloc/results/hg38/eqtl_Gene_indexed/ENSG00000235098.4__InsPIRE_Islets_Gene__1:1418420.bed.gz
6,chr1,1471769,1471770,ATAD3B__rs190181683__P,ENSG00000160072,/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/2_t1d-eQTL-coloc/results/hg38/eqtl_Gene_indexed/ENSG00000160072.15__InsPIRE_Islets_Gene__1:1471765.bed.gz


In [6]:
l <- "A1CF__rs12244405__P__chr10-50635675-51135676"
input <- meta[meta$locus == "A1CF__rs12244405__P", "eqtl_input"]

for (k in 1:length(input)) {
        qtl <- read.csv(input[k], sep='\t', header=T, check.names=F)
        qtl$snp <- paste0(qtl$SNP, "-", qtl$REF, "-", qtl$ALT)
        qtl$Slope <- qtl$Slope / qtl$multiply #get the slope originally reported by the study
        load(paste0("/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_eQTL-inspire-susie/results/susie-prep/gene_eQTLs__", 
                    l, "__250kb.susie.Rda"))

        if (length(S2$sets$cs) > 0) {
        for (j in 1:length(S2$sets$cs)) {
                pip <- data.frame(pip=S2$pip[names(S2$sets$cs[[j]])])
                if (S2$sets$coverage[[j]] < 0.95) {
                        print(names(S2$sets$cs[[j]]))
                        next
                }

                pip$snp <- row.names(pip)
                pip <- inner_join(pip, qtl[,c("snp", "Pvalue", "effect_allele", "other_allele", "Slope")])
                idx = S2$sets$cs_index[j]
                isnps = colnames(S2$lbf_variable)
                bf = S2$lbf_variable[idx, isnps, drop=FALSE]
                bf = data.frame(snp = isnps, lbf = t(bf)[,1])
                pip <- inner_join(pip, bf, by = c("snp" = "snp"))
                print(head(pip))
                colnames(pip) <- c("pip", "snp", "nominal_p", "effect_allele", "other_allele", "slope", "lbf")

                ldf <- process_dosage(paste0("/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_eQTL-inspire-susie/results/susie-prep/gene_eQTLs__", 
                                             l, "__250kb.ukbb-dosages.tsv.gz"), pip$snp)
                ldf <- ldf**2
                colnames(ldf) <- stringr::str_extract(colnames(ldf), "[^-]*")
                rownames(ldf) <- stringr::str_extract(rownames(ldf), "[^-]*")
                #write.table(ldf, paste0("/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_eQTL-inspire-susie/results/susie/gene_eQTLs__", 
                #                            report$V1[i], "__250kb__credibleSet", j, "__ld.txt"), sep = "\t", quote = F)

                pip$snp <- stringr::str_extract(pip$snp, "[^-]*")
                print(head(pip))
                #write.table(pip[, c("snp", "pip", "nominal_p", "effect_allele", "other_allele", "slope", "lbf")], 
                #            paste0("/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_eQTL-inspire-susie/results/susie/gene_eQTLs__", l, "__250kb__credibleSet", j, ".txt"), row.names = F, sep = "\t", quote = F)
                }
        }

        if (length(S2$sets$cs) > 0) {
            purity <- c()
            coverage <- c()
            p <- data.frame(locus = rep(l, length(S2$sets$cs)), purity = NA, coverage = NA)
            for (j in 1:length(S2$sets$cs)) {
                coverage <- c(coverage, S2$sets$coverage[[j]])
                purity <- c(purity, S2$sets$purity[j, 1])
            }
            p$purity <- purity
            p$coverage <- coverage
            p$credibleset <- 1:length(S2$sets$cs)
            print(head(p))
            #write.table(p, paste0("/nfs/turbo/umms-scjp-pank/vthihong/colocGWAS_T1D/1_eQTL-inspire-susie/results/susie/purity/gene_eQTLs__", 
            #                "A1CF", "__", l, ".txt"), row.names = F, sep = "\t", quote = F)
        }
}


[1m[22mJoining with `by = join_by(snp)`


         pip            snp      Pvalue effect_allele other_allele     Slope
1 0.06585744 rs61856594-A-G 1.89082e-07             G            A  0.154257
2 0.10735381 rs17500776-G-C 1.11974e-07             C            G  0.155543
3 0.11757846 rs17500846-A-G 1.01574e-07             G            A  0.155280
4 0.10256088  rs7075575-A-G 1.17591e-07             G            A -0.154399
5 0.16948025 rs12244405-C-T 6.86634e-08             T            C  0.156916
6 0.10257797  rs6479769-T-C 1.17570e-07             C            T -0.154400
       lbf
1 11.42003
2 11.90867
3 11.99964
4 11.86299
5 12.36527
6 11.86316
N duplicates = 0
               1000251 1000534 1000542 1000766 1000898 1000924 1000961 1001059
rs61856594-A-G       0       1       2       1   1.000       1       1       1
rs17500776-G-C       0       1       2       1   1.000       1       1       1
rs17500846-A-G       0       1       2       1   0.996       1       1       1
rs7075575-A-G        2       1       0       1   1.