# Getting results from molQTL analyses in the Athero-Express

This notebook extracts per-variant and per-gene results from a _cis_-acting and _trans_-acting eQTL and mQTL analysis of expression and DNA methylation in carotid plaques from the Athero-Express Biobank Study.

## PGC

We will use the European (`EUR`) data from the Psychiatric Genomics Consortium (PGC) meta-analysis of GWAS on major depressive disorder (MDD). This GWAS discovered a list of 602 loci associated to MDD.

## Million Hearts

We will use the trans-ancestry data - there is no other - from the 'Million Hearts' project on a meta-analysis of GWAS for coronary artery disease (CAD).

## GIGASTROKE

We will use the European (`EUR`) data from [GIGASTROKE](https://www.nature.com/articles/s41586-022-05165-3), these include the following 5 phenotypes. 

Phenotypes:

- `AS` c.q. `ALLSTROKE` = all stroke
- `AIS` c.q. `IS` = all ischemic stroke
- `CES` = cardio-embolic (ischemic) stroke
- `LAS` = large artery (ischemic) stroke
- `SVS` c.q. `SVD` = small vessel disease/stroke

We looked up the gigastroke loci and from these data as follows:

```
python scripts/loci_lookup.py -A targets/gigastroke_las_loci.txt -cA Chr BP -B ~/PLINK/_GWAS_Datasets/_ISGC/gigastroke/LAS_EUR_GCST90104538_buildGRCh37.tsv.gz -cB chromosome base_pair_location -o OUTPUT/20231211_gigastroke_loci.LAS_EUR_GCST90104538_buildGRCh37.txt
```

We also looked up all the PGC hits in the European data from `GIGASTROKE` as follows:

```
python scripts/loci_lookup.py -A /[path_to_google_drive]/\#Projects/TO_AITION/MR\ CVD\ MDD/GWAS/PGC/PGC3_cojo_622.txt -cA CHR BP -B /[path_to_plink_data]/_GWAS_Datasets/_ISGC/gigastroke/LAS_EUR_GCST90104538_buildGRCh37.tsv.gz -cB chromosome base_pair_location -o OUTPUT/20231211_LAS_EUR_GCST90104538_buildGRCh37.toPGC.txt
```

## CAC

We will use the European (`EUR`) data from [coronary artery calcification (CAC) GWAS](https://www.nature.com/articles/s41588-023-01518-4). 

## CIMT

We will use the European (`EUR`) data from [carotid intima-media thickness (cIMT) GWAS](https://www.nature.com/articles/s41467-018-07340-5). This includes two phenotypes:

- `cIMT`, which is carotid IMT
- `plaque`, which is the presence of plaque defined as >25% `cIMT`

## Import necessary libraries

In [1]:
# Function to check for installation of required packages
def check_install_package(package_name):
    try:
        importlib.import_module(package_name)
    except ImportError:
        print(f'{package_name} is not installed. Installing it now...')
        subprocess.check_call(['pip', 'install', package_name])

import os
import glob
import importlib
import subprocess
import sys

# argument parsing
import argparse

# get date and time
from datetime import datetime

# Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool
check_install_package('pandas')
import pandas as pd

# pyarrow is supperior to loading parquet files
check_install_package('pyarrow')
import pyarrow as pa
import pyarrow.parquet as pq

# polars is a fast dataframe library
check_install_package('polars')
import polars as pl

# for statistical analysis
check_install_package('scipy')
from scipy import stats
import numpy as np

# scientific colourmaps
# https://www.fabiocrameri.ch/ws/media-library/8c4b111121ff448e843dfef9220bf613/readme_scientificcolourmaps.pdf
check_install_package('cmcrameri')
import cmcrameri as ccm
from cmcrameri import cm

# for plotting
check_install_package('matplotlib')
import matplotlib
import matplotlib.pyplot as plt
# if using a Jupyter notebook, include:
%matplotlib inline

# use Seaborn for visualisations
check_install_package('seaborn')
import seaborn as sns

# for handling GWAS data
import gwaslab as gl

Setting some functions and standard variables, as well creating some directories. 

In [2]:
# Create directories for the GWAS data and the reference data
import os
from subprocess import check_output

# set some general defaults
PGC_HITS = "PGC"

POPULATION = "EUR"

# general plotting directory
PLOTS_loc = "PLOTS"

# location to put molQTL results
molQTL_loc = "molQTL_results"

# Check if the directory exists
if not os.path.exists(molQTL_loc):
    # If it doesn't exist, create it
    os.makedirs(molQTL_loc)

# Check if the directory exists
if not os.path.exists(PLOTS_loc):
    # If it doesn't exist, create it
    os.makedirs(PLOTS_loc)

# # regional association plots directory
# REG_PLOTS_loc = PLOTS_loc + "/Regional_Association_Plots"

# # Check if the directory exists
# if not os.path.exists(REG_PLOTS_loc):
#     # If it doesn't exist, create it
#     os.makedirs(REG_PLOTS_loc)

# Reference data directory
REF_loc = "/Users/slaan3/PLINK/references"
print("Checking contents of the reference directory:")
print(check_output(["ls", os.path.join(REF_loc)]).decode("utf8"))

# GWAS data directory
GD_loc = "/Users/slaan3/Library/CloudStorage/GoogleDrive-s.w.vanderlaan@gmail.com/My Drive/Genomics/#Projects/TO_AITION/MR CVD MDD/GWAS"
print("Checking contents of the Google Drive directory:")
print(check_output(["ls", os.path.join(GD_loc)]).decode("utf8"))

# molQTL data directory
NOM_CIS_EQTL_loc = "/Users/slaan3/git/CirculatoryHealth/molqtl/results/version1_aernas1_firstrun/nom_cis_eqtl"
print("Checking contents of the nominal cis-eQTL data directory:")
print(check_output(["ls", os.path.join(NOM_CIS_EQTL_loc)]).decode("utf8"))

PERM_CIS_MQTL_loc = "/Users/slaan3/git/CirculatoryHealth/molqtl/results/perm_cis_mqtl"
print("Checking contents of the permuted cis-mQTL data directory:")
print(check_output(["ls", os.path.join(PERM_CIS_MQTL_loc)]).decode("utf8"))

PERM_TRANS_EQTL_loc = "/Users/slaan3/git/CirculatoryHealth/molqtl/results/version1_aernas1_firstrun/perm_trans_eqtl"
print("Checking contents of the permuted trans-eQTL data directory:")
print(check_output(["ls", os.path.join(PERM_TRANS_EQTL_loc)]).decode("utf8"))

PERM_TRANS_MQTL_loc = (
    "/Users/slaan3/git/CirculatoryHealth/molqtl/results/perm_trans_mqtl"
)
print("Checking contents of the permuted trans-eQTL data directory:")
print(check_output(["ls", os.path.join(PERM_TRANS_MQTL_loc)]).decode("utf8"))

Checking contents of the reference directory:
[34m1000G[m[m
[34mHRC_r1_1_2016[m[m
[34mHRCr11_1000Gp3v5[m[m
[34mdbSNP[m[m
[34mtcga[m[m

Checking contents of the Google Drive directory:
[34mCAC[m[m
[34mCIMT[m[m
[34mGIGASTROKE[m[m
[34mMILLIONHEARTS[m[m
[34mPGC[m[m

Checking contents of the nominal cis-eQTL data directory:
README.md
tensorqtl_cis_nominal_chr01.cis_qtl_pairs.chr01.parquet
tensorqtl_cis_nominal_chr02.cis_qtl_pairs.chr02.parquet
tensorqtl_cis_nominal_chr03.cis_qtl_pairs.chr03.parquet
tensorqtl_cis_nominal_chr04.cis_qtl_pairs.chr04.parquet
tensorqtl_cis_nominal_chr05.cis_qtl_pairs.chr05.parquet
tensorqtl_cis_nominal_chr06.cis_qtl_pairs.chr06.parquet
tensorqtl_cis_nominal_chr07.cis_qtl_pairs.chr07.parquet
tensorqtl_cis_nominal_chr08.cis_qtl_pairs.chr08.parquet
tensorqtl_cis_nominal_chr09.cis_qtl_pairs.chr09.parquet
tensorqtl_cis_nominal_chr10.cis_qtl_pairs.chr10.parquet
tensorqtl_cis_nominal_chr11.cis_qtl_pairs.chr11.parquet
tensorqtl_cis_nominal_c

In [None]:
# Example function to get data

import os
import polars as pl
import pandas as pd


def merge_and_export(target_variants, sumstats, left_col, right_col, sort_column, output_csv):
    print("Merging target variants with nominal cis-eQTLs.")

    # Perform the join operation
    temp = target_variants.join(
        sumstats, left_on=left_col, right_on=right_col, how="inner"
    )

    print(
        f'Sorting the DataFrame by column "{sort_column}" in descending order.')
    # Sort the DataFrame by specified column in descending order
    result = temp.sort(sort_column)
    del temp

    print("Showing the first 5 rows of the DataFrame.")
    # Display the sorted DataFrame in descending order
    print(result)

    print("Exporting the Polars DataFrame to a CSV file.")
    # Export the Polars DataFrame to a CSV file
    result.write_csv(output_csv)


# En voor elke target_variant roep je het dan aan als:
merge_and_export(target_variants_pgc, sumstats_nom_cis_eqtl, "VariantID", "VariantID",
                 "pval_nominal", os.path.join(molQTL_loc, "pgc_target_variants_nom_cis_eqtl.csv"))

## Loading data

Loading the different datasets.

In [3]:
# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list the files in the input directory

from subprocess import check_output

print(check_output(["ls", os.path.join(GD_loc, PGC_HITS)]).decode("utf8"))

PGC3_cojo_622.txt



# Targets

Here we load the list of genes and variants from the GWAS.

In [23]:
import polars as pl

# polars.read_excel(
# source: str | BytesIO | Path | BinaryIO | bytes,
# *,
# sheet_id: None = None,
# sheet_name: str,
# engine: Literal['xlsx2csv', 'openpyxl', 'pyxlsb'] | None = None,
# xlsx2csv_options: dict[str, Any] | None = None,
# read_csv_options: dict[str, Any] | None = None,
# schema_overrides: SchemaDict | None = None,
# raise_if_empty: bool = True,
# )

target_variants_pgc = pl.read_excel(
    source=os.path.join("targets/targets.xlsx"), sheet_name="Variants"
)

target_variants_millionhearts = pl.read_excel(
    source=os.path.join("targets/targets.xlsx"), sheet_name="CAD"
)

target_variants_gigastroke = pl.read_excel(
    source=os.path.join("targets/targets.xlsx"), sheet_name="GIGASTROKE_IS"
)

target_variants_cac = pl.read_excel(
    source=os.path.join("targets/targets.xlsx"), sheet_name="CAC"
)

target_variants_cimt = pl.read_excel(
    source=os.path.join("targets/targets.xlsx"), sheet_name="CIMT"
)

In [5]:
target_variants_pgc

RSID,ALTID,VariantID,Chr,BP,A1,A2,FRQ_A,FRQ_U,Locus.Index,SNP.Index,VariantType,Comments
str,str,str,i64,i64,str,str,f64,f64,i64,i64,str,str
"""---""","""---""","""1:8482078""",1,8482078,"""C""","""T""",0.43,0.43,1,1,"""single nucleot…","""PGC_MDD"""
"""---""","""---""","""1:16126676""",1,16126676,"""A""","""G""",0.335,0.334,2,1,"""single nucleot…","""PGC_MDD"""
"""---""","""---""","""1:18123443""",1,18123443,"""G""","""T""",0.141,0.145,3,1,"""single nucleot…","""PGC_MDD"""
"""---""","""---""","""1:32207581""",1,32207581,"""G""","""T""",0.182,0.176,4,1,"""single nucleot…","""PGC_MDD"""
"""---""","""---""","""1:35765084""",1,35765084,"""G""","""A""",0.0354,0.0362,5,1,"""single nucleot…","""PGC_MDD"""
"""---""","""---""","""1:37167333""",1,37167333,"""T""","""G""",0.31,0.306,6,1,"""single nucleot…","""PGC_MDD"""
"""---""","""---""","""1:37672580""",1,37672580,"""C""","""A""",0.291,0.288,7,1,"""single nucleot…","""PGC_MDD"""
"""---""","""---""","""1:38431554""",1,38431554,"""C""","""T""",0.274,0.275,8,1,"""single nucleot…","""PGC_MDD"""
"""---""","""---""","""1:41104664""",1,41104664,"""A""","""G""",0.475,0.475,9,1,"""single nucleot…","""PGC_MDD"""
"""---""","""---""","""1:50316592""",1,50316592,"""T""","""C""",0.222,0.219,10,1,"""single nucleot…","""PGC_MDD"""


In [6]:
target_variants_millionhearts

VariantID,MarkerName,CHR,BP,Allele1,Allele2,Freq1,FreqSE,MinFreq,MaxFreq,Effect,StdErr,P-value,Direction,HetISq,HetChiSq,HetDf,HetPVal,Cases,Effective_Cases,N,Meta_analysis,SNPID,Index
str,str,i64,i64,str,str,f64,f64,f64,f64,f64,f64,f64,str,f64,f64,i64,f64,i64,i64,i64,str,str,i64
"""1:2245570""","""1:2245570_C_G""",1,2245570,"""C""","""G""",0.28,0.0251,0.2305,0.3312,0.041638,0.005804,0.0,"""+++-+-+-+++""",16.6,11.985,10,0.286,173909,162434,1138550,"""Cardiogram""","""1:2245570:C:G""",1
"""1:2252205""","""1:2252205_C_T""",1,2252205,"""T""","""C""",0.1442,0.0169,0.1106,0.1746,0.046794,0.007375,0.0,"""+++++-+-+++""",0.0,7.516,10,0.676,174440,160990,1143770,"""Cardiogram""","""1:2252205:T:C""",2
"""1:2917460""","""1:2917460_C_G""",1,2917460,"""C""","""G""",0.8746,0.014,0.8548,0.9191,0.031473,0.007704,0.000044,"""++--+++-+-+""",19.4,12.411,10,0.2585,181522,177219,1165680,"""Cardiogram""","""1:2917460:C:G""",3
"""1:2985885""","""1:2985885_C_G""",1,2985885,"""C""","""G""",0.6118,0.0236,0.574,0.6404,-0.035136,0.005753,0.0,"""--????-?-?-""",0.0,3.435,4,0.4878,154611,138116,1100510,"""Cardiogram""","""1:2985885:C:G""",4
"""1:3325912""","""1:3325912_A_C""",1,3325912,"""A""","""C""",0.1404,0.0148,0.1207,0.207,0.04846,0.007234,0.0,"""+++++++-+++""",36.6,15.772,10,0.1064,176686,173914,1150900,"""Cardiogram""","""1:3325912:A:C""",5
"""1:26847640""","""1:26847640_C_T…",1,26847640,"""T""","""C""",0.2329,0.021,0.2023,0.2627,-0.025522,0.005827,0.000012,"""----+--+---""",0.0,8.778,10,0.5533,181522,180431,1165630,"""Cardiogram""","""1:26847640:T:C…",6
"""1:27284913""","""1:27284913_C_T…",1,27284913,"""T""","""C""",0.0237,0.0056,0.0133,0.0364,0.095092,0.01726,0.0,"""+++??++?+++""",20.5,8.806,7,0.2669,170977,166290,1138220,"""Cardiogram""","""1:27284913:T:C…",7
"""1:38461319""","""1:38461319_A_C…",1,38461319,"""A""","""C""",0.5403,0.0166,0.5081,0.5604,0.035656,0.005178,0.0,"""+++++-+-+-+""",16.0,11.903,10,0.2916,178954,159719,1157270,"""Cardiogram""","""1:38461319:A:C…",8
"""1:41809640""","""1:41809640_A_T…",1,41809640,"""A""","""T""",0.221,0.0089,0.192,0.2334,0.0188665,0.00592,0.001439,"""++-+-++++++""",0.0,7.665,10,0.6615,181522,180355,1165640,"""Cardiogram""","""1:41809640:A:T…",9
"""1:42946462""","""1:42946462_C_T…",1,42946462,"""T""","""C""",0.3721,0.015,0.3347,0.3901,-0.024887,0.00506,0.000001,"""----++-----""",0.0,9.064,10,0.5261,181522,181178,1165620,"""Cardiogram""","""1:42946462:T:C…",10


In [7]:
target_variants_gigastroke

Index,rsID,VariantID,Chr,BP,Minor_allele_1kG,Major_allele_1kG,MAF_AFR_1kG,MAF_EAS_1kG,MAF_EUR_1kG,MAF_SAS_1kG,MAF_HIS_1kG,Locus,Analysis,Phenotype,Ancestry,SNP_Type,GIGASTROKE_Confidence_level,chromosome,base_pair_location,effect_allele_frequency,beta,standard_error,p_value,odds_ratio,ci_lower,ci_upper,effect_allele,other_allele
i64,str,str,i64,i64,str,str,f64,f64,f64,str,f64,str,str,str,str,str,str,i64,i64,f64,f64,f64,f64,f64,f64,f64,str,str
1,"""rs2455132""","""1:3221083""",1,3221083,"""T""","""C""",0.09,0.27,0.25,"""0.380""",0.29,"""PRDM16""","""IVW (METAL)""","""SVS""","""CROSS-ANC""","""NOVEL""","""INTERMEDIATE""",1,3221083,0.2497,-0.1201,0.0249,0.0000014,0.886832,0.84459,0.931186,"""T""","""C"""
2,"""rs880315""","""1:10796866""",1,10796866,"""C""","""T""",0.16,0.64,0.36,"""0.400""",0.51,"""CASZ1""","""IVW (METAL)""","""AIS""","""CROSS-ANC""","""OLD_MEGASTROKE…","""HIGH """,1,10796866,0.6315,-0.0454,0.0076,1.9990e-9,0.955615,0.941486,0.969957,"""T""","""C"""
3,"""rs3790607""","""1:113053023""",1,113053023,"""C""","""A""",0.0068,0.34,0.08,"""0.090""",0.21,"""WNT2B""","""IVW (METAL)""","""AIS""","""CROSS-ANC""","""OLD_MEGASTROKE…","""HIGH """,1,113053023,0.8812,-0.0486,0.0117,0.000032,0.952562,0.930966,0.974659,"""A""","""C"""
4,"""rs2251636""","""1:156202809""",1,156202809,"""C""","""G""",0.5,0.33,0.36,"""0.350""",0.35,"""PMF1""","""IVW (METAL)""","""SVS""","""CROSS-ANC""","""OLD_MEGASTROKE…","""HIGH """,1,156202809,0.381,-0.0841,0.0203,0.000036,0.919339,0.883479,0.956655,"""C""","""G"""
5,"""rs680084""","""1:170628255""",1,170628255,"""G""","""A""",0.49,0.62,0.44,"""0.260""",0.37,"""PRRX1""","""IVW (METAL)""","""CES""","""CROSS-ANC""","""NOVEL""","""INTERMEDIATE""",1,170628255,0.5427,-0.0846,0.0161,1.4640e-7,0.91888,0.890336,0.948338,"""A""","""G"""
6,"""rs2877984""","""1:183090497""",1,183090497,"""A""","""G""",0.56,0.37,0.44,"""0.400""",0.37,"""LAMC1""","""IVW (METAL)""","""AS""","""CROSS-ANC""","""NOVEL""","""HIGH """,1,183090497,0.4502,-0.0262,0.0065,0.000052,0.97414,0.961808,0.98663,"""A""","""G"""
7,"""rs11694327""","""2:26919429""",2,26919429,"""C""","""T""",0.03,0.7,0.22,"""0.290""",0.14,"""KCNK3""","""IVW (METAL)""","""AIS""","""CROSS-ANC""","""OLD_MEGASTROKE…","""HIGH """,2,26919429,0.758,-0.0331,0.0088,0.0001723,0.967442,0.950898,0.984273,"""T""","""C"""
8,"""rs6722806""","""2:43627715""",2,43627715,"""T""","""A""",0.49,0.03,0.36,"""0.330""",0.24,"""THADA""","""IVW (METAL)""","""AS""","""CROSS-ANC""","""NOVEL""","""INTERMEDIATE""",2,43627715,0.6481,0.0357,0.0068,1.7450e-7,1.036345,1.022624,1.05025,"""A""","""T"""
9,"""rs11691032""","""2:164788513""",2,164788513,"""C""","""G""",0.19,0.19,0.33,"""0.260""",0.21,"""FIGN""","""IVW (METAL)""","""AS""","""CROSS-ANC""","""NOVEL""","""INTERMEDIATE""",2,164788513,0.339,0.0283,0.0068,0.000032,1.028704,1.015085,1.042507,"""C""","""G"""
10,"""rs2351524""","""2:203880992""",2,203880992,"""T""","""C""",0.05,0.02,0.13,"""0.010""",0.08,"""NBEAL1""","""IVW (METAL)""","""AIS""","""CROSS-ANC""","""OLD_LAC_STROKE…","""HIGH """,2,203880992,0.1244,-0.0685,0.0109,3.3740e-10,0.933793,0.914055,0.953958,"""T""","""C"""


In [8]:
target_variants_cac

rsID,Chr,Pos_hg19,Effect_Allele,Other_Allele,EAF,Effect,SE,P_meta,I2,P_het,Nearest_gene,Annotation,SNP_type
str,i64,i64,str,str,f64,str,f64,f64,f64,f64,str,str,str
"""rs3844006""",6,132095002,"""T""","""C""",0.221,"""−0.114""",0.02,0.0,0.5,0.453,"""miR-548h-5(dis…","""intergenic""","""NOVEL"""
"""rs2854746""",7,45960645,"""C""","""G""",0.414,"""0.11""",0.018,0.0,0.0,0.76,"""IGFBP3""","""missense""","""NOVEL"""
"""rs10899970""",10,44515716,"""A""","""G""",0.474,"""0.095000""",0.017,0.0,0.0,0.94,"""AL512640.1(dis…","""intergenic""","""NOVEL"""
"""rs9633535""",10,63836088,"""T""","""C""",0.371,"""0.098000""",0.018,0.0,4.0,0.407,"""ARID5B""","""intronic""","""NOVEL"""
"""rs10762577""",10,75917431,"""A""","""G""",0.258,"""−0.107""",0.019,0.0,0.0,0.683,"""ADK""","""intronic""","""NOVEL"""
"""rs11063120""",12,4486618,"""A""","""G""",0.303,"""−0.133""",0.022,0.0,56.0,0.001,"""FGF23""","""intronic""","""NOVEL"""
"""rs9515203""",13,111049623,"""T""","""C""",0.732,"""0.123""",0.022,0.0,0.0,0.744,"""COL4A1(dist: 9…","""intronic""","""NOVEL"""
"""rs7182103""",15,79123946,"""T""","""G""",0.575,"""0.112""",0.017,0.0,4.1,0.405,"""ADAMTS7(dist.:…","""intronic""","""NOVEL"""
"""rs10456561""",6,12887465,"""A""","""G""",0.036,"""0.375""",0.069,0.0,14.7,0.292,"""PHACTR1""","""intronic""","""KNOWN"""
"""rs35355695""",6,12891103,"""T""","""G""",0.256,"""−0.115""",0.02,0.0,5.2,0.39,"""PHACTR1""","""intronic""","""KNOWN"""


In [9]:
target_variants_cimt

RSID,CHR,BP,Effect_Allele,Other_Allele,EAF,BETA,SE,p,N,Nearest coding gene,SNP_type,PHENOTYPE,dbSNP,Note
str,i64,i64,str,str,f64,str,f64,str,i64,str,str,str,str,str
"""rs201648240""",1,208953176,"""A""","""AA""",0.83,"""−0.0062""",0.0011,"""4e10-9""",54752,"""LINC01717""","""NOVEL""","""CIMT""","""https://www.nc…","""INDEL; alleles…"
"""rs224904""",5,81637916,"""C""","""G""",0.95,"""−0.0088""",0.0016,"""5e10-8""",68962,"""ATP6AP1L""","""NOVEL""","""CIMT""",,
"""rs6907215""",6,143608968,"""T""","""C""",0.6,"""−0.0040""",0.0007,"""5e10-8""",64586,"""AIG1""","""NOVEL""","""CIMT""",,
"""rs13225723""",7,106416467,"""A""","""G""",0.22,"""0.005200""",0.0009,"""3e10-9""",68070,"""PIK3CG""","""NOVEL""","""CIMT""",,
"""rs2912063""",8,6486033,"""A""","""G""",0.71,"""0.004500""",0.0008,"""9e10-9""",67401,"""MCPH1""","""NOVEL""","""CIMT""",,
"""rs11785239""",8,8205010,"""T""","""C""",0.65,"""−0.0043""",0.0008,"""9e10-9""",67107,"""SGK223""","""NOVEL""","""CIMT""",,
"""rs11196033""",10,114410998,"""A""","""C""",0.48,"""0.004200""",0.0008,"""4e10-8""",57995,"""VTI1A""","""NOVEL""","""CIMT""",,
"""rs844396""",16,88966667,"""T""","""C""",0.3,"""−0.0051""",0.0009,"""6e10-9""",50377,"""CBFA2T3""","""NOVEL""","""CIMT""",,
"""rs200495339""",19,11189298,"""GG""","""GGG""",0.11,"""−0.1023""",0.0179,"""1e10-8""",36569,"""LDLR""","""NOVEL""","""PLAQUE""","""https://www.nc…","""INDEL; alleles…"
"""rs148147734""",8,123401537,"""GGGG""","""GGGGG""",0.54,"""0.005000""",0.0007,"""3e10-11""",58141,"""ZHX2""","""KNOWN""","""CIMT""","""https://www.nc…","""INDEL; alleles…"


# cis-eQTL (nominal)

Here we load the nominal _cis_-acting eQTL data.

In [10]:
# read in data
# https://stackoverflow.com/questions/33813815/how-to-read-a-parquet-file-into-pandas-dataframe

import polars as pl

# annotated cis-eQTLs
sumstats_nom_cis_eqtl = pl.read_parquet(
    source=os.path.join(
        NOM_CIS_EQTL_loc, "tensorqtl_nominal_cis_qtl_pairs.annot.parquet"
    )
)

In [11]:
sumstats_nom_cis_eqtl

EnsemblID,VariantID,tss_distance,CAF_eQTL,ma_samples,ma_count,pval_nominal,Beta,SE,chromosome,position,OtherAlleleA,CodedAlleleB,CAF,AltID,Source,AverageMaximumPosteriorCall,Info,AA_N,AB_N,BB_N,TotalN,MAF,MissingDataProportion,HWE_P
str,str,i32,f32,i32,i32,f64,f32,f32,i64,i64,str,str,f64,str,str,f64,f64,f64,f64,f64,i64,f64,f64,f64
"""ENSG0000018763…","""1:693731""",-240519,0.134185,153,168,0.881492,0.005953,0.039917,1,693731,"""A""","""G""",0.137241,"""rs12238997""","""HRCr11""",0.902046,0.624501,1637.01,451.103,35.569,2124,0.122957,0.000075,0.544841
"""ENSG0000018763…","""1:714596""",-219654,0.038339,48,48,0.834746,-0.015369,0.073635,1,714596,"""T""","""C""",0.0327213,"""rs149887893""","""HRCr11""",0.968278,0.577564,1987.73,135.495,0.677,2124,0.0322165,0.000024,0.272504
"""ENSG0000018763…","""1:715367""",-218883,0.038339,48,48,0.834746,-0.015369,0.073635,1,715367,"""A""","""G""",0.0324859,"""rs12184277""","""HRCr11""",0.975207,0.67196,1976.85,146.512,0.556,2124,0.0347528,0.000019,0.176551
"""ENSG0000018763…","""1:717485""",-216765,0.038339,48,48,0.834746,-0.015369,0.073635,1,717485,"""C""","""A""",0.0324859,"""rs12184279""","""HRCr11""",0.975183,0.670147,1978.35,145.041,0.517,2124,0.0343883,0.000022,0.174817
"""ENSG0000018763…","""1:720381""",-213869,0.038339,48,48,0.834746,-0.015369,0.073635,1,720381,"""G""","""T""",0.0327213,"""rs116801199""","""HRCr11""",0.974476,0.667536,1973.72,149.503,0.691,2124,0.0355205,0.00002,0.110452
"""ENSG0000018763…","""1:721290""",-212960,0.038339,48,48,0.834746,-0.015369,0.073635,1,721290,"""G""","""C""",0.0327213,"""rs12565286""","""HRCr11""",0.975168,0.677386,1972.67,150.564,0.687,2124,0.0357682,0.000017,0.110447
"""ENSG0000018763…","""1:726794""",-207456,0.038339,48,48,0.834746,-0.015369,0.073635,1,726794,"""C""","""G""",0.0324859,"""rs28454925""","""HRCr11""",0.977817,0.70519,1977.3,146.099,0.547,2124,0.0346509,0.000013,0.176493
"""ENSG0000018763…","""1:729632""",-204618,0.038339,48,48,0.834746,-0.015369,0.073635,1,729632,"""C""","""T""",0.0324859,"""rs116720794""","""HRCr11""",0.978581,0.715529,1976.76,146.752,0.454,2124,0.0347604,0.000008,0.176551
"""ENSG0000018763…","""1:729679""",-204571,0.829073,195,214,0.678049,0.015322,0.03689,1,729679,"""C""","""G""",0.829331,"""rs4951859""","""HRCr11""",0.936062,0.78861,58.204,600.893,1464.61,2124,0.16888,0.00007,0.757157
"""ENSG0000018763…","""1:730087""",-204163,0.010383,13,13,0.068513,-0.24805,0.135916,1,730087,"""T""","""C""",0.0216573,"""rs148120343""","""HRCr11""",0.919602,0.439455,1902.75,213.581,7.655,2124,0.0538824,0.000003,0.665385


## PGC

GWAS on MDD: major depressive disorder.

In [12]:
import os
import polars as pl
import pandas as pd

print("Merging target variants with nominal cis-eQTLs.")
# Perform the join operation
temp = target_variants_pgc.join(
    sumstats_nom_cis_eqtl, left_on="VariantID", right_on="VariantID", how="inner"
)
print('Sorting the DataFrame by column "pval_nominal" in descending order.')
# Sort the DataFrame by column 'A' in descending order
target_variants_pgc_nom_cis_eqtl = temp.sort("pval_nominal")
del temp

print("Showing the first 5 rows of the DataFrame.")
# Display the sorted DataFrame in descending order
print(target_variants_pgc_nom_cis_eqtl)

print("Exporting the Polars DataFrame to a CSV file.")
# Export the Polars DataFrame to a CSV file
target_variants_pgc_nom_cis_eqtl.write_csv(
    os.path.join(molQTL_loc, "pgc_target_variants_nom_cis_eqtl.csv")
)

Merging target variants with nominal cis-eQTLs.
Sorting the DataFrame by column "pval_nominal" in descending order.
Showing the first 5 rows of the DataFrame.
shape: (5_822, 37)
┌──────┬───────┬─────────────┬─────┬───┬────────┬───────────┬───────────────────────┬───────────┐
│ RSID ┆ ALTID ┆ VariantID   ┆ Chr ┆ … ┆ TotalN ┆ MAF       ┆ MissingDataProportion ┆ HWE_P     │
│ ---  ┆ ---   ┆ ---         ┆ --- ┆   ┆ ---    ┆ ---       ┆ ---                   ┆ ---       │
│ str  ┆ str   ┆ str         ┆ i64 ┆   ┆ i64    ┆ f64       ┆ f64                   ┆ f64       │
╞══════╪═══════╪═════════════╪═════╪═══╪════════╪═══════════╪═══════════════════════╪═══════════╡
│ ---  ┆ ---   ┆ 11:73525881 ┆ 11  ┆ … ┆ 2124   ┆ 0.445842  ┆ 0.000016              ┆ 0.272289  │
│ ---  ┆ ---   ┆ 6:29659124  ┆ 6   ┆ … ┆ 2124   ┆ 0.345331  ┆ 0.000009              ┆ 0.564475  │
│ ---  ┆ ---   ┆ 11:61571348 ┆ 11  ┆ … ┆ 2124   ┆ 0.296173  ┆ 7.0622e-7             ┆ 0.131178  │
│ ---  ┆ ---   ┆ 1:67500894  ┆ 1   ┆ …

## Million Hearts

GWAS on CAD: coronary artery disease.

In [13]:
import os
import polars as pl
import pandas as pd

print("Merging target variants with nominal cis-eQTLs.")
# Perform the join operation
temp = target_variants_millionhearts.join(
    sumstats_nom_cis_eqtl, left_on="VariantID", right_on="VariantID", how="inner"
)
print('Sorting the DataFrame by column "pval_nominal" in descending order.')
# Sort the DataFrame by column 'A' in descending order
target_variants_millionhearts_nom_cis_eqtl = temp.sort("pval_nominal")
del temp

print("Showing the first 5 rows of the DataFrame.")
# Display the sorted DataFrame in descending order
print(target_variants_millionhearts_nom_cis_eqtl)

print("Exporting the Polars DataFrame to a CSV file.")
# Export the Polars DataFrame to a CSV file
target_variants_millionhearts_nom_cis_eqtl.write_csv(
    os.path.join(molQTL_loc, "millionhearts_target_variants_nom_cis_eqtl.csv")
)

Merging target variants with nominal cis-eQTLs.
Sorting the DataFrame by column "pval_nominal" in descending order.
Showing the first 5 rows of the DataFrame.
shape: (9_523, 48)
┌─────────────┬────────────────┬─────┬──────────┬───┬────────┬──────────┬───────────────┬──────────┐
│ VariantID   ┆ MarkerName     ┆ CHR ┆ BP       ┆ … ┆ TotalN ┆ MAF      ┆ MissingDataPr ┆ HWE_P    │
│ ---         ┆ ---            ┆ --- ┆ ---      ┆   ┆ ---    ┆ ---      ┆ oportion      ┆ ---      │
│ str         ┆ str            ┆ i64 ┆ i64      ┆   ┆ i64    ┆ f64      ┆ ---           ┆ f64      │
│             ┆                ┆     ┆          ┆   ┆        ┆          ┆ f64           ┆          │
╞═════════════╪════════════════╪═════╪══════════╪═══╪════════╪══════════╪═══════════════╪══════════╡
│ 6:31888367  ┆ 6:31888367_C_T ┆ 6   ┆ 31888367 ┆ … ┆ 2124   ┆ 0.134741 ┆ 0.000002      ┆ 0.512893 │
│ 7:6486067   ┆ 7:6486067_C_T  ┆ 7   ┆ 6486067  ┆ … ┆ 2124   ┆ 0.210286 ┆ 0.000004      ┆ 0.556702 │
│ 7:6446027   

## GIGASTROKE

All loci together.

In [14]:
import os
import polars as pl
import pandas as pd

print("Merging target variants with nominal cis-eQTLs.")
# Perform the join operation
temp = target_variants_gigastroke.join(
    sumstats_nom_cis_eqtl, left_on="VariantID", right_on="VariantID", how="inner"
)
print('Sorting the DataFrame by column "pval_nominal" in descending order.')
# Sort the DataFrame by column 'A' in descending order
target_variants_gigastroke_nom_cis_eqtl = temp.sort("pval_nominal")
del temp

print("Showing the first 5 rows of the DataFrame.")
# Display the sorted DataFrame in descending order
print(target_variants_gigastroke_nom_cis_eqtl)

print("Exporting the Polars DataFrame to a CSV file.")
# Export the Polars DataFrame to a CSV file
target_variants_gigastroke_nom_cis_eqtl.write_csv(
    os.path.join(molQTL_loc, "gigastroke_target_variants_nom_cis_eqtl.csv")
)

Merging target variants with nominal cis-eQTLs.
Sorting the DataFrame by column "pval_nominal" in descending order.
Showing the first 5 rows of the DataFrame.
shape: (1_323, 53)
┌───────┬────────────┬─────────────┬─────┬───┬────────┬───────────┬─────────────────────┬──────────┐
│ Index ┆ rsID       ┆ VariantID   ┆ Chr ┆ … ┆ TotalN ┆ MAF       ┆ MissingDataProporti ┆ HWE_P    │
│ ---   ┆ ---        ┆ ---         ┆ --- ┆   ┆ ---    ┆ ---       ┆ on                  ┆ ---      │
│ i64   ┆ str        ┆ str         ┆ i64 ┆   ┆ i64    ┆ f64       ┆ ---                 ┆ f64      │
│       ┆            ┆             ┆     ┆   ┆        ┆           ┆ f64                 ┆          │
╞═══════╪════════════╪═════════════╪═════╪═══╪════════╪═══════════╪═════════════════════╪══════════╡
│ 58    ┆ rs28860769 ┆ 19:10737581 ┆ 19  ┆ … ┆ 2124   ┆ 0.189858  ┆ 0.000011            ┆ 0.943753 │
│ 19    ┆ rs36229526 ┆ 6:32820656  ┆ 6   ┆ … ┆ 2124   ┆ 0.0777784 ┆ 7.0622e-7           ┆ 0.87893  │
│ 29    ┆ rs15

## CAC

In [15]:
import os
import polars as pl
import pandas as pd

print("Merging target variants with nominal cis-eQTLs.")
# Perform the join operation
temp = target_variants_cac.join(
    sumstats_nom_cis_eqtl, left_on="rsID", right_on="AltID", how="inner"
)
print('Sorting the DataFrame by column "pval_nominal" in descending order.')
# Sort the DataFrame by column 'A' in descending order
target_variants_cac_nom_cis_eqtl = temp.sort("pval_nominal")
del temp

print("Showing the first 5 rows of the DataFrame.")
# Display the sorted DataFrame in descending order
print(target_variants_cac_nom_cis_eqtl)

print("Exporting the Polars DataFrame to a CSV file.")
# Export the Polars DataFrame to a CSV file
target_variants_cac_nom_cis_eqtl.write_csv(
    os.path.join(molQTL_loc, "cac_target_variants_nom_cis_eqtl.csv")
)

Merging target variants with nominal cis-eQTLs.
Sorting the DataFrame by column "pval_nominal" in descending order.
Showing the first 5 rows of the DataFrame.
shape: (176, 38)
┌────────────┬─────┬───────────┬───────────────┬───┬────────┬───────────┬───────────────┬──────────┐
│ rsID       ┆ Chr ┆ Pos_hg19  ┆ Effect_Allele ┆ … ┆ TotalN ┆ MAF       ┆ MissingDataPr ┆ HWE_P    │
│ ---        ┆ --- ┆ ---       ┆ ---           ┆   ┆ ---    ┆ ---       ┆ oportion      ┆ ---      │
│ str        ┆ i64 ┆ i64       ┆ str           ┆   ┆ i64    ┆ f64       ┆ ---           ┆ f64      │
│            ┆     ┆           ┆               ┆   ┆        ┆           ┆ f64           ┆          │
╞════════════╪═════╪═══════════╪═══════════════╪═══╪════════╪═══════════╪═══════════════╪══════════╡
│ rs9349379  ┆ 6   ┆ 12903957  ┆ A             ┆ … ┆ 2124   ┆ 0.365801  ┆ 0.000003      ┆ 0.708139 │
│ rs7412     ┆ 19  ┆ 45412079  ┆ T             ┆ … ┆ 2124   ┆ 0.0738128 ┆ 0.000008      ┆ 0.42312  │
│ rs7412     ┆ 1

## CIMT

All cIMT and Plaque loci together.

In [16]:
import os
import polars as pl
import pandas as pd

print("Merging target variants with nominal cis-eQTLs.")
# Perform the join operation
temp = target_variants_cimt.join(
    sumstats_nom_cis_eqtl, left_on="RSID", right_on="AltID", how="inner"
)
print('Sorting the DataFrame by column "pval_nominal" in descending order.')
# Sort the DataFrame by column 'A' in descending order
target_variants_cimt_nom_cis_eqtl = temp.sort("pval_nominal")
del temp

print("Showing the first 5 rows of the DataFrame.")
# Display the sorted DataFrame in descending order
print(target_variants_cimt_nom_cis_eqtl)

print("Exporting the Polars DataFrame to a CSV file.")
# Export the Polars DataFrame to a CSV file
target_variants_cimt_nom_cis_eqtl.write_csv(
    os.path.join(molQTL_loc, "cimt_target_variants_nom_cis_eqtl.csv")
)

Merging target variants with nominal cis-eQTLs.
Sorting the DataFrame by column "pval_nominal" in descending order.
Showing the first 5 rows of the DataFrame.
shape: (232, 39)
┌─────────────┬─────┬───────────┬───────────────┬───┬────────┬───────────┬──────────────┬──────────┐
│ RSID        ┆ CHR ┆ BP        ┆ Effect_Allele ┆ … ┆ TotalN ┆ MAF       ┆ MissingDataP ┆ HWE_P    │
│ ---         ┆ --- ┆ ---       ┆ ---           ┆   ┆ ---    ┆ ---       ┆ roportion    ┆ ---      │
│ str         ┆ i64 ┆ i64       ┆ str           ┆   ┆ i64    ┆ f64       ┆ ---          ┆ f64      │
│             ┆     ┆           ┆               ┆   ┆        ┆           ┆ f64          ┆          │
╞═════════════╪═════╪═══════════╪═══════════════╪═══╪════════╪═══════════╪══════════════╪══════════╡
│ rs17477177  ┆ 7   ┆ 106411858 ┆ T             ┆ … ┆ 2124   ┆ 0.221096  ┆ 0.000004     ┆ 0.528212 │
│ rs13225723  ┆ 7   ┆ 106416467 ┆ A             ┆ … ┆ 2124   ┆ 0.223969  ┆ 0.000008     ┆ 0.573563 │
│ rs7412      ┆ 

In [17]:
del sumstats_nom_cis_eqtl

# cis-mQTL (permuted)

Here we load the nominal _cis_-acting mQTL data.

In [18]:
# read in data
import polars as pl

# Specify the file path to your data
file_path = os.path.join(PERM_CIS_MQTL_loc, "tensormqtl.perm_cis_mqtl.txt")

# Read the data into a Polars DataFrame
sumstats_perm_cis_mqtl = pl.read_csv(
    file_path, has_header=True, separator="\t", ignore_errors=True
)

In [19]:
# Display the first few rows of the DataFrame to verify the data loading
sumstats_perm_cis_mqtl

phenotype_id,num_var,beta_shape1,beta_shape2,true_df,pval_true_df,variant_id,tss_distance,ma_samples,ma_count,af,pval_nominal,slope,slope_se,pval_perm,pval_beta,qval,pval_nominal_threshold
str,i64,f64,f64,f64,f64,str,i64,i64,i64,f64,f64,f64,f64,f64,f64,f64,f64
"""cg21870274""",309,1.01037,15.0882,351.533,0.0495503,"""1:752307""",682715,15,15,0.0170068,0.036876,-0.406806,0.194254,0.531247,0.530703,0.468602,0.001232
"""cg08258224""",731,1.03658,40.9219,342.718,0.101636,"""1:1636400""",836316,37,38,0.0430839,0.0780652,-0.160865,0.0910595,0.984102,0.986523,0.605861,0.000509
"""cg18147296""",750,1.02594,43.8044,350.307,0.000046,"""1:771410""",-41130,130,141,0.840136,0.000014,-0.195278,0.0444456,0.0017,0.001693,0.006203,0.000454
"""cg13938959""",787,1.00503,41.991,346.635,0.0160055,"""1:800193""",-33991,7,7,0.007937,0.009935,0.568674,0.219518,0.49525,0.48971,0.45302,0.000433
"""cg12445832""",787,1.01013,42.2714,344.888,0.003059,"""1:1706886""",872590,44,46,0.0521542,0.001483,-0.25942,0.08106,0.118388,0.118428,0.211842,0.000439
"""cg23999112""",787,1.03227,42.5622,342.63,0.0111282,"""1:810286""",-24071,114,124,0.14059,0.006277,0.168039,0.0611592,0.364464,0.363523,0.395055,0.00048
"""cg08128007""",799,1.01323,45.7034,353.328,0.025363,"""1:1037047""",197611,177,207,0.234694,0.0177802,0.174208,0.0731943,0.686431,0.685759,0.522438,0.000412
"""cg23733394""",799,1.02275,42.9169,345.861,0.000206,"""1:753405""",-86348,108,118,0.866213,0.00007,-0.393622,0.0979262,0.006899,0.007824,0.0248428,0.000457
"""cg13371836""",800,0.994356,47.1897,362.606,0.0186669,"""1:1830151""",989876,125,141,0.159864,0.0138438,-0.193078,0.0780974,0.588241,0.591562,0.491116,0.000367
"""cg04407431""",801,1.01772,44.621,351.137,0.023597,"""1:768448""",-72171,34,37,0.0419501,0.0160789,-0.358302,0.148214,0.650535,0.648136,0.51084,0.00043


## PGC

In [20]:
import os
import polars as pl
import pandas as pd

print("Merging target variants with nominal cis-eQTLs.")
# Perform the join operation
temp = target_variants_pgc.join(
    sumstats_perm_cis_mqtl, left_on="VariantID", right_on="variant_id", how="inner"
)
print('Sorting the DataFrame by column "pval_nominal" in descending order.')
# Sort the DataFrame by column 'A' in descending order
target_variants_pgc_perm_cis_mqtl = temp.sort("pval_nominal")
del temp

print("Showing the first 5 rows of the DataFrame.")
# Display the sorted DataFrame in descending order
print(target_variants_pgc_perm_cis_mqtl)

print("Exporting the Polars DataFrame to a CSV file.")
# Export the Polars DataFrame to a CSV file
target_variants_pgc_perm_cis_mqtl.write_csv(
    os.path.join(molQTL_loc, "pgc_target_variants_perm_cis_mqtl.csv")
)

Merging target variants with nominal cis-eQTLs.
Sorting the DataFrame by column "pval_nominal" in descending order.
Showing the first 5 rows of the DataFrame.
shape: (28, 30)
┌──────┬───────┬──────────────┬─────┬───┬───────────┬────────────┬────────────┬────────────────────┐
│ RSID ┆ ALTID ┆ VariantID    ┆ Chr ┆ … ┆ pval_perm ┆ pval_beta  ┆ qval       ┆ pval_nominal_thres │
│ ---  ┆ ---   ┆ ---          ┆ --- ┆   ┆ ---       ┆ ---        ┆ ---        ┆ hold               │
│ str  ┆ str   ┆ str          ┆ i64 ┆   ┆ f64       ┆ f64        ┆ f64        ┆ ---                │
│      ┆       ┆              ┆     ┆   ┆           ┆            ┆            ┆ f64                │
╞══════╪═══════╪══════════════╪═════╪═══╪═══════════╪════════════╪════════════╪════════════════════╡
│ ---  ┆ ---   ┆ 12:121207147 ┆ 12  ┆ … ┆ 0.0001    ┆ 3.7327e-26 ┆ 9.3753e-25 ┆ 0.000077           │
│ ---  ┆ ---   ┆ 7:2760750    ┆ 7   ┆ … ┆ 0.0001    ┆ 9.0450e-13 ┆ 7.9183e-12 ┆ 0.000039           │
│ ---  ┆ ---   ┆ 

## Million Hearts

In [21]:
import os
import polars as pl
import pandas as pd

print("Merging target variants with nominal cis-eQTLs.")
# Perform the join operation
temp = target_variants_millionhearts.join(
    sumstats_perm_cis_mqtl, left_on="VariantID", right_on="variant_id", how="inner"
)
print('Sorting the DataFrame by column "pval_nominal" in descending order.')
# Sort the DataFrame by column 'A' in descending order
target_variants_millionhearts_perm_cis_mqtl = temp.sort("pval_nominal")
del temp

print("Showing the first 5 rows of the DataFrame.")
# Display the sorted DataFrame in descending order
print(target_variants_millionhearts_perm_cis_mqtl)

print("Exporting the Polars DataFrame to a CSV file.")
# Export the Polars DataFrame to a CSV file
target_variants_millionhearts_perm_cis_mqtl.write_csv(
    os.path.join(molQTL_loc, "millionhearts_target_variants_perm_cis_mqtl.csv")
)

Merging target variants with nominal cis-eQTLs.
Sorting the DataFrame by column "pval_nominal" in descending order.
Showing the first 5 rows of the DataFrame.
shape: (83, 41)
┌────────────┬────────────┬─────┬───────────┬───┬───────────┬────────────┬────────────┬────────────┐
│ VariantID  ┆ MarkerName ┆ CHR ┆ BP        ┆ … ┆ pval_perm ┆ pval_beta  ┆ qval       ┆ pval_nomin │
│ ---        ┆ ---        ┆ --- ┆ ---       ┆   ┆ ---       ┆ ---        ┆ ---        ┆ al_thresho │
│ str        ┆ str        ┆ i64 ┆ i64       ┆   ┆ f64       ┆ f64        ┆ f64        ┆ ld         │
│            ┆            ┆     ┆           ┆   ┆           ┆            ┆            ┆ ---        │
│            ┆            ┆     ┆           ┆   ┆           ┆            ┆            ┆ f64        │
╞════════════╪════════════╪═════╪═══════════╪═══╪═══════════╪════════════╪════════════╪════════════╡
│ 1:10981719 ┆ 1:10981719 ┆ 1   ┆ 109817192 ┆ … ┆ 0.0001    ┆ 2.4600e-10 ┆ 1.4757e-10 ┆ 0.000055   │
│ 2          ┆ 2_

## GIGASTROKE

In [22]:
import os
import polars as pl
import pandas as pd

print("Merging target variants with nominal cis-eQTLs.")
# Perform the join operation
temp = target_variants_gigastroke.join(
    sumstats_perm_cis_mqtl, left_on="VariantID", right_on="variant_id", how="inner"
)
print('Sorting the DataFrame by column "pval_nominal" in descending order.')
# Sort the DataFrame by column 'A' in descending order
target_variants_gigastroke_perm_cis_mqtl = temp.sort("pval_nominal")
del temp

print("Showing the first 5 rows of the DataFrame.")
# Display the sorted DataFrame in descending order
print(target_variants_gigastroke_perm_cis_mqtl)

print("Exporting the Polars DataFrame to a CSV file.")
# Export the Polars DataFrame to a CSV file
target_variants_gigastroke_perm_cis_mqtl.write_csv(
    os.path.join(molQTL_loc, "gigastroke_target_variants_perm_cis_mqtl.csv")
)

Merging target variants with nominal cis-eQTLs.
Sorting the DataFrame by column "pval_nominal" in descending order.
Showing the first 5 rows of the DataFrame.
shape: (10, 46)
┌───────┬────────────┬──────────────┬─────┬───┬───────────┬────────────┬────────────┬──────────────┐
│ Index ┆ rsID       ┆ VariantID    ┆ Chr ┆ … ┆ pval_perm ┆ pval_beta  ┆ qval       ┆ pval_nominal │
│ ---   ┆ ---        ┆ ---          ┆ --- ┆   ┆ ---       ┆ ---        ┆ ---        ┆ _threshold   │
│ i64   ┆ str        ┆ str          ┆ i64 ┆   ┆ f64       ┆ f64        ┆ f64        ┆ ---          │
│       ┆            ┆              ┆     ┆   ┆           ┆            ┆            ┆ f64          │
╞═══════╪════════════╪══════════════╪═════╪═══╪═══════════╪════════════╪════════════╪══════════════╡
│ 23    ┆ rs2107595  ┆ 7:19049388   ┆ 7   ┆ … ┆ 0.0001    ┆ 4.6910e-18 ┆ 5.9184e-17 ┆ 0.000047     │
│ 85    ┆ rs7500448  ┆ 16:83045790  ┆ 16  ┆ … ┆ 0.0001    ┆ 7.2634e-13 ┆ 7.3576e-12 ┆ 0.000022     │
│ 55    ┆ rs12445

## CAC

In [25]:
import os
import polars as pl
import pandas as pd

print("Merging target variants with nominal cis-eQTLs.")
# Perform the join operation
temp = target_variants_cac.join(
    sumstats_perm_cis_mqtl, left_on="VariantID", right_on="variant_id", how="inner"
)
print('Sorting the DataFrame by column "pval_nominal" in descending order.')
# Sort the DataFrame by column 'A' in descending order
target_variants_cac_perm_cis_mqtl = temp.sort("pval_nominal")
del temp

print("Showing the first 5 rows of the DataFrame.")
# Display the sorted DataFrame in descending order
print(target_variants_cac_perm_cis_mqtl)

print("Exporting the Polars DataFrame to a CSV file.")
# Export the Polars DataFrame to a CSV file
target_variants_cac_perm_cis_mqtl.write_csv(
    os.path.join(molQTL_loc, "cac_target_variants_perm_cis_mqtl.csv")
)

Merging target variants with nominal cis-eQTLs.
Sorting the DataFrame by column "pval_nominal" in descending order.
Showing the first 5 rows of the DataFrame.
shape: (5, 32)
┌────────────┬────────────┬─────┬──────────┬───┬───────────┬───────────┬───────────┬───────────────┐
│ rsID       ┆ VariantID  ┆ Chr ┆ Pos_hg19 ┆ … ┆ pval_perm ┆ pval_beta ┆ qval      ┆ pval_nominal_ │
│ ---        ┆ ---        ┆ --- ┆ ---      ┆   ┆ ---       ┆ ---       ┆ ---       ┆ threshold     │
│ str        ┆ str        ┆ i64 ┆ i64      ┆   ┆ f64       ┆ f64       ┆ f64       ┆ ---           │
│            ┆            ┆     ┆          ┆   ┆           ┆           ┆           ┆ f64           │
╞════════════╪════════════╪═════╪══════════╪═══╪═══════════╪═══════════╪═══════════╪═══════════════╡
│ rs2854746  ┆ 7:45960645 ┆ 7   ┆ 45960645 ┆ … ┆ 0.0001    ┆ 5.6344e-8 ┆ 3.3670e-7 ┆ 0.000067      │
│ rs2854746  ┆ 7:45960645 ┆ 7   ┆ 45960645 ┆ … ┆ 0.0001    ┆ 0.000001  ┆ 0.000007  ┆ 0.000064      │
│ rs2854746  ┆ 7:4

## CIMT

In [26]:
import os
import polars as pl
import pandas as pd

print("Merging target variants with nominal cis-eQTLs.")
# Perform the join operation
temp = target_variants_cimt.join(
    sumstats_perm_cis_mqtl, left_on="VariantID", right_on="variant_id", how="inner"
)
print('Sorting the DataFrame by column "pval_nominal" in descending order.')
# Sort the DataFrame by column 'A' in descending order
target_variants_cimt_perm_cis_mqtl = temp.sort("pval_nominal")
del temp

print("Showing the first 5 rows of the DataFrame.")
# Display the sorted DataFrame in descending order
print(target_variants_cimt_perm_cis_mqtl)

print("Exporting the Polars DataFrame to a CSV file.")
# Export the Polars DataFrame to a CSV file
target_variants_cimt_perm_cis_mqtl.write_csv(
    os.path.join(molQTL_loc, "cimt_target_variants_perm_cis_mqtl.csv")
)

Merging target variants with nominal cis-eQTLs.
Sorting the DataFrame by column "pval_nominal" in descending order.
Showing the first 5 rows of the DataFrame.
shape: (4, 33)
┌────────────┬────────────┬─────┬───────────┬───┬───────────┬────────────┬────────────┬────────────┐
│ RSID       ┆ VariantID  ┆ CHR ┆ BP        ┆ … ┆ pval_perm ┆ pval_beta  ┆ qval       ┆ pval_nomin │
│ ---        ┆ ---        ┆ --- ┆ ---       ┆   ┆ ---       ┆ ---        ┆ ---        ┆ al_thresho │
│ str        ┆ str        ┆ i64 ┆ i64       ┆   ┆ f64       ┆ f64        ┆ f64        ┆ ld         │
│            ┆            ┆     ┆           ┆   ┆           ┆            ┆            ┆ ---        │
│            ┆            ┆     ┆           ┆   ┆           ┆            ┆            ┆ f64        │
╞════════════╪════════════╪═════╪═══════════╪═══╪═══════════╪════════════╪════════════╪════════════╡
│ rs13225723 ┆ 7:10641646 ┆ 7   ┆ 106416467 ┆ … ┆ 0.0001    ┆ 5.4718e-35 ┆ 1.7193e-33 ┆ 0.000055   │
│            ┆ 7  

In [27]:
del sumstats_perm_cis_mqtl

# trans-eQTL (permuted)

Here we load the nominal _trans_-acting eQTL data.

In [28]:
# read in data
import polars as pl

# Specify the file path to your data
file_path = os.path.join(
    PERM_TRANS_EQTL_loc, "tensorqtl_trans_full.trans_qtl_pairs.parquet"
)

# Read the data into a Polars DataFrame
sumstats_perm_trans_eqtl = pl.read_parquet(file_path)

In [29]:
# Display the first few rows of the DataFrame to verify the data loading
sumstats_perm_trans_eqtl

variant_id,phenotype_id,pval,b,b_se,af,__index_level_0__
str,str,f64,f32,f32,f32,i64
"""1:730087""","""ENSG0000013140…",0.000009,-0.419756,0.093878,0.010383,0
"""1:752307""","""ENSG0000010803…",0.000006,-0.308768,0.067392,0.019169,1
"""1:752593""","""ENSG0000010803…",0.000006,-0.308768,0.067392,0.019169,2
"""1:752617""","""ENSG0000010803…",0.000006,-0.308768,0.067392,0.019169,3
"""1:754121""","""ENSG0000010803…",0.000006,-0.308768,0.067392,0.019169,4
"""1:754163""","""ENSG0000010803…",0.000006,-0.308768,0.067392,0.019169,5
"""1:754671""","""ENSG0000010803…",0.000006,-0.308768,0.067392,0.019169,6
"""1:759177""","""ENSG0000010803…",0.000006,-0.308768,0.067392,0.019169,7
"""1:759970""","""ENSG0000010803…",0.000006,-0.308768,0.067392,0.019169,8
"""1:760811""","""ENSG0000010803…",0.000006,-0.308768,0.067392,0.019169,9


## PGC

In [30]:
import os
import polars as pl
import pandas as pd

print("Merging target variants with nominal cis-eQTLs.")
# Perform the join operation
temp = target_variants_pgc.join(
    sumstats_perm_trans_eqtl, left_on="VariantID", right_on="variant_id", how="inner"
)
print('Sorting the DataFrame by column "pval_nominal" in descending order.')
# Sort the DataFrame by column 'A' in descending order
target_variants_pgc_perm_trans_eqtl = temp.sort("pval")
del temp

print("Showing the first 5 rows of the DataFrame.")
# Display the sorted DataFrame in descending order
print(target_variants_pgc_perm_trans_eqtl)

print("Exporting the Polars DataFrame to a CSV file.")
# Export the Polars DataFrame to a CSV file
target_variants_pgc_perm_trans_eqtl.write_csv(
    os.path.join(molQTL_loc, "pgc_target_variants_perm_trans_eqtl.csv")
)

Merging target variants with nominal cis-eQTLs.
Sorting the DataFrame by column "pval_nominal" in descending order.
Showing the first 5 rows of the DataFrame.
shape: (94, 19)
┌──────┬───────┬──────────────┬─────┬───┬───────────┬──────────┬──────────┬───────────────────┐
│ RSID ┆ ALTID ┆ VariantID    ┆ Chr ┆ … ┆ b         ┆ b_se     ┆ af       ┆ __index_level_0__ │
│ ---  ┆ ---   ┆ ---          ┆ --- ┆   ┆ ---       ┆ ---      ┆ ---      ┆ ---               │
│ str  ┆ str   ┆ str          ┆ i64 ┆   ┆ f32       ┆ f32      ┆ f32      ┆ i64               │
╞══════╪═══════╪══════════════╪═════╪═══╪═══════════╪══════════╪══════════╪═══════════════════╡
│ ---  ┆ ---   ┆ 17:27368639  ┆ 17  ┆ … ┆ -0.08548  ┆ 0.014809 ┆ 0.06869  ┆ 1125431           │
│ ---  ┆ ---   ┆ 9:23740210   ┆ 9   ┆ … ┆ -0.073517 ┆ 0.013739 ┆ 0.227636 ┆ 731827            │
│ ---  ┆ ---   ┆ 7:111391927  ┆ 7   ┆ … ┆ 0.292759  ┆ 0.05673  ┆ 0.066294 ┆ 626547            │
│ ---  ┆ ---   ┆ 16:15558272  ┆ 16  ┆ … ┆ 0.075144  ┆ 0.0

## Million Hearts

In [31]:
import os
import polars as pl
import pandas as pd

print("Merging target variants with nominal cis-eQTLs.")
# Perform the join operation
temp = target_variants_millionhearts.join(
    sumstats_perm_trans_eqtl, left_on="VariantID", right_on="variant_id", how="inner"
)
print('Sorting the DataFrame by column "pval_nominal" in descending order.')
# Sort the DataFrame by column 'A' in descending order
target_variants_millionhearts_perm_trans_eqtl = temp.sort("pval")
del temp

print("Showing the first 5 rows of the DataFrame.")
# Display the sorted DataFrame in descending order
print(target_variants_millionhearts_perm_trans_eqtl)

print("Exporting the Polars DataFrame to a CSV file.")
# Export the Polars DataFrame to a CSV file
target_variants_millionhearts_perm_trans_eqtl.write_csv(
    os.path.join(
        molQTL_loc, "millionhearts_target_variants_perm_trans_eqtl.csv")
)

Merging target variants with nominal cis-eQTLs.
Sorting the DataFrame by column "pval_nominal" in descending order.
Showing the first 5 rows of the DataFrame.
shape: (110, 30)
┌─────────────┬──────────────┬─────┬───────────┬───┬───────────┬──────────┬──────────┬─────────────┐
│ VariantID   ┆ MarkerName   ┆ CHR ┆ BP        ┆ … ┆ b         ┆ b_se     ┆ af       ┆ __index_lev │
│ ---         ┆ ---          ┆ --- ┆ ---       ┆   ┆ ---       ┆ ---      ┆ ---      ┆ el_0__      │
│ str         ┆ str          ┆ i64 ┆ i64       ┆   ┆ f32       ┆ f32      ┆ f32      ┆ ---         │
│             ┆              ┆     ┆           ┆   ┆           ┆          ┆          ┆ i64         │
╞═════════════╪══════════════╪═════╪═══════════╪═══╪═══════════╪══════════╪══════════╪═════════════╡
│ 6:31888367  ┆ 6:31888367_C ┆ 6   ┆ 31888367  ┆ … ┆ -0.524084 ┆ 0.076206 ┆ 0.869808 ┆ 479722      │
│             ┆ _T           ┆     ┆           ┆   ┆           ┆          ┆          ┆             │
│ 20:44607661 ┆ 

## GIGASTROKE

In [32]:
import os
import polars as pl
import pandas as pd

print("Merging target variants with nominal cis-eQTLs.")
# Perform the join operation
temp = target_variants_gigastroke.join(
    sumstats_perm_trans_eqtl, left_on="VariantID", right_on="variant_id", how="inner"
)
print('Sorting the DataFrame by column "pval_nominal" in descending order.')
# Sort the DataFrame by column 'A' in descending order
target_variants_gigastroke_perm_trans_eqtl = temp.sort("pval")
del temp

print("Showing the first 5 rows of the DataFrame.")
# Display the sorted DataFrame in descending order
print(target_variants_gigastroke_perm_trans_eqtl)

print("Exporting the Polars DataFrame to a CSV file.")
# Export the Polars DataFrame to a CSV file
target_variants_gigastroke_perm_trans_eqtl.write_csv(
    os.path.join(molQTL_loc, "gigastroke_target_variants_perm_trans_eqtl.csv")
)

Merging target variants with nominal cis-eQTLs.
Sorting the DataFrame by column "pval_nominal" in descending order.
Showing the first 5 rows of the DataFrame.
shape: (11, 35)
┌───────┬───────────┬──────────────┬─────┬───┬───────────┬──────────┬──────────┬───────────────────┐
│ Index ┆ rsID      ┆ VariantID    ┆ Chr ┆ … ┆ b         ┆ b_se     ┆ af       ┆ __index_level_0__ │
│ ---   ┆ ---       ┆ ---          ┆ --- ┆   ┆ ---       ┆ ---      ┆ ---      ┆ ---               │
│ i64   ┆ str       ┆ str          ┆ i64 ┆   ┆ f32       ┆ f32      ┆ f32      ┆ i64               │
╞═══════╪═══════════╪══════════════╪═════╪═══╪═══════════╪══════════╪══════════╪═══════════════════╡
│ 78    ┆ rs1412444 ┆ 10:91002927  ┆ 10  ┆ … ┆ 0.083318  ┆ 0.01443  ┆ 0.369808 ┆ 813894            │
│ 30    ┆ rs2738158 ┆ 8:6749669    ┆ 8   ┆ … ┆ -0.125952 ┆ 0.02513  ┆ 0.905751 ┆ 655633            │
│ 84    ┆ rs8014986 ┆ 14:100135718 ┆ 14  ┆ … ┆ 0.136306  ┆ 0.027278 ┆ 0.162141 ┆ 1039841           │
│ 12    ┆ rs46813

## CAC

In [33]:
import os
import polars as pl
import pandas as pd

print("Merging target variants with nominal cis-eQTLs.")
# Perform the join operation
temp = target_variants_cac.join(
    sumstats_perm_trans_eqtl, left_on="VariantID", right_on="variant_id", how="inner"
)
print('Sorting the DataFrame by column "pval_nominal" in descending order.')
# Sort the DataFrame by column 'A' in descending order
target_variants_cac_perm_trans_eqtl = temp.sort("pval")
del temp

print("Showing the first 5 rows of the DataFrame.")
# Display the sorted DataFrame in descending order
print(target_variants_cac_perm_trans_eqtl)

print("Exporting the Polars DataFrame to a CSV file.")
# Export the Polars DataFrame to a CSV file
target_variants_cac_perm_trans_eqtl.write_csv(
    os.path.join(molQTL_loc, "cac_target_variants_perm_trans_eqtl.csv")
)

Merging target variants with nominal cis-eQTLs.
Sorting the DataFrame by column "pval_nominal" in descending order.
Showing the first 5 rows of the DataFrame.
shape: (1, 21)
┌────────────┬────────────┬─────┬──────────┬───┬──────────┬──────────┬──────────┬──────────────────┐
│ rsID       ┆ VariantID  ┆ Chr ┆ Pos_hg19 ┆ … ┆ b        ┆ b_se     ┆ af       ┆ __index_level_0_ │
│ ---        ┆ ---        ┆ --- ┆ ---      ┆   ┆ ---      ┆ ---      ┆ ---      ┆ _                │
│ str        ┆ str        ┆ i64 ┆ i64      ┆   ┆ f32      ┆ f32      ┆ f32      ┆ ---              │
│            ┆            ┆     ┆          ┆   ┆          ┆          ┆          ┆ i64              │
╞════════════╪════════════╪═════╪══════════╪═══╪══════════╪══════════╪══════════╪══════════════════╡
│ rs35355695 ┆ 6:12891103 ┆ 6   ┆ 12891103 ┆ … ┆ 0.157713 ┆ 0.029411 ┆ 0.276358 ┆ 461822           │
└────────────┴────────────┴─────┴──────────┴───┴──────────┴──────────┴──────────┴──────────────────┘
Exporting the Pola

## CIMT

In [36]:
import os
import polars as pl
import pandas as pd

print("Merging target variants with nominal cis-eQTLs.")
# Perform the join operation
temp = target_variants_cimt.join(
    sumstats_perm_trans_eqtl, left_on="VariantID", right_on="variant_id", how="inner"
)
print('Sorting the DataFrame by column "pval_nominal" in descending order.')
# Sort the DataFrame by column 'A' in descending order
target_variants_cimt_perm_trans_eqtl = temp.sort("pval")
del temp

print("Showing the first 5 rows of the DataFrame.")
# Display the sorted DataFrame in descending order
print(target_variants_cimt_perm_trans_eqtl)

print("Exporting the Polars DataFrame to a CSV file.")
# Export the Polars DataFrame to a CSV file
target_variants_cimt_perm_trans_eqtl.write_csv(
    os.path.join(molQTL_loc, "cimt_target_variants_perm_trans_eqtl.csv")
)

Merging target variants with nominal cis-eQTLs.
Sorting the DataFrame by column "pval_nominal" in descending order.
Showing the first 5 rows of the DataFrame.
shape: (0, 22)
┌──────┬───────────┬─────┬─────┬───┬─────┬──────┬─────┬───────────────────┐
│ RSID ┆ VariantID ┆ CHR ┆ BP  ┆ … ┆ b   ┆ b_se ┆ af  ┆ __index_level_0__ │
│ ---  ┆ ---       ┆ --- ┆ --- ┆   ┆ --- ┆ ---  ┆ --- ┆ ---               │
│ str  ┆ str       ┆ i64 ┆ i64 ┆   ┆ f32 ┆ f32  ┆ f32 ┆ i64               │
╞══════╪═══════════╪═════╪═════╪═══╪═════╪══════╪═════╪═══════════════════╡
└──────┴───────────┴─────┴─────┴───┴─────┴──────┴─────┴───────────────────┘
Exporting the Polars DataFrame to a CSV file.


In [37]:
del sumstats_perm_trans_eqtl

# trans-mQTL (permuted)

Here we load the nominal _trans_-acting mQTL data.

In [38]:
# read in data
import polars as pl

# Specify the file path to your data
file_path = os.path.join(
    PERM_TRANS_MQTL_loc, "tensormqtl_perm_trans_qtl_pairs.annot.parquet"
)

# Read the data into a Polars DataFrame
sumstats_perm_trans_mqtl = pl.read_parquet(file_path)

In [39]:
# Display the first few rows of the DataFrame to verify the data loading
sumstats_perm_trans_mqtl

VariantID,CpG,pval_perm,Beta,SE,CAF_mQTL,__index_level_0__,chromosome,position,OtherAlleleA,CodedAlleleB,CAF,AltID,Source,AverageMaximumPosteriorCall,Info,AA_N,AB_N,BB_N,TotalN,MAF,MissingDataProportion,HWE_P
str,str,f64,f32,f32,f32,i64,i64,i64,str,str,f64,str,str,f64,f64,f64,f64,f64,i64,f64,f64,f64
"""1:693731""","""cg23922040""",0.000007,-0.356294,0.078441,0.126984,0,1,693731,"""A""","""G""",0.137241,"""rs12238997""","""HRCr11""",0.902046,0.624501,1637.01,451.103,35.569,2124,0.122957,0.000075,0.544841
"""1:714596""","""cg26079250""",4.7736e-7,0.601047,0.117393,0.038549,0,1,714596,"""T""","""C""",0.0327213,"""rs149887893""","""HRCr11""",0.968278,0.577564,1987.73,135.495,0.677,2124,0.0322165,0.000024,0.272504
"""1:715367""","""cg26079250""",4.7736e-7,0.601047,0.117393,0.038549,1,1,715367,"""A""","""G""",0.0324859,"""rs12184277""","""HRCr11""",0.975207,0.67196,1976.85,146.512,0.556,2124,0.0347528,0.000019,0.176551
"""1:717485""","""cg26079250""",4.7736e-7,0.601047,0.117393,0.038549,2,1,717485,"""C""","""A""",0.0324859,"""rs12184279""","""HRCr11""",0.975183,0.670147,1978.35,145.041,0.517,2124,0.0343883,0.000022,0.174817
"""1:720381""","""cg26079250""",4.7736e-7,0.601047,0.117393,0.038549,3,1,720381,"""G""","""T""",0.0327213,"""rs116801199""","""HRCr11""",0.974476,0.667536,1973.72,149.503,0.691,2124,0.0355205,0.00002,0.110452
"""1:721290""","""cg26079250""",4.7736e-7,0.601047,0.117393,0.038549,4,1,721290,"""G""","""C""",0.0327213,"""rs12565286""","""HRCr11""",0.975168,0.677386,1972.67,150.564,0.687,2124,0.0357682,0.000017,0.110447
"""1:726794""","""cg26079250""",4.7736e-7,0.601047,0.117393,0.038549,5,1,726794,"""C""","""G""",0.0324859,"""rs28454925""","""HRCr11""",0.977817,0.70519,1977.3,146.099,0.547,2124,0.0346509,0.000013,0.176493
"""1:729632""","""cg26079250""",4.7736e-7,0.601047,0.117393,0.038549,6,1,729632,"""C""","""T""",0.0324859,"""rs116720794""","""HRCr11""",0.978581,0.715529,1976.76,146.752,0.454,2124,0.0347604,0.000008,0.176551
"""1:729679""","""cg23922040""",0.000003,0.343842,0.072648,0.835601,1,1,729679,"""C""","""G""",0.829331,"""rs4951859""","""HRCr11""",0.936062,0.78861,58.204,600.893,1464.61,2124,0.16888,0.00007,0.757157
"""1:729679""","""cg20911180""",0.00001,0.115228,0.025699,0.835601,0,1,729679,"""C""","""G""",0.829331,"""rs4951859""","""HRCr11""",0.936062,0.78861,58.204,600.893,1464.61,2124,0.16888,0.00007,0.757157


## PGC

In [40]:
import os
import polars as pl
import pandas as pd

print("Merging target variants with nominal cis-eQTLs.")
# Perform the join operation
temp = target_variants_pgc.join(
    sumstats_perm_trans_mqtl, left_on="VariantID", right_on="VariantID", how="inner"
)
print('Sorting the DataFrame by column "pval_nominal" in descending order.')
# Sort the DataFrame by column 'A' in descending order
target_variants_pgc_perm_trans_mqtl = temp.sort("pval_perm")
del temp

print("Showing the first 5 rows of the DataFrame.")
# Display the sorted DataFrame in descending order
print(target_variants_pgc_perm_trans_mqtl)

print("Exporting the Polars DataFrame to a CSV file.")
# Export the Polars DataFrame to a CSV file
target_variants_pgc_perm_trans_mqtl.write_csv(
    os.path.join(molQTL_loc, "pgc_target_variants_perm_trans_mqtl.csv")
)

Merging target variants with nominal cis-eQTLs.
Sorting the DataFrame by column "pval_nominal" in descending order.
Showing the first 5 rows of the DataFrame.
shape: (2_568, 35)
┌──────┬───────┬─────────────┬─────┬───┬────────┬───────────┬───────────────────────┬───────────┐
│ RSID ┆ ALTID ┆ VariantID   ┆ Chr ┆ … ┆ TotalN ┆ MAF       ┆ MissingDataProportion ┆ HWE_P     │
│ ---  ┆ ---   ┆ ---         ┆ --- ┆   ┆ ---    ┆ ---       ┆ ---                   ┆ ---       │
│ str  ┆ str   ┆ str         ┆ i64 ┆   ┆ i64    ┆ f64       ┆ f64                   ┆ f64       │
╞══════╪═══════╪═════════════╪═════╪═══╪════════╪═══════════╪═══════════════════════╪═══════════╡
│ ---  ┆ ---   ┆ 3:52540773  ┆ 3   ┆ … ┆ 2124   ┆ 0.464328  ┆ 0.00001               ┆ 0.512693  │
│ ---  ┆ ---   ┆ 9:114952087 ┆ 9   ┆ … ┆ 2124   ┆ 0.216406  ┆ 0.000007              ┆ 0.898262  │
│ ---  ┆ ---   ┆ 17:8068045  ┆ 17  ┆ … ┆ 2124   ┆ 0.353823  ┆ 0.000005              ┆ 0.235288  │
│ ---  ┆ ---   ┆ 11:61571348 ┆ 11  ┆ …

## Million Hearts

In [41]:
import os
import polars as pl
import pandas as pd

print("Merging target variants with nominal cis-eQTLs.")
# Perform the join operation
temp = target_variants_millionhearts.join(
    sumstats_perm_trans_mqtl, left_on="VariantID", right_on="VariantID", how="inner"
)
print('Sorting the DataFrame by column "pval_nominal" in descending order.')
# Sort the DataFrame by column 'A' in descending order
target_variants_millionhearts_perm_trans_mqtl = temp.sort("pval_perm")
del temp

print("Showing the first 5 rows of the DataFrame.")
# Display the sorted DataFrame in descending order
print(target_variants_millionhearts_perm_trans_mqtl)

print("Exporting the Polars DataFrame to a CSV file.")
# Export the Polars DataFrame to a CSV file
target_variants_millionhearts_perm_trans_mqtl.write_csv(
    os.path.join(
        molQTL_loc, "millionhearts_target_variants_perm_trans_mqtl.csv")
)

Merging target variants with nominal cis-eQTLs.
Sorting the DataFrame by column "pval_nominal" in descending order.
Showing the first 5 rows of the DataFrame.
shape: (3_096, 46)
┌─────────────┬───────────────┬─────┬───────────┬───┬────────┬───────────┬──────────────┬──────────┐
│ VariantID   ┆ MarkerName    ┆ CHR ┆ BP        ┆ … ┆ TotalN ┆ MAF       ┆ MissingDataP ┆ HWE_P    │
│ ---         ┆ ---           ┆ --- ┆ ---       ┆   ┆ ---    ┆ ---       ┆ roportion    ┆ ---      │
│ str         ┆ str           ┆ i64 ┆ i64       ┆   ┆ i64    ┆ f64       ┆ ---          ┆ f64      │
│             ┆               ┆     ┆           ┆   ┆        ┆           ┆ f64          ┆          │
╞═════════════╪═══════════════╪═════╪═══════════╪═══╪════════╪═══════════╪══════════════╪══════════╡
│ 14:75614504 ┆ 14:75614504_A ┆ 14  ┆ 75614504  ┆ … ┆ 2124   ┆ 0.46603   ┆ 0.00002      ┆ 0.930546 │
│             ┆ _ACCCG        ┆     ┆           ┆   ┆        ┆           ┆              ┆          │
│ 1:109817192 

## GIGASTROKE

In [42]:
import os
import polars as pl
import pandas as pd

print("Merging target variants with nominal cis-eQTLs.")
# Perform the join operation
temp = target_variants_gigastroke.join(
    sumstats_perm_trans_mqtl, left_on="VariantID", right_on="VariantID", how="inner"
)
print('Sorting the DataFrame by column "pval_nominal" in descending order.')
# Sort the DataFrame by column 'A' in descending order
target_variants_gigastroke_perm_trans_mqtl = temp.sort("pval_perm")
del temp

print("Showing the first 5 rows of the DataFrame.")
# Display the sorted DataFrame in descending order
print(target_variants_gigastroke_perm_trans_mqtl)

print("Exporting the Polars DataFrame to a CSV file.")
# Export the Polars DataFrame to a CSV file
target_variants_gigastroke_perm_trans_mqtl.write_csv(
    os.path.join(molQTL_loc, "gigastroke_target_variants_perm_trans_mqtl.csv")
)

Merging target variants with nominal cis-eQTLs.
Sorting the DataFrame by column "pval_nominal" in descending order.
Showing the first 5 rows of the DataFrame.
shape: (452, 51)
┌───────┬────────────┬──────────────┬─────┬───┬────────┬──────────┬─────────────────────┬──────────┐
│ Index ┆ rsID       ┆ VariantID    ┆ Chr ┆ … ┆ TotalN ┆ MAF      ┆ MissingDataProporti ┆ HWE_P    │
│ ---   ┆ ---        ┆ ---          ┆ --- ┆   ┆ ---    ┆ ---      ┆ on                  ┆ ---      │
│ i64   ┆ str        ┆ str          ┆ i64 ┆   ┆ i64    ┆ f64      ┆ ---                 ┆ f64      │
│       ┆            ┆              ┆     ┆   ┆        ┆          ┆ f64                 ┆          │
╞═══════╪════════════╪══════════════╪═════╪═══╪════════╪══════════╪═════════════════════╪══════════╡
│ 87    ┆ rs4471742  ┆ 17:17877771  ┆ 17  ┆ … ┆ 2124   ┆ 0.341493 ┆ 0.000011            ┆ 0.469488 │
│ 88    ┆ rs1788820  ┆ 18:21101944  ┆ 18  ┆ … ┆ 2124   ┆ 0.36816  ┆ 0.000015            ┆ 0.30465  │
│ 81    ┆ rs5576

## CAC

In [43]:
import os
import polars as pl
import pandas as pd

print("Merging target variants with nominal cis-eQTLs.")
# Perform the join operation
temp = target_variants_cac.join(
    sumstats_perm_trans_mqtl, left_on="VariantID", right_on="VariantID", how="inner"
)
print('Sorting the DataFrame by column "pval_nominal" in descending order.')
# Sort the DataFrame by column 'A' in descending order
target_variants_cac_perm_trans_mqtl = temp.sort("pval_perm")
del temp

print("Showing the first 5 rows of the DataFrame.")
# Display the sorted DataFrame in descending order
print(target_variants_cac_perm_trans_mqtl)

print("Exporting the Polars DataFrame to a CSV file.")
# Export the Polars DataFrame to a CSV file
target_variants_cac_perm_trans_mqtl.write_csv(
    os.path.join(molQTL_loc, "cac_target_variants_perm_trans_mqtl.csv")
)

Merging target variants with nominal cis-eQTLs.
Sorting the DataFrame by column "pval_nominal" in descending order.
Showing the first 5 rows of the DataFrame.
shape: (68, 37)
┌────────────┬─────────────┬─────┬───────────┬───┬────────┬───────────┬─────────────────┬──────────┐
│ rsID       ┆ VariantID   ┆ Chr ┆ Pos_hg19  ┆ … ┆ TotalN ┆ MAF       ┆ MissingDataProp ┆ HWE_P    │
│ ---        ┆ ---         ┆ --- ┆ ---       ┆   ┆ ---    ┆ ---       ┆ ortion          ┆ ---      │
│ str        ┆ str         ┆ i64 ┆ i64       ┆   ┆ i64    ┆ f64       ┆ ---             ┆ f64      │
│            ┆             ┆     ┆           ┆   ┆        ┆           ┆ f64             ┆          │
╞════════════╪═════════════╪═════╪═══════════╪═══╪════════╪═══════════╪═════════════════╪══════════╡
│ rs2854746  ┆ 7:45960645  ┆ 7   ┆ 45960645  ┆ … ┆ 2124   ┆ 0.39495   ┆ 0.000009        ┆ 0.413688 │
│ rs7182103  ┆ 15:79123946 ┆ 15  ┆ 79123946  ┆ … ┆ 2124   ┆ 0.418275  ┆ 0.000005        ┆ 0.12942  │
│ rs7182103  ┆ 15

## CIMT

In [44]:
import os
import polars as pl
import pandas as pd

print("Merging target variants with nominal cis-eQTLs.")
# Perform the join operation
temp = target_variants_cimt.join(
    sumstats_perm_trans_mqtl, left_on="VariantID", right_on="VariantID", how="inner"
)
print('Sorting the DataFrame by column "pval_nominal" in descending order.')
# Sort the DataFrame by column 'A' in descending order
target_variants_cimt_perm_trans_mqtl = temp.sort("pval_perm")
del temp

print("Showing the first 5 rows of the DataFrame.")
# Display the sorted DataFrame in descending order
print(target_variants_cimt_perm_trans_mqtl)

print("Exporting the Polars DataFrame to a CSV file.")
# Export the Polars DataFrame to a CSV file
target_variants_cimt_perm_trans_mqtl.write_csv(
    os.path.join(molQTL_loc, "cimt_target_variants_perm_trans_mqtl.csv")
)

Merging target variants with nominal cis-eQTLs.
Sorting the DataFrame by column "pval_nominal" in descending order.
Showing the first 5 rows of the DataFrame.
shape: (400, 38)
┌─────────────┬─────────────┬─────┬───────────┬───┬────────┬──────────┬─────────────────┬──────────┐
│ RSID        ┆ VariantID   ┆ CHR ┆ BP        ┆ … ┆ TotalN ┆ MAF      ┆ MissingDataProp ┆ HWE_P    │
│ ---         ┆ ---         ┆ --- ┆ ---       ┆   ┆ ---    ┆ ---      ┆ ortion          ┆ ---      │
│ str         ┆ str         ┆ i64 ┆ i64       ┆   ┆ i64    ┆ f64      ┆ ---             ┆ f64      │
│             ┆             ┆     ┆           ┆   ┆        ┆          ┆ f64             ┆          │
╞═════════════╪═════════════╪═════╪═══════════╪═══╪════════╪══════════╪═════════════════╪══════════╡
│ rs113309773 ┆ 16:75432686 ┆ 16  ┆ 75432686  ┆ … ┆ 2124   ┆ 0.371528 ┆ 0.000016        ┆ 0.74495  │
│ rs13225723  ┆ 7:106416467 ┆ 7   ┆ 106416467 ┆ … ┆ 2124   ┆ 0.223969 ┆ 0.000008        ┆ 0.573563 │
│ rs17477177  ┆ 

In [45]:
del sumstats_perm_trans_mqtl