# Pleiotropy project for Asthma, Adiposity and type 2 diabetes in jazf1 region

## Data files and documents
* The location of phenotype and genotype data described [here](https://github.com/statgenetics/UKBB_GWAS_dev/blob/master/analysis/pleiotropy/data_description.ipynb)
* Phenotype and regenie summstat files also copied to my cluster account

    **Pheno**
    
    > **asthma**:/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/pheno_asthma_ind_PC.txt
    >
    > **t2d**:/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/pheno_asthma_ind_PC.txt
    >
    > **waist**:/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/pheno_WC_ind_PC.txt
    
    **Sumstats**
    > **asthma**:/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/asthma_PC10_step2_imp.regenie_ASTHMA.regenie
    >
    > **t2d**:/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/T2D_PC10_step2_imp.regenie_T2D.regenie
    >
    > **waist**:/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/WC_PC10_step2_imp.regenie_WAISTcirc_invranknorm.regenie

## Input file preparation for GSMR

GSMR requires GWAS summary data of both the risk factor and outcome (disease) that includes the SNP, a1, a2, a1_freq, effect size, std error, p value, and GWAS sample size for each phenotype.


In [2]:
import pandas as pd

In [3]:
# reading in the regenie on the imputed data and subsetting regenie data to only keep information within jazf1 region - 7 27868573 28273990
# within jazf1 region - 7 27868573 28273990 of the sumstats files there are 2067 variants for asthma and t2d, and 2068 variants for waist

# asthma data
asthma_regenie = pd.read_csv("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/asthma_PC10_step2_imp.regenie_ASTHMA.regenie", sep=" ")
asthma_regenie = asthma_regenie[(asthma_regenie["CHROM"] == 7) & (asthma_regenie["GENPOS"] >= 27868573) & (asthma_regenie["GENPOS"] <= 28273990)][["CHROM", "GENPOS", "ID", "ALLELE0", "ALLELE1", "A1FREQ", "N", "BETA", "SE", "LOG10P"]]

# t2d data
t2d_regenie = pd.read_csv("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/T2D_PC10_step2_imp.regenie_T2D.regenie", sep=" ")
t2d_regenie = t2d_regenie[(t2d_regenie["CHROM"] == 7) & (t2d_regenie["GENPOS"] >= 27868573) & (t2d_regenie["GENPOS"] <= 28273990)][["CHROM", "GENPOS", "ID", "ALLELE0", "ALLELE1", "A1FREQ", "N", "BETA", "SE", "LOG10P"]]

# waist circumference data
waist_regenie = pd.read_csv("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/WC_PC10_step2_imp.regenie_WAISTcirc_invranknorm.regenie", sep=" ")
waist_regenie = waist_regenie[(waist_regenie["CHROM"] == 7) & (waist_regenie["GENPOS"] >= 27868573) & (waist_regenie["GENPOS"] <= 28273990)][["CHROM", "GENPOS", "ID", "ALLELE0", "ALLELE1", "A1FREQ", "N", "BETA", "SE", "LOG10P"]]

In [4]:
asthma_regenie

Unnamed: 0,CHROM,GENPOS,ID,ALLELE0,ALLELE1,A1FREQ,N,BETA,SE,LOG10P
6069486,7,27869098,rs545409685,C,T,0.997340,339345,0.116773,0.071129,0.997184
6069487,7,27869261,7:27869261_CAGTA_C,C,CAGTA,0.998498,339345,0.036830,0.098431,0.149797
6069488,7,27869377,rs73075348,G,A,0.943372,339345,-0.000204,0.015226,0.004665
6069489,7,27869782,rs6948467,A,G,0.607234,339345,-0.002206,0.007223,0.119179
6069490,7,27869794,rs73075354,G,C,0.883803,339345,0.012311,0.011001,0.579898
...,...,...,...,...,...,...,...,...,...,...
6071548,7,28273623,7:28273623_TTTCCTTCCTTCC_T,T,TTTCCTTCCTTCC,0.842749,339345,0.000601,0.009892,0.021555
6071549,7,28273697,rs188426589,A,T,0.987408,339345,-0.004086,0.031982,0.046563
6071550,7,28273719,rs6944995,G,T,0.146693,339345,0.006399,0.009963,0.283409
6071551,7,28273829,rs192297723,C,A,0.988365,339345,0.013026,0.035082,0.148496


In [7]:
# gsmr sumdata uses SNP, a1, a2, a1_freq, bzx, bzx_se, bzx_pval, bzx_n, bzy, bzy_se, bzy_pval, bzy_n as columns

# from current regenie data need to calculate A0FREQ and PVAL

# A0FREQ
get_a0freq = lambda row: 1 - row["A1FREQ"]

asthma_regenie["A0FREQ"] = asthma_regenie.apply(get_a0freq, axis=1)
t2d_regenie["A0FREQ"] = t2d_regenie.apply(get_a0freq, axis=1)
waist_regenie["A0FREQ"] = waist_regenie.apply(get_a0freq, axis=1)

# PVAL
get_pval = lambda row: 10 ** (-row["LOG10P"])

asthma_regenie["PVAL"] = asthma_regenie.apply(get_pval, axis=1)
t2d_regenie["PVAL"] = t2d_regenie.apply(get_pval, axis=1)
waist_regenie["PVAL"] = waist_regenie.apply(get_pval, axis=1)

# also renaming all the columns for merging later on
asthma_regenie = asthma_regenie.rename(columns={"ID":"SNP", "ALLELE0":"a1", "ALLELE1":"a2", "A0FREQ":"asthma_a1_freq", "N":"asthma_n", "BETA":"asthma_beta", "SE":"asthma_se", "PVAL":"asthma_pval"})
t2d_regenie = t2d_regenie.rename(columns={"ID":"SNP", "ALLELE0":"a1", "ALLELE1":"a2", "A0FREQ":"t2d_a1_freq", "N":"t2d_n", "BETA":"t2d_beta", "SE":"t2d_se", "PVAL":"t2d_pval"})
waist_regenie = waist_regenie.rename(columns={"ID":"SNP", "ALLELE0":"a1", "ALLELE1":"a2", "A0FREQ":"waist_a1_freq", "N":"waist_n", "BETA":"waist_beta", "SE":"waist_se", "PVAL":"waist_pval"})

# keeping only relevant columns
asthma_regenie = asthma_regenie[["SNP", "a1", "a2", "asthma_a1_freq", "asthma_beta", "asthma_se", "asthma_pval", "asthma_n"]]
t2d_regenie = t2d_regenie[["SNP", "a1", "a2", "t2d_a1_freq", "t2d_beta", "t2d_se", "t2d_pval", "t2d_n"]]
waist_regenie = waist_regenie[["SNP", "a1", "a2", "waist_a1_freq", "waist_beta", "waist_se", "waist_pval", "waist_n"]]

This is the input format for the GSMR analysis.  
SNP: the genetic instrument  
a1: effect allele  
a2: the other allele  
a1_freq: frequency of a1  
bzx: the effect size of a1 on risk factor  
bzx_se: standard error of bzx  
bzx_pval: p value for bzx  
bzx_n: per-SNP sample size of GWAS for the risk factor  
bzy: the effect size of a1 on disease  
bzy_se: standard error of bzy  
bzy_pval: p value for bzy  
bzy_n: per-SNP sample size of GWAS for the disease

### The goal is to assess pleiotropic relationships between asthma-T2D, waist-T2D and waist_asthma_t2d 


In [11]:
## Prepare asthma_t2d sumdata for GSMR
asthma_v_t2d = pd.merge(asthma_regenie, t2d_regenie,  how='inner', left_on=['SNP','a1','a2'], right_on = ['SNP','a1','a2']).drop(["t2d_a1_freq"], axis=1)
asthma_v_t2d = asthma_v_t2d.rename(columns={"asthma_a1_freq":"a1_freq", "asthma_beta":"bzx", "asthma_se":"bzx_se", "asthma_pval":"bzx_pval", "asthma_n":"bzx_n", "t2d_beta":"bzy", "t2d_se":"bzy_se", "t2d_pval":"bzy_pval", "t2d_n":"bzy_n"})
asthma_v_t2d

## Prepare waist_t2d sumdata for GSMR
waist_v_t2d = pd.merge(waist_regenie, t2d_regenie,  how='inner', left_on=['SNP','a1','a2'], right_on = ['SNP','a1','a2']).drop(["t2d_a1_freq"], axis=1)
waist_v_t2d = waist_v_t2d.rename(columns={"waist_a1_freq":"a1_freq", "waist_beta":"bzx", "waist_se":"bzx_se", "waist_pval":"bzx_pval", "waist_n":"bzx_n", "t2d_beta":"bzy", "t2d_se":"bzy_se", "t2d_pval":"bzy_pval", "t2d_n":"bzy_n"})

# Prepare waist_asthma sumdata for GSMR
waist_v_asthma = pd.merge(waist_regenie, asthma_regenie,  how='inner', left_on=['SNP','a1','a2'], right_on = ['SNP','a1','a2']).drop(["asthma_a1_freq"], axis=1)
waist_v_asthma = waist_v_asthma.rename(columns={"waist_a1_freq":"a1_freq", "waist_beta":"bzx", "waist_se":"bzx_se", "waist_pval":"bzx_pval", "waist_n":"bzx_n", "asthma_beta":"bzy", "asthma_se":"bzy_se", "asthma_pval":"bzy_pval", "asthma_n":"bzy_n"})



Unnamed: 0,SNP,a1,a2,a1_freq,bzx,bzx_se,bzx_pval,bzx_n,bzy,bzy_se,bzy_pval,bzy_n
0,rs545409685,C,T,0.002660,0.116773,0.071129,0.100651,339345,0.038626,0.100036,0.699410,336074
1,7:27869261_CAGTA_C,C,CAGTA,0.001502,0.036830,0.098431,0.708277,339345,0.108386,0.140165,0.439362,336074
2,rs73075348,G,A,0.056628,-0.000204,0.015226,0.989316,339345,0.016225,0.021522,0.450928,336074
3,rs6948467,A,G,0.392766,-0.002206,0.007223,0.760013,339345,-0.020544,0.010241,0.044843,336074
4,rs73075354,G,C,0.116197,0.012311,0.011001,0.263089,339345,0.038589,0.015628,0.013539,336074
...,...,...,...,...,...,...,...,...,...,...,...,...
2060,7:28273623_TTTCCTTCCTTCC_T,T,TTTCCTTCCTTCC,0.157251,0.000601,0.009892,0.951580,339345,-0.007129,0.014019,0.611093,336074
2061,rs188426589,A,T,0.012592,-0.004086,0.031982,0.898333,339345,-0.013464,0.045338,0.766489,336074
2062,rs6944995,G,T,0.853307,0.006399,0.009963,0.520704,339345,0.013798,0.014138,0.329086,336074
2063,rs192297723,C,A,0.011635,0.013026,0.035082,0.710402,339345,-0.051344,0.049307,0.297730,336074


In [14]:
# Save GSMR data
asthma_v_t2d.to_csv("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/asthma_v_t2d_gsmr_data", sep=" ", index=False)
waist_v_t2d.to_csv("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/waist_v_t2d_gsmr_data", sep=" ", index=False)
waist_v_asthma.to_csv("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/waist_v_asthma_gsmr_data", sep=" ", index=False)

In [15]:
# Save the genetic variants and effect alleles to estimate LD correlation matrix
asthma_v_t2d[["SNP", "a1"]].to_csv("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/asthma_v_t2d_snps.allele", sep=" ", header=False, index=False)
waist_v_t2d[["SNP", "a1"]].to_csv("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/waist_v_t2d_snps.allele", sep=" ", header=False, index=False)
waist_v_asthma[["SNP", "a1"]].to_csv("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/waist_v_asthma_snps.allele", sep=" ", header=False, index=False)

## check sumstats against bgenfile and bfile to determine intersection of variants

* within jazf1 region - 7 27868573 28273990 of the sumstats files there are 2067 variants for asthma and t2d, and 2068 variants for waist
    * the intersection with the bgen file indicates 1952 and 1953 of these variants respectively can be found in both datasets
    

In [18]:
# asthma data
asthma_sumstats = pd.read_csv("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/asthma_PC10_step2_imp.regenie_ASTHMA.regenie", sep=" ")
asthma_sumstats = asthma_sumstats[(asthma_sumstats["CHROM"] == 7) & (asthma_sumstats["GENPOS"] >= 27868573) & (asthma_sumstats["GENPOS"] <= 28273990)][["CHROM", "GENPOS", "ID", "ALLELE0", "ALLELE1", "A1FREQ", "N", "BETA", "SE", "LOG10P"]]

# t2d data
t2d_sumstats = pd.read_csv("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/T2D_PC10_step2_imp.regenie_T2D.regenie", sep=" ")
t2d_sumstats = t2d_sumstats[(t2d_sumstats["CHROM"] == 7) & (t2d_sumstats["GENPOS"] >= 27868573) & (t2d_sumstats["GENPOS"] <= 28273990)][["CHROM", "GENPOS", "ID", "ALLELE0", "ALLELE1", "A1FREQ", "N", "BETA", "SE", "LOG10P"]]

# waist circumference data
waist_sumstats = pd.read_csv("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/WC_PC10_step2_imp.regenie_WAISTcirc_invranknorm.regenie", sep=" ")
waist_sumstats = waist_sumstats[(waist_sumstats["CHROM"] == 7) & (waist_sumstats["GENPOS"] >= 27868573) & (waist_sumstats["GENPOS"] <= 28273990)][["CHROM", "GENPOS", "ID", "ALLELE0", "ALLELE1", "A1FREQ", "N", "BETA", "SE", "LOG10P"]]

# from current regenie data need to calculate A0FREQ and PVAL

# A0FREQ
get_a0freq = lambda row: 1 - row["A1FREQ"]

asthma_sumstats["A0FREQ"] = asthma_sumstats.apply(get_a0freq, axis=1)
t2d_sumstats["A0FREQ"] = t2d_sumstats.apply(get_a0freq, axis=1)
waist_sumstats["A0FREQ"] = waist_sumstats.apply(get_a0freq, axis=1)

# PVAL
get_pval = lambda row: 10 ** (-row["LOG10P"])

asthma_sumstats["PVAL"] = asthma_sumstats.apply(get_pval, axis=1)
t2d_sumstats["PVAL"] = t2d_sumstats.apply(get_pval, axis=1)
waist_sumstats["PVAL"] = waist_sumstats.apply(get_pval, axis=1)

# keeping only relevant columns
asthma_sumstats = asthma_sumstats[["CHROM","GENPOS","ID", "ALLELE0", "ALLELE1", "A0FREQ", "BETA", "SE", "PVAL", "N"]]
t2d_sumstats = t2d_sumstats[["CHROM","GENPOS","ID", "ALLELE0", "ALLELE1", "A0FREQ", "BETA", "SE", "PVAL", "N"]]
waist_sumstats = waist_sumstats[["CHROM","GENPOS","ID", "ALLELE0", "ALLELE1", "A0FREQ", "BETA", "SE", "PVAL", "N"]]



In [20]:
# renaming columns in Sumstat format
asthma_sumstats = asthma_sumstats.rename(columns={"CHROM":"CHR", "GENPOS":"POS", "ID":"SNP", "ALLELE0":"A1", "ALLELE1":"A2", "A0FREQ":"A1FREQ", "BETA":"beta", "SE":"se", "PVAL":"p"})
t2d_sumstats = t2d_sumstats.rename(columns={"CHROM":"CHR", "GENPOS":"POS", "ID":"SNP", "ALLELE0":"A1", "ALLELE1":"A2", "A0FREQ":"A1FREQ", "BETA":"beta", "SE":"se", "PVAL":"p"})
waist_sumstats = waist_sumstats.rename(columns={"CHROM":"CHR", "GENPOS":"POS", "ID":"SNP", "ALLELE0":"A1", "ALLELE1":"A2", "A0FREQ":"A1FREQ", "BETA":"beta", "SE":"se", "PVAL":"p"})

In [21]:
# Save sumstats in jazf1 region for future use

asthma_sumstats.to_csv("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/asthma_sumstats_jazf1", sep="\t", index=False)
t2d_sumstats.to_csv("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/t2d_sumstats_jazf1", sep="\t", index=False)
waist_sumstats.to_csv("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/waist_sumstats_jazf1", sep="\t", index=False)

In [30]:
# Define set of chr:pos_Ref.allele_Alt.allele format to check the intersection of variants with the bgen file.

def chromsnp(row):
    return f"{row['CHR']}:{row['POS']}_{row['A2']}_{row['A1']}"
asthma_sumstats_chromsnp = asthma_sumstats.apply(chromsnp, axis=1)
asthma_sumstats_chromsnp = set(asthma_sumstats_chromsnp.to_list())
t2d_sumstats_chromsnp = t2d_sumstats.apply(chromsnp, axis=1)
t2d_sumstats_chromsnp = set(t2d_sumstats_chromsnp.to_list())
waist_sumstats_chromsnp = waist_sumstats.apply(chromsnp, axis=1)
waist_sumstats_chromsnp = set(waist_sumstats_chromsnp.to_list())

In [31]:
# importimputed bgen (imputed) from chr.7

bgenfile = pd.read_csv("/mnt/mfs/statgen/archive/UKBiobank_Yale_transfer/ukb39554_imputeddataset/ukb_mfi_chr7_v3.txt", sep="\t", header=None)


In [32]:
bgenfile

Unnamed: 0,0,1,2,3,4,5,6,7
0,7:14808_T_C,rs555283805,14808,T,C,6.696880e-05,C,0.364399
1,7:15064_T_C,rs576737504,15064,T,C,9.846300e-04,C,0.652527
2,7:16454_C_T,rs544026442,16454,C,T,7.442310e-07,T,0.011086
3,7:16692_G_C,rs370739206,16692,G,C,0.000000e+00,G,
4,7:16712_T_G,rs373250171,16712,T,G,2.717570e-04,G,0.207113
...,...,...,...,...,...,...,...,...
5405519,7:159128544_A_C,rs183389554,159128544,A,C,7.966900e-05,C,0.308943
5405520,7:159128550_C_G,rs145893243,159128550,C,G,2.656680e-02,G,0.939808
5405521,7:159128554_C_T,rs77350961,159128554,C,T,2.120010e-04,T,0.254989
5405522,7:159128560_T_C,rs542634737,159128560,T,C,5.464860e-03,C,0.532634


In [33]:
# create the variant id set 
bgen_chromsnp = set(bgenfile[0].to_list())

In [34]:
# check the intersection between the sumstats data and bgenfile
len(asthma_sumstats_chromsnp.intersection(bgen_chromsnp))

1952

In [35]:
len(t2d_sumstats_chromsnp.intersection(bgen_chromsnp))

1952

In [36]:
len(waist_sumstats_chromsnp.intersection(bgen_chromsnp))

1953

In [39]:
# checking the list ofsig snps
import numpy as np
import pandas as pd

In [40]:

asthma_sumstats_chromsnp = asthma_sumstats.apply(chromsnp, axis=1)
t2d_sumstats_chromsnp = t2d_sumstats.apply(chromsnp, axis=1)
waist_sumstats_chromsnp = waist_sumstats.apply(chromsnp, axis=1)

In [41]:
asthma_sumstats_chromsnp

6069486                7:27869098_T_C
6069487            7:27869261_CAGTA_C
6069488                7:27869377_A_G
6069489                7:27869782_G_A
6069490                7:27869794_C_G
                      ...            
6071548    7:28273623_TTTCCTTCCTTCC_T
6071549                7:28273697_T_A
6071550                7:28273719_T_G
6071551                7:28273829_A_C
6071552                7:28273986_T_C
Length: 2067, dtype: object

In [2]:
library("dplyr")


Attaching package: ‘dplyr’


The following objects are masked from ‘package:stats’:

    filter, lag


The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union




In [1]:
#
asthma_sumstats <- read.table("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/asthma_sumstats_jazf1", header=TRUE, sep="\t")
t2d_sumstats <- read.table("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/t2d_sumstats_jazf1", header=TRUE, sep="\t")
waist_sumstats <- read.table("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/waist_sumstats_jazf1", header=TRUE, sep="\t")

In [3]:
head(asthma_sumstats)

Unnamed: 0_level_0,CHR,POS,SNP,A1,A2,A1FREQ,beta,se,p,N
Unnamed: 0_level_1,<int>,<int>,<chr>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<int>
1,7,27869098,rs545409685,C,T,0.00266,0.116773,0.0711289,0.1006505,339345
2,7,27869261,7:27869261_CAGTA_C,C,CAGTA,0.001502,0.0368299,0.0984305,0.7082768,339345
3,7,27869377,rs73075348,G,A,0.056628,-0.000203885,0.0152262,0.9893163,339345
4,7,27869782,rs6948467,A,G,0.392766,-0.00220631,0.00722281,0.760013,339345
5,7,27869794,rs73075354,G,C,0.116197,0.0123111,0.0110007,0.2630886,339345
6,7,27869921,rs35410592,A,C,0.009441,-0.0158857,0.0369552,0.6672952,339345


In [15]:
asthma_sumstats_p508<-filter(asthma_sumstats, p<=5.0e-08)
t2d_sumstats_p508<-filter(t2d_sumstats, p<=5.0e-08)
waist_sumstats_p508<-filter(waist_sumstats, p<=5.0e-08)

In [34]:
dim(asthma_sumstats_p508)

In [35]:
dim(t2d_sumstats_p508)

In [36]:
dim(waist_sumstats_p508)

In [39]:
# Save list of significant snps
write.table(asthma_sumstats_p508,"/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/asthma_sumstats_p508.txt", col.names=TRUE,row.names=FALSE, sep="\t",quote=FALSE)
write.table(t2d_sumstats_p508,"/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/t2d_sumstats_p508.txt", col.names=TRUE,row.names=FALSE, sep="\t",quote=FALSE)
write.table(waist_sumstats_p508,"/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/waist_sumstats_p508.txt", col.names=TRUE,row.names=FALSE, sep="\t",quote=FALSE)

In [51]:
# create a marker name chr:pos_a2_a1 consistent with bgen file to subset variants not fond in the bgen file.

asthma_sumstats_marker <- asthma_sumstats %>% mutate(MARKER = paste0(CHR,":",POS,"_",A2,"_",A1))
t2d_sumstats_marker <- t2d_sumstats %>% mutate(MARKER = paste0(CHR,":",POS,"_",A2,"_",A1))
waist_sumstats_marker <- waist_sumstats %>% mutate(MARKER = paste0(CHR,":",POS,"_",A2,"_",A1))

In [55]:
write.table(asthma_sumstats_marker,"/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/asthma_sumstats_marker.txt", col.names=TRUE,row.names=FALSE, sep="\t",quote=FALSE)
write.table(t2d_sumstats_marker,"/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/t2d_sumstats_marker.txt", col.names=TRUE,row.names=FALSE, sep="\t",quote=FALSE)
write.table(waist_sumstats_marker,"/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/waist_sumstats_marker.txt", col.names=TRUE,row.names=FALSE, sep="\t",quote=FALSE)

In [52]:
head(asthma_sumstats_marker)

Unnamed: 0_level_0,CHR,POS,SNP,A1,A2,A1FREQ,beta,se,p,N,MARKER
Unnamed: 0_level_1,<int>,<int>,<chr>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<int>,<chr>
1,7,27869098,rs545409685,C,T,0.00266,0.116773,0.0711289,0.1006505,339345,7:27869098_T_C
2,7,27869261,7:27869261_CAGTA_C,C,CAGTA,0.001502,0.0368299,0.0984305,0.7082768,339345,7:27869261_CAGTA_C
3,7,27869377,rs73075348,G,A,0.056628,-0.000203885,0.0152262,0.9893163,339345,7:27869377_A_G
4,7,27869782,rs6948467,A,G,0.392766,-0.00220631,0.00722281,0.760013,339345,7:27869782_G_A
5,7,27869794,rs73075354,G,C,0.116197,0.0123111,0.0110007,0.2630886,339345,7:27869794_C_G
6,7,27869921,rs35410592,A,C,0.009441,-0.0158857,0.0369552,0.6672952,339345,7:27869921_C_A


In [5]:
# import bgenfile
bgenfile <- read.table("/mnt/mfs/statgen/archive/UKBiobank_Yale_transfer/ukb39554_imputeddataset/ukb_mfi_chr7_v3.txt", sep="\t", header=FALSE)

In [6]:
head(bgenfile)

Unnamed: 0_level_0,V1,V2,V3,V4,V5,V6,V7,V8
Unnamed: 0_level_1,<chr>,<chr>,<int>,<chr>,<chr>,<dbl>,<chr>,<dbl>
1,7:14808_T_C,rs555283805,14808,T,C,6.69688e-05,C,0.364399
2,7:15064_T_C,rs576737504,15064,T,C,0.00098463,C,0.652527
3,7:16454_C_T,rs544026442,16454,C,T,7.44231e-07,T,0.0110856
4,7:16692_G_C,rs370739206,16692,G,C,0.0,G,
5,7:16712_T_G,rs373250171,16712,T,G,0.000271757,G,0.207113
6,7:16731_T_G,rs541070747,16731,T,G,0.000805592,G,0.551481


In [7]:
dim(bgenfile)

In [8]:
# Import sumstats data
asthma_sumstats_marker <- read.table("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/asthma_sumstats_marker.txt", header=TRUE, sep="\t")
t2d_sumstats_marker <- read.table("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/t2d_sumstats_marker.txt", header=TRUE, sep="\t")
waist_sumstats_marker <- read.table("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/waist_sumstats_marker.txt", header=TRUE,sep="\t")


In [10]:
bgen_chrsnp <- list(bgenfile$V1)

In [19]:
library(dplyr)


Attaching package: ‘dplyr’


The following objects are masked from ‘package:stats’:

    filter, lag


The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union




In [21]:

# use anti_join() command from dplyr to filter variants filter of sumstats data that are not in the bgen file

asthma_sumstats_notin_bgen <- anti_join(asthma_sumstats_marker, bgenfile, by = c("MARKER" = "V1"))
t2d_sumstats_notin_bgen <- anti_join(t2d_sumstats_marker, bgenfile, by = c("MARKER" = "V1"))
waist_sumstats_notin_bgen <- anti_join(waist_sumstats_marker, bgenfile, by = c("MARKER" = "V1"))



In [22]:
dim(asthma_sumstats_notin_bgen)

In [23]:
dim(t2d_sumstats_notin_bgen)

In [24]:
dim(waist_sumstats_notin_bgen)

In [25]:
head(asthma_sumstats_notin_bgen)

Unnamed: 0_level_0,CHR,POS,SNP,A1,A2,A1FREQ,beta,se,p,N,MARKER
Unnamed: 0_level_1,<int>,<int>,<chr>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<int>,<chr>
1,7,27869377,rs73075348,G,A,0.056628,-0.000203885,0.0152262,0.9893163,339345,7:27869377_A_G
2,7,27872466,rs143697346,G,A,0.001164,0.00457096,0.104266,0.9650324,339345,7:27872466_A_G
3,7,27875642,rs112759592,C,G,0.037919,0.0112174,0.0183894,0.5418661,339345,7:27875642_G_C
4,7,27877537,rs10281008,G,A,0.10365,0.00500289,0.0115376,0.6645673,339345,7:27877537_A_G
5,7,27879552,rs117933100,C,T,0.020039,-0.00578915,0.0250778,0.8174334,339345,7:27879552_T_C
6,7,27881058,rs58651394,T,C,0.010441,0.0225396,0.0346362,0.5152073,339345,7:27881058_C_T


In [27]:
# save the files
write.table(asthma_sumstats_notin_bgen,"/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/asthma_sumstats_notinbgen.txt",col.names=TRUE, row.name=FALSE,sep="\t",quote=FALSE)
write.table(t2d_sumstats_notin_bgen,"/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/t2d_sumstats_notinbgen.txt",col.names=TRUE, row.name=FALSE,sep="\t",quote=FALSE)
write.table(waist_sumstats_notin_bgen,"/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/waist_sumstats_notinbgen.txt",col.names=TRUE, row.name=FALSE,sep="\t",quote=FALSE)


## bfile

In [4]:
library(dplyr)


Attaching package: ‘dplyr’


The following objects are masked from ‘package:stats’:

    filter, lag


The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union




In [1]:
# bfile from imputed dataset
# Join the sumstat data with the bfile to check the overlap of variants

bfile <- read.table("/mnt/mfs/statgen/archive/UKBiobank_Yale_transfer/pleiotropy_geneticfiles/UKB_Caucasians_phenotypeindepqc120319_updated020221removedwithdrawnindiv.bim",sep="\t", header=FALSE)


In [2]:
head(bfile)

Unnamed: 0_level_0,V1,V2,V3,V4,V5,V6
Unnamed: 0_level_1,<int>,<chr>,<int>,<int>,<chr>,<chr>
1,1,rs3131962,0,756604,A,G
2,1,rs12562034,0,768448,A,G
3,1,rs4040617,0,779322,G,A
4,1,rs79373928,0,801536,G,T
5,1,rs11240779,0,808631,G,A
6,1,rs57181708,0,809876,G,A


In [5]:
bfile1 <- bfile %>% filter(V1 == 7 & V4 >= 27868573 & V4 <= 28273990)

In [6]:
dim(bfile)

In [7]:
dim(bfile1)

In [27]:
# import sumstats
# Import sumstats data
asthma <- read.table("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/asthma_sumstats_marker.txt", header=TRUE, sep="\t")
t2d <- read.table("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/t2d_sumstats_marker.txt", header=TRUE, sep="\t")
waist <- read.table("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/waist_sumstats_marker.txt", header=TRUE,sep="\t")


## Estimate LD correlation matrix

The availlable bfile contains only 96 variants in the jazf1 region. I used plink to generate bfiles from the imputed bgen data.

In [None]:
# ran this command on the terminal to generate plink files from the imputed bgen file to estimate LD correlation matrix
./plink2 --bgen /mnt/mfs/statgen/archive/UKBiobank_Yale_transfer/ukb39554_imputeddataset/ukb_imp_chr7_v3.bgen 'ref-first'  
         --sample /mnt/mfs/statgen/archive/UKBiobank_Yale_transfer/ukb39554_imputeddataset/ukb32285_imputedindiv.sample 
         --make-bed 
         --out /mnt/mfs/statgen/bst2126/pleiotropy/geno/UKB_imputed_chr7_v3

In [2]:
library(data.table)
library(dplyr)



Attaching package: ‘dplyr’


The following objects are masked from ‘package:data.table’:

    between, first, last


The following objects are masked from ‘package:stats’:

    filter, lag


The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union




In [6]:
# Prepare the GWAS sumstats in GCTA-COJO format: SNP, A1, A2, freq, b, se, p, N
asthma <- fread("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/asthma_sumstats_jazf1",header=TRUE, sep="\t")
t2d <- fread("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/t2d_sumstats_jazf1",header=TRUE, sep="\t")
waist <- fread("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/waist_sumstats_jazf1",header=TRUE, sep="\t")
cols<-c("SNP","A1","A2","freq","b","se","p","N")
asthma <- asthma[,c(3:10)]
colnames(asthma)<- cols
t2d <- t2d[,c(3:10)]
colnames(t2d)<-cols
waist<-waist[,c(3:10)]
colnames(waist)<-cols

# # Remove if there are any dublicates (as the GCTA_GSMR module do not allow dublicates in SNP column)
t2d <- t2d[!duplicated(t2d$SNP), ] 
asthma <- asthma[!duplicated(asthma$SNP), ] 
waist <- waist[!duplicated(waist$SNP), ] 

In [7]:
dim(t2d)
head(t2d)

SNP,A1,A2,freq,b,se,p,N
<chr>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<int>
rs545409685,C,T,0.002663,0.0386257,0.100036,0.69941026,336074
7:27869261_CAGTA_C,C,CAGTA,0.001508,0.108386,0.140165,0.4393615,336074
rs73075348,G,A,0.056693,0.0162246,0.0215218,0.45092779,336074
rs6948467,A,G,0.392739,-0.0205441,0.0102407,0.04484252,336074
rs73075354,G,C,0.116102,0.038589,0.0156277,0.01353879,336074
rs35410592,A,C,0.009458,0.00427717,0.0522587,0.934769,336074


In [None]:
# Save sumstats in gsmr format
write.table(asthma,"/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/asthma_gsmr_jazf1.txt",col.names=TRUE, row.names=FALSE, sep="\t",quote=FALSE)
write.table(t2d,"/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/t2d_gsmr_jazf1.txt",col.names=TRUE, row.names=FALSE, sep="\t",quote=FALSE)
write.table(waist,"/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/waist_gsmr_jazf1.txt",col.names=TRUE, row.names=FALSE, sep="\t",quote=FALSE)


In [8]:
asthma_t2d_snp <- read.table("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/asthma_v_t2d_snps.allele",header=FALSE, sep=" ")
waist_t2d_snp <- read.table("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/waist_v_t2d_snps.allele",header=FALSE, sep=" ")
waist_asthma_snp <- read.table("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/waist_v_asthma_snps.allele",header=FALSE, sep=" ")

In [9]:
dim(asthma_t2d_snp)

In [10]:
head(asthma_t2d_snp)

Unnamed: 0_level_0,V1,V2
Unnamed: 0_level_1,<chr>,<chr>
1,rs545409685,C
2,7:27869261_CAGTA_C,C
3,rs73075348,G
4,rs6948467,A
5,rs73075354,G
6,rs35410592,A


In [11]:
# drop any dublicates
asthma_t2d_snp <- asthma_t2d_snp[!duplicated(asthma_t2d_snp$V1), ] 
waist_t2d_snp <- waist_t2d_snp[!duplicated(waist_t2d_snp$V1), ]
waist_asthma_snp <- waist_asthma_snp[!duplicated(waist_asthma_snp$V1), ] 


“attempt to set 'col.names' ignored”
“attempt to set 'col.names' ignored”
“attempt to set 'col.names' ignored”


In [12]:
# save file
write.table(asthma_t2d_snp,"/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/asthma_v_t2d_snps.allele.txt",col.names=FALSE, row.names=FALSE,quote=FALSE)
write.table(waist_t2d_snp,"/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/waist_v_t2d_snps.allele.txt",col.names=FALSE, row.names=FALSE,quote=FALSE)
write.table(waist_asthma_snp,"/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/waist_v_asthma_snps.allele.txt",col.names=FALSE, row.names=FALSE,quote=FALSE)


In [None]:
# rs61469546 create a problem in genrationg ld matrix with a wrong ref. allele


In [1]:
%cd /mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum

/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum

In [2]:
asthma_v_t2d <- read.table("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/asthma_v_t2d_snps.allele.txt")
waist_v_t2d <- read.table("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/waist_v_t2d_snps.allele.txt")
waist_v_asthma <- read.table("/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/waist_v_asthma_snps.allele.txt")

In [3]:
asthma_v_t2d <- asthma_v_t2d[!(asthma_v_t2d$V1 == "rs61469546"),]
waist_v_t2d <- waist_v_t2d[!(waist_v_t2d$V1 == "rs61469546"),]
waist_v_asthma <- waist_v_asthma[!(waist_v_asthma$V1 == "rs61469546"),]


In [5]:
# save file
write.table(asthma_v_t2d,"/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/asthma_v_t2d_snps.allele.txt",col.names=FALSE, row.names=FALSE,quote=FALSE)
write.table(waist_v_t2d,"/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/waist_v_t2d_snps.allele.txt",col.names=FALSE, row.names=FALSE,quote=FALSE)
write.table(waist_v_asthma,"/mnt/mfs/statgen/bst2126/pleiotropy/JAZF1_sum/waist_v_asthma_snps.allele.txt",col.names=FALSE, row.names=FALSE,quote=FALSE)


### Checking the updated asthma, T2D and waist circumference REGENIE analyses results

data location: '/mnt/mfs/statgen/UKBiobank/results_pleiotropy/REGENIE_results/results_imputed_data'

In [1]:
import pandas as pd

In [3]:
# reading in the regenie on the imputed data and subsetting regenie data to only keep information within jazf1 region - 7 27868573 28273990
# within jazf1 region - 7 27868573 28273990 of the sumstats files there are 2067 variants for asthma and t2d, and 2068 variants for waist

# asthma data
asthma_regenie = pd.read_csv("/mnt/mfs/statgen/UKBiobank/results_pleiotropy/REGENIE_results/results_imputed_data/asthma_final_step2_imp.regenie_ASTHMA.regenie", sep=" ")
asthma_regenie = asthma_regenie[(asthma_regenie["CHROM"] == 7) & (asthma_regenie["GENPOS"] >= 27868573) & (asthma_regenie["GENPOS"] <= 28273990)][["CHROM", "GENPOS", "ID", "ALLELE0", "ALLELE1", "A1FREQ", "N", "BETA", "SE", "LOG10P"]]

# t2d data
t2d_regenie = pd.read_csv("/mnt/mfs/statgen/UKBiobank/results_pleiotropy/REGENIE_results/results_imputed_data/T2D_final_step2_imp.regenie_T2D.regenie", sep=" ")
t2d_regenie = t2d_regenie[(t2d_regenie["CHROM"] == 7) & (t2d_regenie["GENPOS"] >= 27868573) & (t2d_regenie["GENPOS"] <= 28273990)][["CHROM", "GENPOS", "ID", "ALLELE0", "ALLELE1", "A1FREQ", "N", "BETA", "SE", "LOG10P"]]

# waist circumference data
waist_regenie = pd.read_csv("/mnt/mfs/statgen/UKBiobank/results_pleiotropy/REGENIE_results/results_imputed_data/WC_final_step2_imp.regenie_rankNorm_WAIST.regenie", sep=" ")
waist_regenie = waist_regenie[(waist_regenie["CHROM"] == 7) & (waist_regenie["GENPOS"] >= 27868573) & (waist_regenie["GENPOS"] <= 28273990)][["CHROM", "GENPOS", "ID", "ALLELE0", "ALLELE1", "A1FREQ", "N", "BETA", "SE", "LOG10P"]]

In [4]:
asthma_regenie

Unnamed: 0,CHROM,GENPOS,ID,ALLELE0,ALLELE1,A1FREQ,N,BETA,SE,LOG10P
6070229,7,27869098,rs545409685,C,T,0.997328,361547,0.107914,0.071057,0.889958
6070230,7,27869261,7:27869261_CAGTA_C,C,CAGTA,0.998500,361547,0.045641,0.098320,0.192128
6070231,7,27869377,rs73075348,G,A,0.943334,361547,-0.002170,0.015242,0.052182
6070232,7,27869782,rs6948467,A,G,0.607163,361547,-0.001612,0.007229,0.084324
6070233,7,27869794,rs73075354,G,C,0.883874,361547,0.012535,0.011008,0.593751
...,...,...,...,...,...,...,...,...,...,...
6072291,7,28273623,7:28273623_TTTCCTTCCTTCC_T,T,TTTCCTTCCTTCC,0.842775,361547,0.002123,0.009901,0.080806
6072292,7,28273697,rs188426589,A,T,0.987428,361547,-0.007622,0.032049,0.090438
6072293,7,28273719,rs6944995,G,T,0.146688,361547,0.005589,0.009977,0.240083
6072294,7,28273829,rs192297723,C,A,0.988314,361547,0.022642,0.035026,0.285667


In [5]:
# GSMR uses SNP A1 A2 freq b se p N column for the analyses
# from current regenie data need to calculate A0FREQ and PVAL

# A0FREQ
get_a0freq = lambda row: 1 - row["A1FREQ"]

asthma_regenie["A0FREQ"] = asthma_regenie.apply(get_a0freq, axis=1)
t2d_regenie["A0FREQ"] = t2d_regenie.apply(get_a0freq, axis=1)
waist_regenie["A0FREQ"] = waist_regenie.apply(get_a0freq, axis=1)

# PVAL
get_pval = lambda row: 10 ** (-row["LOG10P"])

asthma_regenie["PVAL"] = asthma_regenie.apply(get_pval, axis=1)
t2d_regenie["PVAL"] = t2d_regenie.apply(get_pval, axis=1)
waist_regenie["PVAL"] = waist_regenie.apply(get_pval, axis=1)

# also renaming all the columns for merging later on
asthma_regenie = asthma_regenie.rename(columns={"ID":"SNP", "ALLELE0":"A1", "ALLELE1":"A2", "A0FREQ":"freq","N":"N", "BETA":"b", "SE":"se", "PVAL":"p"})
t2d_regenie = t2d_regenie.rename(columns={"ID":"SNP", "ALLELE0":"A1", "ALLELE1":"A2", "A0FREQ":"freq","N":"N", "BETA":"b", "SE":"se", "PVAL":"p"})
waist_regenie = waist_regenie.rename(columns={"ID":"SNP", "ALLELE0":"A1", "ALLELE1":"A2", "A0FREQ":"freq","N":"N", "BETA":"b", "SE":"se", "PVAL":"p"})

# keeping only relevant columns
asthma_regenie = asthma_regenie[["SNP", "A1", "A2", "freq", "b", "se", "p", "N"]]
t2d_regenie = t2d_regenie[["SNP", "A1", "A2", "freq", "b", "se", "p", "N"]]
waist_regenie = waist_regenie[["SNP", "A1", "A2", "freq", "b", "se", "p", "N"]]

In [7]:
asthma_regenie = asthma_regenie.drop_duplicates(subset='SNP', keep="first")
t2d_regenie = t2d_regenie.drop_duplicates(subset='SNP', keep="first")
waist_regenie = waist_regenie.drop_duplicates(subset='SNP', keep="first")

In [11]:
# Save sumstats 

asthma_regenie.to_csv("/mnt/mfs/statgen/bst2126/pleiotropy/gsmr/asthma_gsmr_jazf1.txt", sep="\t", index=False)
t2d_regenie.to_csv("/mnt/mfs/statgen/bst2126/pleiotropy/gsmr/t2d_gsmr_jazf1.txt", sep="\t", index=False)
waist_regenie.to_csv("/mnt/mfs/statgen/bst2126/pleiotropy/gsmr/waist_gsmr_jazf1.txt", sep="\t", index=False)

In [10]:
t2d_regenie

Unnamed: 0,SNP,A1,A2,freq,b,se,p,N
6070229,rs545409685,C,T,0.002672,0.044915,0.099664,0.652233,361547
6070230,7:27869261_CAGTA_C,C,CAGTA,0.001500,0.085851,0.140819,0.542088,361547
6070231,rs73075348,G,A,0.056666,0.017907,0.021423,0.403215,361547
6070232,rs6948467,A,G,0.392837,-0.019676,0.010197,0.053664,361547
6070233,rs73075354,G,C,0.116126,0.037787,0.015558,0.015147,361547
...,...,...,...,...,...,...,...,...
6072291,7:28273623_TTTCCTTCCTTCC_T,T,TTTCCTTCCTTCC,0.157225,-0.006717,0.013967,0.630562,361547
6072292,rs188426589,A,T,0.012572,-0.018526,0.045250,0.682229,361547
6072293,rs6944995,G,T,0.853312,0.012986,0.014084,0.356507,361547
6072294,rs192297723,C,A,0.011686,-0.049966,0.049102,0.308870,361547


In [24]:
asthma_regenie_sig = asthma_regenie[(asthma_regenie["p"] <= 5*10**(-8))]
t2d_regenie_sig = t2d_regenie[(t2d_regenie["p"] <= 5*10**(-8))]
waist_regenie_sig = waist_regenie[(waist_regenie["p"] <= 5*10**(-8))]

In [26]:
len(asthma_regenie_sig)


29

In [27]:

len(t2d_regenie_sig)


43

In [28]:

len(waist_regenie_sig)

125

In [29]:
asthma_regenie_sig.to_csv("/mnt/mfs/statgen/bst2126/pleiotropy/gsmr/asthma_sig_jazf1.txt", sep="\t", index=False)
t2d_regenie_sig.to_csv("/mnt/mfs/statgen/bst2126/pleiotropy/gsmr/t2d_sig_jazf1.txt", sep="\t", index=False)
waist_regenie_sig.to_csv("/mnt/mfs/statgen/bst2126/pleiotropy/gsmr/waist_sig_jazf1.txt", sep="\t", index=False)

In [33]:
asthma_regenie_sig[1:5]

Unnamed: 0,SNP,A1,A2,freq,b,se,p,N
6071623,rs2158624,C,T,0.115878,-0.061688,0.010804,1.358188e-08,361547
6071629,rs57585717,A,G,0.117893,-0.06513,0.010712,1.492313e-09,361547
6071660,7:28154215_GT_G,G,GT,0.208021,-0.063661,0.008629,2.054471e-13,361547
6071673,rs4722758,G,C,0.19947,-0.066247,0.008667,2.796405e-14,361547


In [32]:
waist_regenie_sig

Unnamed: 0,SNP,A1,A2,freq,b,se,p,N
6070358,rs10275982,A,C,0.337575,-0.006749,0.001059,1.855325e-10,360393
6070384,rs35135622,A,T,0.340960,-0.006638,0.001056,3.228791e-10,360393
6070390,rs11411348,GT,G,0.339378,-0.006469,0.001062,1.134175e-09,360393
6070415,rs10239787,T,C,0.327381,-0.006376,0.001064,2.072956e-09,360393
6070422,rs4722745,T,C,0.328131,-0.006424,0.001063,1.536632e-09,360393
...,...,...,...,...,...,...,...,...
6071803,rs11448038,TA,T,0.550241,0.007343,0.001086,1.355189e-11,360393
6071806,rs150139082,TACAC,T,0.747870,0.006524,0.001154,1.589864e-08,360393
6071840,rs527637,C,T,0.839501,0.008695,0.001363,1.775415e-10,360393
6071846,rs498475,A,G,0.633950,0.005884,0.001038,1.440987e-08,360393
