# Rare Variant Association Testing Tutorial

This tutorial demonstrates how to perform rare variant association testing using burden tests and SKAT (Sequence Kernel Association Test). We will walk through the steps of preparing the single-cell data (the same as for [common variant eQTL analysis](./pseudobulk_eqtl.ipynb)), running burden tests, and combining p-values across annotations. Additionally, we will explore SKAT for association testing.

The tutorial exemplifies the rare variant associationg testing procedure for a single cell type (CD 8 NC).

To get the VEP output files used in this tutorial check the [variant annotation tutorial](./explore_annotations.ipynb)

## Objectives
- Learn how to prepare genotype and phenotype data for rare variant association testing.
- Understand how to apply burden tests with different weighting schemes.
- Combine p-values across annotations using the Cauchy combination test.
- Perform SKAT for rare variant association testing.

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import pandas as pd

In [38]:
import gc
from pathlib import Path
import warnings

import anndata as ad
import scanpy as sc
import dask.array as da
import numpy as np
from tqdm.auto import tqdm

import cellink as cl
from cellink._core import DAnn, GAnn
from cellink.tl._rvat import run_burden_test, run_skat_test, beta_weighting
from cellink.utils import column_normalize, gaussianize
from cellink.at.acat import acat_test

In [37]:
DATA = Path(cl.__file__).parent.parent.parent / "data"
DATA = Path("/s/project/sys_gen_students/2024_2025/project04_rare_variant_sc/input_data")
GENODATA = DATA  # / "OneK1K_imputation_post_qc_r2_08"

gpc_path = GENODATA / "pcdir/wgs.dose.filtered.R2_0.8.filtered.pruned.eigenvec"
adata_path = DATA / "OneK1K_cohort_gene_expression_matrix_14_celltypes.h5ad.gz"
gdata_path = GENODATA / "filter_vcf_r08/chr22.dose.filtered.R2_0.8.vcz"

In [59]:
DATA = Path(cl.__file__).parent.parent.parent / "data"
DATA = Path("/s/project/sys_gen_students/2024_2025/project04_rare_variant_sc/input_data/eqtl_catalog")
GENODATA = DATA  # / "OneK1K_imputation_post_qc_r2_08"

# gpc_path = GENODATA / "pcdir/wgs.dose.filtered.R2_0.8.filtered.pruned.eigenvec"
adata_path = DATA / "onk1k_cellxgene.h5ad"
gdata_path = GENODATA / "sample.vcz"

In [39]:
adata = ad.read_h5ad(adata_path)
adata

AnnData object with n_obs × n_vars = 1272489 × 32738
    obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'pool', 'individual', 'percent.mt', 'latent', 'nCount_SCT', 'nFeature_SCT', 'cell_type', 'cell_label', 'sex', 'age'
    var: 'GeneSymbol', 'features'

In [60]:
gdata = cl.io.read_sgkit_zarr(gdata_path)

In [63]:
gdata.obs

OneK1K_1
OneK1K_10
OneK1K_100
OneK1K_1000
OneK1K_1001
...
OneK1K_995
OneK1K_996
OneK1K_997
OneK1K_998
OneK1K_999


In [48]:
adata.obs[["cell_type", "predicted.celltype.l2.score", "predicted.celltype.l2"]]

Unnamed: 0_level_0,cell_type,predicted.celltype.l2.score,predicted.celltype.l2
barcode,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
AAACCTGAGAATGTTG-1,"central memory CD4-positive, alpha-beta T cell",0.605378,CD4 TCM
AAACCTGAGAGAACAG-1,natural killer cell,1.000000,NK
AAACCTGAGCATGGCA-1,"naive thymus-derived CD4-positive, alpha-beta ...",0.557355,CD4 Naive
AAACCTGAGTATTGGA-1,"effector memory CD8-positive, alpha-beta T cell",0.359614,CD8 TEM
AAACCTGAGTGTCCCG-1,"effector memory CD8-positive, alpha-beta T cell",0.896674,CD8 TEM
...,...,...,...
TTTGTCATCCGCTGTT-9,transitional stage B cell,0.811524,B intermediate
TTTGTCATCCGTTGTC-9,"central memory CD4-positive, alpha-beta T cell",0.922046,CD4 TCM
TTTGTCATCGCCGTGA-9,"naive thymus-derived CD4-positive, alpha-beta ...",0.763648,CD4 Naive
TTTGTCATCGCGGATC-9,"central memory CD4-positive, alpha-beta T cell",0.865812,CD4 TCM


In [64]:
adata.obs["donor_id"]

barcode
AAACCTGAGAATGTTG-1    691_692
AAACCTGAGAGAACAG-1    693_694
AAACCTGAGCATGGCA-1    688_689
AAACCTGAGTATTGGA-1    683_684
AAACCTGAGTGTCCCG-1    684_685
                       ...   
TTTGTCATCCGCTGTT-9    796_797
TTTGTCATCCGTTGTC-9    800_801
TTTGTCATCGCCGTGA-9    821_822
TTTGTCATCGCGGATC-9    840_841
TTTGTCATCTCGTATT-9    821_822
Name: donor_id, Length: 1248980, dtype: category
Categories (981, object): ['1_1', '2_2', '3_3', '4_4', ..., '1078_1079', '1079_1080', '1080_1081', '1081_1082']

In [52]:
adata.obs["donor_id"]

barcode
AAACCTGAGAATGTTG-1    691_692
AAACCTGAGAGAACAG-1    693_694
AAACCTGAGCATGGCA-1    688_689
AAACCTGAGTATTGGA-1    683_684
AAACCTGAGTGTCCCG-1    684_685
                       ...   
TTTGTCATCCGCTGTT-9    796_797
TTTGTCATCCGTTGTC-9    800_801
TTTGTCATCGCCGTGA-9    821_822
TTTGTCATCGCGGATC-9    840_841
TTTGTCATCTCGTATT-9    821_822
Name: donor_id, Length: 1248980, dtype: category
Categories (981, object): ['1_1', '2_2', '3_3', '4_4', ..., '1078_1079', '1079_1080', '1080_1081', '1081_1082']

In [54]:
len(adata.obs["cell_type"].value_counts())

29

In [55]:
len(adata.obs["predicted.celltype.l2"].value_counts())

31

In [42]:
adata = adata2

In [36]:
10_000_000 / 42

238095.2380952381

In [43]:
n_gpcs = 20
n_epcs = 15
batch_e_pcs_n_top_genes = 2000
chrom = 22
cis_window = 100_000
cell_type = "CD8 NC"
pb_gex_key = f"PB_{cell_type}"  # pseudobulk expression in dd.G.obsm[key_added]
original_donor_col = "individual"
min_percent_donors_expressed = 0.1
celltype_key = "cell_label"
do_debug = False

## Prepare data 

In [6]:
if do_debug:
    adata_path = DATA / "debug_OneK1K_cohort_gene_expression_matrix_14_celltypes.h5ad"

adata = ad.read_h5ad(adata_path)
gdata = cl.io.read_sgkit_zarr(gdata_path)

gene_ann = pd.read_csv(DATA / "gene_annotation.csv").set_index("ensembl_gene_id")
adata.var = pd.concat([adata.var, gene_ann.loc[adata.var.index]], axis=1).rename(
    columns={
        "start_position": GAnn.start,
        "end_position": GAnn.end,
        "chromosome_name": GAnn.chrom,
    }
)
adata.obs[DAnn.donor] = adata.obs[original_donor_col]
adata

AnnData object with n_obs × n_vars = 1272489 × 32738
    obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'pool', 'individual', 'percent.mt', 'latent', 'nCount_SCT', 'nFeature_SCT', 'cell_type', 'cell_label', 'sex', 'age', 'donor_id'
    var: 'GeneSymbol', 'features', 'start', 'end', 'chrom', 'strand', 'description', 'wikigene_name', 'wikigene_id'

In [7]:
dd = cl.DonorData(G=gdata, C=adata).copy()  # copy to avoid view warnings
dd



### Preprocessing Single-Cell Data
We normalize and log-transform the expression data, compute highly variable genes, and perform PCA to extract expression principal components (ePCs). 

In [8]:
sc.pp.normalize_total(dd.C)
sc.pp.log1p(dd.C)
sc.pp.normalize_total(dd.C)

# are the expression pcs computed by pseudobulking across all cell types?
mdata = sc.get.aggregate(dd.C, by=DAnn.donor, func="mean")
mdata.X = mdata.layers.pop("mean")

sc.pp.highly_variable_genes(mdata, n_top_genes=batch_e_pcs_n_top_genes)
sc.tl.pca(mdata, n_comps=n_epcs)

dd.G.obsm["ePCs"] = mdata[dd.G.obs_names].obsm["X_pca"]

In [9]:
dd = dd[..., dd.C.obs[celltype_key] == cell_type, :].copy()
dd



In [10]:
gc.collect()

999

In [11]:
dd.aggregate(key_added=pb_gex_key, sync_var=True, verbose=True)
dd.aggregate(obs=["sex", "age"], func="first", add_to_obs=True)
dd

[2025-04-22 16:36:00,837] INFO:cellink._core.donordata: Aggregated X to PB_CD8 NC
[2025-04-22 16:36:00,838] INFO:cellink._core.donordata: Observation found for 981 donors.




In [12]:
dd.G.obsm[pb_gex_key].shape

(981, 32738)

In [13]:
gpcs = pd.read_csv(gpc_path, sep=r"\s+", index_col=1, header=None).drop(columns=0)
dd.G.obsm["gPCs"] = gpcs.loc[dd.G.obs_names].iloc[:, :n_gpcs]

In [14]:
print(f"{pb_gex_key} shape:", dd.G.obsm[pb_gex_key].shape)
print("dd.shape:", dd.shape)

keep_genes = ((dd.G.obsm[pb_gex_key] > 0).mean(axis=0) >= min_percent_donors_expressed).values
dd = dd[..., keep_genes]
print("after filtering")
print(f"{pb_gex_key} shape:", dd.G.obsm[pb_gex_key].shape)
print("dd.shape:", dd.shape)

PB_CD8 NC shape: (981, 32738)
dd.shape: (981, 143083, 133482, 32738)
after filtering
PB_CD8 NC shape: (981, 14119)
dd.shape: (981, 143083, 133482, 14119)


In [16]:
# alternative to dd[:, dd.G.var.chrom == str(chrom), :, dd.C.var.chrom == str(chrom)]
dd = dd.sel(G_var=dd.G.var.chrom == str(chrom), C_var=dd.C.var.chrom == str(chrom)).copy()
dd



### Adding variant annotations to `dd`
We use VEP (Variant Effect Predictor) annotations to add functional information to the variants (as explained [here](./explore_annotations.ipynb)). These annotations are aggregated and stored in the `variant_annotation` attribute of the genotype data.

In [17]:
vep_annotation_file = "/data/nasif12/home_if12/hoev/git/sc-genetics/tests/data/variants_vep_annotated_all_ch22.txt"

In [18]:
cl.tl.add_vep_annos_to_gdata(vep_anno_file=vep_annotation_file, gdata=dd.G, dummy_consequence=True)
dd.G.uns["variant_annotation_vep"]

[2025-04-22 16:36:02,739] INFO:cellink.tl._annotate_snps_genotype_data: Preparing VEP annotations for addition to gdata
[2025-04-22 16:36:02,740] INFO:cellink.tl._annotate_snps_genotype_data: Reading annotation file /data/nasif12/home_if12/hoev/git/sc-genetics/tests/data/variants_vep_annotated_all_ch22.txt
[2025-04-22 16:36:03,154] INFO:cellink.tl._annotate_snps_genotype_data: Annotation file loaded
[2025-04-22 16:36:03,179] INFO:cellink.tl._annotate_snps_genotype_data: Annotation columns: ['snp_id', 'Location', 'Allele', 'gene_id', 'transcript_id', 'Feature_type', 'Consequence', 'cDNA_position', 'CDS_position', 'Protein_position', 'Amino_acids', 'Codons', 'Existing_variation', 'IMPACT', 'DISTANCE', 'STRAND', 'FLAGS', 'BIOTYPE', 'CANONICAL', 'ENSP', 'SIFT', 'PolyPhen', 'gnomADe_AF', 'gnomADe_AFR_AF', 'gnomADe_AMR_AF', 'gnomADe_ASJ_AF', 'gnomADe_EAS_AF', 'gnomADe_FIN_AF', 'gnomADe_NFE_AF', 'gnomADe_OTH_AF', 'gnomADe_SAS_AF', 'CLIN_SIG', 'SOMATIC', 'PHENO', 'CADD_PHRED', 'CADD_RAW', 'TSS

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Consequence_3_prime_UTR_variant,Consequence_5_prime_UTR_variant,Consequence_NMD_transcript_variant,Consequence_coding_sequence_variant,Consequence_downstream_gene_variant,Consequence_incomplete_terminal_codon_variant,Consequence_intergenic_variant,Consequence_intron_variant,Consequence_mature_miRNA_variant,Consequence_missense_variant,...,ENSP,BIOTYPE,SIFT,gnomADe_OTH_AF,CADD_RAW,gnomADe_ASJ_AF,DISTANCE,CADD_PHRED,STRAND,CDS_position
snp_id,gene_id,transcript_id,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1
22_16849573_A_G,-,-,0,0,0,0,0,0,1,0,0,0,...,-,-,,,0.433139,,,8.747,,-
22_16849971_A_T,-,-,0,0,0,0,0,0,1,0,0,0,...,-,-,,,0.442607,,,8.843,,-
22_16850437_G_A,-,-,0,0,0,0,0,0,1,0,0,0,...,-,-,,,0.369731,,,8.063,,-
22_16851225_C_T,-,-,0,0,0,0,0,0,1,0,0,0,...,-,-,,,0.393139,,,8.324,,-
22_16851356_C_T,-,-,0,0,0,0,0,0,1,0,0,0,...,-,-,,,0.377289,,,8.148,,-
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
22_51211031_A_G,ENSG00000184319,ENST00000496652,0,0,0,0,0,0,0,1,0,0,...,-,processed_transcript,,,0.153297,,,5.127,1.0,-
22_51213613_C_T,ENSG00000184319,ENST00000496652,0,0,0,0,0,0,0,1,0,0,...,-,processed_transcript,,,-0.394121,,,0.190,1.0,-
22_51213613_C_T,ENSG00000079974,ENST00000395593,0,0,0,0,0,0,0,1,0,0,...,ENSP00000378958,protein_coding,,,-0.394121,,,0.190,-1.0,-
22_51216564_T_C,ENSG00000184319,ENST00000496652,0,0,0,0,0,0,0,1,0,0,...,-,processed_transcript,,,-0.113869,,,1.282,1.0,-


In [19]:
cl.tl.aggregate_annotations_for_varm(
    dd.G, "variant_annotation_vep", agg_type="first", return_data=True
)  # TODO change agg type

[2025-04-22 16:36:06,228] INFO:cellink.tl._annotate_snps_genotype_data: Aggregating using method: first


Unnamed: 0_level_0,gene_id,transcript_id,Consequence_3_prime_UTR_variant,Consequence_5_prime_UTR_variant,Consequence_NMD_transcript_variant,Consequence_coding_sequence_variant,Consequence_downstream_gene_variant,Consequence_incomplete_terminal_codon_variant,Consequence_intergenic_variant,Consequence_intron_variant,...,ENSP,BIOTYPE,SIFT,gnomADe_OTH_AF,CADD_RAW,gnomADe_ASJ_AF,DISTANCE,CADD_PHRED,STRAND,CDS_position
snp_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
22_16849573_A_G,-,-,0,0,0,0,0,0,1,0,...,-,-,,,0.433139,,,8.747,,-
22_16849971_A_T,-,-,0,0,0,0,0,0,1,0,...,-,-,,,0.442607,,,8.843,,-
22_16850437_G_A,-,-,0,0,0,0,0,0,1,0,...,-,-,,,0.369731,,,8.063,,-
22_16851225_C_T,-,-,0,0,0,0,0,0,1,0,...,-,-,,,0.393139,,,8.324,,-
22_16851356_C_T,-,-,0,0,0,0,0,0,1,0,...,-,-,,,0.377289,,,8.148,,-
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
22_51202748_A_G,ENSG00000184319,ENST00000496652,0,0,0,0,0,0,0,1,...,-,processed_transcript,,,0.550962,,,9.870,1.0,-
22_51208568_G_T,ENSG00000184319,ENST00000496652,0,0,0,0,0,0,0,1,...,-,processed_transcript,,,0.102244,,,4.299,1.0,-
22_51211031_A_G,ENSG00000079974,ENST00000395593,0,0,0,0,0,0,0,1,...,ENSP00000378958,protein_coding,,,0.153297,,,5.127,-1.0,-
22_51213613_C_T,ENSG00000184319,ENST00000496652,0,0,0,0,0,0,0,1,...,-,processed_transcript,,,-0.394121,,,0.190,1.0,-


In [20]:
dd.G.varm["variant_annotation"].columns

Index(['gene_id', 'transcript_id', 'Consequence_3_prime_UTR_variant',
       'Consequence_5_prime_UTR_variant', 'Consequence_NMD_transcript_variant',
       'Consequence_coding_sequence_variant',
       'Consequence_downstream_gene_variant',
       'Consequence_incomplete_terminal_codon_variant',
       'Consequence_intergenic_variant', 'Consequence_intron_variant',
       'Consequence_mature_miRNA_variant', 'Consequence_missense_variant',
       'Consequence_non_coding_transcript_exon_variant',
       'Consequence_non_coding_transcript_variant',
       'Consequence_splice_acceptor_variant',
       'Consequence_splice_donor_5th_base_variant',
       'Consequence_splice_donor_region_variant',
       'Consequence_splice_donor_variant',
       'Consequence_splice_polypyrimidine_tract_variant',
       'Consequence_splice_region_variant', 'Consequence_start_lost',
       'Consequence_stop_gained', 'Consequence_stop_retained_variant',
       'Consequence_synonymous_variant', 'Consequence_ups

In [21]:
dd.G.uns["variant_annotation_vep"]["CADD_RAW"].describe()

count    178597.000000
mean         -0.000616
std           0.407068
min          -2.286863
25%          -0.202241
50%          -0.040239
75%           0.123338
max           7.595543
Name: CADD_RAW, dtype: float64

In [22]:
dd.G.uns["variant_annotation_vep"]["TSSDistance"].describe()

count    24466.000000
mean      2536.080724
std       1454.770799
min          1.000000
25%       1272.000000
50%       2561.000000
75%       3803.000000
max       5000.000000
Name: TSSDistance, dtype: float64

## Run the burden test

Burden tests aggregate the effects of rare variants within a gene or region to test for association with a phenotype (cell type specific expression). We use different weighting schemes, including:
- **CADD_RAW**: Raw CADD scores.
- **maf_beta**: Beta weighting based on minor allele frequency (MAF).
- **tss_distance**: Distance to the transcription start site (TSS).
- **tss_distance_exp**: Exponential weighting based on TSS distance.



In [23]:
burden_agg_fct = "sum"
run_lrt = True
annotation_cols = ["CADD_RAW", "maf_beta", "tss_distance", "tss_distance_exp"]
rare_maf_threshold = 0.01

### Filtering for Rare Variants
We filter variants with a minor allele frequency (MAF) below a specified threshold (e.g., 0.01) to focus on rare variants.

In [24]:
dd = dd.sel(G_var=dd.G.var.maf < rare_maf_threshold).copy()
dd



### Custom MAF Weights

We add custom MAF weights commonly used in burden tests, such as `Beta(MAF, 1, 25)`. These weights prioritize rarer variants in the analysis. TSS distance weight as used in the SAIGE-QTL paper are added manually for each gene

In [27]:
dd.G.varm["variant_annotation"]["maf_beta"] = beta_weighting(dd.G.var["maf"])

### Run burden tests using each annotation individually for weighting


In [None]:
# This specifies covariates/fixed effects
F = np.concatenate(
    [
        np.ones((dd.shape[0], 1)),
        dd.G.obs[["sex"]].values - 1,
        dd.G.obs[["age"]].values,
        dd.G.obsm["gPCs"].values,
        dd.G.obsm["ePCs"],
    ],
    axis=1,
)
F[:, 2:] = column_normalize(F[:, 2:])

In [None]:
results = []
if isinstance(dd.G.X, da.Array | ad._core.views.DaskArrayView):
    if dd.G.is_view:
        dd._G = dd._G.copy()
    dd.G.X = dd.G.X.compute()

if do_debug:
    warnings.filterwarnings("ignore", category=RuntimeWarning)

for gene, row in tqdm(dd.C.var.iterrows(), total=dd.shape[3]):
    Y = gaussianize(dd.G.obsm[pb_gex_key][[gene]].values + 1e-5 * np.random.randn(dd.shape[0], 1))

    start = max(0, row.start - cis_window)
    end = row.end + cis_window
    _G = dd.G[:, (dd.G.var.pos < end)]
    _G = _G[:, (_G.var.pos > start)]
    _G = _G[:, (_G.X.std(0) != 0)]
    _G = _G.copy()

    # TODO make strand aware
    _G.varm["variant_annotation"]["tss_distance"] = np.abs(row.start - _G.var["pos"])
    _G.varm["variant_annotation"]["tss_distance_exp"] = np.exp(-1e-5 * _G.varm["variant_annotation"]["tss_distance"])

    rdf = run_burden_test(
        _G, Y, F, gene, annotation_cols=annotation_cols, burden_agg_fct=burden_agg_fct, run_lrt=run_lrt
    )
    results.append(rdf)

rdf = pd.concat(results)
rdf

100%|██████████| 360/360 [00:12<00:00, 29.54it/s]


Unnamed: 0,burden_gene,egene,weight_col,burden_agg_fct,pv,beta,betaste,lrt
0,ENSG00000100181,ENSG00000100181,CADD_RAW,sum,0.289885,1.213162e+00,1.146252e+00,1.120153
1,ENSG00000100181,ENSG00000100181,maf_beta,sum,0.025325,-9.702817e-03,4.338561e-03,5.001546
2,ENSG00000100181,ENSG00000100181,tss_distance,sum,0.094869,-2.169492e-06,1.298900e-06,2.789745
3,ENSG00000100181,ENSG00000100181,tss_distance_exp,sum,0.032786,-3.156013e-01,1.478437e-01,4.556923
0,ENSG00000237438,ENSG00000237438,CADD_RAW,sum,0.164809,8.317045e-02,5.987450e-02,1.929542
...,...,...,...,...,...,...,...,...
3,ENSG00000100299,ENSG00000100299,tss_distance_exp,sum,0.811293,-4.636869e-03,1.942074e-02,0.057006
0,ENSG00000079974,ENSG00000079974,CADD_RAW,sum,0.134451,-3.387797e-01,2.263393e-01,2.240345
1,ENSG00000079974,ENSG00000079974,maf_beta,sum,0.297102,-2.105266e-03,2.019116e-03,1.087155
2,ENSG00000079974,ENSG00000079974,tss_distance,sum,0.272661,-8.895558e-07,8.109324e-07,1.203309


### Combine p-values per gene across anntoations using ACAT

We use the ACAT test to combine p-values across different annotations for each gene. This approach provides a single p-value per gene, accounting for multiple annotations.

In [34]:
combined = rdf.dropna(subset=["pv"]).groupby("egene")["pv"].agg(lambda x: acat_test(x.values)).reset_index()
combined.sort_values("pv")

Unnamed: 0,egene,pv
141,ENSG00000100376,0.000088
88,ENSG00000100219,0.000254
215,ENSG00000183172,0.001849
78,ENSG00000100154,0.005039
287,ENSG00000212939,0.005233
...,...,...
110,ENSG00000100294,0.984078
216,ENSG00000183473,0.985963
251,ENSG00000185838,0.986744
307,ENSG00000233903,0.988543


## SKAT tests
At the moment only default weighting with weights = Beta(MAF, 1, 25) is supported

In [None]:
results = []

for gene, row in tqdm(dd.C.var.iterrows(), total=dd.shape[3]):
    Y = gaussianize(dd.G.obsm[pb_gex_key][[gene]].values + 1e-5 * np.random.randn(dd.shape[0], 1))

    start = max(0, row.start - cis_window)
    end = row.end + cis_window
    _G = dd.G[:, (dd.G.var.pos < end)]
    _G = _G[:, (_G.var.pos > start)]
    _G = _G[:, (_G.X.std(0) != 0)]

    rdict = run_skat_test(_G, Y, F, gene)
    results.append(rdict)

rdf = pd.DataFrame(results)
rdf

  0%|          | 0/360 [00:00<?, ?it/s]

[2025-04-22 16:06:29,662] INFO:root: Starting optimization ...
[2025-04-22 16:06:29,691] INFO:root: Starting optimization ...


[2025-04-22 16:06:29,741] INFO:root: Starting optimization ...


  1%|          | 3/360 [00:00<00:13, 26.89it/s]

[2025-04-22 16:06:29,775] INFO:root: Starting optimization ...
[2025-04-22 16:06:29,809] INFO:root: Starting optimization ...
[2025-04-22 16:06:29,845] INFO:root: Starting optimization ...


  2%|▏         | 6/360 [00:00<00:12, 27.68it/s]

[2025-04-22 16:06:29,881] INFO:root: Starting optimization ...
[2025-04-22 16:06:29,918] INFO:root: Starting optimization ...
[2025-04-22 16:06:29,957] INFO:root: Starting optimization ...


  2%|▎         | 9/360 [00:00<00:12, 27.04it/s]

[2025-04-22 16:06:29,996] INFO:root: Starting optimization ...
[2025-04-22 16:06:30,040] INFO:root: Starting optimization ...
[2025-04-22 16:06:30,098] INFO:root: Starting optimization ...


  3%|▎         | 12/360 [00:00<00:14, 24.40it/s]

[2025-04-22 16:06:30,136] INFO:root: Starting optimization ...
[2025-04-22 16:06:30,171] INFO:root: Starting optimization ...
[2025-04-22 16:06:30,200] INFO:root: Starting optimization ...
[2025-04-22 16:06:30,231] INFO:root: Starting optimization ...


  4%|▍         | 16/360 [00:00<00:12, 26.86it/s]

[2025-04-22 16:06:30,266] INFO:root: Starting optimization ...
[2025-04-22 16:06:30,308] INFO:root: Starting optimization ...
[2025-04-22 16:06:30,343] INFO:root: Starting optimization ...


  5%|▌         | 19/360 [00:00<00:13, 26.20it/s]

[2025-04-22 16:06:30,387] INFO:root: Starting optimization ...
[2025-04-22 16:06:30,433] INFO:root: Starting optimization ...
[2025-04-22 16:06:30,488] INFO:root: Starting optimization ...


  6%|▌         | 22/360 [00:00<00:14, 23.95it/s]

[2025-04-22 16:06:30,536] INFO:root: Starting optimization ...
[2025-04-22 16:06:30,585] INFO:root: Starting optimization ...
[2025-04-22 16:06:30,622] INFO:root: Starting optimization ...


  7%|▋         | 25/360 [00:01<00:15, 21.56it/s]

[2025-04-22 16:06:30,708] INFO:root: Starting optimization ...
[2025-04-22 16:06:30,756] INFO:root: Starting optimization ...
[2025-04-22 16:06:30,811] INFO:root: Starting optimization ...


  8%|▊         | 28/360 [00:01<00:15, 20.90it/s]

[2025-04-22 16:06:30,864] INFO:root: Starting optimization ...
[2025-04-22 16:06:31,005] INFO:root: Starting optimization ...
[2025-04-22 16:06:31,141] INFO:root: Starting optimization ...


  9%|▊         | 31/360 [00:01<00:23, 13.83it/s]

[2025-04-22 16:06:31,248] INFO:root: Starting optimization ...
[2025-04-22 16:06:31,313] INFO:root: Starting optimization ...


  9%|▉         | 33/360 [00:01<00:22, 14.63it/s]

[2025-04-22 16:06:31,356] INFO:root: Starting optimization ...
[2025-04-22 16:06:31,411] INFO:root: Starting optimization ...


 10%|▉         | 35/360 [00:01<00:21, 15.39it/s]

[2025-04-22 16:06:31,465] INFO:root: Starting optimization ...
[2025-04-22 16:06:31,512] INFO:root: Starting optimization ...
[2025-04-22 16:06:31,556] INFO:root: Starting optimization ...


 11%|█         | 38/360 [00:01<00:18, 17.34it/s]

[2025-04-22 16:06:31,595] INFO:root: Starting optimization ...
[2025-04-22 16:06:31,629] INFO:root: Starting optimization ...
[2025-04-22 16:06:31,659] INFO:root: Starting optimization ...
[2025-04-22 16:06:31,688] INFO:root: Starting optimization ...


 12%|█▏        | 42/360 [00:02<00:15, 21.04it/s]

[2025-04-22 16:06:31,722] INFO:root: Starting optimization ...
[2025-04-22 16:06:31,747] INFO:root: Starting optimization ...
[2025-04-22 16:06:31,779] INFO:root: Starting optimization ...
[2025-04-22 16:06:31,814] INFO:root: Starting optimization ...


 13%|█▎        | 46/360 [00:02<00:13, 22.92it/s]

[2025-04-22 16:06:31,870] INFO:root: Starting optimization ...
[2025-04-22 16:06:31,904] INFO:root: Starting optimization ...
[2025-04-22 16:06:31,946] INFO:root: Starting optimization ...


 14%|█▎        | 49/360 [00:02<00:13, 23.61it/s]

[2025-04-22 16:06:31,987] INFO:root: Starting optimization ...
[2025-04-22 16:06:32,028] INFO:root: Starting optimization ...
[2025-04-22 16:06:32,066] INFO:root: Starting optimization ...


 14%|█▍        | 52/360 [00:02<00:12, 24.17it/s]

[2025-04-22 16:06:32,103] INFO:root: Starting optimization ...
[2025-04-22 16:06:32,141] INFO:root: Starting optimization ...
[2025-04-22 16:06:32,177] INFO:root: Starting optimization ...


 15%|█▌        | 55/360 [00:02<00:12, 24.96it/s]

[2025-04-22 16:06:32,213] INFO:root: Starting optimization ...
[2025-04-22 16:06:32,242] INFO:root: Starting optimization ...
[2025-04-22 16:06:32,272] INFO:root: Starting optimization ...
[2025-04-22 16:06:32,299] INFO:root: Starting optimization ...


 16%|█▋        | 59/360 [00:02<00:10, 27.72it/s]

[2025-04-22 16:06:32,327] INFO:root: Starting optimization ...
[2025-04-22 16:06:32,364] INFO:root: Starting optimization ...
[2025-04-22 16:06:32,436] INFO:root: Starting optimization ...


 17%|█▋        | 62/360 [00:02<00:11, 25.63it/s]

[2025-04-22 16:06:32,469] INFO:root: Starting optimization ...
[2025-04-22 16:06:32,498] INFO:root: Starting optimization ...
[2025-04-22 16:06:32,530] INFO:root: Starting optimization ...
[2025-04-22 16:06:32,563] INFO:root: Starting optimization ...


 18%|█▊        | 66/360 [00:02<00:10, 27.31it/s]

[2025-04-22 16:06:32,597] INFO:root: Starting optimization ...
[2025-04-22 16:06:32,630] INFO:root: Starting optimization ...
[2025-04-22 16:06:32,659] INFO:root: Starting optimization ...
[2025-04-22 16:06:32,685] INFO:root: Starting optimization ...


 19%|█▉        | 70/360 [00:03<00:10, 28.90it/s]

[2025-04-22 16:06:32,722] INFO:root: Starting optimization ...
[2025-04-22 16:06:32,820] INFO:root: Starting optimization ...
[2025-04-22 16:06:32,867] INFO:root: Starting optimization ...


 20%|██        | 73/360 [00:03<00:12, 23.70it/s]

[2025-04-22 16:06:32,913] INFO:root: Starting optimization ...
[2025-04-22 16:06:32,966] INFO:root: Starting optimization ...
[2025-04-22 16:06:33,048] INFO:root: Starting optimization ...


 21%|██        | 76/360 [00:03<00:13, 21.19it/s]

[2025-04-22 16:06:33,094] INFO:root: Starting optimization ...
[2025-04-22 16:06:33,138] INFO:root: Starting optimization ...
[2025-04-22 16:06:33,226] INFO:root: Starting optimization ...


 22%|██▏       | 79/360 [00:03<00:14, 19.73it/s]

[2025-04-22 16:06:33,278] INFO:root: Starting optimization ...
[2025-04-22 16:06:33,328] INFO:root: Starting optimization ...
[2025-04-22 16:06:33,371] INFO:root: Starting optimization ...


 23%|██▎       | 82/360 [00:03<00:13, 20.15it/s]

[2025-04-22 16:06:33,412] INFO:root: Starting optimization ...
[2025-04-22 16:06:33,441] INFO:root: Starting optimization ...
[2025-04-22 16:06:33,516] INFO:root: Starting optimization ...


 24%|██▎       | 85/360 [00:03<00:13, 20.63it/s]

[2025-04-22 16:06:33,548] INFO:root: Starting optimization ...
[2025-04-22 16:06:33,580] INFO:root: Starting optimization ...
[2025-04-22 16:06:33,615] INFO:root: Starting optimization ...


 24%|██▍       | 88/360 [00:04<00:13, 20.47it/s]

[2025-04-22 16:06:33,700] INFO:root: Starting optimization ...
[2025-04-22 16:06:33,741] INFO:root: Starting optimization ...
[2025-04-22 16:06:33,788] INFO:root: Starting optimization ...


 25%|██▌       | 91/360 [00:04<00:12, 21.26it/s]

[2025-04-22 16:06:33,828] INFO:root: Starting optimization ...
[2025-04-22 16:06:33,873] INFO:root: Starting optimization ...
[2025-04-22 16:06:33,914] INFO:root: Starting optimization ...


 26%|██▌       | 94/360 [00:04<00:12, 21.28it/s]

[2025-04-22 16:06:33,971] INFO:root: Starting optimization ...
[2025-04-22 16:06:34,028] INFO:root: Starting optimization ...
[2025-04-22 16:06:34,062] INFO:root: Starting optimization ...


 27%|██▋       | 97/360 [00:04<00:12, 21.79it/s]

[2025-04-22 16:06:34,102] INFO:root: Starting optimization ...
[2025-04-22 16:06:34,172] INFO:root: Starting optimization ...
[2025-04-22 16:06:34,215] INFO:root: Starting optimization ...


 28%|██▊       | 100/360 [00:04<00:12, 20.48it/s]

[2025-04-22 16:06:34,268] INFO:root: Starting optimization ...
[2025-04-22 16:06:34,312] INFO:root: Starting optimization ...
[2025-04-22 16:06:34,362] INFO:root: Starting optimization ...


 29%|██▊       | 103/360 [00:04<00:12, 20.51it/s]

[2025-04-22 16:06:34,413] INFO:root: Starting optimization ...
[2025-04-22 16:06:34,475] INFO:root: Starting optimization ...
[2025-04-22 16:06:34,532] INFO:root: Starting optimization ...


 29%|██▉       | 106/360 [00:04<00:12, 19.56it/s]

[2025-04-22 16:06:34,583] INFO:root: Starting optimization ...
[2025-04-22 16:06:34,638] INFO:root: Starting optimization ...


 30%|███       | 108/360 [00:05<00:12, 19.56it/s]

[2025-04-22 16:06:34,684] INFO:root: Starting optimization ...
[2025-04-22 16:06:34,726] INFO:root: Starting optimization ...


 31%|███       | 110/360 [00:05<00:14, 17.43it/s]

[2025-04-22 16:06:34,841] INFO:root: Starting optimization ...
[2025-04-22 16:06:34,887] INFO:root: Starting optimization ...
[2025-04-22 16:06:34,929] INFO:root: Starting optimization ...


 31%|███▏      | 113/360 [00:05<00:13, 18.84it/s]

[2025-04-22 16:06:34,973] INFO:root: Starting optimization ...
[2025-04-22 16:06:35,020] INFO:root: Starting optimization ...
[2025-04-22 16:06:35,068] INFO:root: Starting optimization ...


 32%|███▏      | 116/360 [00:05<00:12, 19.51it/s]

[2025-04-22 16:06:35,117] INFO:root: Starting optimization ...
[2025-04-22 16:06:35,162] INFO:root: Starting optimization ...
[2025-04-22 16:06:35,203] INFO:root: Starting optimization ...


 33%|███▎      | 119/360 [00:05<00:11, 20.56it/s]

[2025-04-22 16:06:35,244] INFO:root: Starting optimization ...
[2025-04-22 16:06:35,279] INFO:root: Starting optimization ...
[2025-04-22 16:06:35,340] INFO:root: Starting optimization ...


 34%|███▍      | 122/360 [00:05<00:11, 20.70it/s]

[2025-04-22 16:06:35,389] INFO:root: Starting optimization ...
[2025-04-22 16:06:35,440] INFO:root: Starting optimization ...
[2025-04-22 16:06:35,496] INFO:root: Starting optimization ...


 35%|███▍      | 125/360 [00:05<00:11, 20.05it/s]

[2025-04-22 16:06:35,550] INFO:root: Starting optimization ...
[2025-04-22 16:06:35,608] INFO:root: Starting optimization ...
[2025-04-22 16:06:35,659] INFO:root: Starting optimization ...


 36%|███▌      | 128/360 [00:06<00:11, 19.39it/s]

[2025-04-22 16:06:35,716] INFO:root: Starting optimization ...
[2025-04-22 16:06:35,773] INFO:root: Starting optimization ...


 36%|███▌      | 130/360 [00:06<00:12, 18.58it/s]

[2025-04-22 16:06:35,840] INFO:root: Starting optimization ...
[2025-04-22 16:06:35,891] INFO:root: Starting optimization ...


 37%|███▋      | 132/360 [00:06<00:12, 18.83it/s]

[2025-04-22 16:06:35,941] INFO:root: Starting optimization ...
[2025-04-22 16:06:35,991] INFO:root: Starting optimization ...


 37%|███▋      | 134/360 [00:06<00:11, 19.07it/s]

[2025-04-22 16:06:36,041] INFO:root: Starting optimization ...
[2025-04-22 16:06:36,092] INFO:root: Starting optimization ...
[2025-04-22 16:06:36,140] INFO:root: Starting optimization ...


 38%|███▊      | 137/360 [00:06<00:11, 19.19it/s]

[2025-04-22 16:06:36,196] INFO:root: Starting optimization ...
[2025-04-22 16:06:36,248] INFO:root: Starting optimization ...


 39%|███▊      | 139/360 [00:06<00:11, 19.30it/s]

[2025-04-22 16:06:36,301] INFO:root: Starting optimization ...
[2025-04-22 16:06:36,367] INFO:root: Starting optimization ...


 39%|███▉      | 141/360 [00:06<00:11, 18.45it/s]

[2025-04-22 16:06:36,420] INFO:root: Starting optimization ...
[2025-04-22 16:06:36,485] INFO:root: Starting optimization ...


 40%|███▉      | 143/360 [00:06<00:12, 17.80it/s]

[2025-04-22 16:06:36,545] INFO:root: Starting optimization ...
[2025-04-22 16:06:36,633] INFO:root: Starting optimization ...


 40%|████      | 145/360 [00:07<00:13, 16.46it/s]

[2025-04-22 16:06:36,688] INFO:root: Starting optimization ...
[2025-04-22 16:06:36,746] INFO:root: Starting optimization ...


 41%|████      | 147/360 [00:07<00:12, 16.92it/s]

[2025-04-22 16:06:36,797] INFO:root: Starting optimization ...
[2025-04-22 16:06:36,848] INFO:root: Starting optimization ...
[2025-04-22 16:06:36,896] INFO:root: Starting optimization ...


 42%|████▏     | 150/360 [00:07<00:11, 17.87it/s]

[2025-04-22 16:06:36,950] INFO:root: Starting optimization ...
[2025-04-22 16:06:37,007] INFO:root: Starting optimization ...


 42%|████▏     | 152/360 [00:07<00:11, 17.54it/s]

[2025-04-22 16:06:37,069] INFO:root: Starting optimization ...
[2025-04-22 16:06:37,122] INFO:root: Starting optimization ...
[2025-04-22 16:06:37,168] INFO:root: Starting optimization ...


 43%|████▎     | 155/360 [00:07<00:10, 18.66it/s]

[2025-04-22 16:06:37,212] INFO:root: Starting optimization ...
[2025-04-22 16:06:37,265] INFO:root: Starting optimization ...


 44%|████▎     | 157/360 [00:07<00:10, 18.97it/s]

[2025-04-22 16:06:37,313] INFO:root: Starting optimization ...
[2025-04-22 16:06:37,370] INFO:root: Starting optimization ...


 44%|████▍     | 159/360 [00:07<00:10, 18.94it/s]

[2025-04-22 16:06:37,419] INFO:root: Starting optimization ...
[2025-04-22 16:06:37,476] INFO:root: Starting optimization ...


 45%|████▍     | 161/360 [00:07<00:10, 18.45it/s]

[2025-04-22 16:06:37,533] INFO:root: Starting optimization ...
[2025-04-22 16:06:37,580] INFO:root: Starting optimization ...
[2025-04-22 16:06:37,627] INFO:root: Starting optimization ...


 46%|████▌     | 164/360 [00:08<00:10, 19.53it/s]

[2025-04-22 16:06:37,672] INFO:root: Starting optimization ...
[2025-04-22 16:06:37,717] INFO:root: Starting optimization ...
[2025-04-22 16:06:37,765] INFO:root: Starting optimization ...


 46%|████▋     | 167/360 [00:08<00:09, 20.54it/s]

[2025-04-22 16:06:37,802] INFO:root: Starting optimization ...
[2025-04-22 16:06:37,837] INFO:root: Starting optimization ...
[2025-04-22 16:06:37,875] INFO:root: Starting optimization ...


 47%|████▋     | 170/360 [00:08<00:08, 22.38it/s]

[2025-04-22 16:06:37,913] INFO:root: Starting optimization ...
[2025-04-22 16:06:37,957] INFO:root: Starting optimization ...
[2025-04-22 16:06:38,005] INFO:root: Starting optimization ...


 48%|████▊     | 173/360 [00:08<00:08, 21.74it/s]

[2025-04-22 16:06:38,061] INFO:root: Starting optimization ...
[2025-04-22 16:06:38,117] INFO:root: Starting optimization ...
[2025-04-22 16:06:38,167] INFO:root: Starting optimization ...


 49%|████▉     | 176/360 [00:08<00:08, 20.80it/s]

[2025-04-22 16:06:38,218] INFO:root: Starting optimization ...
[2025-04-22 16:06:38,261] INFO:root: Starting optimization ...
[2025-04-22 16:06:38,295] INFO:root: Starting optimization ...


 50%|████▉     | 179/360 [00:08<00:08, 21.82it/s]

[2025-04-22 16:06:38,339] INFO:root: Starting optimization ...
[2025-04-22 16:06:38,373] INFO:root: Starting optimization ...
[2025-04-22 16:06:38,416] INFO:root: Starting optimization ...


 51%|█████     | 182/360 [00:08<00:08, 21.09it/s]

[2025-04-22 16:06:38,494] INFO:root: Starting optimization ...
[2025-04-22 16:06:38,538] INFO:root: Starting optimization ...
[2025-04-22 16:06:38,586] INFO:root: Starting optimization ...


 51%|█████▏    | 185/360 [00:08<00:08, 21.22it/s]

[2025-04-22 16:06:38,634] INFO:root: Starting optimization ...
[2025-04-22 16:06:38,687] INFO:root: Starting optimization ...
[2025-04-22 16:06:38,731] INFO:root: Starting optimization ...


 52%|█████▏    | 188/360 [00:09<00:08, 21.21it/s]

[2025-04-22 16:06:38,775] INFO:root: Starting optimization ...
[2025-04-22 16:06:38,836] INFO:root: Starting optimization ...
[2025-04-22 16:06:38,881] INFO:root: Starting optimization ...


 53%|█████▎    | 191/360 [00:09<00:08, 20.84it/s]

[2025-04-22 16:06:38,923] INFO:root: Starting optimization ...
[2025-04-22 16:06:38,958] INFO:root: Starting optimization ...
[2025-04-22 16:06:38,995] INFO:root: Starting optimization ...


 54%|█████▍    | 194/360 [00:09<00:07, 22.54it/s]

[2025-04-22 16:06:39,030] INFO:root: Starting optimization ...
[2025-04-22 16:06:39,071] INFO:root: Starting optimization ...
[2025-04-22 16:06:39,106] INFO:root: Starting optimization ...


 55%|█████▍    | 197/360 [00:09<00:06, 23.76it/s]

[2025-04-22 16:06:39,141] INFO:root: Starting optimization ...
[2025-04-22 16:06:39,178] INFO:root: Starting optimization ...
[2025-04-22 16:06:39,215] INFO:root: Starting optimization ...


 56%|█████▌    | 200/360 [00:09<00:06, 24.60it/s]

[2025-04-22 16:06:39,254] INFO:root: Starting optimization ...
[2025-04-22 16:06:39,301] INFO:root: Starting optimization ...
[2025-04-22 16:06:39,342] INFO:root: Starting optimization ...


 56%|█████▋    | 203/360 [00:09<00:06, 23.89it/s]

[2025-04-22 16:06:39,388] INFO:root: Starting optimization ...
[2025-04-22 16:06:39,428] INFO:root: Starting optimization ...
[2025-04-22 16:06:39,474] INFO:root: Starting optimization ...


 57%|█████▋    | 206/360 [00:09<00:06, 23.58it/s]

[2025-04-22 16:06:39,521] INFO:root: Starting optimization ...
[2025-04-22 16:06:39,572] INFO:root: Starting optimization ...
[2025-04-22 16:06:39,617] INFO:root: Starting optimization ...


 58%|█████▊    | 209/360 [00:09<00:06, 22.73it/s]

[2025-04-22 16:06:39,662] INFO:root: Starting optimization ...
[2025-04-22 16:06:39,710] INFO:root: Starting optimization ...
[2025-04-22 16:06:39,752] INFO:root: Starting optimization ...


 59%|█████▉    | 212/360 [00:10<00:06, 22.98it/s]

[2025-04-22 16:06:39,789] INFO:root: Starting optimization ...
[2025-04-22 16:06:39,828] INFO:root: Starting optimization ...
[2025-04-22 16:06:39,861] INFO:root: Starting optimization ...


 60%|█████▉    | 215/360 [00:10<00:06, 24.04it/s]

[2025-04-22 16:06:39,901] INFO:root: Starting optimization ...
[2025-04-22 16:06:39,939] INFO:root: Starting optimization ...
[2025-04-22 16:06:39,981] INFO:root: Starting optimization ...


 61%|██████    | 218/360 [00:10<00:05, 24.15it/s]

[2025-04-22 16:06:40,025] INFO:root: Starting optimization ...
[2025-04-22 16:06:40,068] INFO:root: Starting optimization ...
[2025-04-22 16:06:40,110] INFO:root: Starting optimization ...


 61%|██████▏   | 221/360 [00:10<00:05, 23.97it/s]

[2025-04-22 16:06:40,152] INFO:root: Starting optimization ...
[2025-04-22 16:06:40,196] INFO:root: Starting optimization ...
[2025-04-22 16:06:40,241] INFO:root: Starting optimization ...


 62%|██████▏   | 224/360 [00:10<00:05, 23.65it/s]

[2025-04-22 16:06:40,282] INFO:root: Starting optimization ...
[2025-04-22 16:06:40,324] INFO:root: Starting optimization ...
[2025-04-22 16:06:40,371] INFO:root: Starting optimization ...


 63%|██████▎   | 227/360 [00:10<00:05, 23.20it/s]

[2025-04-22 16:06:40,418] INFO:root: Starting optimization ...
[2025-04-22 16:06:40,464] INFO:root: Starting optimization ...
[2025-04-22 16:06:40,512] INFO:root: Starting optimization ...


 64%|██████▍   | 230/360 [00:10<00:05, 22.34it/s]

[2025-04-22 16:06:40,564] INFO:root: Starting optimization ...
[2025-04-22 16:06:40,607] INFO:root: Starting optimization ...
[2025-04-22 16:06:40,663] INFO:root: Starting optimization ...


 65%|██████▍   | 233/360 [00:11<00:05, 21.76it/s]

[2025-04-22 16:06:40,710] INFO:root: Starting optimization ...
[2025-04-22 16:06:40,766] INFO:root: Starting optimization ...
[2025-04-22 16:06:40,815] INFO:root: Starting optimization ...


 66%|██████▌   | 236/360 [00:11<00:05, 20.87it/s]

[2025-04-22 16:06:40,868] INFO:root: Starting optimization ...
[2025-04-22 16:06:40,917] INFO:root: Starting optimization ...
[2025-04-22 16:06:40,965] INFO:root: Starting optimization ...


 66%|██████▋   | 239/360 [00:11<00:05, 20.85it/s]

[2025-04-22 16:06:41,013] INFO:root: Starting optimization ...
[2025-04-22 16:06:41,063] INFO:root: Starting optimization ...
[2025-04-22 16:06:41,108] INFO:root: Starting optimization ...


 67%|██████▋   | 242/360 [00:11<00:05, 20.95it/s]

[2025-04-22 16:06:41,155] INFO:root: Starting optimization ...
[2025-04-22 16:06:41,237] INFO:root: Starting optimization ...
[2025-04-22 16:06:41,280] INFO:root: Starting optimization ...


 68%|██████▊   | 245/360 [00:11<00:05, 19.83it/s]

[2025-04-22 16:06:41,323] INFO:root: Starting optimization ...
[2025-04-22 16:06:41,370] INFO:root: Starting optimization ...
[2025-04-22 16:06:41,443] INFO:root: Starting optimization ...


 69%|██████▉   | 248/360 [00:11<00:05, 19.29it/s]

[2025-04-22 16:06:41,489] INFO:root: Starting optimization ...
[2025-04-22 16:06:41,533] INFO:root: Starting optimization ...
[2025-04-22 16:06:41,580] INFO:root: Starting optimization ...


 70%|██████▉   | 251/360 [00:11<00:05, 20.24it/s]

[2025-04-22 16:06:41,620] INFO:root: Starting optimization ...
[2025-04-22 16:06:41,674] INFO:root: Starting optimization ...
[2025-04-22 16:06:41,724] INFO:root: Starting optimization ...


 71%|███████   | 254/360 [00:12<00:05, 20.03it/s]

[2025-04-22 16:06:41,774] INFO:root: Starting optimization ...
[2025-04-22 16:06:41,819] INFO:root: Starting optimization ...
[2025-04-22 16:06:41,872] INFO:root: Starting optimization ...


 71%|███████▏  | 257/360 [00:12<00:05, 19.93it/s]

[2025-04-22 16:06:41,927] INFO:root: Starting optimization ...
[2025-04-22 16:06:41,983] INFO:root: Starting optimization ...
[2025-04-22 16:06:42,036] INFO:root: Starting optimization ...


 72%|███████▏  | 260/360 [00:12<00:05, 19.49it/s]

[2025-04-22 16:06:42,088] INFO:root: Starting optimization ...
[2025-04-22 16:06:42,155] INFO:root: Starting optimization ...


 73%|███████▎  | 262/360 [00:12<00:05, 18.56it/s]

[2025-04-22 16:06:42,214] INFO:root: Starting optimization ...
[2025-04-22 16:06:42,273] INFO:root: Starting optimization ...


 73%|███████▎  | 264/360 [00:12<00:05, 18.16it/s]

[2025-04-22 16:06:42,331] INFO:root: Starting optimization ...
[2025-04-22 16:06:42,399] INFO:root: Starting optimization ...


 74%|███████▍  | 266/360 [00:12<00:05, 17.50it/s]

[2025-04-22 16:06:42,457] INFO:root: Starting optimization ...
[2025-04-22 16:06:42,517] INFO:root: Starting optimization ...


 74%|███████▍  | 268/360 [00:12<00:05, 17.11it/s]

[2025-04-22 16:06:42,581] INFO:root: Starting optimization ...
[2025-04-22 16:06:42,624] INFO:root: Starting optimization ...
[2025-04-22 16:06:42,667] INFO:root: Starting optimization ...


 75%|███████▌  | 271/360 [00:13<00:04, 18.75it/s]

[2025-04-22 16:06:42,714] INFO:root: Starting optimization ...
[2025-04-22 16:06:42,755] INFO:root: Starting optimization ...
[2025-04-22 16:06:42,800] INFO:root: Starting optimization ...


 76%|███████▌  | 274/360 [00:13<00:04, 19.93it/s]

[2025-04-22 16:06:42,848] INFO:root: Starting optimization ...
[2025-04-22 16:06:42,898] INFO:root: Starting optimization ...


 77%|███████▋  | 276/360 [00:13<00:04, 19.80it/s]

[2025-04-22 16:06:42,950] INFO:root: Starting optimization ...
[2025-04-22 16:06:42,993] INFO:root: Starting optimization ...
[2025-04-22 16:06:43,044] INFO:root: Starting optimization ...


 78%|███████▊  | 279/360 [00:13<00:03, 20.40it/s]

[2025-04-22 16:06:43,089] INFO:root: Starting optimization ...
[2025-04-22 16:06:43,136] INFO:root: Starting optimization ...
[2025-04-22 16:06:43,205] INFO:root: Starting optimization ...


 78%|███████▊  | 282/360 [00:13<00:03, 19.68it/s]

[2025-04-22 16:06:43,253] INFO:root: Starting optimization ...
[2025-04-22 16:06:43,298] INFO:root: Starting optimization ...
[2025-04-22 16:06:43,346] INFO:root: Starting optimization ...


 79%|███████▉  | 285/360 [00:13<00:03, 20.46it/s]

[2025-04-22 16:06:43,387] INFO:root: Starting optimization ...
[2025-04-22 16:06:43,422] INFO:root: Starting optimization ...
[2025-04-22 16:06:43,473] INFO:root: Starting optimization ...


 80%|████████  | 288/360 [00:13<00:03, 20.81it/s]

[2025-04-22 16:06:43,526] INFO:root: Starting optimization ...
[2025-04-22 16:06:43,575] INFO:root: Starting optimization ...
[2025-04-22 16:06:43,621] INFO:root: Starting optimization ...


 81%|████████  | 291/360 [00:14<00:03, 20.64it/s]

[2025-04-22 16:06:43,676] INFO:root: Starting optimization ...
[2025-04-22 16:06:43,742] INFO:root: Starting optimization ...
[2025-04-22 16:06:43,796] INFO:root: Starting optimization ...


 82%|████████▏ | 294/360 [00:14<00:03, 19.49it/s]

[2025-04-22 16:06:43,847] INFO:root: Starting optimization ...
[2025-04-22 16:06:43,895] INFO:root: Starting optimization ...
[2025-04-22 16:06:43,944] INFO:root: Starting optimization ...


 82%|████████▎ | 297/360 [00:14<00:03, 19.75it/s]

[2025-04-22 16:06:43,995] INFO:root: Starting optimization ...
[2025-04-22 16:06:44,041] INFO:root: Starting optimization ...
[2025-04-22 16:06:44,089] INFO:root: Starting optimization ...


 83%|████████▎ | 300/360 [00:14<00:02, 20.39it/s]

[2025-04-22 16:06:44,131] INFO:root: Starting optimization ...
[2025-04-22 16:06:44,175] INFO:root: Starting optimization ...
[2025-04-22 16:06:44,232] INFO:root: Starting optimization ...


 84%|████████▍ | 303/360 [00:14<00:02, 19.77it/s]

[2025-04-22 16:06:44,294] INFO:root: Starting optimization ...
[2025-04-22 16:06:44,349] INFO:root: Starting optimization ...


 85%|████████▍ | 305/360 [00:14<00:02, 19.36it/s]

[2025-04-22 16:06:44,405] INFO:root: Starting optimization ...
[2025-04-22 16:06:44,463] INFO:root: Starting optimization ...


 85%|████████▌ | 307/360 [00:14<00:02, 19.34it/s]

[2025-04-22 16:06:44,509] INFO:root: Starting optimization ...
[2025-04-22 16:06:44,566] INFO:root: Starting optimization ...
[2025-04-22 16:06:44,604] INFO:root: Starting optimization ...


 86%|████████▌ | 310/360 [00:14<00:02, 20.05it/s]

[2025-04-22 16:06:44,646] INFO:root: Starting optimization ...
[2025-04-22 16:06:44,685] INFO:root: Starting optimization ...
[2025-04-22 16:06:44,730] INFO:root: Starting optimization ...


 87%|████████▋ | 313/360 [00:15<00:02, 20.92it/s]

[2025-04-22 16:06:44,777] INFO:root: Starting optimization ...
[2025-04-22 16:06:44,820] INFO:root: Starting optimization ...
[2025-04-22 16:06:44,858] INFO:root: Starting optimization ...


 88%|████████▊ | 316/360 [00:15<00:01, 22.02it/s]

[2025-04-22 16:06:44,897] INFO:root: Starting optimization ...
[2025-04-22 16:06:44,934] INFO:root: Starting optimization ...
[2025-04-22 16:06:44,982] INFO:root: Starting optimization ...


 89%|████████▊ | 319/360 [00:15<00:01, 21.94it/s]

[2025-04-22 16:06:45,038] INFO:root: Starting optimization ...
[2025-04-22 16:06:45,098] INFO:root: Starting optimization ...
[2025-04-22 16:06:45,146] INFO:root: Starting optimization ...


 89%|████████▉ | 322/360 [00:15<00:01, 21.16it/s]

[2025-04-22 16:06:45,193] INFO:root: Starting optimization ...
[2025-04-22 16:06:45,292] INFO:root: Starting optimization ...
[2025-04-22 16:06:45,337] INFO:root: Starting optimization ...


 90%|█████████ | 325/360 [00:15<00:01, 18.13it/s]

[2025-04-22 16:06:45,409] INFO:root: Starting optimization ...
[2025-04-22 16:06:45,452] INFO:root: Starting optimization ...


 91%|█████████ | 327/360 [00:15<00:01, 18.48it/s]

[2025-04-22 16:06:45,509] INFO:root: Starting optimization ...
[2025-04-22 16:06:45,551] INFO:root: Starting optimization ...
[2025-04-22 16:06:45,596] INFO:root: Starting optimization ...


 92%|█████████▏| 330/360 [00:15<00:01, 19.69it/s]

[2025-04-22 16:06:45,640] INFO:root: Starting optimization ...
[2025-04-22 16:06:45,682] INFO:root: Starting optimization ...
[2025-04-22 16:06:45,722] INFO:root: Starting optimization ...


 92%|█████████▎| 333/360 [00:16<00:01, 21.08it/s]

[2025-04-22 16:06:45,762] INFO:root: Starting optimization ...
[2025-04-22 16:06:45,808] INFO:root: Starting optimization ...
[2025-04-22 16:06:45,852] INFO:root: Starting optimization ...


 93%|█████████▎| 336/360 [00:16<00:01, 21.67it/s]

[2025-04-22 16:06:45,892] INFO:root: Starting optimization ...
[2025-04-22 16:06:45,937] INFO:root: Starting optimization ...
[2025-04-22 16:06:45,979] INFO:root: Starting optimization ...


 94%|█████████▍| 339/360 [00:16<00:00, 22.07it/s]

[2025-04-22 16:06:46,023] INFO:root: Starting optimization ...
[2025-04-22 16:06:46,068] INFO:root: Starting optimization ...
[2025-04-22 16:06:46,112] INFO:root: Starting optimization ...


 95%|█████████▌| 342/360 [00:16<00:00, 21.91it/s]

[2025-04-22 16:06:46,163] INFO:root: Starting optimization ...
[2025-04-22 16:06:46,213] INFO:root: Starting optimization ...
[2025-04-22 16:06:46,261] INFO:root: Starting optimization ...


 96%|█████████▌| 345/360 [00:16<00:00, 21.04it/s]

[2025-04-22 16:06:46,317] INFO:root: Starting optimization ...
[2025-04-22 16:06:46,370] INFO:root: Starting optimization ...
[2025-04-22 16:06:46,420] INFO:root: Starting optimization ...


 97%|█████████▋| 348/360 [00:16<00:00, 20.81it/s]

[2025-04-22 16:06:46,466] INFO:root: Starting optimization ...
[2025-04-22 16:06:46,509] INFO:root: Starting optimization ...
[2025-04-22 16:06:46,553] INFO:root: Starting optimization ...


 98%|█████████▊| 351/360 [00:16<00:00, 21.15it/s]

[2025-04-22 16:06:46,602] INFO:root: Starting optimization ...
[2025-04-22 16:06:46,646] INFO:root: Starting optimization ...
[2025-04-22 16:06:46,692] INFO:root: Starting optimization ...


 98%|█████████▊| 354/360 [00:17<00:00, 20.87it/s]

[2025-04-22 16:06:46,750] INFO:root: Starting optimization ...
[2025-04-22 16:06:46,794] INFO:root: Starting optimization ...
[2025-04-22 16:06:46,838] INFO:root: Starting optimization ...


 99%|█████████▉| 357/360 [00:17<00:00, 21.40it/s]

[2025-04-22 16:06:46,881] INFO:root: Starting optimization ...
[2025-04-22 16:06:46,919] INFO:root: Starting optimization ...
[2025-04-22 16:06:46,964] INFO:root: Starting optimization ...


100%|██████████| 360/360 [00:17<00:00, 20.77it/s]


Unnamed: 0,burden_gene,egene,weight_col,pv
0,ENSG00000100181,ENSG00000100181,maf_beta,0.024738
1,ENSG00000237438,ENSG00000237438,maf_beta,0.012391
2,ENSG00000177663,ENSG00000177663,maf_beta,0.526781
3,ENSG00000069998,ENSG00000069998,maf_beta,0.359992
4,ENSG00000185837,ENSG00000185837,maf_beta,0.693745
...,...,...,...,...
355,ENSG00000205560,ENSG00000205560,maf_beta,0.820957
356,ENSG00000100288,ENSG00000100288,maf_beta,0.433113
357,ENSG00000205559,ENSG00000205559,maf_beta,0.205992
358,ENSG00000100299,ENSG00000100299,maf_beta,0.350120
