# Manuscript outline of quantile QTL and quantile twas

## Title

Quantile Regression-Based QTL and TWAS Analyses Reveal Heterogeneous Genetic Effects

## Abstract

- Brief background on AD genetics and expression studies
- Current limitations of mean-based approaches
- Introduction of quantile QTL/TWAS framework
- Key findings and significance
- Impact on understanding AD mechanisms

## 1 Introduction

- Traditional QTL and TWAS approaches
    - Linear regression limitations
    - Need for capturing heterogeneous effects
- Importance of quantile-specific effects
    - Expression heterogeneity in biological systems
    - Current challenges in capturing non-linear relationships
- Study overview and innovations
    - Novel quantile framework
    - Integration of QTL and TWAS analyses
    - Application to AD and brain tissue data

## 2 Results


### 1. **Characterization and Subtyping of Quantile xQTLs**

- **Motivation**: Traditional QTL methods primarily rely on mean-based modeling, which may not fully capture the heterogeneity in gene-to-phenotype regulation. By defining "pure quantile QTLs," we aim to identify regulatory signals that are undetectable with linear regression (LR) and categorize them by regulatory pattern, uncovering potential complexity in genetic regulation.

1.1 **Classification Based on Significance**

- **Unique LR (not Quantile QTL)**: QTLs showing homogenous effects across quantiles, which are mainly captured by mean-level analysis without significant heterogeneity.
- **Unique QR Quantile QTL）**: QTLs that are significant only in QR models but not in LR. These are further divided into:
    - **No Mean Effect**: No significant mean-level effect, but strong effects at high or low quantiles.
    - **Weak Mean Effect**: QR detects significant effects at extreme quantiles where LR does not reach significance, often suggesting interaction or non-linear effects.
- **Both QR and LR Significant (Partially Quantile QTL)**:
    - **Homogeneous Effect（not Quantile QTL）**: When LR p-values are substantially smaller than QR, indicating a stable effect across quantiles.
    - **Heterogeneous Effect（Quantile QTL）**: QR p-values are substantially smaller than LR, QR shows stronger effects, and reveals quantile-specific variations potentially due to interaction effects or other non-linear modulation.

1.2 **Classification Based on Effect Patterns**

- **Mean Shift**: QTLs influencing the overall mean without affecting variance, indicative of a global shift in expression.
- **Variance Change**: QTLs that increase or decrease variance across the population, often associated with individual-specific or environmental interactions.
- **Local Effects**: Significant only within specific quantile ranges, representing context-specific regulatory mechanisms.

1.3 **Defining Pure Quantile QTLs**

- **Pure Quantile QTLs** are QTLs that are significant only in QR models or that show unique heterogeneity within shared QR and LR significant loci.
    - **Unique QR**: Detected by QR but not by LR.
    - **Heterogeneous Overlap**: Variants significant in both QR (using Cauchy-combined p-values) and LR models but with **distinct heterogeneity patterns** in QR:
        - **Filter Step**: Select variants where p(Cauchy-QR)<p(LR) to ensure that QR provides stronger or unique signals for these loci.
        - **Identifying Variance Change**:
            - Fit a linear regression model across quantiles: β(τ)=α+γ⋅τ+ϵ.
            - If p(γ)<0.05, this indicates significant variance change across quantiles. This approach distinguishes variance-modifying QTLs from simple mean shifts.
            - **Validation**: Apply Bootstrap with confidence bands to validate the observed non-linearity and quantify heterogeneity in effect sizes across quantiles.
        - **Quantitative Description**: Compute a **Heterogeneity Index** to provide a quantitative measure of the variance change, offering an additional layer of description for effect variability.
    - **Identification of Local Effects**: After getting pure QR, for loci significant only in low or high quantile ranges (e.g., 5%-45% or 55%-95%), these can be flagged as having local effects.
    - **Summary**: Variants that meet the criteria of unique QR and **Heterogeneous Overlap:** significant QR signals (p(Cauchy-QR) < p(LR)), demonstrate variance change or local effects, and pass heterogeneity validation are classified as **Pure Quantile QTLs**. These loci contribute uniquely to the understanding of genetic effects that vary across expression quantiles and are undetected by traditional LR methods.

---

### 2. **Global Landscape of Quantile xQTLs Reveals Diverse Regulatory Mechanisms**

- **Motivation**: This section explores the global landscape of pure QR signals across different modality and contexts. By characterizing quantile QTL diversity and context-specific patterns, we aim to reveal unique regulatory features in various conditions.

2.1 **Characterization Across Modality and Contexts**

- **Modalities** : Analysis across eQTL, pQTL, sQTL modalities to capture multi-layered regulatory effects.
- **Contextual Factors**: Examination across tissue types and cell types to observe context-specific regulatory patterns.

**Table 1. Description of datasets for xx different quantile-xQTL data from brain cortex of ROSMAP subjects**. # pure QR

**Fig 1 : Disease-agnostic Quantile QTL results across cell types and phenotypes modalities**. 

stacked bar plot of diff subtype of QR: unique QR + overlapped heter QR(pQR<PLR) across contexts

**a**. Quantile QTL (unique, heter-overlapped) across multiple molecular phenotype modalities (eQTL, pQTL, sQTL if available). 

**b**. Quantile QTL (unique, heter-overlapped) across different brain 6 cell types in DLPFC and bulk-xQTL.

**c**. Distribution of number of significant quantile-specifc QTL from tau 0.05-0.95 

**d**. Stacked barplot showing local effects quantile QTL of those passing cauchy (only sig in low quantile, only sig in high quantile, both: non-local ) across contexts

**e**. Manhattan plot showing QR associations with gene labels (q<0.05).—only pure QR OR upper fig: unique QR with lower fig: heter-overlapped QR.

2.2 **Functional Annotation and Enrichment Analysis**

- **Gene Set and Pathway Enrichment**: Analysis with GO, KEGG, and other databases to identify biological pathways enriched with quantile QTLs.
- **CRISPR Validation**: Utilizing CRISPR screening data to verify the functional role of quantile QTLs.
- **Perturb-seq Analysis**: Analyzing network-level effects of quantile QTLs through Perturb-seq data.
- **Hi-C Interaction Data**: Evaluating spatial associations between non-coding variants and gene regions.
- **RegulomeDB Annotations**: Identifying the regulatory context of quantile QTLs within non-coding elements to understand their role in regulatory networks.

**Fig 2: Functional Annotation and Enrichment Analysis of Pure Quantile QTLs**  

Validation of Quantile QTL signals using CRISPRi data.

1. Enrichments of CRISPRi concordant peak-gene links in marginal quantile xQTL signals and  quantile-specifc quantile QTL signals (union of 0.05-0.35, union of 0.4-0.7. union of 0.75-0.95).

**b**. Enrichment of CRISPRi concordant links in cell types with quantile QTL signals. 

**c**. Enrichment of CRISPRi concordant links in brain cortical regions and monocyte quantile QTL

signals.

**d**. Enrichment in Functional Datasets Hi-C     

Enrichment with Hi-C data, displaying how often Quantile QTLs reside in non-coding regions that are spatially linked to active gene loci.

**e**. Pathway Enrichment for Genes with Quantile QTLs     

Top 10 significantly enriched pathways for genes underlying neuron- and microglia-specific quantile QTL, respectively.

Other functional plots…

---

### 3. **Quantile QTL Reveals Heterogeneity Effects Compared with Linear Approaches**

- **Motivation**: This section aims to demonstrate the distinct advantage of quantile QTL (QR) in capturing heterogeneous regulatory effects that linear regression (LR) might overlook. By comparing QR- and LR-detected signals, we aim to show how QR uniquely contributes to understanding genetic regulation, especially in capturing non-linear and interaction effects.

### 3.1 **Comparison of** Unique **QR and LR Signals**

- **Objective**: Compare unique QR and LR signals to illustrate the distinct regulatory features captured by each method.
- **Functional Characterization Comparizon of Unique Signals**:
    - **Approach**: Evaluate and annotate unique QR and LR signals within functional regulatory contexts, using tools such as gene set enrichment and pathway analysis. This will highlight how QR captures novel regulatory elements and pathways that are undetected by LR.
    - **Functional Comparisons**: By annotating unique QR and LR signals with functional data (e.g., CRISPR, Perturb-seq, Hi-C), we can assess the biological relevance of each signal type. QR-unique signals may reveal specific regulatory mechanisms at certain quantiles, particularly those associated with variability or extreme expression values.

### 3.2 **Analysis of Heterogeneous and non-Heterogeneous Quantile Signals shared with LR**

- **Objective**: Investigate signals detected by both QR and LR, focusing on those with significant heterogeneity captured through QR analysis.
- **Characterization of Heterogeneity in Overlapping pure QR Signals**:
    - **Approach**: Identify overlapping QR and LR signals that exhibit meaningful heterogeneity across quantiles. Using Cauchy combination p-values and cross-quantile β(τ) variations, this analysis will pinpoint where QR detects additional layers of genetic effect variability that LR overlooks.
    - **Functional Annotation of Overlap Signals**: For overlapping signals, perform functional annotation to assess whether these signals map to known regulatory elements or pathways, providing insights into why QR captures additional variability within these loci.
    

**Fig 3: Global landscape of quantile QTLs compared with LR QTLs**

- **a**. Stacked bar plots showing distribution of QTL types (unique QR, unique LR | heter-shared, non-heter shared) across contexts
- **b**. Stacked bar plots showing distribution of QTL types (unique QR, unique LR | heter-shared, non-heter shared) across modality
- **c**. Effect size heterogeneity density plots (unique QR, heter-shared, non-heter-shared)
- **Case study: Heterogeneity in QR Signals**

**Fig 4: Functional Enrichment of QR vs. LR Signals**:

- **a**. Bar plot or dot plot showing the top 10 enriched pathways for genes underlying QR-unique, LR-unique, heter shared, non-heter shared signals, providing a biological context for the detected QTLs.
- **b**. Enrichment plot showing the QR-unique,  LR-unique, heter-shared, non-heter shared signals with functional datasets, such as CRISPRi data.

---

### 4. **Interaction QTLs Reveal Mechanisms of Effect Heterogeneity**

- **Motivation**: Interaction QTLs capture context-dependent regulatory adaptations, revealing specific effects across different conditions and individual characteristics.

4.1 **Gene-Environment Interactions**

- Analysis of interactions with factors such as sex, APOE genotype, and cell-type proportions to understand how QR signals are modulated in specific contexts.

4.2 **Functional Validation**

- Utilizing functional data, including CRISPR and Perturb-seq, to validate the biological significance of interaction QTLs and ensure these signals represent genuine biological effects.

**Fig 5: Interaction and quantile effects**

- **a**. Proportion of Significant Variants detected by QR (q<0.05) with Significant Interaction Terms Absent in LR.
- **b**. Number of significant variants across interaction types (APOE, sex, cell proportion) by contexts wt/wo signs.
- **c**. Venn diagram of QR, LR, and iQTL gene signals / only show pure QR and iQTL (APOE, sex, cell porportion) gene signals by modality (eQTL, pQTL, xQTL).
- **d**. Venn diagram of QR, LR, and iQTL gene signals / only show pure QR and iQTL (APOE, sex, cell porportion) gene signals of Mic and Ast.
- **e**. Pure QR and iQTL overlapped variants across contexts (stacked barplots showing unique QR and heter overlapped QR) — number/proportion
- **f**. Heatmap of gene-environment interaction tests
- **g**. Case study of interaction-driven heterogeneity (iQTL sig, QR sig, LR not sig). Gene-environment interaction between interested SNP and interested interaction.

**Fig 6: Validation of Interaction QTL signals using CRISPRi data**. 

1. Enrichments of CRISPRi concordant peak-gene links in standard marginal xQTL signals, interaction xQTL signals, Quantile xQTL signals.

**Table 2: Comparison of detection methods (pure QR, pure LR, iQTL) in AD-associated regions** (Total Discoveries, Novel Loci, tissue)

---

### **5. Contribution to Genetic Architecture of Complex Traits of Quantile QTL**

**Quantile xQTL are enriched for disease heritability**

- **Motivation**: Quantile QTL provides a unique opportunity to explore the genetic architecture of complex traits, especially by uncovering the genetic contributions specific to brain and blood-related traits. By evaluating heritability enrichment across trait categories, we aim to determine the context-specific impact of quantile xQTLs on trait variability.

---

### 5.1 Heritability Enrichment Analysis

- **Objective**: Evaluate the heritability contribution of quantile-specific xQTLs across complex traits by comparing quantile xQTL annotations within different trait categories (all traits, blood-related, and brain-related). This allows us to capture the distinct heritability patterns that quantile xQTLs bring to each trait category.
- **Methodology**:
    - **Stratified LD Score Regression (S-LDSC)**: Conduct S-LDSC analysis to assess heritability enrichment for quantile xQTLs across three main trait categories: all traits, blood-related traits, and brain-related traits. Use baseline-LD annotations (e.g., coding, conserved, regulatory regions) as control and add quantile xQTLs as novel annotations.
    - **Trait Category Comparison**: For each trait category, calculate the joint $\tau^*$ to measure the cumulative heritability contribution from quantile xQTLs within that category, thereby providing insight into category-specific contributions.

---

### 5.2 Cross-Trait Patterns

- **Objective**: Assess if quantile xQTLs exhibit unique heritability enrichment in brain-related traits compared to other categories, supporting the hypothesis of tissue-specific genetic effects.
- **Methodology**:
    - **Category-Specific Analysis**: Examine quantile xQTL heritability enrichment within each of the three trait categories (all traits, blood-related, brain-related). Focus on identifying if brain-related traits show distinct enrichment patterns compared to blood-related and all traits.
    - **Tissue Context Evaluation**: For brain-related traits, specifically evaluate quantile xQTLs within brain tissue datasets (e.g., GTEx brain tissues) to determine if certain quantile-specific signals are enriched in neurological traits, providing evidence for context-specific effects.
- **Figure Outline**:

**Fig.7.Quantile QTL annotations with 97 functional annotations from baseline-LD model for enrichment, marginal 𝝉∗, and joint 𝝉∗.**[Forward stepwise elimination for the joint tau*]

- **a**. Bar plots displaying heritability enrichment for quantile xQTLs across three categories: all traits, blood-related traits, and brain-related traits. Each bar represents the enrichment value for quantile xQTL annotations compared to baseline annotations.
- **b**. Standardized effect size plots ($\tau^*$) for quantile xQTLs across the three trait categories. These plots will illustrate the magnitude of quantile xQTL contributions within each category, highlighting the relative strength of effects in different trait contexts.
- **c**. Joint analysis of quantile xQTLs across the three trait categories (all traits, blood-related, brain-related), visualized with bar plots showing joint $\tau^*$ values. Each bar represents the combined effect size within each trait category, illustrating the relative contribution of quantile xQTLs to heritability across tissue-specific categories.

---

### 6. **Integration with Alzheimer's Disease Genetics**

- **Motivation**: Alzheimer’s disease (AD) genetics involve unexplained genetic variability. Quantile QTL analysis can reveal previously undetected regulatory features in AD, providing new insights into AD pathology.

6.1 **Variant-Level Insights (76 loci + Fine-Mapping Loci)**

- **Novel AD Loci**: Identify AD-associated loci not previously detected by traditional xQTL methods.
- **New Interpretation of Known Loci**: Examine previously studied AD loci (explained by LR), identifying novel regulatory patterns in different contexts or target genes using quantile-specific analysis.
- **Analysis of Key AD Genes (20 Genes)**: Determine whether these genes contain quantile-specific QTL signals, expanding our understanding of AD-related gene regulation.

6.2 **Gene-Level Associations through Quantile TWAS**

- **Quantile TWAS Analysis**: Compare TWAS results across fixed and dynamic regions to capture gene-phenotype relationships at different quantiles, focusing on dynamic regions due to their data-driven advantages.
    - **Dynamic Region Approach**: Provides greater insight into heterogeneous gene expression by clustering results across quantiles.
    - Hint: **In-Depth Understanding of Heterogeneity**: Demonstrate dynamic region approach through examples aligning clustering with quantile-specific β(τ).

6.3 **Case Study**

- **Example Analysis**: Conduct a detailed case study with an AD-related gene to illustrate the biological significance of quantile QTLs. Compare dynamic and fixed regions, and present heterogeneity across contexts for practical demonstration of quantile QTL's explanatory power in AD research.

**Fig 8: Quantile-specific effects at AD loci**

- **a**. Regional association plots for key loci
- **b**. Quantile coefficient plots with standard errors
- **c**. Empirical conditional quantile functions by genotype
- **d**. AD: heat map showing correlations of quantile-specific expression across tissues, highlighting any shared patterns reported in the literature.
- **e**. Stacked Bar Plot: number of loci with categories (Known/Novel) across contexts.
- **f**. Pie Charts showing: proportion of quantile QTL loci (known/novel) in 76 GWAS loci; proportion of Qquantile QTL loci (known/novel) in Fine-mapping loci
- **g**. Analysis of 20 Key AD Genes: Number of quantile QTL loci identified per gene.

### Table 3: Summary of Quantile QTL **at AD loci**

| Category | Total | Pure QR Signals | Novel Discoveries | New Context | New Target Genes |
| --- | --- | --- | --- | --- | --- |
| GWAS Loci | 76 |  |  |  |  |
| Fine-mapping Loci | ~20 |  |  |  |  |
| Key Genes | 20 |  |  |  |  |

**Fig 9: Quantile TWAS methodology and results**

- **a**. Schematic of fixed and dynamic region approaches
- **b**. Number of novel gene discoveries of fixed model and dynamic model.
- **c**. Detailed comparison for Quantile TWAS and best model / elestic net at different significance levels (i.e., 2.5e −6, 1e −10, 1e −15, and 1e −20).

Case study: Crisper

- **d**. Example of microglia-specific eQTL quantile QTL for the gene xxxxx **validated by CRISPRi in xxx cells.
- e. Example of quantile eQTL and quantile pQTL for the gene xxx **validated by CRISPRi in xxxx cells.

**Table 4: Performance metrics of quantile TWAS approaches**

- **a**. Performance of An, Cn (cauchy, at least 2 regions significant, unique discoveries) : try to illustarare Cn is better than An, then use Cn as QTWAS to compare with LR TWAS model.
- **b**. Tissue/Context Specificity showing AD findings

### Table 4a: Detailed Performance Analysis of TWAS Methods compared with Elestic net/ Best model

| Method | Total Discoveries | Multi-Region Evidence | # Novel Genes |
| --- | --- | --- | --- |
| Quantile TWAS (A1-A3) |  |  |  |
| Quantile TWAS (C1-Cn) |  |  |  |
| An+ Cn |  |  |  |

### Table 4b: Tissue/Context Specificity showing AD findings
| Context | #Quantile QTL | # Novel Quantile QTL | # Quantile TWAS Genes  (QTWAS-Cn) | # Novel Quantile TWAS Genes (QTWAS-Cn) | Novel gene list | Top Pathways (QTWAS-Cn) |
| --- | --- | --- | --- | --- | --- | --- |
| Microglia |  |  |  |  |  |  |
| Astrocytes |  |  |  |  |  |  |
| Neurons |  |  |  |  |  |  |

## 3 Discussion

- Key findings and biological implications
- Methodological advances
- Clinical relevance
- Limitations and future directions

## 4 Methods

### 4.1 Study Design and Data

- ROSMAP cohort description
- GTEx data processing
- Quality control procedures

### 4.2 Quantile QTL Framework

### 4.3 Quantile TWAS Methodology

- **Workflow and updating models of TWAS**
- Weight calculation
- Region integration approaches
- Statistical inference

## 5 Supplementary Information

- Extended methods
- Additional analyses
- Supplementary figures and tables
- Quality control metrics

## 6 Acknowledgements

- Data providers
- Funding sources
- Computational resources



### Innovation of our paper:

- First comprehensive quantile framework integrating both QTL and TWAS analyses
- Novel method for capturing heterogeneous genetic effects across expression distribution
- Integration of fixed and dynamic region approaches
- Application to large-scale brain tissue data and AD genetics
- Identification of novel associations missed by traditional methods