scNetGWAS

A Python toolkit for integrating GWAS, single-cell RNA-seq, and STRING protein-protein interaction (PPI) networks to identify cell-type-specific disease modules.

Overview

scNetGWAS builds weighted gene networks by combining three complementary data sources:

GWAS gene-level p-values (e.g., from MAGMA) — identifies disease-associated genes
Single-cell RNA-seq data (AnnData .h5ad) — provides cell-type resolution and co-expression structure
STRING PPI networks — defines physical/functional protein interactions

The pipeline constructs cell-type-specific networks where:

Nodes are genes weighted by GWAS significance (-log10(p))
Edges are PPI interactions weighted by co-expression correlation and STRING confidence scores
Communities (disease modules) are detected via Louvain or other algorithms
Gene prioritization extends beyond GWAS hits using Random Walk with Restart

Key Capabilities

Category	Features
Network Construction	Cell-type stratification, p-value filtering, co-expression edge weighting, STRING confidence scoring
Analysis	Community detection, centrality metrics (5 types), Random Walk with Restart, pathway enrichment, differential co-expression
Multi-Trait	Fisher's method to combine multiple GWAS datasets
Validation	Permutation-based robustness testing, leave-one-cell-type-out cross-validation
QC & Reproducibility	Input QC report, reproducibility manifest with file hashes and package versions
Output	5 network formats, matplotlib plots, self-contained HTML report
Scalability	Parallel co-expression, sparse matrix support, Snakemake batch workflow

Architecture

scnetgwas/
├── pipeline.py        # Main orchestration — runs the full pipeline
├── validation.py      # Input validation and QC reporting
├── coexpression.py    # Co-expression computation (standard, differential, parallel)
├── network.py         # Network construction, centrality, RWR, multi-GWAS merging
├── community.py       # Community detection, robustness testing, cross-validation
├── enrichment.py      # GO/pathway enrichment via gseapy
├── gene_mapping.py    # Gene ID harmonization (Ensembl ↔ Symbol ↔ Entrez)
├── export.py          # Multi-format export, visualization, HTML reports
└── cli.py             # Command-line interface

Installation

Basic Install

git clone https://github.com/glbala87/scNetGWAS.git
cd scNetGWAS
pip install -e .

With All Optional Dependencies

pip install -e ".[all]"

Optional Dependency Groups

pip install -e ".[enrichment]"   # gseapy for pathway enrichment
pip install -e ".[progress]"     # tqdm for progress bars
pip install -e ".[config]"       # pyyaml for YAML config files
pip install -e ".[dev]"          # pytest, pytest-cov, build, twine

Requirements

Python >= 3.9 (tested on 3.9, 3.10, 3.11, 3.12)
Core: numpy, pandas, networkx, scanpy, anndata, scipy, matplotlib

Input File Formats

1. GWAS File (CSV or TSV)

Gene-level association results (e.g., from MAGMA).

Column	Type	Description
`gene`	string	Gene symbol (or Ensembl/Entrez ID)
`pval`	float	Gene-level p-value

gene,pval
TP53,1.2e-8
BRCA1,0.002
EGFR,0.05
MYC,0.5

2. PPI File (CSV or TSV)

Protein-protein interaction network (e.g., from STRING).

Column	Type	Description
`protein1`	string	First protein/gene
`protein2`	string	Second protein/gene
`combined_score`	int (optional)	STRING confidence score (0-1000)

protein1,protein2,combined_score
TP53,BRCA1,900
TP53,EGFR,700
BRCA1,EGFR,500

3. AnnData File (.h5ad)

Single-cell RNA-seq data in AnnData format.

var_names: Gene identifiers (must overlap with GWAS and PPI gene IDs)
obs["cell_type"] (optional): Cell type annotations for cell-type-specific analysis
obs["condition"] (optional): Case/control labels for differential co-expression

4. Gene Mapping File (CSV, optional)

For harmonizing gene IDs across datasets.

Column	Type	Description
`symbol`	string	HGNC gene symbol
`ensembl`	string	Ensembl gene ID
`entrez`	string	Entrez gene ID

symbol,ensembl,entrez
TP53,ENSG00000141510,7157
BRCA1,ENSG00000012048,672

Quick Start

# Install
pip install -e .

# Run with minimum required inputs
scnetgwas \
  --gwas data/gwas_results.csv \
  --ppi data/string_ppi.csv \
  --adata data/scrna.h5ad \
  --outdir results/ \
  -v

# Open the HTML report
open results/report.html

Usage Examples

Basic Run

Builds one global network using all cells, detects communities, computes centrality, and generates outputs.

scnetgwas \
  --gwas data/gwas.csv \
  --ppi data/ppi.csv \
  --adata data/scrna.h5ad \
  --outdir results/ \
  --pval-threshold 0.05 \
  -v

Cell-Type-Specific Networks

Build separate networks for each cell type with cross-validation to assess stability.

scnetgwas \
  --gwas data/gwas.csv \
  --ppi data/ppi.csv \
  --adata data/scrna.h5ad \
  --outdir results/ \
  --cell-types Excitatory Inhibitory Microglia Astrocyte \
  --cross-validation \
  -v

Full Analysis Pipeline

Run every available analysis.

scnetgwas \
  --gwas data/gwas.csv \
  --ppi data/ppi.csv \
  --adata data/scrna.h5ad \
  --outdir results/ \
  --pval-threshold 0.05 \
  --min-edge-weight 0.1 \
  --cell-types Excitatory Inhibitory Microglia \
  --use-string-scores \
  --community-method louvain \
  --community-resolution 1.0 \
  --rwr \
  --enrichment \
  --gene-sets GO_Biological_Process_2023 \
  --differential \
  --condition-col condition \
  --case-label case \
  --control-label control \
  --robustness \
  --n-permutations 100 \
  --cross-validation \
  --n-jobs 4 \
  -v

Multi-Trait GWAS Integration

Combine multiple GWAS traits using Fisher's method for p-value combination.

scnetgwas \
  --gwas data/gwas_schizophrenia.csv \
  --multi-gwas data/gwas_bipolar.csv data/gwas_depression.csv \
  --ppi data/ppi.csv \
  --adata data/scrna.h5ad \
  --outdir results_multi/ \
  --pval-threshold 0.05 \
  -v

Gene ID Harmonization

Auto-detect and convert between Ensembl, HGNC symbols, and Entrez IDs.

scnetgwas \
  --gwas data/gwas_ensembl.csv \
  --ppi data/string_ensembl.csv \
  --adata data/scrna.h5ad \
  --outdir results/ \
  --gene-mapping data/gene_id_map.csv \
  --target-id-type symbol \
  -v

Using a Config File

Store parameters in a YAML or TOML file for reproducibility.

scnetgwas \
  --gwas data/gwas.csv \
  --ppi data/ppi.csv \
  --adata data/scrna.h5ad \
  --outdir results/ \
  --config example_config.yaml

Example example_config.yaml:

pval_threshold: 0.05
min_edge_weight: 0.1
community_method: louvain
community_resolution: 1.0
coexpression: true
centrality: true
rwr: true
enrichment: true
robustness: true
n_permutations: 100
plot: true
html_report: true
qc_report: true
# cell_types:
#   - Excitatory
#   - Inhibitory
#   - Microglia

Batch Processing with Snakemake

Process multiple GWAS traits in parallel using the included Snakefile.

1. Organize your GWAS files:

data/gwas/
├── schizophrenia.csv
├── bipolar.csv
└── depression.csv

2. Create a Snakemake config:

# snakemake_config.yaml
gwas_dir: data/gwas/
ppi: data/ppi.csv
adata: data/scrna.h5ad
outdir: results/
pval_threshold: 0.05
cell_types: ["Excitatory", "Inhibitory", "Microglia"]
enrichment: true
rwr: true

3. Run:

snakemake --configfile snakemake_config.yaml -j 4

This creates separate output directories per trait under results/.

Python API

Use individual modules programmatically for custom workflows.

import scanpy as sc
import pandas as pd
from scnetgwas import (
    run_pipeline,
    dry_run,
    build_network,
    compute_centrality,
    random_walk_restart,
    detect_communities,
    run_enrichment,
    generate_qc_report,
    validate_pipeline_params,
    differential_coexpression,
    harmonize_gene_ids,
)

# ── Option 0: Validate inputs before running ─────────────────
qc = dry_run(
    gwas_path="data/gwas.csv",
    ppi_path="data/ppi.csv",
    adata_path="data/scrna.h5ad",
    outdir="results/",
    pval_threshold=0.05,
    cell_types=["Excitatory", "Inhibitory"],
)
print(qc["warnings"])  # Check for issues before committing to full run

# ── Option 1: Run the full pipeline ──────────────────────────
stats = run_pipeline(
    gwas_path="data/gwas.csv",
    ppi_path="data/ppi.csv",
    adata_path="data/scrna.h5ad",
    outdir="results/",
    pval_threshold=0.05,
    cell_types=["Excitatory", "Inhibitory"],
    rwr=True,
    enrichment=True,
    centrality=True,
    robustness=True,
)
print(stats)

# ── Option 2: Use individual modules ─────────────────────────
gwas = pd.read_csv("data/gwas.csv")
ppi = pd.read_csv("data/ppi.csv")
adata = sc.read_h5ad("data/scrna.h5ad")

# Build network for a specific cell type
G = build_network(gwas, ppi, adata, cell_type="Microglia", pval_threshold=0.05)

# Detect communities
communities = detect_communities(G, method="louvain", resolution=1.0)

# Compute centrality metrics
centrality_df = compute_centrality(G)
print(centrality_df.head(10))

# Gene prioritization via Random Walk with Restart
rwr_scores = random_walk_restart(G, seed_genes=["TP53", "BRCA1"])

# Pathway enrichment per community
enrichment_results = run_enrichment(G, communities)

# Differential co-expression
diff = differential_coexpression(
    adata, list(G.nodes()),
    condition_col="condition", case_label="case", control_label="control"
)

CLI Reference

scnetgwas [OPTIONS]

Required Arguments

Flag	Description
`--gwas PATH`	GWAS gene-level results (CSV/TSV with `gene` and `pval` columns)
`--ppi PATH`	PPI network (CSV/TSV with `protein1` and `protein2` columns)
`--adata PATH`	AnnData `.h5ad` file with scRNA-seq data
`--outdir PATH`	Output directory

Filtering

Flag	Default	Description
`--pval-threshold FLOAT`	None (keep all)	Keep GWAS genes with p-value <= threshold
`--min-edge-weight FLOAT`	0.0	Minimum co-expression weight to retain an edge

Cell-Type Analysis

Flag	Default	Description
`--cell-types TYPE [TYPE ...]`	None (all cells)	Build separate networks per cell type

Community Detection

Flag	Default	Description
`--community-method {louvain,greedy_modularity,label_propagation}`	louvain	Community detection algorithm
`--community-resolution FLOAT`	1.0	Resolution parameter for Louvain
`--robustness`	off	Permutation-based community significance testing
`--n-permutations INT`	100	Number of permutations for robustness
`--cross-validation`	off	Leave-one-cell-type-out cross-validation

Co-Expression

Flag	Default	Description
`--no-coexpression`	off	Skip co-expression computation
`--differential`	off	Compute case vs control differential co-expression
`--condition-col STR`	condition	Column in `adata.obs` for case/control
`--case-label STR`	case	Label for case group
`--control-label STR`	control	Label for control group

Network Analysis

Flag	Default	Description
`--no-centrality`	off	Skip centrality metric computation
`--rwr`	off	Run Random Walk with Restart for gene prioritization
`--rwr-restart-prob FLOAT`	0.15	RWR restart probability
`--rwr-seed-genes GENE [GENE ...]`	None (uses GWAS weights)	Seed genes for RWR
`--use-string-scores`	off	Use STRING confidence scores as edge weights

Multi-Trait Analysis

Flag	Default	Description
`--multi-gwas PATH [PATH ...]`	None	Additional GWAS files (combined via Fisher's method)

Pathway Enrichment

Flag	Default	Description
`--enrichment`	off	Run pathway enrichment on communities
`--gene-sets STR`	GO_Biological_Process_2023	Gene set library for Enrichr

Gene ID Mapping

Flag	Default	Description
`--gene-mapping PATH`	None	Gene ID mapping table (CSV with ensembl/symbol/entrez)
`--target-id-type {symbol,ensembl,entrez}`	symbol	Target ID type for harmonization

Output Options

Flag	Default	Description
`--no-plot`	off	Skip network visualization
`--no-html`	off	Skip HTML report generation
`--no-qc`	off	Skip input QC report

Performance

Flag	Default	Description
`--n-jobs INT`	1	Parallel workers for co-expression computation

Validation

Flag	Description
`--dry-run`	Validate inputs and parameters, generate QC report, skip analysis

General

Flag	Description
`--config PATH`	YAML or TOML config file (keys validated against known parameters)
`-v / --verbose`	Increase verbosity (-v INFO, -vv DEBUG)
`-q / --quiet`	Suppress all output except errors

Features

1. Gene Ontology / Pathway Enrichment

Runs Enrichr enrichment analysis on each detected community module using gseapy. Falls back to a gene-list summary if gseapy is not installed.

2. Random Walk with Restart (RWR)

Propagates GWAS association signal through the PPI network to prioritize genes beyond genome-wide significance. Seed genes are weighted by -log10(p) by default, or specified manually.

3. Differential Co-Expression

Compares Spearman co-expression between case and control groups. Reports per-gene-pair case correlation, control correlation, and the difference — highlighting edges that rewire between conditions.

4. STRING Confidence Edge Scoring

When --use-string-scores is enabled and the PPI file includes a combined_score column, edge weights become a composite of co-expression and STRING confidence (normalized to 0-1).

5. Multi-Trait GWAS Integration

Accepts multiple GWAS files via --multi-gwas. P-values are combined across traits using Fisher's method (scipy.stats.combine_pvalues), enabling identification of pleiotropic disease modules.

6. Network Centrality Metrics

Computes five centrality measures per gene:

Degree — number of direct connections
Betweenness — frequency on shortest paths
Closeness — inverse average distance to all nodes
Eigenvector — connection to other well-connected nodes
PageRank — importance based on link structure

7. Parallel Co-Expression

For large gene sets, co-expression computation can be parallelized across --n-jobs workers using chunked thread pools.

8. Sparse Matrix Support

Handles large sparse AnnData matrices efficiently with chunked column-wise dense conversion, avoiding full memory materialization.

9. PPI Gene Pre-Filtering

Before computing co-expression, genes are pre-filtered to only those present in the PPI network, avoiding unnecessary computation on unconnected genes.

10. HTML Report

Generates a self-contained HTML report (report.html) with:

QC summary and warnings
Network statistics with metric cards
Embedded network visualizations
Centrality rankings
RWR gene prioritization scores
Pathway enrichment tables
Community robustness results
Cross-validation results

11. Reproducibility Manifest

Outputs run_metadata.json containing:

Timestamp
All pipeline parameters
SHA-256 hashes and sizes of input files
Package versions (numpy, pandas, networkx, scanpy, etc.)
Python version

12. Snakemake Workflow

Included Snakefile enables batch processing of multiple GWAS traits in parallel, with automatic result aggregation.

13. Progress Bars

Optional tqdm progress bars for long-running steps. Install with pip install tqdm or pip install -e ".[progress]".

14. Gene ID Mapping

Auto-detects gene ID types (Ensembl, HGNC symbols, Entrez) and harmonizes across datasets using a provided mapping table. Handles duplicates gracefully.

15. Input QC Report & Validation

Before analysis, validates all inputs and generates qc_report.json with:

Parameter bounds validation: p-value threshold (0–1), community resolution (> 0), RWR restart probability (0–1), edge weight (>= 0)
Cell type existence check: Verifies requested --cell-types exist in AnnData before starting
GWAS validation: p-value range (0–1), duplicate gene detection, missing value handling
PPI validation: STRING score range check (0–1000), duplicate edge warnings
Config key validation: Unknown keys in YAML/TOML configs are flagged and ignored
GWAS p-value distribution statistics
PPI edge counts and self-loop detection
scRNA cell counts per cell type
Gene overlap across all three datasets
Warnings for low overlap, missing annotations, or data issues

Use --dry-run to run validation and QC only, without the full analysis.

16. Community Robustness Testing

Permutation-based significance testing via degree-preserving edge swaps. Reports real modularity, null distribution statistics, z-score, and empirical p-value.

17. Leave-One-Cell-Type-Out Cross-Validation

Assesses community stability by holding out each cell type, rebuilding the network, and comparing community assignments via Jaccard similarity.

Output Files

After running, the output directory contains:

File	Description
`network_*.edgelist`	Edge list format
`network_*.graphml`	Cytoscape-compatible GraphML
`network_*.gexf`	Gephi-compatible GEXF
`network_*.json`	JSON node-link format
`network_*_adjacency.csv`	Adjacency matrix as CSV
`network_*.png`	Network visualization (nodes colored by community, sized by GWAS weight)
`centrality_*.csv`	Centrality metrics per gene (degree, betweenness, closeness, eigenvector, PageRank)
`rwr_scores_*.csv`	Random Walk with Restart gene prioritization scores
`diff_coexpr_*.csv`	Differential co-expression (case vs control per gene pair)
`_enrichment_comm.csv`	Pathway enrichment results per community
`*_enrichment_summary.json`	Top enrichment terms per community
`qc_report.json`	Input QC report with overlap stats and warnings
`summary_stats.json`	Network statistics (nodes, edges, density, hubs, communities)
`run_metadata.json`	Reproducibility manifest (parameters, file hashes, versions)
`cross_validation.json`	Leave-one-cell-type-out Jaccard similarities
`report.html`	Self-contained HTML report with all results

Testing

# Install dev dependencies
pip install -e ".[dev]"

# Run all tests (65 tests)
pytest tests/ -v

# Run only the original pipeline tests
pytest tests/test_pipeline.py -v

# Run only the new feature tests
pytest tests/test_new_features.py -v

# Run with coverage report
pytest tests/ -v --cov=scnetgwas --cov-report=term-missing

CI Pipeline

GitHub Actions runs automatically on push/PR with:

Linting: flake8 for syntax errors and style checks
Multi-version testing: Python 3.9, 3.10, 3.11, 3.12
Coverage reporting: pytest-cov with term-missing output
Optional dependency testing: Separate job with all optional deps installed
Pip caching: Faster CI runs with dependency caching

Project Structure

scNetGWAS/
├── .github/
│   └── workflows/
│       └── ci.yml                 # GitHub Actions CI (lint, test matrix, coverage)
├── src/
│   └── scnetgwas/
│       ├── __init__.py            # Public API exports
│       ├── cli.py                 # Command-line interface
│       ├── pipeline.py            # Main pipeline orchestration
│       ├── validation.py          # Input validation & QC reporting
│       ├── coexpression.py        # Co-expression (standard, differential, parallel)
│       ├── network.py             # Network construction, centrality, RWR
│       ├── community.py           # Community detection, robustness, cross-validation
│       ├── enrichment.py          # Pathway enrichment via gseapy
│       ├── gene_mapping.py        # Gene ID harmonization
│       └── export.py              # Multi-format export, plots, HTML report
├── tests/
│   ├── test_smoke.py              # Import smoke test
│   ├── test_pipeline.py           # Core pipeline tests (28 tests)
│   └── test_new_features.py       # Enhancement tests (37 tests)
├── Snakefile                      # Snakemake batch workflow
├── example_config.yaml            # Example YAML configuration
├── pyproject.toml                 # Package configuration
├── README.md
├── LICENSE
└── .gitignore

Scalability & Resource Requirements

Memory Guidelines

Dataset Size	Genes	Cells	Approx. RAM	Notes
Small	< 5,000	< 10,000	2–4 GB	Runs on laptop
Medium	5,000–15,000	10,000–100,000	8–16 GB	Use `--n-jobs 4` for co-expression
Large	15,000–25,000	100,000–500,000	32–64 GB	Consider `--no-coexpression` or cell-type subsets
Very large	> 25,000	> 500,000	64+ GB	Subsample cells, use sparse AnnData, limit cell types

Performance Tips

Co-expression is the bottleneck: O(n² genes) pairwise correlations. Use --n-jobs for parallelism or --no-coexpression if not needed.
Sparse matrices: The tool handles sparse .X in AnnData efficiently with chunked conversion for matrices > 10⁸ elements.
PPI pre-filtering: Only PPI-connected genes are included in co-expression, reducing computation significantly.
Cell-type subsetting: Processing one cell type at a time uses far less memory than all cells together.
Permutation testing: --robustness with --n-permutations 1000 multiplies community detection runtime proportionally — start with 100.

Dry-Run Validation

Use --dry-run to validate all inputs and parameters without running the full analysis:

scnetgwas --gwas gwas.csv --ppi ppi.csv --adata data.h5ad --outdir out/ --dry-run

This loads files, checks formats, validates parameter bounds, verifies cell types, and writes a QC report — useful before committing to a long run.

Troubleshooting

Common Errors

Error	Cause	Fix
`GWAS file missing required columns: {'gene'}`	CSV column names don't match expected `gene` and `pval`	Rename columns or check delimiter (CSV vs TSV auto-detected)
`GWAS p-values must be between 0 and 1`	P-values contain -log10 transformed values or z-scores	Convert back to raw p-values before input
`No cells found for cell_type='...'`	Requested cell type doesn't exist in `adata.obs["cell_type"]`	Use `--dry-run` to list available cell types, check spelling
`Cell types not found in AnnData`	`--cell-types` includes names not in the data	Run `--dry-run` to see available types
`No overlapping genes — returning empty network`	Gene IDs don't match across GWAS/PPI/scRNA-seq	Use `--gene-mapping` with a mapping table to harmonize IDs
`CRITICAL: No genes overlap across all three datasets`	Gene naming conventions differ (e.g., Ensembl vs Symbol)	Check QC report, use `--gene-mapping` and `--target-id-type`
`PyYAML is required to use YAML config files`	Optional dependency not installed	`pip install scnetgwas[config]`
`gseapy is required for enrichment`	Optional dependency not installed	`pip install scnetgwas[enrichment]`
`Unknown config keys (will be ignored): [...]`	Typo or unsupported key in config YAML/TOML	Check key spelling against CLI reference
`Eigenvector centrality did not converge`	Disconnected graph or numerical instability	Non-fatal — zeros used as fallback; consider filtering edges

Debugging Tips

Start with --dry-run: Validates inputs cheaply before long runs
Check qc_report.json: Gene overlap stats reveal ID mismatches early
Use -vv for DEBUG logging: Shows detailed step-by-step progress
Small test first: Run on one cell type with --no-coexpression to verify the pipeline works, then scale up
Memory issues: If you hit OOM, use --cell-types to process one type at a time, or increase --min-edge-weight to produce a sparser network

Dependencies

Core (installed automatically)

Package	Version	Purpose
numpy	>= 1.23, < 3.0	Numerical computing
pandas	>= 2.0, < 3.0	Data manipulation
networkx	>= 3.0, < 4.0	Graph/network analysis
scanpy	>= 1.9.0, < 2.0	Single-cell RNA-seq processing
anndata	>= 0.9.0, < 1.0	Single-cell data structure
scipy	>= 1.10, < 2.0	Scientific computing, statistics
matplotlib	>= 3.5, < 4.0	Network visualization

Optional

Package	Install Group	Purpose
gseapy	`[enrichment]`	GO/pathway enrichment analysis
tqdm	`[progress]`	Progress bars
pyyaml	`[config]`	YAML config file support
snakemake	(separate)	Batch workflow execution

License

MIT License. See LICENSE for details.

Author

Bala Subramani — GitHub

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/workflows		.github/workflows
src/scnetgwas		src/scnetgwas
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Snakefile		Snakefile
example_config.yaml		example_config.yaml
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

scNetGWAS

Table of Contents

Overview

Key Capabilities

Architecture

Installation

Basic Install

With All Optional Dependencies

Optional Dependency Groups

Requirements

Input File Formats

1. GWAS File (CSV or TSV)

2. PPI File (CSV or TSV)

3. AnnData File (.h5ad)

4. Gene Mapping File (CSV, optional)

Quick Start

Usage Examples

Basic Run

Cell-Type-Specific Networks

Full Analysis Pipeline

Multi-Trait GWAS Integration

Gene ID Harmonization

Using a Config File

Batch Processing with Snakemake

Python API

CLI Reference

Required Arguments

Filtering

Cell-Type Analysis

Community Detection

Co-Expression

Network Analysis

Multi-Trait Analysis

Pathway Enrichment

Gene ID Mapping

Output Options

Performance

Validation

General

Features

1. Gene Ontology / Pathway Enrichment

2. Random Walk with Restart (RWR)

3. Differential Co-Expression

4. STRING Confidence Edge Scoring

5. Multi-Trait GWAS Integration

6. Network Centrality Metrics

7. Parallel Co-Expression

8. Sparse Matrix Support

9. PPI Gene Pre-Filtering

10. HTML Report

11. Reproducibility Manifest

12. Snakemake Workflow

13. Progress Bars

14. Gene ID Mapping

15. Input QC Report & Validation

16. Community Robustness Testing

17. Leave-One-Cell-Type-Out Cross-Validation

Output Files

Testing

CI Pipeline

Project Structure

Scalability & Resource Requirements

Memory Guidelines

Performance Tips

Dry-Run Validation

Troubleshooting

Common Errors

Debugging Tips

Dependencies

Core (installed automatically)

Optional

License

Author

About

Resources

License

Packages