Skip to content

junior1p/scMultiome

Repository files navigation

scMultiome

Single-Cell Multimodal Integration Pipeline for scRNA-seq + scATAC-seq

A complete end-to-end pipeline integrating paired scRNA-seq and scATAC-seq data (10x Multiome, SHARE-seq, SNARE-seq) using scGLUE and MOFA+, with automatic cell type annotation and gene regulatory network (GRN) inference.


Features

  • Multimodal Integration: scGLUE (graph-linked unified embedding) + MOFA+ (multi-omics factor analysis)
  • Cell Type Annotation: Marker-based annotation validated across both RNA and ATAC modalities
  • GRN Inference: Peak → gene regulatory links via GLUE cosine similarity + TF motif scanning
  • Standard Output: UMAP plots, cluster labels, peak-gene links, GRN edges, reproducible .h5mu bundle

Quick Start

Installation

pip install muon scanpy scglue anndata mofapy2 leidenalg python-igraph \
    matplotlib seaborn pandas numpy scipy \
    --break-system-packages -q

Run on PBMC Demo Data (auto-downloads)

from multiome import run_multiome_skill

mdata, metrics, grn = run_multiome_skill(
    out_dir="multiome_results_demo",
    run_scglue=True,
    run_mofa=True,
    run_grn=True,
    max_epochs=100  # reduce for faster demo
)

With Your Own Data

from multiome import run_multiome_skill

# 10x .h5 or .h5mu file
mdata, metrics, grn = run_multiome_skill(
    input_path="your_multiome.h5mu",
    out_dir="my_analysis",
    max_epochs=500
)

Pipeline Overview

1. Load Data
   └── 10x Multiome .h5 / .h5mu / separate .h5ad files

2. Quality Control
   ├── RNA: gene count, total counts, mitochondrial %
   └── ATAC: peak count, total counts
   └── Intersect: keep cells present in both modalities

3. Preprocessing
   ├── RNA: normalize → log1p → HVG → scale → PCA
   └── ATAC: TF-IDF → LSI → HVG → scale → PCA

4. Multimodal Integration
   ├── scGLUE: genomic coordinate prior (peak → gene proximity)
   └── MOFA+: multi-omics factor analysis

5. Cell Type Annotation
   └── Marker gene scoring across modalities

6. GRN Inference
   └── Peak → gene cosine similarity → TF motif → GRN

7. Outputs
   ├── multiome_integrated.h5mu  (full MuData)
   ├── cell_metadata.csv          (cluster labels)
   ├── peak_gene_links.csv        (regulatory pairs)
   └── UMAP figures

Output Files

File Description
multiome_integrated.h5mu Complete MuData object with all embeddings
cell_metadata.csv Cell × cluster assignments (RNA, ATAC, joint)
peak_gene_links.csv GLUE-scored peak → gene regulatory pairs
joint_umap_clusters.png Main UMAP: RNA clusters, ATAC clusters, joint clusters
marker_dotplot.png Canonical marker gene expression by cluster

Dependencies

muon>=0.1.6
scanpy>=1.9.6
scglue>=0.3.3
anndata>=0.10.0
mofapy2>=0.7.1
leidenalg>=0.10.1
python-igraph>=0.11.0
matplotlib>=3.7
seaborn>=0.12
pandas>=1.5
numpy>=1.24
scipy>=1.10
scikit-learn>=1.3
requests>=2.28

Python 3.9+ required. GPU recommended (scGLUE auto-detects CUDA, 5–10× faster).


Methods

scGLUE Integration

scGLUE uses genomic coordinate proximity (peaks within 1 Mb of genes) as a knowledge graph prior to align RNA and ATAC modalities in a shared latent space. This biologically grounded approach reduces false positives from spurious correlations.

MOFA+ Integration

Multi-Omics Factor Analysis learns latent factors capturing both shared and modality-specific variation, providing interpretable biological processes as factors.

GRN Inference

GLUE feature embeddings place genes and peaks in the same vector space. Peaks with high cosine similarity to a gene embedding are predicted cis-regulatory elements. TF motif scanning (via JASPAR) on these peaks yields a three-layer network: TF → enhancer peak → target gene.


Citation

If you use scMultiome in your research, please cite:

Bredikhin, D. et al. (2022). MUON: multimodal omics analysis framework. Genome Biology.
Cao, Z.-J. & Gao, G. (2022). Multi-omics single-cell data integration and regulatory inference with graph-linked embedding. Nature Biotechnology.

About

Single-cell multimodal integration pipeline for scRNA-seq + scATAC-seq with scGLUE and MOFA+

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors