scPairs: Identifying Synergistic Gene Pairs in Single-Cell and Spatial Transcriptomics

Overview

scPairs is an R package for the systematic identification and multi-evidence evaluation of synergistic gene pairs in single-cell and spatial transcriptomics data. Unlike conventional pairwise co-expression analyses that rely on a single correlation metric, scPairs integrates 14 complementary metrics across five orthogonal evidence layers to compute a composite synergy score with optional permutation-based significance testing.

The five evidence layers span cell-level co-expression (Pearson, Spearman, biweight midcorrelation, mutual information, ratio consistency), neighbourhood-aware smoothing (KNN-smoothed correlation, neighbourhood co-expression, cluster pseudo-bulk, cross-cell-type, neighbourhood synergy), prior biological knowledge (GO/KEGG co-annotation Jaccard, pathway bridge score), trans-cellular interaction, and spatial co-variation (Lee's L, co-location quotient). This multi-scale design enables researchers to move beyond simple co-expression towards a comprehensive characterisation of cooperative gene regulation at transcriptomic and spatial resolution.

1. Preparation

1.1 Installation

# Install from GitHub
if (!require("devtools")) install.packages("devtools")
devtools::install_github("zhaoqing-wang/scPairs")

# Optional: prior knowledge integration (GO/KEGG annotation)
if (!require("BiocManager")) install.packages("BiocManager")
BiocManager::install(c("org.Mm.eg.db", "org.Hs.eg.db", "AnnotationDbi"))

Install missing CRAN dependencies manually

install.packages(c("data.table", "ggplot2", "ggraph", "ggrepel",
                   "igraph", "Matrix", "patchwork", "Seurat",
                   "tidygraph", "tidyr"))
# Optional accelerators and extras
install.packages(c("RANN", "ggExtra", "crayon"))

1.2 Dependencies

Required: Seurat (≥ 4.0), data.table, ggplot2, ggraph, ggrepel, igraph, Matrix, patchwork, tidygraph, tidyr

Optional:

Package	Purpose
`org.Mm.eg.db` / `org.Hs.eg.db`	GO & KEGG prior knowledge (mouse / human)
`AnnotationDbi`	Gene annotation infrastructure
`RANN`	Fast approximate KNN (10–100× speedup for neighbourhood metrics)
`ggExtra`	Marginal density panels in `PlotPairScatter()`
`crayon`	Colourised startup message

2. Quick Start

library(scPairs)
sce <- readRDS("your_data.rds")  # Seurat object, normalised

2.1 Global Pair Discovery

# All 14 metrics (default)
result <- FindAllPairs(sce, n_top_genes = 1000, top_n = 200)

# Expression metrics only — no annotation databases required
result <- FindAllPairs(sce, mode = "expression")

# Prior knowledge only — fast pathway-based screening
result <- FindAllPairs(sce, mode = "prior_only", organism = "mouse")

# With permutation testing (empirical p-values)
result <- FindAllPairs(sce, n_top_genes = 500, n_perm = 999)

# Visualise
PlotPairNetwork(result, top_n = 50)
PlotPairHeatmap(result, top_n = 25)

2.2 Query-Centric Partner Search

tp53_partners <- FindGenePairs(sce, gene = "TP53", top_n = 20)
PlotPairNetwork(tp53_partners)
PlotPairDimplot(sce, gene1 = "TP53", gene2 = tp53_partners$pairs$gene2[1])

2.3 In-Depth Pair Assessment

assessment <- AssessGenePair(sce, gene1 = "CD8A", gene2 = "CD8B")
print(assessment)                                          # structured summary

PlotPairSmoothed(sce, gene1 = "CD8A", gene2 = "CD8B")    # raw vs KNN-smoothed
PlotPairSummary(sce,  gene1 = "CD8A", gene2 = "CD8B",
                result = assessment)                       # full evidence dashboard

# All visualisation functions accept any result class interchangeably
PlotPairNetwork(assessment)
PlotPairHeatmap(assessment)

2.4 Prior Knowledge & Bridge Gene Network

# Requires: BiocManager::install(c("org.Mm.eg.db", "AnnotationDbi"))
result <- AssessGenePair(sce, gene1 = "Adora2a", gene2 = "Ido1",
                         organism = "mouse")

# 6-panel synergy dashboard (expression + neighbourhood + prior evidence)
PlotPairSynergy(sce, gene1 = "Adora2a", gene2 = "Ido1")

# Standalone bridge gene network
PlotBridgeNetwork(sce, gene1 = "Adora2a", gene2 = "Ido1",
                  organism = "mouse", top_bridges = 15)

# Custom interaction databases (CellChatDB, CellPhoneDB, SCENIC, etc.)
custom_db <- data.frame(gene1 = c("Adora2a"), gene2 = c("Ido1"))
result <- FindGenePairs(sce, gene = "Adora2a", custom_pairs = custom_db)

2.5 Spatial Transcriptomics

Spatial metrics are detected and computed automatically for Visium, MERFISH, Slide-seq, and similar modalities:

result <- FindAllPairs(spatial_obj, n_top_genes = 500)   # spatial metrics auto-enabled
PlotPairSpatial(spatial_obj, gene1 = "EPCAM", gene2 = "KRT8")

3. Architecture

3.1 Computation Modes

Mode	Metrics Computed	Dependencies	Use Case
`"all"` (default)	All 14 metrics	All packages	Full multi-evidence analysis
`"expression"`	Co-expression + neighbourhood (10)	No annotation DBs	No external databases needed
`"prior_only"`	Prior knowledge scores only (2)	org.*.db, AnnotationDbi	Fast pathway screening

3.2 Five Evidence Layers (14 Metrics)

Layer	Metrics	Weight
Cell-level (5)	Pearson (`cor_pearson`), Spearman (`cor_spearman`), Biweight midcorrelation (`cor_biweight`), Mutual information (`mi_score`), Ratio consistency (`ratio_consistency`)	1.0 – 1.5
Neighbourhood (5)	KNN-smoothed correlation (`smoothed_cor`), Neighbourhood score (`neighbourhood_score`), Cluster correlation (`cluster_cor`), Cross-cell-type score (`cross_celltype_score`), Neighbourhood synergy (`neighbourhood_synergy`)	1.2 – 1.5
Prior knowledge (2)	GO/KEGG co-annotation Jaccard (`prior_score`), Pathway bridge score (`bridge_score`)	1.8 – 2.0
Spatial (2)	Lee's L (`spatial_lee_L`), Co-location quotient (`spatial_clq`)	1.2 – 1.5

Metrics are rank-normalised to [0, 1] and combined via weighted summation:

$$\text{synergy_score} = \frac{\sum_i w_i \cdot \text{rank_norm}(m_i)}{\sum_i w_i}$$

3.3 Unified Output Schema

All three workflows produce a pairs data.table with consistent columns: gene1, gene2, synergy_score, rank, confidence, plus all metric columns. Every visualisation function accepts any result class interchangeably.

Confidence assignment:

With permutation p-values	Without p-values (score quantiles)
High: p_adj < 0.01	High: ≥ 95th percentile
Medium: p_adj < 0.05	Medium: ≥ 80th percentile
Low: p_adj < 0.10	Low: ≥ 50th percentile

4. Visualization Gallery

The following plots are generated from the built-in scpairs_testdata object (100 cells × 20 genes, synthetic co-expression patterns). GENE3 and GENE4 are the injected globally co-expressed pair (Pearson r ≈ 0.89).

Pair Discovery — Network & Heatmap

Gene Pair Network	Synergy Score Heatmap

Left: PlotPairNetwork() — synergy-weighted interaction graph; node size ∝ connectivity, edge width ∝ synergy score.
Right: PlotPairHeatmap() — pairwise synergy matrix; top-ranked pairs clustered by score.

Co-expression — UMAP Overlays

Raw expression (3-panel):

PlotPairDimplot() — UMAP coloured by individual gene expression and co-expression product. GENE3 and GENE4 show overlapping high-expression regions, revealing their synergistic co-activation.

Raw vs KNN-smoothed (6-panel):

PlotPairSmoothed() — top row: raw expression; bottom row: KNN-smoothed. Smoothing reduces single-cell noise and clarifies the co-expression domain across the embedding.

Expression Distribution & Scatter

Violin by Cluster	Gene–Gene Scatter

Left: PlotPairViolin() — expression distributions of GENE3, GENE4, and their product per cluster.
Right: PlotPairScatter() — cell-level gene–gene scatter coloured by cluster; positive correlation visible across all three groups.

5. Functions

5.1 Discovery (3 functions)

Function	Purpose
`FindAllPairs()`	Global discovery of synergistic gene pairs across all variable genes
`FindGenePairs()`	Find synergistic partners for a specific query gene
`AssessGenePair()`	In-depth assessment of a specific gene pair with per-cluster detail

5.2 Visualization (11 functions)

Function	Input	Purpose
`PlotPairNetwork()`	Any result class	Synergy-weighted gene interaction network
`PlotPairHeatmap()`	Any result class	Synergy score heatmap across top gene pairs
`PlotPairDimplot()`	Seurat object	UMAP co-expression overlay (3-panel)
`PlotPairSmoothed()`	Seurat object	Raw + KNN-smoothed UMAP expression (6-panel)
`PlotPairSummary()`	Seurat object	Comprehensive multi-panel summary figure
`PlotPairSpatial()`	Spatial Seurat	Spatial co-expression map (3-panel)
`PlotPairCrossType()`	Seurat object	Cross-cell-type interaction heatmap
`PlotPairViolin()`	Seurat object	Expression distributions by cluster
`PlotPairScatter()`	Seurat object	Gene–gene scatter with optional marginal densities
`PlotPairSynergy()`	Seurat object	Multi-evidence synergy dashboard (4-panel)
`PlotBridgeNetwork()`	Seurat object	Bridge gene network with pathway-weighted edges

6. Built-in Test Dataset

scpairs_testdata is a minimal synthetic Seurat object shipped with the package for immediate use in examples and tests:

data(scpairs_testdata)
scpairs_testdata
#> An object of class Seurat
#> 20 features across 100 samples within 1 assay
#> Active assay: RNA (20 features, 20 variable features)
#>  3 layers present: counts, data, scale.data
#>  2 dimensional reductions calculated: pca, umap

Property	Value
Genes	GENE1–GENE20 (synthetic)
Cells	CELL001–CELL100
Clusters	3 balanced clusters (`seurat_clusters`)
Reductions	PCA (5 components), UMAP (2 dimensions)
Co-expression	GENE3/GENE4 globally (r ≈ 0.89); GENE1/GENE2 cluster-1-specific
File size	26 KB (xz-compressed)

The dataset is generated by data-raw/make_testdata.R with set.seed(7391) for full reproducibility.

7. Citation

Wang Z (2026). scPairs: Identifying Synergistic Gene Pairs in Single-Cell and
Spatial Transcriptomics. R package version 0.1.8.
https://github.com/zhaoqing-wang/scPairs

8. License

MIT

9. Contact

Author: Zhaoqing Wang (ORCID) | Email: zhaoqingwang@mail.sdu.edu.cn | Issues: scPairs Issues

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
R		R
data		data
inst		inst
man		man
tests		tests
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
MD5		MD5
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

scPairs: Identifying Synergistic Gene Pairs in Single-Cell and Spatial Transcriptomics

Overview

Table of Contents

1. Preparation

1.1 Installation

1.2 Dependencies

2. Quick Start

2.1 Global Pair Discovery

2.2 Query-Centric Partner Search

2.3 In-Depth Pair Assessment

2.4 Prior Knowledge & Bridge Gene Network

2.5 Spatial Transcriptomics

3. Architecture

3.1 Computation Modes

3.2 Five Evidence Layers (14 Metrics)

3.3 Unified Output Schema

4. Visualization Gallery

Pair Discovery — Network & Heatmap

Co-expression — UMAP Overlays

Expression Distribution & Scatter

5. Functions

5.1 Discovery (3 functions)

5.2 Visualization (11 functions)

6. Built-in Test Dataset

7. Citation

8. License

9. Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

scPairs: Identifying Synergistic Gene Pairs in Single-Cell and Spatial Transcriptomics

Overview

Table of Contents

1. Preparation

1.1 Installation

1.2 Dependencies

2. Quick Start

2.1 Global Pair Discovery

2.2 Query-Centric Partner Search

2.3 In-Depth Pair Assessment

2.4 Prior Knowledge & Bridge Gene Network

2.5 Spatial Transcriptomics

3. Architecture

3.1 Computation Modes

3.2 Five Evidence Layers (14 Metrics)

3.3 Unified Output Schema

4. Visualization Gallery

Pair Discovery — Network & Heatmap

Co-expression — UMAP Overlays

Expression Distribution & Scatter

5. Functions

5.1 Discovery (3 functions)

5.2 Visualization (11 functions)

6. Built-in Test Dataset

7. Citation

8. License

9. Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages