Skip to content

05 Single Gene Set

songif edited this page Jun 12, 2026 · 3 revisions

Single Gene Set Analysis

dsge_perm_test()

For testing one custom gene list without running the full pathway pipeline.

Usage

my_genes <- c("CALN1", "GAD1", "GAD2", "SLC32A1", "SLC17A6")

test <- dsge_perm_test(
  gene_list        = my_genes,
  pvalue           = res$pvalue,
  base_mean        = res$baseMean,
  gene_names       = res$geneName,
  base_mean_cutoff = 0.1,
  n_perm           = 10000,
  seed             = 42,
  progress         = TRUE,
  heterogeneity    = FALSE,
  directional      = FALSE,
  direction_vec    = NULL,
  use_std          = TRUE,
  use_gpd          = TRUE,
  gpd_threshold    = 0.99,
  gpd_method       = "mle",
  safety_margin    = 1.6,
  nds_top_frac     = 0.25
)

test$observed   # DSGE of the gene set
test$p_value    # empirical right-tail p-value
test$dsge_std   # standardised DSGE (if use_std = TRUE)
test$nds        # Normalized Direction Score (if directional = TRUE)

Parameters

Parameter Default Description
gene_list (required) Character vector of gene symbols
pvalue (required) p-value vector from differential expression analysis
base_mean NULL Mean expression vector; NULL skips filtering
gene_names (required) Gene symbols, must be unique
base_mean_cutoff 0.1 Exclude genes with baseMean at or below this value
n_perm 10000 Number of permutations
seed NULL Random seed
progress TRUE Show progress bar
heterogeneity FALSE If TRUE, also compute Gini, CV, and het_p
directional FALSE If TRUE, compute NDS using direction_vec
direction_vec NULL Numeric vector (e.g. log2FoldChange); required when directional = TRUE
use_std TRUE If TRUE, return dsge_std = (observed - mean(null)) / sd(null)
use_gpd TRUE If TRUE, use GPD tail extrapolation (avoids p=0)
gpd_threshold 0.99 Tail quantile threshold for GPD fitting
gpd_method "mle" GPD estimation method passed to POT::fitgpd
safety_margin 1.6 Safety margin for GPD support-constrained adjustment
nds_top_frac 0.25 Fraction of most-perturbed genes retained for NDS (only used when directional = TRUE)

calc_dsge()

A single scalar DSGE for the whole transcriptome, plus the per-gene z-score pool.

dsge_res <- calc_dsge(
  pvalue           = res$pvalue,
  base_mean        = res$baseMean,
  base_mean_cutoff = 0.1
)

dsge_res$dsge       # scalar: global perturbation strength
dsge_res$n_genes    # number of genes after filtering
hist(dsge_res$z_scores, breaks = 100)

Parameters

Parameter Default Description
pvalue (required) p-value vector from differential expression analysis
base_mean NULL Mean expression vector; NULL skips filtering
base_mean_cutoff 0.1 Exclude genes with mean expression at or below this value

Clone this wiki locally