04 DSGE Analysis

Pathway DSGE Analysis

The core function. Computes DSGE for every pathway, generates size-grouped permutation null distributions, fits GPD to the upper tail, and applies multiple testing correction.

Usage

result <- pathway_dsge(
  pathway_genes    = pw,
  pvalue           = res$pvalue,
  base_mean        = res$baseMean,
  gene_names       = res$geneName,
  gene_id_col      = "db_object_symbol",
  base_mean_cutoff = 0.1,
  min_size         = 5,
  max_size         = 500,
  n_perm           = 10000,
  seed             = 42,
  return_null      = TRUE,
  progress         = TRUE,
  heterogeneity    = FALSE,
  directional      = FALSE,
  direction_vec    = NULL,
  use_std          = TRUE,
  use_gpd          = TRUE,
  gpd_threshold    = 0.99,
  gpd_method       = "mle",
  safety_margin    = 1.6,
  n_cores          = 1,
  p_adjust_method  = "BY",
  nds_top_frac     = 0.25
)

result_tbl <- result$table    # data.frame, sorted by p_adj ascending

Parameters

Parameter	Default	Description
`pathway_genes`	(required)	Named list from `get_pathway_genes()` or `get_pathway_genes_db()`
`pvalue`	(required)	p-value vector from differential expression analysis
`base_mean`	`NULL`	Mean expression vector (e.g., DESeq2 baseMean); `NULL` skips filtering
`gene_names`	(required)	Gene symbols, must be unique
`gene_id_col`	`"db_object_symbol"`	Column in pathway data.frames to match `gene_names`
`base_mean_cutoff`	`0.1`	Exclude genes with baseMean at or below this value
`min_size`	`5`	Minimum matched genes per pathway
`max_size`	`500`	Maximum matched genes (set `Inf` to disable)
`n_perm`	`10000`	Permutations per size group
`seed`	`NULL`	Random seed for reproducibility
`return_null`	`FALSE`	If `TRUE`, return list with null distributions (needed for `plot_dsge`)
`progress`	`TRUE`	Show progress bars during computation
`heterogeneity`	`FALSE`	If `TRUE`, also compute Gini, CV, and heterogeneity p-values
`directional`	`FALSE`	If `TRUE`, compute Normalized Direction Score (NDS) using `direction_vec`
`direction_vec`	`NULL`	Numeric vector (e.g. log2FoldChange), same length as `pvalue`; required when `directional = TRUE`
`use_std`	`TRUE`	If `TRUE`, compute `(observed - mean(null)) / sd(null)` and include `dsge_std` column
`use_gpd`	`TRUE`	If `TRUE`, use GPD tail extrapolation with support-constrained adjustment (avoids p=0)
`gpd_threshold`	`0.99`	Tail quantile threshold for GPD fitting
`gpd_method`	`"mle"`	GPD estimation method passed to `POT::fitgpd`
`safety_margin`	`1.6`	Safety margin for GPD support-constrained adjustment
`n_cores`	`1`	Number of CPU cores for parallel null generation (uses Rcpp; limited benefit on Windows)
`p_adjust_method`	`"BY"`	Multiple testing correction method. Default BY (Benjamini-Yekutieli) for FDR control under arbitrary dependence
`nds_top_frac`	`0.25`	Fraction of most-perturbed genes retained for NDS calculation (only used when `directional = TRUE`)

Result Columns

Column	Description
`go_id`	GO term identifier
`go_name`	Human-readable GO term name
`aspect`	Ontology: BP (Biological Process), MF (Molecular Function), CC (Cellular Component)
`n_pathway`	Total genes in the pathway annotation
`n_matched`	Genes matched in the expression data
`dsge`	Observed DSGE (mean z-score of pathway genes)
`dsge_std`	Standardised DSGE (only when `use_std = TRUE`)
`nds`	Normalized Direction Score (only when `directional = TRUE`)
`gini`	Gini coefficient (only when `heterogeneity = TRUE`)
`cv`	Coefficient of Variation (only when `heterogeneity = TRUE`)
`p_value`	Permutation p-value
`p_adj`	Adjusted p-value (BY correction by default)
`het_p_value`	Heterogeneity p-value (only when `heterogeneity = TRUE`)
`het_p_adj`	Adjusted heterogeneity p-value (only when `heterogeneity = TRUE`)

Inspecting Results

# How many pathways are significant?
sum(result_tbl$p_adj < 0.05)

# Ontology breakdown
table(result_tbl$aspect)

# Top pathways
head(result_tbl[, c("go_id", "go_name", "aspect", "n_matched", "dsge", "dsge_std", "p_adj")])

# Save
write.csv(result_tbl, "pathway_dsge_results.csv", row.names = FALSE)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

04 DSGE Analysis

Pathway DSGE Analysis

Usage

Parameters

Result Columns

Inspecting Results

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally