-
Notifications
You must be signed in to change notification settings - Fork 0
02 Input Data
DSGE accepts p-values from any differential expression tool (DESeq2, edgeR, limma, Seurat, etc.).
Required columns:
-
pvalue: nominal p-values (not adjusted) -
geneName(or similar): gene symbols/identifiers, must be unique
Optional columns:
-
baseMean(orAveExpr): mean expression for expression-level filtering -
log2FoldChange(or similar): direction vector for NDS computation
res <- read.csv("inst/data_exp/limma_FLT3_IR_vs_FLT3.csv", stringsAsFactors = FALSE)
# Remove genes without a valid symbol
res <- subset(res, gene != "" & !is.na(gene))Instead of GAF + OBO files, you can build the pathway-gene map directly from a Bioconductor OrgDb package. This is simpler for common model organisms and avoids managing external files.
library(org.Hs.eg.db)
pw <- get_pathway_genes_db(org.Hs.eg.db)| Parameter | Default | Description |
|---|---|---|
orgdb |
(required) | An OrgDb object (e.g., org.Hs.eg.db) |
keytype |
"ENTREZID" |
Key type for gene IDs in the OrgDb |
min_size |
5 |
Drop pathways below this gene count |
aspect |
NULL |
Ontology filter: "BP", "MF", "CC", or NULL (all) |
evidence |
NULL |
Evidence code filter (e.g., "IDA"); NULL = all |
attach_go_names |
TRUE |
Fetch GO term names via GO.db
|
For the complete parameter list and usage details (non-model organisms, AnnotationHub, etc.), see the 03-Pathway-Genes page.
Reads GAF 2.2 tab-separated files with data.table::fread. Comment lines starting with ! are auto-skipped.
gaf <- read_gaf("data_exp/goa_human.gaf/goa_human.gaf")
# Inspect the file metadata
head(get_gaf_header("data_exp/goa_human.gaf/goa_human.gaf"))| Parameter | Default | Description |
|---|---|---|
file |
(required) | Path to the GAF file |
col_names |
GAF_COLUMNS |
17 GAF 2.2 column names; set NULL to auto-detect |
... |
— | Passed to data.table::fread
|
The 17 standard columns: db, db_object_id, db_object_symbol, qualifier, go_id, db_reference, evidence_code, with_from, aspect, db_object_name, db_object_synonym, db_object_type, taxon, date, assigned_by, annotation_extension, gene_product_form_id.
Extracts id, name, and namespace from each [Term] stanza in OBO format. Only needed in GAF mode (not needed with get_pathway_genes_db()).
go <- read_obo("data_exp/go.obo")| Parameter | Default | Description |
|---|---|---|
file |
(required) | Path to the OBO file |
| File | Format | Where to get it |
|---|---|---|
| Differential expression results | CSV/table with pvalue, baseMean, geneName columns (column names can be adapted) |
Your own DE analysis (DESeq2, edgeR, Seurat, limma, etc.) |
| GAF annotations (mode A) | GAF 2.2 (tab-separated, 17 cols) | GOA Human |
| OBO ontology (mode A) | OBO 1.2/1.4 | Gene Ontology Downloads |
| OrgDb (mode B) | Bioconductor OrgDb package |
BiocManager::install("org.Hs.eg.db") or AnnotationHub
|