-
Notifications
You must be signed in to change notification settings - Fork 0
02 Input Data
DSGE accepts p-values from any differential expression tool (DESeq2, edgeR, limma, Seurat, etc.).
Required columns:
-
pvalue: nominal p-values (not adjusted) -
geneName(or similar): gene symbols/identifiers, must be unique
Optional columns:
-
baseMean(orAveExpr): mean expression for expression-level filtering -
log2FoldChange(or similar): direction vector for NDS computation
res <- read.csv("inst/data_exp/limma_FLT3_IR_vs_FLT3.csv", stringsAsFactors = FALSE)
# Remove genes without a valid symbol
res <- subset(res, gene != "" & !is.na(gene))Instead of GAF + OBO files, you can build the pathway-gene map directly from a Bioconductor OrgDb package. This is simpler for common model organisms and avoids managing external files.
library(org.Hs.eg.db)
pw <- get_pathway_genes_db(org.Hs.eg.db)| Parameter | Default | Description |
|---|---|---|
orgdb |
(required) | An OrgDb object (e.g., org.Hs.eg.db) |
keytype |
"ENTREZID" |
Key type for gene IDs in the OrgDb |
min_size |
5 |
Drop pathways below this gene count |
aspect |
NULL |
Ontology filter: "BP", "MF", "CC", or NULL (all) |
evidence |
NULL |
Evidence code filter (e.g., "IDA"); NULL = all |
attach_names |
TRUE |
Fetch pathway names (requires GO.db, KEGGREST, or reactome.db depending on source) |
For the complete parameter list and usage details (non-model organisms, AnnotationHub, etc.), see the 03-Pathway-Genes.md page.
Reads GAF 2.2 tab-separated files with data.table::fread. Comment lines starting with ! are auto-skipped.
gaf <- read_gaf("data_exp/goa_human.gaf/goa_human.gaf")
# Inspect the file metadata
head(get_gaf_header("data_exp/goa_human.gaf/goa_human.gaf"))| Parameter | Default | Description |
|---|---|---|
file |
(required) | Path to the GAF file |
col_names |
GAF_COLUMNS |
17 GAF 2.2 column names; set NULL to auto-detect |
... |
— | Passed to data.table::fread
|
The 17 standard columns: db, db_object_id, db_object_symbol, qualifier, go_id, db_reference, evidence_code, with_from, aspect, db_object_name, db_object_synonym, db_object_type, taxon, date, assigned_by, annotation_extension, gene_product_form_id.
Extracts id, name, and namespace from each [Term] stanza in OBO format. Only needed in GAF mode (not needed with get_pathway_genes_db()).
go <- read_obo("data_exp/go.obo")| Parameter | Default | Description |
|---|---|---|
file |
(required) | Path to the OBO file |
| File | Format | Where to get it |
|---|---|---|
| Differential expression results | CSV/table with pvalue, baseMean, geneName columns (column names can be adapted) |
Your own DE analysis (DESeq2, edgeR, Seurat, limma, etc.) |
| GAF annotations (mode A) | GAF 2.2 (tab-separated, 17 cols) | GOA Human |
| OBO ontology (mode A) | OBO 1.2/1.4 | Gene Ontology Downloads |
| OrgDb (mode B, GO) | Bioconductor OrgDb package |
BiocManager::install("org.Hs.eg.db") or AnnotationHub
|
| KEGGREST (KEGG) | Online API | BiocManager::install("KEGGREST") |
| reactome.db (Reactome) | Bioconductor annotation package | BiocManager::install("reactome.db") |
Extracts KEGG pathway-to-gene mappings from a Bioconductor OrgDb (via the PATH column). The organism is auto-detected from the OrgDb.
library(org.Hs.eg.db)
pw <- get_pathway_genes_kegg(org.Hs.eg.db, min_size = 5L)| Parameter | Default | Description |
|---|---|---|
orgdb |
(required) | An OrgDb object (e.g., org.Hs.eg.db) |
keytype |
"ENTREZID" |
Key type for gene IDs in the OrgDb |
min_size |
5 |
Drop pathways below this gene count |
attach_path_names |
TRUE |
Fetch pathway names via KEGGREST::keggList() (online) |
Requires: AnnotationDbi + KEGGREST. Name lookup requires network access; when unavailable, names are set to NA with a warning.
For the complete parameter list and supported organisms, see the 03-Pathway-Genes.md page.
Extracts Reactome pathway-to-gene mappings from reactome.db (local, no network). Gene symbols are resolved from the OrgDb.
library(org.Hs.eg.db)
pw <- get_pathway_genes_reactome(org.Hs.eg.db, min_size = 5L)| Parameter | Default | Description |
|---|---|---|
orgdb |
(required) | An OrgDb object (for gene symbol resolution) |
keytype |
"ENTREZID" |
Key type for querying the OrgDb |
min_size |
5 |
Drop pathways below this gene count |
attach_path_names |
TRUE |
Fetch pathway names from reactome.db (local) |
species_prefix |
"R-HSA" |
Reactome species prefix; NULL = all species |
Requires: AnnotationDbi + reactome.db. Both pathway data and names are fetched locally.
For the complete parameter list and species prefixes, see the 03-Pathway-Genes.md page.