pairedGSEA
is an R package that helps you to run a paired differential
gene expression (DGE) and splicing (DGS) analysis. Providing a bulk RNA
count data, pairedGSEA
combines the results of DESeq2
(DGE) and
DEXSeq
(DGS), aggregates the p-values to gene level, and allows you to
run a subsequent gene set over-representation analysis using its
implementation of the fgsea::fora
function.
pairedGSEA
is published in BMC
Biology.
Please cite with citation("pairedGSEA")
Dependencies
# Install Bioconductor dependencies
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install(c("SummarizedExperiment", "S4Vectors", "DESeq2", "DEXSeq", "fgsea", "sva", "BiocParallel"))
Install pairedGSEA
from Bioconductor
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("pairedGSEA")
Install development version from GitHub
# Install pairedGSEA from github
devtools::install_github("shdam/pairedGSEA", build_vignettes = TRUE)
To view documentation for the version of this package installed in your system, start R and enter:
browseVignettes("pairedGSEA")
Please see the User Guide vignette for a detailed description of usage.
Here is a quick run-through of the functions:
Load example data.
suppressPackageStartupMessages(library("SummarizedExperiment"))
library("pairedGSEA")
data("example_se")
example_se
#> class: SummarizedExperiment
#> dim: 5611 6
#> metadata(0):
#> assays(1): counts
#> rownames(5611): ENSG00000282880:ENST00000635453
#> ENSG00000282880:ENST00000635195 ... ENSG00000249230:ENST00000504393
#> ENSG00000249244:ENST00000505994
#> rowData names(0):
#> colnames(6): GSM1499784 GSM1499785 ... GSM1499791 GSM1499792
#> colData names(5): study id source final_description group_nr
Run paired differential analysis
set.seed(500) # For reproducible results
diff_results <- paired_diff(
example_se,
group_col = "group_nr",
sample_col = "id",
baseline = 1,
case = 2,
store_results = FALSE,
quiet = TRUE
)
#> No significant surrogate variables
#> converting counts to integer mode
#> Warning in DESeqDataSet(rse, design, ignoreRank = TRUE): some variables in
#> design formula are characters, converting to factors
Over-representation analysis of results
# Define gene sets in your preferred way
gene_sets <- pairedGSEA::prepare_msigdb(
species = "Homo sapiens",
category = "C5",
gene_id_type = "ensembl_gene"
)
ora <- paired_ora(
paired_diff_result = diff_results,
gene_sets = gene_sets
)
#> Running over-representation analyses
#> Joining result
You can now plot the enrichment scores against each other and identify pathways of interest.
plot_ora(
ora,
paired = TRUE # Available in version 1.1.0 and newer
) +
ggplot2::theme_classic()
If you have any issues or questions regarding the use of pairedGSEA
,
please do not hesitate to raise an issue on GitHub. In this way, others
may also benefit from the answers and discussions.