Skip to content

Statistics of Enrichment Analysis Methods

reubenthomas edited this page Apr 12, 2024 · 4 revisions

Description

The use of gene set or pathway enrichment methods is ubiquitous for assigning functional or mechanistic significance to results from high-throughput assays like RNA-sequencing. This workshop will focus on the theory behind some of these methods and also on the interpretation of the associated results. Even though different methods will be illustrated using R code, the emphasis will be more on the understanding the background and assumptions of the methods. See our other workshops for performing enrichment using online tools and R packages.

Learning Path

Intermediate   This is an intermediate workshop in both the Biostats series and the Pathway Analysis series. The demos session will be conducted in R. The participants will be expected to have a basic understanding of statistics and experimental design, awareness of high-dimensional assays like RNA-seq, Mass Spectrometry etc., and also some familiarity with R code.

Material

All the material can be downloaded here in one zipped folder. Note, the files in the folder will be up-to-date. The material consists of

  1. Slides which the instructor will go over.
  2. A R markdown file describing the implementation of the different methods starting from raw count data.
  3. A html file with the output produced after "knitting" the above R markdown file.
  4. A RDS file that stores the SummarizedExperiment object with raw read count data from an RNA-seq experiment
  5. A RDS file that stores a list of SummarizedExperiment objects. This list has only one element made up of the object referred to above. Additional information for the differential expression analyses are added to this object.
  6. A RData file with information on three gene set databases - Gene Ontology, WikiPathways and PFOCR.
  7. A folder with the results from all association analyses (gene expression, gene set/pathway association using each of 6 different methods).
  8. A decision flow-chart guiding the choice of the right method to use for your own enrichment analyses. This will probably make sense only after you go over the material or attend the workshop.

Pre-workshop instructions

Participants interested in running the code in the provided R markdown file will need to have R on their computer and have/install the following R packages - tidyverse, magrittr, rSEA and the following Bioconductor packages - clusterProfiler, DESeq2, EnhancedVolcano, SAFE, PADOG, GSEABenchmarkeR. Note: Participants are not expected to run the code during the session though of course they are welcome to do so.