Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

leading_edge and core_enrichment columns in clusterProfiler GSEA output #103

Open
8 tasks
sanjanasood opened this issue Sep 1, 2017 · 1 comment
Open
8 tasks

Comments

@sanjanasood
Copy link

Hi,

I am using GSEA in clusterprofiler which returns gseaResult object. I am struggling a bit to interpret and understand the two columns in the output that have header leading_edge and core_enrichment. It will be helpful if you could help me understand what do these columns refer to and how to interpret them?

Thanks in advnace
Sanj

Prerequisites

  • Have you read Feedback and follow the guide?
    • make sure your are using the latest release version
    • read the documents
    • google your quesion/issue

Describe you issue

  • Make a reproducible example (e.g. 1)
  • your code should contain comments to describe the problem (e.g. what expected and actually happened?)

Ask in right place

  • for bugs or feature requests, post here (github issue)
  • for questions, please post to Bioconductor or Biostars with tag clusterProfiler
@guidohooiveld
Copy link

guidohooiveld commented Oct 19, 2017

Please check the GSEA documentation on the Broad website....

Leading edge genes: "As described in the Gene Set Enrichment Analysis PNAS paper, the leading-edge subset in a gene set are those genes that appear in the ranked list at or before the point at which the running sum reaches its maximum deviation from zero. The leading-edge subset can be interpreted as the core that accounts for the gene set’s enrichment signal." Source..

Core enrichment genes: "Genes that contribute to the leading-edge subset within the gene set. This is the subset of genes that contributes most to the enrichment result." Source.

The leading edge information is given as percentages (tags, list, signal), and these metrics are used to define the leading edge subset, which are thus the core enrichment genes.... See halfway this page (section "Detailed Enrichment Results"). I agree it is somewhat confusing....

Maybethis clusterProfiler page on visualization is also of interest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants