Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Coverage Analysis #71

Merged
merged 4 commits into from
Dec 16, 2018
Merged

Conversation

gwaybio
Copy link
Collaborator

@gwaybio gwaybio commented Dec 14, 2018

Here I add a module (a new module 7) for tracking gene set coverage across algorithms, dimensions, and seeds.

I also add an initial bash script that will run all analyses and the first iteration of results

Copy link

@ajlee21 ajlee21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Just a clarifying comment as I'm not as familiar with gene enrichment analyses.

So for a given gene set (which are a set of genes that are in the same pathway?) you are looking to determine which z feature contains the most gene sets? I'm a bit confused because if this assumption is true then I think that all the z features contain all genes just weighted differently so how are you looking for a difference between z's?

@gwaybio
Copy link
Collaborator Author

gwaybio commented Dec 16, 2018

So for a given gene set (which are a set of genes that are in the same pathway?)

Gene sets are groups of genes that correspond to specific biology, which include pathways.

you are looking to determine which z feature contains the most gene sets? I'm a bit confused because if this assumption is true then I think that all the z features contain all genes just weighted differently so how are you looking for a difference between z's?

I am looking to see which gene sets are enriched in the z features. I determine enriched gene sets based on which genes are upweighted and downweighted in the z features. So, I am looking to determine the coverage, or proportion, of the gene set compendium that are captured by the algorithms with a given z dimension. Since I am also building 5 models per dimension, I am also tracking the gene set coverage of these ensemble models.

@gwaybio gwaybio merged commit ecbd9d6 into greenelab:master Dec 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants