Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Consider p value calculation for IC based sim calculations #80

Open
kshefchek opened this Issue Apr 3, 2017 · 2 comments

Comments

Projects
None yet
3 participants
Member

kshefchek commented Apr 3, 2017

When presenting results I'm often asked for a p value to determine if a match is significant. @drseb has proposed a way to generate p values for similarity scores here: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2756558/. Would it be feasible and useful to add this to the phenodigm algorithm? Could we also add phenomizer as a matcher?

@kshefchek kshefchek added the question label Apr 3, 2017

Owner

drseb commented Apr 10, 2017

It would be possible to add phenomizer, but I suggest to use the bayesian algorithms, as these will naturally give you a statistical statement. Happy to help with the empirical p-values if you decide to go this route

Owner

cmungall commented Apr 11, 2017

Agreed about the bayesian algorithms, but just to be clear these yield a probability not a p-value.

For calculation of p-values there is this code added by Nicole I think:

public void calculateMatchSignificance(DescriptiveStatistics background) {

But this is a T-Test and not meaningful here.

To get accurate p-values we can follow the methods in @drseb's paper but we would have to do the simulation for all combinations of species I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment