How Pvalue is calculated

This page describes how the p-values are calculated for results returned from this service. The p-values are used in sorting the results.

Overview

This service uses HypergeometricDistribution to calculate p-values for each enrichment result leveraging the probability function to get the p-value. The p-values are then adjusted using this Benjamini, Heller, Yekutieli 2009 paper.

The population size is set to the number of unique genes in all networks visible to enrichment.

The number of successes is set to the number unique genes for given network being examined.

number of successes value might be wrong, should the code be changed 
to get count of number of networks where # of matched genes occur?

The sample size is set to the number of unique genes in the query.

To get the p-value the unique number of matching genes is passed to the probability function.

The p-values are then adjusted where each p-value is multiplied by the number of networks queried, and then divided by its rank relative to other p-values (where low p-values have a low rank and vice versa). Lower value p-values are propagated up the list so that the p-values are always ascending.

Actual code

Specifically the HypergeometricDistribution class in the apache library is used

HypergeometricDistribution hd = new HypergeometricDistribution(populationSize, 
                                                               numberOfSuccesses,
                                                               sampleSize);

double pvalue = hd.probability(numGenesMatch);

For p-value adjustment see this class

Definition of values used above

populationSize is set to the number of unique genes in all networks examined by enrichment
numberOfSuccesses is set to the number of unique genes for the given network being examined
sampleSize is set to the number of unique genes in the query
numGenesMatch is set to the number of genes that match the given network

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How Pvalue is calculated

Overview

Actual code

Definition of values used above

Uh oh!

Uh oh!

Clone this wiki locally