PPI Randomisations and Statistics

In brief, the input PPI dataset is shuffled without replacement (i.e. keeping the original number of interacting partners per protein) and the number of predicted DMI calculated for the shuffled data. By default, this is done 1000 times, which can be set using the Number of randomisations (or --random=INTEGER on the commandline).

The main SLiMEnrich output histogram shows the expected distribution of predicted DMIs from these randomised PPI data, and marks the observed number of predicted DMIs in the real data. The following values are also calculated and displayed:

  • P-value: this is an empirical p-value based on the proportion of randomised PPI datasets that equal or exceed the number of DMI observed in the real data.
  • Enrichment: this is the ratio of observed non-redundant DMI to the mean of the random non-redundant DMI.
  • FDR: this is the estimated proportion of observed DMI that are false positives, based on the mean random DMI count, excluding any random datasets exceeding the observed predicted DMI count.

