-
Notifications
You must be signed in to change notification settings - Fork 16
Description
Looking at the code here:
Lines 533 to 540 in bb31cfd
| shared_counts = ( | |
| self.store.select( | |
| "/main/{}/counts".format(label), "columns in ['c_0', c_last]" | |
| ) | |
| .sum(axis="index") | |
| .values | |
| + 0.5 | |
| ) |
I notice that for the denominator of the enrichment ratio (shared_counts), the code sums all the values for c_0 and c_last and then adds a single pseudo-count of 0.5. Later, for the numerator, the code adds a pseudo-count of 0.5 to each count in the numerator.
Wouldn't this have the effect of potentially skewing the ratios so that they wouldn't sum to 1? For instance, let's say c_0 = [1, 3, 1, 2]
Then shared_counts = (1 + 3 + 1 + 2) + 0.5 = 7.5
And then the ratios would be:
1.5/7.5 = 0.2
3.5/7.5 = 0.467
1.5/7.5 = 0.2
2.5/7.5 = 0.333
and the sum of the ratios = 1.2 instead of 1.
I would have thought that for shared_counts the code would have added the 0.5 pseudo count prior to the sum (or alternatively added a pseudo_count = 0.5*len(c_0)) ?
I may be misreading things though.