Skip to content

Question about 0.5 pseudo-count #75

@godotgildor

Description

@godotgildor

Looking at the code here:

shared_counts = (
self.store.select(
"/main/{}/counts".format(label), "columns in ['c_0', c_last]"
)
.sum(axis="index")
.values
+ 0.5
)

I notice that for the denominator of the enrichment ratio (shared_counts), the code sums all the values for c_0 and c_last and then adds a single pseudo-count of 0.5. Later, for the numerator, the code adds a pseudo-count of 0.5 to each count in the numerator.

Wouldn't this have the effect of potentially skewing the ratios so that they wouldn't sum to 1? For instance, let's say c_0 = [1, 3, 1, 2]

Then shared_counts = (1 + 3 + 1 + 2) + 0.5 = 7.5

And then the ratios would be:
1.5/7.5 = 0.2
3.5/7.5 = 0.467
1.5/7.5 = 0.2
2.5/7.5 = 0.333

and the sum of the ratios = 1.2 instead of 1.

I would have thought that for shared_counts the code would have added the 0.5 pseudo count prior to the sum (or alternatively added a pseudo_count = 0.5*len(c_0)) ?

I may be misreading things though.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions