Separation based on p-value or abs(mean) unrealistically good #5

tlnagy · 2016-04-28T01:40:56Z

Even with the new method introduced in f25e71e of sampling an "observed phenotype" from a Normal distribution centered at the "theoretical phenotype" with a stddev of 0.5, the separation is unrealistically good. Here I'm sorting guides based on their "observed phenotype" and taking the bottom 1/3 and putting it in one bin and the top 1/3 in another and comparing:

Distribution of "observed phenotypes" that was used to bin

Log-log plot of guide frequencies in the two bins

Volcano plot

It's super apparent when I collapse the results down to the gene-level:

What do you think @martinkampmann? I'm ending up with AUROC's of 0.999.

The text was updated successfully, but these errors were encountered:

Working towards a solution to #5. Dropping the coverage of library introduced a couple artifacts due to guides being absent starting at the transfection state.

tlnagy · 2016-04-30T01:51:08Z

Dropping coverage from 1000x to 100x and increasing the variance due to sorting to 1 gives the following plots...

Observed Phenotypes

Volcano plot

It's interesting to note that there seem to be binning occuring when pvalue -> 0

Roc

Working on #5.

tlnagy · 2016-04-30T02:20:04Z

@martinkampmann Plotting the ROC curves separately for linear and sigmoidal genes:

This is with 100x and sigma=1

tlnagy · 2016-05-06T00:59:40Z

AUROC's for optimum screen conditions are good, but I can effectively degrade them so I'm going to go ahead and close this issue for now.

tlnagy mentioned this issue Apr 29, 2016

Guides should store their initial frequencies #7

Closed

tlnagy added a commit that referenced this issue Apr 30, 2016

track initial guide frequencies, fixes #7

ef9c963

Working towards a solution to #5. Dropping the coverage of library introduced a couple artifacts due to guides being absent starting at the transfection state.

tlnagy added a commit that referenced this issue Apr 30, 2016

generate separate roc curves for gene classes

ffc80f0

Working on #5.

tlnagy added a commit that referenced this issue May 2, 2016

make library more realistic, working on #5

2ce5f2f

tlnagy closed this as completed May 6, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separation based on p-value or abs(mean) unrealistically good #5

Separation based on p-value or abs(mean) unrealistically good #5

tlnagy commented Apr 28, 2016 •

edited

Loading

tlnagy commented Apr 30, 2016

tlnagy commented Apr 30, 2016

tlnagy commented May 6, 2016

Separation based on p-value or abs(mean) unrealistically good #5

Separation based on p-value or abs(mean) unrealistically good #5

Comments

tlnagy commented Apr 28, 2016 • edited Loading

Distribution of "observed phenotypes" that was used to bin

Log-log plot of guide frequencies in the two bins

Volcano plot

tlnagy commented Apr 30, 2016

Observed Phenotypes

Volcano plot

Roc

tlnagy commented Apr 30, 2016

tlnagy commented May 6, 2016

tlnagy commented Apr 28, 2016 •

edited

Loading