Weird-looking plots #24

DarwinAwardWinner · 2016-10-04T19:01:26Z

When I run idr on peak lists produced by MACS on my data, I get some bizarre-looking plots:

The plots look like fractal versions of the typical IDR plots shown here: http://ccg.vital-it.ch/var/sib_april15/cases/landt12/idr.html

Do you have any idea what might be causing this? I'm invoking the script as:

idr --samples sample1.narrowPeak sample2.narrowPeak \
    --peak-list oraclepeaks.narrowPeak --input-file-type narrowPeak \
    --output-file idrValues.txt --output-file-type narrowPeak \
    --log-output-file idr.log --plot --random-seed 1986

I can share my peak files if you want them.

The text was updated successfully, but these errors were encountered:

DarwinAwardWinner · 2016-10-04T21:32:00Z

Here's a link to example files that priduce plots like the above: https://www.dropbox.com/sh/k2193eqe1j8qun9/AAASAJG9BkzXHXPDHKdlLVhha?dl=0

DarwinAwardWinner · 2016-10-05T23:29:44Z

Ok, I think I know what the problem is. The peak caller I'm using (MACS2) is returning lots of identical enrichment scores, which means that peaks with those scores are essentially sorted randomly, throwing off the IDR algorithm. The patterns of identical scores exactly match the stair-step patterns seen in the top plots.

DarwinAwardWinner · 2016-10-05T23:33:17Z

Here's a look at an example plot for one sample's peak call scores vs rank:

DarwinAwardWinner · 2016-10-10T23:26:52Z

It turns out that the answer was to use the -log10(p-value) column instead of score or signal value, since this column seems to have the greatest number of unique values for MACS2. In contrast, for Epic, the column with the most unique values is score. So the lesson is to look out your peak output and figure out which potential ranking column has the fewest duplicates.

DarwinAwardWinner · 2016-10-10T23:29:12Z

Also, I think the above plots look weird partially because all the red points have black outlines. So in areas of high point density, the red points look black because all you see are the black outlines.

(Also also: MACS2 outputs up to millions of peaks if you let it, so one should filter to only the best 150k or so, or else idr will take forever to run.)

DarwinAwardWinner closed this as completed Oct 10, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weird-looking plots #24

Weird-looking plots #24

DarwinAwardWinner commented Oct 4, 2016

DarwinAwardWinner commented Oct 4, 2016

DarwinAwardWinner commented Oct 5, 2016 •

edited

Loading

DarwinAwardWinner commented Oct 5, 2016 •

edited

Loading

DarwinAwardWinner commented Oct 10, 2016

DarwinAwardWinner commented Oct 10, 2016

Weird-looking plots #24

Weird-looking plots #24

Comments

DarwinAwardWinner commented Oct 4, 2016

DarwinAwardWinner commented Oct 4, 2016

DarwinAwardWinner commented Oct 5, 2016 • edited Loading

DarwinAwardWinner commented Oct 5, 2016 • edited Loading

DarwinAwardWinner commented Oct 10, 2016

DarwinAwardWinner commented Oct 10, 2016

DarwinAwardWinner commented Oct 5, 2016 •

edited

Loading

DarwinAwardWinner commented Oct 5, 2016 •

edited

Loading