Issue with THOR p-values #124

plbaldoni · 2019-07-22T17:27:19Z

Hi,

I am running THOR with the options "--merge --no-correction --pvalue 1.0" aiming to get the set of all potential peaks and their raw p-values.

After loading the results in R, the output looks like the following:

> thor = read.table('~/Downloads/THOR_Output-diffpeaks.narrowPeak')
> 
> head(thor)
    V1     V2     V3    V4 V5 V6 V7          V8 V9 V10
1 chr1  16350  16500 Peak0  0  +  0  0.08121752  0  -1
2 chr1 761100 761300 Peak1  0  +  0 -0.40853573  0  -1
3 chr1 776950 777000 Peak2  0  -  0  0.31362374  0  -1
4 chr1 777050 777350 Peak3  0  -  0  0.01068111  0  -1
5 chr1 882750 883050 Peak4  0  +  0 -0.34060708  0  -1
6 chr1 884100 884450 Peak5  0  -  0 -0.35914791  0  -1

Which transformation has been used to produce the pValue (8th column) of the output? Is there a way I can get the raw p-values ranging on (0,1) from each peak? It looks like neither -log10 nor log10 gives me these quantities:

> summary(thor$V8)
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
-3.36502 -0.72130 -0.31481 -0.37389  0.04078  0.52139 
> summary(10^(-thor$V8))
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
   0.3010    0.9104    2.0645    6.1711    5.2638 2317.4833 
> summary(10^(thor$V8))
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
0.000432 0.189976 0.484389 0.779523 1.098453 3.321928

Best wishes,
Pedro Baldoni

The text was updated successfully, but these errors were encountered:

michael-kotliar · 2019-11-29T18:43:00Z

Got the same problem. It looks like a complete mess with all this -log10 and raw p-values.
If I run with --no-correction flag I assume that my filtering is done by filter_by_pvalue_strand_lag function. pvalues in this case are already log transformed, otherwise there isn't any logical explanation why we log transform our pcutoff.

pv_pass = np.where(np.asarray(pvalues) >= -log10(pcutoff), True, False)

Then, when exporting the results in _output_BED function we log transform our pvalues one more time

p_tmp = -log10(pvalues[i]) if pvalues[i] > 0 else sys.maxint

fabio-t · 2019-12-04T18:47:03Z

Thanks all for reporting, this will be fixed in the next release. Using the --no-correction argument the p-values were indeed "double log10'd", so @plbaldoni in your original example, if you wanted a quick and dirty solution before the release, you could do:

summary(10^(-10^(-thor$V8)))

fabio-t self-assigned this Jul 29, 2019

fabio-t added this to TODO in THOR via automation Jul 29, 2019

fabio-t added bug maybe labels Jul 29, 2019

fabio-t removed the maybe label Dec 4, 2019

fabio-t mentioned this issue Mar 2, 2020

RGT-0.13.0 #143

Merged

fabio-t closed this as completed in 2054b25 Mar 2, 2020

THOR automation moved this from TODO to Done Mar 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with THOR p-values #124

Issue with THOR p-values #124

plbaldoni commented Jul 22, 2019

michael-kotliar commented Nov 29, 2019

fabio-t commented Dec 4, 2019

Issue with THOR p-values #124

Issue with THOR p-values #124

Comments

plbaldoni commented Jul 22, 2019

michael-kotliar commented Nov 29, 2019

fabio-t commented Dec 4, 2019