ENH Added lfc under null and alt hypothesis #172

vcabeli · 2023-09-05T10:41:48Z

Reference Issue or PRs

Resolves #158

PR description

This PR aims to implement the option to specify an alternative test for producing Wald statistics and p-values.
It corresponds to the 'lfcThreshold' and 'altHypothesis' parameters of the original DESeq2 results(dds), but not exactly :

DESeq2 takes the negative lfcThreshold when altHypothesis = less as explained in the vignette. If you want to have the same behavior in pyDESeq2, you will have to specify the correct sign for lfc_null

DESeq2 computes the pvalues and the statistics separately , this gives unintuitive results where the statistic is 0 and the p-value is < 1 :

	baseMean	log2FoldChange	lfcSE	stat	pvalue	padj
gene1	8.541317	0.632812	0.289104	0.459393	0.322976	1.0
gene2	21.281239	0.538552	0.149963	0.257077	0.398560	1.0
gene3	5.010123	-0.632832	0.295221	0.000000	0.999938	1.0
gene4	100.517961	-0.412102	0.118628	0.000000	1.000000	1.0
gene5	27.142450	0.582066	0.154730	0.530383	0.297923	1.0
gene6	5.413043	0.001457	0.310310	0.000000	0.945929	1.0
gene7	28.294023	0.134336	0.149987	0.000000	0.992615	1.0
gene8	40.358344	-0.270656	0.136402	0.000000	1.000000	1.0
gene9	37.166183	-0.212715	0.133244	0.000000	1.000000	1.0
gene10	11.589325	0.386011	0.244586	0.000000	0.679409	1.0

compared to pyDESeq2 :

	baseMean	log2FoldChange	lfcSE	stat	pvalue	padj
gene1	8.541317	0.632812	0.289101	0.459398	0.322974	0.5
gene2	21.281239	0.538552	0.149963	0.257077	0.398560	0.5
gene3	5.010123	-0.632830	0.295236	0.000000	0.500000	0.5
gene4	100.517961	-0.412102	0.118629	0.000000	0.500000	0.5
gene5	27.142450	0.582065	0.154706	0.530462	0.297896	0.5
gene6	5.413043	0.001457	0.310311	0.000000	0.500000	0.5
gene7	28.294023	0.134338	0.149945	0.000000	0.500000	0.5
gene8	40.358344	-0.270656	0.136401	0.000000	0.500000	0.5
gene9	37.166183	-0.212715	0.133243	0.000000	0.500000	0.5
gene10	11.589325	0.386011	0.244588	0.000000	0.500000	0.5

I have not yet understood why (probably different Cooks adjustment?), DESeq2 returns p-values between [0,1] even though it uses one-sided tests for the 'less', 'greater' and 'lessAbs' alt hypothesis, which are bounded in [0, 0.5]. They must be corrected afterwards. The p-values where the statistic is > 0 are in agreement up to tol=0.02, as shown in tests/test_pydeseq2.py:test_alt_hypothesis()
This implementation allows to set an arbitrary lfc_null that corresponds to the (log2) LFC under the null hypothesis and no alternative hypothesis, i.e. use the classic Wald test to test for deviation from the null value.

The test results used for comparison were generated using the R DESeq2 version 1.34.0 and results(dds, lfcThreshold=.5, altHypothesis=altHypothesis)

BorisMuzellec · 2023-09-06T10:02:10Z

Hi @vcabeli,

Thanks a lot for this PR!

A few comments:

I agree that it makes sense to expect users to provide an LFC threshold with the correct sign when altHypothesis = less. In particular, I don't see any reason why we wouldn't allow testing LFC < 0.5.
The differences between DESeq2 and this PR regarding statistics equal to 0 seem to come from the fact that when an altHypothesis is specified, DESeq2 thresholds statistics to be above 0 but uses un-thresholded statistics to compute p-values, e.g. for altHypothesis = greater:
```
newStat <- pmax((LFC - T)/SE, 0)
newPvalue <- pfunc((LFC - T)/SE)
```
vs. in this PR
```
stat = contrast @ np.fmax((lfc - lfc_null) / wald_se, 0)
pval = norm.sf(stat)
```
I'm not sure which is correct.
Given that it is possible to provide a negative threshold, could you make DeseqStats throw a ValueError
when lfc_null < 0 but alt_hypothesis is based on absolute values?

…lute value

vcabeli · 2023-09-07T07:08:36Z

Thanks for the review @BorisMuzellec,
Personally I would be for computing the p-value directly from the stat, especially since in this implementation you can specify a lfc_null without any alt_hypothesis, which does stat=(LFC - T)/SE) (and pval=pnorm(stat), although it uses the two-tailed distribution).
That way everything strictly corresponds to the input parameters

pydeseq2/ds.py

…d arguments

BorisMuzellec

LGTM, thanks for the PR

vcabeli added 3 commits September 5, 2023 11:32

Added lfc under null and alt hypothesis

5d4bb1d

Linting

dc24d17

Added test

849e4d8

vcabeli requested review from BorisMuzellec, maikia, arthurPignetOwkin and mandreux-owkin as code owners September 5, 2023 10:41

Added less hypothesis in test

5387795

vcabeli changed the title ~~Added lfc under null and alt hypothesis~~ ENH Added lfc under null and alt hypothesis Sep 5, 2023

vcabeli added the enhancement New feature or request label Sep 5, 2023

Move options from DeseqDataset to DeseqStats, remove debugging part

e870ffe

Raise ValueError when lfc_null is <0 and alt hypothesis based on abso…

e226692

…lute value

docs: improve lfc_threshold example

bc41204

BorisMuzellec reviewed Sep 7, 2023

View reviewed changes

pydeseq2/ds.py Outdated Show resolved Hide resolved

BorisMuzellec reviewed Sep 7, 2023

View reviewed changes

pydeseq2/ds.py Outdated Show resolved Hide resolved

BorisMuzellec and others added 5 commits September 7, 2023 14:37

refactor(summary): set lfc_null and alt_hypothesis as optional keywor…

c7e09d1

…d arguments

docs: format docstring

f0046db

refactor: change log2 to natural log conversion method

d4a8f5d

refactor(summary): set lfc_null and alt_hypothesis as optional keywor…

55f8eed

…d arguments

docs: wrap docstring

9dde855

BorisMuzellec approved these changes Sep 15, 2023

View reviewed changes

BorisMuzellec merged commit 868e7fa into main Sep 15, 2023
8 checks passed

BorisMuzellec deleted the null_lfc branch September 15, 2023 10:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH Added lfc under null and alt hypothesis #172

ENH Added lfc under null and alt hypothesis #172

vcabeli commented Sep 5, 2023 •

edited

Loading

BorisMuzellec commented Sep 6, 2023

vcabeli commented Sep 7, 2023 •

edited

Loading

BorisMuzellec left a comment

ENH Added lfc under null and alt hypothesis #172

ENH Added lfc under null and alt hypothesis #172

Conversation

vcabeli commented Sep 5, 2023 • edited Loading

Reference Issue or PRs

PR description

BorisMuzellec commented Sep 6, 2023

vcabeli commented Sep 7, 2023 • edited Loading

BorisMuzellec left a comment

Choose a reason for hiding this comment

vcabeli commented Sep 5, 2023 •

edited

Loading

vcabeli commented Sep 7, 2023 •

edited

Loading