ENH: override dweibull distribution survival and inverse survival function #18232

dschmitz89 · 2023-04-02T11:51:29Z

Reference issue

What does this implement/fix?

Overrides sf and isf for the double Weibull distribution.

Additional information

For both functions, the available range and precision improved for positive values.

Survival function

import numpy as np
from scipy import stats
import matplotlib.pyplot as plt

x=np.logspace(0, 30, 10000)
c = 1e-1
plt.loglog(x, stats.dweibull.sf(x, c), label="main")
plt.loglog(x, np.where(x > 0, 0.5 * np.exp(-abs(x)**c), 1 - 0.5 * np.exp(-abs(x)**c)), label="PR")
plt.legend()
plt.title(f"Dweibull survival function: $c={c}$")
plt.show()

import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
from mpmath import mp

mp.dps = 100

x=np.logspace(0, 30, 10000)
@np.vectorize
def sf_mpmath(x, c):
    x = mp.mpf(x)
    c = mp.mpf(c)
    if x > 0:
        return mp.mpf(0.5) * mp.exp(-x**c)
    else:
        return mp.one - mp.mpf(0.5) * mp.exp(-(-x)**c)

def sf_pr(x, c):
    return np.where(x > 0,
                    0.5 * np.exp(-abs(x)**c),
                    1 - 0.5 * np.exp(-abs(x)**c))

c = 0.1

x = np.logspace(0, 30, 10000)
main = stats.dweibull.sf(x, c)
pr = sf_pr(x, c)
ref = np.array(sf_mpmath(x, c), np.float64)

plt.loglog(x, np.abs(main - ref)/ref, label="main", alpha=0.5)
plt.loglog(x, np.abs(pr - ref)/ref, label="PR", alpha=0.5)
plt.title(f"Dweibull survival function relative error: $c={c}$")
plt.show()

Inverse survival function

import numpy as np
from scipy import stats
import matplotlib.pyplot as plt

def dweibull_isf(q, c):
    fac = 2. * np.where(q <= 0.5, q, 1. - q)
    fac = np.power(-np.log(fac), 1.0 / c)
    return np.where(q > 0.5, -fac, fac)

q = np.logspace(-20, -10, 1000)
plt.loglog(q, stats.dweibull.isf(q, c), label="main", ls="dashed")
plt.loglog(q, dweibull_isf(q, c), label="PR", ls="dashdot")
plt.legend()
plt.title(f"Dweibull inverse survival function: $c={c}$")
plt.show()

import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
from mpmath import mp

mp.dps = 50

@np.vectorize
def dweibull_isf_mpmath(q, c):
    q = mp.mpf(q)
    c = mp.mpf(c)
    if q <= mp.mpf(0.5):
        fac = mp.mpf(2.) * q
    else:
        fac = mp.mpf(2.) * (mp.one - q)
    fac = (-mp.log(fac))**(mp.one / c)
    #print(fac)
    if q > mp.mpf(0.5):
        return float(-fac)
    else:
        return float(fac)

q = np.logspace(-20, 0, 1000)
c = 0.1
ref_vals = dweibull_isf_mpmath(q, c)
ref = np.array(ref_vals, np.float64)
main = stats.dweibull.isf(q, c)
pr = dweibull_isf(q, c)

plt.loglog(q, np.abs(main - ref)/main, label="main", alpha=0.5)
plt.loglog(q, np.abs(pr - ref)/main, label="PR", alpha=0.5)
plt.title(f"Dweibull inverse survival function relative error: $c={c}$")
plt.show()

mdhaber · 2023-04-02T15:38:32Z

The dweibull survival function is half the weibull_min survival function for positive arguments.

import numpy as np
from scipy import stats
x, c = stats.uniform.rvs(size=(2, 10))
np.testing.assert_allclose(stats.dweibull(c).sf(x), stats.weibull_min(c).sf(abs(x))/2)

To simplify, can you use the private functions of weibull_min (e.g. weibull_min._sf) as we did for _entropy to make this improvement? (And if the accuracy is not as good, the weibull_min function is really what needs to be improved.)
If you're willing, defining logsf would probably be good while our attention is here. Then maybe _sf could just exponentiate _logsf.

For tests, rather than comparing against reference values from mpmath, we only need to test that the dweibull functions follow those of weibull_min, assuming weibull_min is well tested. If not, the mpmath tests can be added to weibull_min, since it's the more commonly used and fundamental distribution.

mdhaber · 2023-04-09T21:55:27Z

scipy/stats/_continuous_distns.py

+    def _sf(self, x, c):
+        Cx1 = 0.5 * np.exp(-abs(x)**c)
+        return np.where(x > 0, Cx1, 1 - Cx1)
+
+    def _isf(self, q, c):
+        fac = 2. * np.where(q <= 0.5, q, 1. - q)
+        fac = np.power(-np.log(fac), 1.0 / c)
+        return np.where(q > 0.5, -fac, fac)
+


Suggested change

def _sf(self, x, c):

Cx1 = 0.5 * np.exp(-abs(x)**c)

return np.where(x > 0, Cx1, 1 - Cx1)

def _isf(self, q, c):

fac = 2. * np.where(q <= 0.5, q, 1. - q)

fac = np.power(-np.log(fac), 1.0 / c)

return np.where(q > 0.5, -fac, fac)

def _sf(self, x, c):

p = weibull_min._sf(x, c) / 2

return np.where(x > 0, p, 1 - p)

def _isf(self, p, c):

i = (p <= 0.5)

p = 2 * np.where(i, p, 1 - p)

q = weibull_min._isf(p, c)

return np.where(i, q, -q)

or it's OK to follow the (unfortunate) naming conventions where _isf accepts argument q.

This fails tests until _sf is overridden for weibull_min. Following the style of its ppf,

def _isf(self, q, c): return pow(-np.log(q), 1.0/c)

or

def _isf(self, p, c): return (-np.log(p))**(1.0/c)

would be fine, too. I don't know how the inverse cdf/sf methods started accepting q as the input. p would make more sense.

Let's not accept and silently revert changes in the future, please.

mdhaber · 2023-04-09T21:57:19Z

scipy/stats/tests/test_distributions.py

+    # reference values were computed with mpmath
+    # from mpmath import mp
+    # mp.dps = 50
+    # def sf_mpmath(x, c):
+    #     x = mp.mpf(x)
+    #     c = mp.mpf(c)
+    #     if x > 0:
+    #          return mp.mpf(0.5) * mp.exp(-x**c)
+    # else:
+    #          return mp.one - mp.mpf(0.5) * mp.exp(-(-x)**c)
+    # for the inverse survival function tests, swap ref and x
+
+    @pytest.mark.parametrize('x, c, ref',
+                             [(1e20, 0.1, 1.8600379880103705e-44),
+                              (1e5, 0.5, 2.306726997904701e-138)])
+    def test_sf_isf(self, x, c, ref):
+        assert_allclose(stats.dweibull.sf(x, c), ref, rtol=5e-14)
+        assert_allclose(stats.dweibull.isf(ref, c), x, rtol=5e-14)


Let's convert this to a test for weibull_min.sf/isf. The test for dweibull.sf/isf can follow the style of test_entropy. There the comment doesn't need to be repeated, though.

Co-authored-by: Matt Haberland <mhaberla@calpoly.edu>

…into dweibull_sf_isf

dschmitz89 · 2023-04-12T19:23:54Z

scipy/stats/tests/test_distributions.py

+    @pytest.mark.parametrize('x, c, ref', [(50, 1, 1.9287498479639178e-22),
+                                           (1000, 0.8,
+                                            8.131269637872743e-110)])
+    def test_sf_isf(self, x, c, ref):
+        assert_allclose(stats.weibull_min.sf(x, c), ref, rtol=1e-14)
+        assert_allclose(stats.weibull_min.isf(ref, c), x, rtol=1e-14)


The tests for the inverse survival function fail in main.

Yup. You'll need to override ~~weibull_min._sf~~weibull_min._isf with the same idea as you originally had for ~~dweibull._sf~~dweibull._isf.

Not sure if I understand exactly: weibull_min's sf method is already overwritten and very accurate.

Oops. As you might have guessed, I meant _isf, since we're talking about the inverse survival function. That is not overridden. (I tried it locally when I made the suggestions for what to change about dweibull, and overriding it made the existing dweibull tests pass.

mdhaber · 2023-04-13T05:39:48Z

scipy/stats/tests/test_distributions.py

@@ -6349,6 +6349,20 @@ def test_fit_min(self):
        ref = np.mean(rvs), stats.skew(rvs)
        assert_allclose(res, ref)

+    # reference values were computed via mpmath


Confirmed with

class WeibullMin(ReferenceDistribution): def __init__(self, *, c): super().__init__(c=c) def _pdf(self, x, c): return c*x**(c-mp.one)*mp.exp(-x**c)

dschmitz89 added 2 commits April 1, 2023 08:55

ENH: dweibull survival and isf

824435a

Add testing code

f067a27

dschmitz89 added scipy.stats enhancement A new feature or improvement labels Apr 2, 2023

dschmitz89 added this to the 1.11.0 milestone Apr 2, 2023

dschmitz89 added 2 commits April 7, 2023 14:00

Merge branch 'main' into dweibull_sf_isf

c231ef6

WIP

fce65ec

mdhaber requested changes Apr 9, 2023

View reviewed changes

dschmitz89 and others added 7 commits April 10, 2023 12:52

Apply suggestion from code review

0ba403d

Co-authored-by: Matt Haberland <mhaberla@calpoly.edu>

WIP

d2be57a

Merge branch 'main' into dweibull_sf_isf

2635332

WIP

4f550e4

TST: test dweibull sf versus weibull

bca40ba

Merge branch 'main' into dweibull_sf_isf

2efd4f1

Merge branch 'dweibull_sf_isf' of https://github.com/dschmitz89/scipy …

a679b8b

…into dweibull_sf_isf

dschmitz89 commented Apr 12, 2023

View reviewed changes

TST: loosen tolerance

9a17f7f

mdhaber reviewed Apr 13, 2023

View reviewed changes

mdhaber approved these changes Apr 13, 2023

View reviewed changes

mdhaber merged commit c62be9c into scipy:main Apr 13, 2023

mdhaber mentioned this pull request May 27, 2023

DOC: SciPy 1.11.0 release notes #18563

Merged

5 tasks

dschmitz89 deleted the dweibull_sf_isf branch July 18, 2023 18:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: override dweibull distribution survival and inverse survival function #18232

ENH: override dweibull distribution survival and inverse survival function #18232

dschmitz89 commented Apr 2, 2023

mdhaber commented Apr 2, 2023 •

edited

Loading

mdhaber Apr 9, 2023 •

edited

Loading

mdhaber Apr 13, 2023

mdhaber Apr 9, 2023

dschmitz89 Apr 12, 2023

mdhaber Apr 12, 2023 •

edited

Loading

dschmitz89 Apr 12, 2023 •

edited

Loading

mdhaber Apr 12, 2023 •

edited

Loading

mdhaber Apr 13, 2023

ENH: override dweibull distribution survival and inverse survival function #18232

ENH: override dweibull distribution survival and inverse survival function #18232

Conversation

dschmitz89 commented Apr 2, 2023

Reference issue

What does this implement/fix?

Additional information

mdhaber commented Apr 2, 2023 • edited Loading

mdhaber Apr 9, 2023 • edited Loading

Choose a reason for hiding this comment

mdhaber Apr 13, 2023

Choose a reason for hiding this comment

mdhaber Apr 9, 2023

Choose a reason for hiding this comment

dschmitz89 Apr 12, 2023

Choose a reason for hiding this comment

mdhaber Apr 12, 2023 • edited Loading

Choose a reason for hiding this comment

dschmitz89 Apr 12, 2023 • edited Loading

Choose a reason for hiding this comment

mdhaber Apr 12, 2023 • edited Loading

Choose a reason for hiding this comment

mdhaber Apr 13, 2023

Choose a reason for hiding this comment

mdhaber commented Apr 2, 2023 •

edited

Loading

mdhaber Apr 9, 2023 •

edited

Loading

mdhaber Apr 12, 2023 •

edited

Loading

dschmitz89 Apr 12, 2023 •

edited

Loading

mdhaber Apr 12, 2023 •

edited

Loading