ENH: override halflogistic `sf` and `isf` #18002

dschmitz89 · 2023-02-19T08:04:01Z

Reference issue

What does this implement/fix?

Overrides survival function and inverse survival function for the halflogistic distribution. Using a little algebra (or Wolfram Alpha for convenience) we get:
$SF(x)=1-CDF(x)=1-\tanh(x/2)=2/(1+e^x)=2\text{expit}(-x)$
$ISF(q)=-\text{logit}(q/2)$

Additional information

The current survival function fails at about $x=38$.

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
from scipy import special as sc


x = np.linspace(0, 400, 200)
plt.semilogy(x, stats.halflogistic.sf(x), label="main", ls="dashed")
plt.semilogy(x, 2 * sc.expit(-x), label="PR", ls="dotted")
plt.legend()
plt.title("Halflogistic survival function")
plt.show()

Inverse survival function: greatly improved range for small values.

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
from scipy import special as sc


x = np.logspace(-40, -10, 200)
plt.semilogx(x, stats.halflogistic.isf(x), label="main", ls="dashed")
plt.semilogx(x, -sc.logit(0.5 * x), label="PR", ls="dotted")
plt.legend()
plt.title("Halflogistic inverse survival function")
plt.show()

WarrenWeckesser

The new _sf works great, but for _isf, it looks like we have to play whack-a-mole on the relative error. Here's a plot of the relative error of the original version of isf and the version in this pull request. The pull request fixes the large errors near 0, but it has significantly worse relative error than the default for inputs near 1. If you zoom in, you can see that a good threshold for switching between the versions is right in the middle, p=0.5.

Note: the script uses mpsci.distribution.half_logistic, a very recent addition to mpsci.

script that generates the plot


import numpy as np
from scipy import stats
from scipy.special import logit
from mpmath import mp
from mpsci.distributions import half_logistic
import matplotlib.pyplot as plt


p = np.linspace(1e-5, 1, 10000, endpoint=False)
# x1 = stats.halflogistic.isf(p)
x1 = 2*np.arctanh(1 - p)  # Same as stats.halflogistic.isf(p) in main.
x2 = -logit(0.5*p)

mp.dps = 100
x = np.array([float(half_logistic.invsf(t)) for t in p])

plt.plot(p, np.abs(x1 - x)/x, alpha=0.35, label='original')
plt.plot(p, np.abs(x2 - x)/x, alpha=0.35, label='pull request')
plt.title('half-logistic isf relative error')
plt.xlabel('p')
plt.grid()
plt.legend(framealpha=1, shadow=True)
plt.ylim(-1e-15, 4e-14)
plt.show()

WarrenWeckesser · 2023-03-03T06:22:33Z

scipy/stats/tests/test_distributions.py

+    # from mpmath import mp
+    # mp.dps = 50
+    # def sf_mpmath(x):
+    #     mp.dps = 50


Delete this--you already set dps two lines above.

dschmitz89 · 2023-03-04T10:11:53Z

Thanks for the suggestions @WarrenWeckesser , the relative error of isf is much lower now. The remaining test failures are unrelated.

Relative error using the new formulation

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
from scipy import special as sc

@np.vectorize
def isf_mpmath(x):
    x = mp.mpf(x)
    x = x/mp.mpf(2.)
    return float(-mp.log(x/(mp.one - x)))

def isf_special(x):
    return np.where(x < 0.5, -sc.logit(0.5 * x),
                    2*np.arctanh(1 - x))

x = np.linspace(0, 1, 100000)
mpmath_isf = np.array(isf_mpmath(x), dtype=np.float64)
result_special = isf_special(x)

plt.plot(x, np.abs((result_special - mpmath_isf)/mpmath_isf), label="arctanh")
plt.title("Relative error: halflogistic inv. survival function")

WarrenWeckesser

Looks good. I'll make a tiny tweak to a few lines of commented code and merge.

scipy/stats/tests/test_distributions.py

dschmitz89 added 2 commits February 19, 2023 08:55

ENH: halflogistic sf/isf override

003c61b

Merge branch 'main' into halflogistic_sf

8fb2a7c

dschmitz89 added scipy.stats enhancement A new feature or improvement labels Feb 19, 2023

dschmitz89 requested a review from WarrenWeckesser February 19, 2023 08:04

WarrenWeckesser requested changes Mar 3, 2023

View reviewed changes

dschmitz89 added 2 commits March 3, 2023 08:17

Merge branch 'main' into halflogistic_sf

f97135a

ENH: increase isf precision near 1 acc. to review

25f41e6

WarrenWeckesser approved these changes Mar 7, 2023

View reviewed changes

scipy/stats/tests/test_distributions.py Outdated Show resolved Hide resolved

Update scipy/stats/tests/test_distributions.py

6ec58d4

WarrenWeckesser merged commit 81f84c8 into scipy:main Mar 7, 2023

WarrenWeckesser added this to the 1.11.0 milestone Mar 7, 2023

dschmitz89 mentioned this pull request Jul 4, 2023

ENH:stats: Add _isf method to pareto #18820

Merged

dschmitz89 deleted the halflogistic_sf branch July 18, 2023 21:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: override halflogistic `sf` and `isf` #18002

ENH: override halflogistic `sf` and `isf` #18002

dschmitz89 commented Feb 19, 2023 •

edited

Loading

WarrenWeckesser left a comment

WarrenWeckesser Mar 3, 2023

dschmitz89 commented Mar 4, 2023 •

edited

Loading

WarrenWeckesser left a comment

ENH: override halflogistic sf and isf #18002

ENH: override halflogistic sf and isf #18002

Conversation

dschmitz89 commented Feb 19, 2023 • edited Loading

Reference issue

What does this implement/fix?

Additional information

WarrenWeckesser left a comment

Choose a reason for hiding this comment

WarrenWeckesser Mar 3, 2023

Choose a reason for hiding this comment

dschmitz89 commented Mar 4, 2023 • edited Loading

WarrenWeckesser left a comment

Choose a reason for hiding this comment

ENH: override halflogistic `sf` and `isf` #18002

ENH: override halflogistic `sf` and `isf` #18002

dschmitz89 commented Feb 19, 2023 •

edited

Loading

dschmitz89 commented Mar 4, 2023 •

edited

Loading