-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: stats: fix skewnorm.cdf losing precision at large x #8473
Conversation
LGTM, it's good to have this now |
Closes #5160 too, right? |
Nice. The survival function
so we get accurate values for large |
@WarrenWeckesser I thought about it, but
And I did not find a failing test case (admittedly, I did not try too hard). Do you have one? |
Those values are all 0, but that just means the true values are smaller than ~1.11e-16. We should be able to do better than that. Here's a comparison:
Here's the default
Here's the version I suggested:
But that reveals a new problem: the survival function should not be negative! How accurate is the implementation of Owen's T with a call such as |
I said
but apparently it isn't as easy as I hoped. For |
The indentation in the test function is wrong--it should be four spaces, not five. |
The CDF implementation using
With scipy 1.0.0, we get
According to Wolfram Alpha, the correct answer is 3.870035046664392611e-31. |
problems in cdf and sf should be symmetric for opposite signs of x and a. one possibility would be to recompute using super for edge cases where numerical integration (scipy 1.0) works better. |
The current version fixes a problem which was reported by a user for somewhat large arguments. At still larger arguments, cdf is exponentially small and the current version suffers from catastrophic cancellation. In that regime, one likely needs to figure out the asymptotic expansion of the Owen's T function. While this could be a fun project, and the answer is likely available from the literature (or possibly even implemented inside of |
Evgeni, so what do you think is the next step? I don't think this pull request should be merged in its current state. It fixes one problem but creates a new problem. |
A small thought: the test case Haven't thought about whether there's cancellation in other regimes. |
It is not hard to find examples where the CDF implementation using Owen's T can result in big errors even when
Here's the CDF implemented with Owen's T function:
A couple cases where the error is big: x = -4, a = 2. The correct value (according to Wolfram Alpha) is 8.1298399188811398e-21.
x = -2, a = 5. The correct value is 1.55326826787106273e-26.
|
An alternative to using Owen's T is to split the integral of the PDF at x=0, to ensure that
It would still be nice to get the CDF and SF functions working reliably with Owen's T, because it is much faster. For example,
|
@josef-pkt wrote
Indeed, for the skew normal distribution,
and focus on getting the CDF function right. |
It might take a while for someone to resolve the problem of the catastrophic loss of precision, so in the meantime, I incorporated my comments above into an alternative pull request: #8501 |
Closing this in favor of gh-8501 |
Now that Owen's T function is available, can use it for the skew normal distribution.
closes #7746