Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accuracy of stats.rv_continuous integrating methods #6579

Closed
denkorzh opened this issue Sep 15, 2016 · 1 comment
Closed

Accuracy of stats.rv_continuous integrating methods #6579

denkorzh opened this issue Sep 15, 2016 · 1 comment
Labels
duplicate Issues that describe the same problem or that are reported multiple times scipy.stats

Comments

@denkorzh
Copy link

I changed a bit code from scipy.stats.rv_continuous documentation to get a custom exponential distribution with parameter 0.5:

class CustomExpon(rv_continuous):
    """"custom exponential distribution with parameter 0.5"""
    def _pdf(self, x, *args):
        if x < 0.:
            return 0.
        else:
            return .5 * np.exp(- .5 * x)
custom_expon = CustomExpon(name='custom_expon')

But when I evaluate print custom_expon.mean() I get a number of InterationWarning and the result is 2.0576933609. (The correct answer is 2.)

I have read the issue #6185 and the advice to set range of distributon a=0. is helpful. But are there any other methods to increase the accuracy of integrating?

@ev-br
Copy link
Member

ev-br commented Sep 15, 2016

This is really an issue of communicating the intent rather then accuracy.

On the technical level, if you do not define a=0, then your distribution is supposed to have the support of (-inf, inf). Then the numerical integration routine tries to evaluate the integral over the this interval, and finds it difficult (which is not too surprising in retrospect,cf #5428).

Now, the framework has a standard way of specifying the support of a distribution, by setting a and b. What you are doing, you're tricking it into believing that you want the support of (-inf, inf), and normalize the PDF for [0, inf].

As an aside, as soon as you do if x < 0, you lose vectorization (try custom_expon.pdf([1, 2, 3])).

Conclusion: setting a=0 is the correct thing to do. I do not believe there is a bug, so I'm closing this ticket. Feel free to keep discussing though (even though usage questions are better asked on StackOverflow or scipy-user mailing list)

@ev-br ev-br closed this as completed Sep 15, 2016
@ev-br ev-br added duplicate Issues that describe the same problem or that are reported multiple times scipy.stats labels Sep 15, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate Issues that describe the same problem or that are reported multiple times scipy.stats
Projects
None yet
Development

No branches or pull requests

2 participants