Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: stats: Use explicit MLE formulas in uniform.fit() #8554

Merged
merged 1 commit into from Apr 1, 2018

Conversation

WarrenWeckesser
Copy link
Member

uniform.fit() will now give the "exact" maximum likelihood parameter estimates.

In previous cases where I implemented the exact MLE formulas (#2519, #8358), I used decorators to tweak the fit docstring. I now think that approach is more trouble than it is worth. In this PR, I give uniform.fit its own docstring. This allows the docstring to be much simpler and clearer, by providing only the information relevant to uniform.fit. And the examples use uniform.fit instead of beta.fit and norm.fit.

(In another pull request--not yet created--I'll give the docstrings for norm.fit and expon.fit the same treatment.)

@WarrenWeckesser WarrenWeckesser added scipy.stats enhancement A new feature or improvement labels Mar 14, 2018
@WarrenWeckesser WarrenWeckesser force-pushed the uniform-fit branch 2 times, most recently from 13a73dd to c2cd5d1 Compare March 14, 2018 16:32
@chrisb83
Copy link
Member

Question: Does fit need to rely on MLE in that case? It is not the best estimator: if one uses the max of the observations to find the upper bound, one always understimates,it is better to use (1 + 1/n) * max, especially for small samples. (it is the minimum-variance unbiased estimator, see https://en.wikipedia.org/wiki/Uniform_distribution_(continuous))

@WarrenWeckesser
Copy link
Member Author

@chrisb83 wrote:

Question: Does fit need to rely on MLE in that case?

Yes. Currently fit computes the MLE. That's what it is documented to do; it must not do something else.

Enhancing the fit function with other fitting methods would be great (and it is something I have experimented with), but it is not something we should do ad hoc to individual distributions. To implement a new fitting technique, we'd need something like a method argument in the fit function, or even a new function if it turns out that changing the API of the current fit function is too awkward to do in a backwards-compatible way.

(The to-do/wish list is long. The other big enhancement to fit that is needed is to return information about the goodness of the MLE fit.)

@mirca
Copy link
Contributor

mirca commented Mar 17, 2018

@WarrenWeckesser by goodness of the MLE fit, you mean the Cramér-Rao Lower Bound? (Asking as someone who is willing to contribute some code if that's the case)

@WarrenWeckesser
Copy link
Member Author

@mirca: Yes. A standard calculation in MLE is to use the inverse of the Hessian of the log-likelihood function to estimate the asymptotic covariance matrix. If you are interested in this, it would be best to open up a new issue to discuss the appropriate calculations and the API for an implementation.

@josef-pkt
Copy link
Member

I don't think Hessian will work for uniform and other distributions where the support bounds are a function of the parameters.
AFAIR, the problem is because the likelihood function is not continuous or not differentiable.

@WarrenWeckesser
Copy link
Member Author

I added comments in uniform.fit() to explain how the maximum likelihood estimate is calculated for the uniform distribution.

@ev-br
Copy link
Member

ev-br commented Apr 1, 2018

I cannot help noticing that a significant part of this PR might be in rv_continuous.fit. Ditto for some other distributions which define special-cased fit method (gamma, beta). Maybe it would be good to have a pair of a public fit and private _fit which does all actual heavy lifting. However, this is clearly separate from this PR, and this PR looks good. Thanks Warren

@ev-br ev-br merged commit d468c4e into scipy:master Apr 1, 2018
@ev-br ev-br added this to the 1.1.0 milestone Apr 1, 2018
@WarrenWeckesser WarrenWeckesser deleted the uniform-fit branch April 1, 2018 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement scipy.stats
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants