New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH:stats:Use explicit formula for gamma.fit('mm') #19932
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll have to check the formulas carefully, but if they are correct this looks pretty good. I solved for the formulas based on the moments listed on Wikipedia, and I checked the code against them. Looks good!
Please add tests like in gh-18824 that show this produces fits at least as good as the generic implementation.
Thanks for reviewing. Let me work on adding some tests. As for the NB point about checking for data within support, technically it is not necessary for MM because the formulas do not rely on data within the support. This is unlike MLE, where data outside the support has a likelihood of zero, making the total likelihood identically zero regardless of parameter value, and therefore cannot be solved. So data range check for MM is not a technical necessity but more of a sanity check on behalf of the user. I’m not keen to include such a check for MM, but if you think it’s worth having it I can add it too. A side note with respect to MM: there’s no guarantee that the fitted parameters are within the range of valid parameters. This is a well known limitation / characteristic of MM. I currently do not check that the fitted parameters are within range, so that user has a chance to see what are the values, and delay the exception until they actually use the parameters (if they do). In a sense, this is arguably more prudent than returning some within-range but absurd parameters. What do you think? |
Sure - we would not do that. After I write that, I realize that's essentially what we're doing with
I forgot about this. We added the following to the documentation when we added method of moments:
So yes, we should keep it like you have it, consistent with the documentation. |
Do you mean to include a script in the GitHub page to compare the old (generic) MM and the new (closed-form) MM and list the result? |
Oops, no, I mean add property-based tests to |
3dac624
to
d3c3a63
Compare
Thanks for adding tests. These are different from the tests in the linked PR in that they use hard-coded samples instead of samples that are generated at random, and they use hard-coded reference values (from where?) |
4289b99
to
fa92f51
Compare
The test generates a random sample from a gamma distribution and fits a gamma distribution to it. It then checks that the first r moments of the fitted distribution is equal to those of the sample, where r is the number of estimated parameters. The simulation parameters are restricted to a small-ish range to ensure the MM estimates lie within the range of valid parameter values, so that "round-trip" testing can be performed.
fa92f51
to
e307aab
Compare
Thanks for reviewing. It makes a lot of sense. I have updated the test accordingly. Sorry about the messy "force push"es .. I seem to have a hard time getting it right! |
In the future, just regular push. We can clean b up the history at the end if we want to, but almost all PRs in stats get squash-merged anyway. |
Thanks @fancidev; this looks great now. |
@fancidev a recent SO question brought to my attention that |
with pytest.raises(ValueError, match=error_msg): | ||
stats.halfcauchy.fit(data, method='mm', **kwds) | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why halfcauchy
here if the PR is about gamma
? Copy Paste hickup? @fancidev
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why
halfcauchy
here if the PR is aboutgamma
? Copy Paste hickup? @fancidev
Oops, yes exactly. Let me make a PR to correct it.
First time I heard of log-laplace :-) Let me look it up after the other Issues/PRs are closed. |
Reference issue
Closes gh-19884.
What does this implement/fix?
Code explicit formula for method of moment fitting of gamma distribution.
Additional information