-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
WIP: Add scale_pcov argument to scipy.optimize.curve_fit #448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@josef-pkt What's this case pcov=inf
? Did I put the if scale_pcov
in the right place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you will have to put the if scale_pcov
at line 536, inside the condition for valid pcov.
I think the case covers when the outerproduct of Jacobian (Hessian) is not invertible, for example reduced rank/perfect multicollinearity.
I think with have pcov is None
in that case.
I never looked at the case when (len(ydata) > len(p0))
doesn't hold. I don't know what leastsq returns in that case. (doesn't make much statistical sense.)
returning inf
is reasonable, but Travis could also have chosen nan or None instead.
@cdeil, @josef-pkt - what do you think about using a parameter name
I realize I'm turning things on end a bit, and possibly even making a mistake since I haven't studied the code thoroughly. But the point is that as an end user I don't know (or care) about the internal operation of rescaling the output covariance. From the user perspective what is conceptually happening is changing the interpretation of the With this suggested change I would also drop all the explicit calculational details in the documentation. In any case it wouldn't be obvious to me what the pcov rescaling is doing and whether the output is what I would expect (from the astronomer perspective). |
As I think about it more, I almost prefer the initial suggestion of using a mutually exclusive kwarg like |
one problem with using the term in the function |
I think my slightly preferred solution would also be to use |
why do you scale by the standard deviation of the ydata? Can either of you write a test case, ydata, xdata, expected results for params and sqrt(diag(pcov))? My impression is that there are two mutually exclusive user groups, and each won't understand the interpretation for the other part. If this get's too complicated, then maybe adding a second function would be better. |
@josef-pkt We certainly don't want to introduce a third definition of |
we don't know the correct scale before estimation, that's why we multiply pcov by an estimate of the error variance. What do you expect in the case that sigma is not given? : |
Here's my gist. I'm pretty sure there is no factor we can apply to So I guess we should forget about this idea and focus on explaining well in the docstring why scaling @josef-pkt I think a whole separate function might be too extreme, given that the difference is only one line of code. What do you think about calling the new parameter |
@josef-pkt - when sigma is not given I usually expect that it defaults to 1.0. But we are experiencing a language issue (coming from different disciplines) because I tend to think of About a reference, I'm not sure if it's online, but Numerical Recipes in the section describing |
@cdeil - could you add to your gist an example where you fit a constant to some data to estimate the mean? The advantage is that the "correct" answers for mean and uncertainty on mean are trivial to compute. For the "ydata_err" version, the uncertainty estimate should be independent of the variance of the input data since this corresponds to the case where we are forcing our knowledge of data uncertainties and not relying on the actual data variance. This would be a good test case as well, again because the correct answers are obvious by inspection. |
I'm not immediately opposed to a new function just because it is only different by one line. But the question to ask is whether having a new function clarifies overall usage for people who want to fit, or confuses the issue. I suspect it will confuse the issue because you now see two fit functions and have to decide which one. I think the new "ydata_err" parameter would be my choice for the simplest way to make the distinction here. |
BTW, I was assuming that a new function would mostly just consist of calling the existing |
On Mon, Feb 25, 2013 at 9:26 AM, Tom Aldcroft notifications@github.comwrote:
|
I have updated the gist with a cell at the end that shows that multiplying the :-) I don't know how to explain it better, sorry, hopefully Josef will find a reference that does a better job. Given that @taldcroft I will make a more extensive notebook that also contains the basic example you have in your notebook, but for now I just wanted to understand myself what is going on ... |
good gist, I think it explains the difference pretty well. Note, I never heard of "chi" before this discussion, because as far as I have seen, it (chi2 / dof) is always 1 by definition. Overall I like the boolean solution better, but understand that it would be easier to use if there are different sigma-weight arguments. |
I would argue that curve_min is inadequate for most scientist anyways: It is missing parameter bounds and the ability to fix parameters. Probably something like lmfit, kaptyn or sherpa should be included into scipy. |
Numerical Recipes: |
looking at SAS, I didn't see a weights option, but the allow a user-given residual variance
in your case it would be 1. Related: So one option would also be to add a |
another package
|
I'd prefer to have a separate keyword argument, say, The documentation should also explain that the covariance matrix is either based on standard errors estimated from the data, or those supplied by the user. In the estimation case, it needs to also discuss how the relative weights affect the estimate. It should also explain whether the relative weights are I wouldn't like boolean flags and adding discussion about "scaling" the covariance matrix --- the covariance matrix is defined as So for instance @Tillsten: bounds and constraints (eg. using L-BFGS-B) and fixed parameter masks could probably be added to curve_fit without too many problems. |
@pv I am not sure, for the typical data fitting problems i had much better performance with levenberg-marquard than any of the other solvers. I also think there should be a scipy.fitting package. The optimize package tends to irritate people who wants just fit a function. I also think it is enough to include one of existing packages, e.g. |
http://www.astro.rug.nl/software/kapteyn/kmpfit.html
|
I don't see a way to implement two different sigma arguments that almost mean the same, are mutually exclusive, should have a "statistics" default, and to explain this easily in the doc string. In analogy to the linear model and the model without weights/sigma, we always (?) have a scale factor. This function is minimizing the sum of squares ("Nonlinear Least Squares") not "chi" or "chisquare" as in astronomy packages. |
@Tillsten: if on this page below the "Fitting" subtitle, there would be additional tools, people would not find them? The point is that having too many namespaces also complicates things. Anyway, I won't object too strongly. We could as well rename scipy.odr to scipy.fitting and add whatever additional stuff we want there. |
@pv Maybe a submodul in optimize would be enough? As someone who tries to spread python typical question to curve_fit are:
Because installing custom packages in windows as an non admin is quite an adventure (thank god for winpython), i would be very happy to see these basic functionality in scipy. sorry btw for going a little bit offtopic. |
I agree with @Tillsten's last comment that we need this functionality (but preferably not in a new submodule). We discussed before including lmfit at some point. I haven't used any of Sherpa, Kapteyn and Minuit though, and lmfit not very extensively, so no opinion on their relative merits. |
As I indicated on the scipy-user mailing list, I would really like to see this pull request implemented since in most cases what I care about is the true uncertainty on the parameters, taking into account the errors on the data. I also think that the docstring of |
I totally share @Tillsten point of view about its exp. physicist experience. That's why I use lmfit for this purpose (and it still longer than my favorite gnuplot approach). |
👍 to treating |
How about creating a second function |
But is it really specific to Astronomy to not rescale the covariance matrix to chi^2/dof = 1? The |
same arguments as before: every (general) statistics package that I know, does by default the same as the current curve_fit. Since I figured out what this story is I have seen it mentioned in several packages and in tutorials oriented to some physical (?) sciences. But I have no idea how wide spread the use is outside of astronomy and related fields. |
@josef-pkt - i think that just because no one is surprised doesn't make it right though ;) I've used it for a couple of years before realizing this myself... Anyway, if that is what other packages are doing, then that does add to the argument for keeping the current default, but do other packages also call the errors |
At least in R, the standard statistics package, they call it |
Note, I'm not too happy now about
There were some suggestion higher up in this PR, especially Pauli's, how to fix this without changing the default. (without repeating the entire discussion) adding a |
@josef-pkt - thanks for the clarifications. I personally feel that the approach in this PR is fine, i.e. have an optional |
The only two things that I really care about are that the default stays with scaling and that the term Maybe a vote on the different solutions, and someone needs to update the pull request. Then Ralf or Pauli can check and merge. |
At this point the extra keyword option seems reasonable. I would vote for a flag name more like The name |
deadline for next release is approaching |
Perhaps it's time to get this moving. How about this: https://github.com/pv/scipy-work/compare/curve-fit-yerr |
@pv looks good to me what happens if absolute_sigma is True and pcov is None (wasn't returned by leastsq)?
Should the behavior match the absolut_sigma is False case. |
@pv - looks good to me too! |
Continued in gh-3098 @josef-pkt: I changed the returned |
Merged in 511ea8f, thanks everyone for contributing to the discussion. |
An option
scale_pcov
is added toscipy.optimize.curve_fit
, to accommodate the common cases:sigma
= relative weights, output covariance matrixpcov
should be scaled. This was the previous behaviour, so to keep backwards compatibility I chose the defaultscale_pcov=True
.sigma
= one standard deviation errors onydata
,pcov
should not be scaled. This use case was requested e.g. here and can now be obtained by explicitly settingscale_pcov=False
.@josef-pkt @taldcroft Please comment if the description in the docstring is OK or if it can be improved.
I was thinking that maybe a new option
yerr
orydata_err
that is mutually exclusive withsigma
instead of the new optionscale_pcov
could be better?I can add a unit test for
scale_pcov=False
tomorrow.