Fix problems with arguments that are neither fit parameters nor independent variables #926

schtandard · 2023-11-18T14:43:24Z

Description

Model function arguments that are neither fit parameters nor independent variables were handled incorrect in two places.

They were incorrectly identified to not correspond to a model function argument in Model.fit() and thus a misleading spurious warning was issued.
They were not considered in ModelResult.plot_fit(), causing ModelResult.plot() to produce contradictory plots.

This PR remedies both issues and fixes #917 (moved to #920).

Type of Changes

Bug fix
New feature
Refactoring / maintenance
Documentation / examples

Tested on

Python: 3.11.6 | packaged by conda-forge | (main, Oct  3 2023, 10:29:11) [MSC v.1935 64 bit (AMD64)]

lmfit: 0.0.post2796+g4e2470a.d20231118, scipy: 1.11.3, numpy: 1.26.0, asteval: 0.9.31, uncertainties: 3.1.7

Verification

Have you

included docstrings that follow PEP 257?
(No API changes.)

referenced existing Issue and/or provided relevant link to mailing list?

verified that existing tests pass locally?
(I get several warnings and three skipped tests, though.)

verified that the documentation builds locally?
(Except for issues that also appeared when building on master, as described in Documentation does not build #924.)

squashed/minimized your commits and written descriptive commit messages?

added or updated existing tests to cover the changes?
(No tests were affected by the change.)
updated the documentation and/or added an entry to the release notes (doc/whatsnew.rst)?
(I did not add anything to doc/whatsnew.rst yet. I had a look at the file history and other PRs and it seems to be done in batch, separate from the PRs. I will add an entry if you ask me to.)
added an example?
(Same as for test, I don't think any example was affected and I did not add a new one.)

Checking only against independent variables fails in the case of arguments that are neither fit parameters nore independent variables.

Additional keyword arguments used during fitting are only known to the former.

newville · 2023-11-18T19:49:22Z

@schtandard I think that arguments of a Model function are either parameters or independent variables.

We generally prefer a process of agreeing on what an issue is and maybe even a description of a solution to that issue before a PR. Some cases are easy to identify as bugs, sometimes experienced developers see a problem and know how to fix it (but sometimes I get this wrong ;)).

I am having a hard time understanding what the actual problem is that you keep misidentifying as something else. Basically, we have not identified an Issue to be solved.

It's possible that "ensure that independent variables can be left unspecified in Model.fit() and Model.eval() if default values are given in the signature" would be a fix that would satisfy what you claim to need and be agreeable to us. Maybe that is what you are trying to say? I am not sure.

schtandard · 2023-11-18T20:21:52Z

We generally prefer a process of agreeing on what an issue is and maybe even a description of a solution to that issue before a PR.

I prefer that, too. I did not get the feeling that I was getting the issue across, though. I decided to just take a look at the code and when I discovered that the problem seemed to be isolated to two lines of code, I thought that maybe a PR would help communicate the issue. This doesn't seem to have been a total success, but I haven't given up hope just yet.

I think that arguments of a Model function are either parameters or independent variables.

Ok, then this is surely a big part of the confusion. From my point of view, your documentation and the module behavior disagree. Let's look at the example from your post.

def func(x, a=3, b=5, scale=None):
    output = a +x*b
    if scale is not None:  output *= scale
    return output
mod = Model(func)

If we ask the model what its independent variables and parameters are

print("independent variables:", mod.independent_vars)
print("parameters:", mod.param_names)

we get

independent variables: ['x']
parameters: ['a', 'b']

Apparently, the model thinks that scale is neither.

model.rst contains the helpful definition

independent variable

A function argument that is not a parameter or otherwise part of the
model, and that will be required to be explicitly provided as a
keyword argument for each fit with Model.fit() or evaluation
with Model.eval().

This is both in direct contradiction to your statement "You have repeatedly said that all independent variables must always be specified but that is simply untrue." and contains a reference to function arguments that are "otherwise part of the model" (i.e. not as a parameter or an independent variable).

And then there is ModelResult.userkws which is documented as

keyword arguments passed to Model.fit(), a dict, which will have independent data arrays such as x.

This is where scale goes in the example above (when one fits, via the kwargs argument of Model.fit()); not a parameter, not an independent variable but an extra keyword argument. This is what's not handled correctly by Model.fit() (causing a spurious warning) and ModelResult.plot_fit() (being ignored there).

Does this help clarify things?

newville · 2023-11-21T18:34:28Z

@schtandard Well, I guess I think we should probably make sure that function arguments can only be Parameters or independent variables. Admittedly, the concept of "independent variable" has evolved here from "first argument, must be 1D ndarry" (as I think is still true in scipy curve_fit) to "all the non-Parameter data needed to calculate the model, without limitation on data type".

So, I think we shoud assert that independent variables with a default value in the function signature should not be required in Model.fit() or Model.eval() (or any others I might be forgetting). And, when there is a default value, that should be used.

I think that such a fix would probably address most of the issues you have faced.

schtandard · 2023-11-22T11:05:27Z

That sounds like a good idea, though I cannot tell how many things would need changing for it. You probably have a much better overview than me there. Some comments:

I guess that would mean dropping Model.opts and (similar attributes in other classes) storing "extra" keyword arguments?
An attribute containing only the independent variables with no default arguments would then be useful, so users can check which arguments must be supplied.
27db25b should then be obsolete, though bd9b9f1 would remain the correct fix for the ModelResult.plot_fit() issue, I think.
Actually, this would mean that the param_names and independent_vars arguments to Model.__init__() are equivalent (being complementary), no? Keeping both for convenience and (limited) backwards compatibility might be good, but supplying both without naming all model function arguments should then raise an error, right?
Or maybe it would be favorable to have add_params and add_independents (or something) for naming only those arguments that should fall in the category they wouldn't got to automatically? Would have the advantage that the user knows they have to provide initial values for exactly the parameters in add_params (as the others have automatic initial values) but be somewhat less explicit, so maybe not.
The rule for automatically determining independent variables would need to change to "anything that isn't a keyword argument with a numerical default value" (i.e. not a parameter), of course. Two possible cases warrant special consideration:
- No independent variables (all arguments have numerical default values). Seems unlikely to be used, but not inconceivable. One could argue for the first argument to be used unless there is an explicit independent_vars=[] but I would probably leave it for consistency.
- No parameters (no arguments have numerical default values). This means there is nothing to fit, though it may still be useful for composite models? Don't know.
The essence of Always retain numerical keyword argument defaults #925 should be unaffected by this change, as it does not deal with independent variables at all (neither the old nor the new kind, it deals with parameters). The code surrounding it would need to be slightly adapted, of course (as using independent_vars should already disable the automatic detection of parameter names, just like param_names does now).

newville · 2024-03-18T12:49:51Z

@schtandard I believe this is now better (if not 100% fixed) with the merging of #941

schtandard added 2 commits November 18, 2023 15:30

Check against all model function arguments

27db25b

Checking only against independent variables fails in the case of arguments that are neither fit parameters nore independent variables.

Evaluate using ModelResult, not Model

bd9b9f1

Additional keyword arguments used during fitting are only known to the former.

This was referenced Feb 19, 2024

Require dill #940

Merged

More uniform independent vars #941

Merged

newville closed this Mar 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix problems with arguments that are neither fit parameters nor independent variables #926

Fix problems with arguments that are neither fit parameters nor independent variables #926

schtandard commented Nov 18, 2023

newville commented Nov 18, 2023 •

edited

Loading

schtandard commented Nov 18, 2023 •

edited

Loading

newville commented Nov 21, 2023

schtandard commented Nov 22, 2023

newville commented Mar 18, 2024

Fix problems with arguments that are neither fit parameters nor independent variables #926

Fix problems with arguments that are neither fit parameters nor independent variables #926

Conversation

schtandard commented Nov 18, 2023

Description

Type of Changes

Tested on

Verification

newville commented Nov 18, 2023 • edited Loading

schtandard commented Nov 18, 2023 • edited Loading

newville commented Nov 21, 2023

schtandard commented Nov 22, 2023

newville commented Mar 18, 2024

newville commented Nov 18, 2023 •

edited

Loading

schtandard commented Nov 18, 2023 •

edited

Loading