CompositeModel depends on return type of model functions #875

Julian-Hochhaus · 2023-05-04T07:50:57Z

Description

When using a single model, whether the return type of the model function is a numpy.ndarray or a list does not affect the result. However, when using a CompositeModel that combines two models (e.g. by + operation), fitting fails if both model functions return a list. In contrast, if at least one of the model functions returns a numpy.ndarray, the fit succeeds.

The issue arises due to the _residual() method of the model class, which throws an error message while calculating diff = model - data. Here, model is the output of eval(). The problem appears to be that the length of model is twice that of data and the output of each model function when using two lists as the return type. It is likely that the bug is caused by appending the lists instead of adding them element-wise during Model evaluation.

Using two lists as return:

import numpy as np
from lmfit.model import Model


def lin1(x, k):
    rtrn = [k*x1 for x1 in x]
    return rtrn



def lin2(x, k):
    rtrn = [k * x1 for x1 in x]
    return rtrn



y=np.linspace(0, 100, 100)
x=np.linspace(0, 100, 100)

Model1 = Model(lin1, independent_vars=["x"], prefix="m1_")
Model2 = Model(lin2, independent_vars=["x"], prefix="m2_")


ModelSum = Model1 + Model2
pars = ModelSum.make_params()

pars.add('m1_k', value=0.5)
pars.add('m2_k', value=0.5)
result = ModelSum.fit(y, params=pars, x=x)
print(result.fit_report())

Error message:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-8-ad2e6908430e> in <module>
     27 pars.add('m1_k', value=0.5)
     28 pars.add('m2_k', value=0.5)
---> 29 result = ModelSum.fit(y, params=pars, x=x)
     30 print(result.fit_report())

~/anaconda3/lib/python3.8/site-packages/lmfit/model.py in fit(self, data, params, weights, method, iter_cb, scale_covar, verbose, fit_kws, nan_policy, calc_covar, max_nfev, **kwargs)
   1088                              nan_policy=self.nan_policy, calc_covar=calc_covar,
   1089                              max_nfev=max_nfev, **fit_kws)
-> 1090         output.fit(data=data, weights=weights)
   1091         output.components = self.components
   1092         return output

~/anaconda3/lib/python3.8/site-packages/lmfit/model.py in fit(self, data, params, weights, method, nan_policy, **kwargs)
   1445         self.userkws.update(kwargs)
   1446         self.init_fit = self.model.eval(params=self.params, **self.userkws)
-> 1447         _ret = self.minimize(method=self.method)
   1448 
   1449         for attr in dir(_ret):

~/anaconda3/lib/python3.8/site-packages/lmfit/minimizer.py in minimize(self, method, params, **kws)
   2377                         val.lower().startswith(user_method)):
   2378                     kwargs['method'] = val
-> 2379         return function(**kwargs)
   2380 
   2381 

~/anaconda3/lib/python3.8/site-packages/lmfit/minimizer.py in leastsq(self, params, max_nfev, **kws)
   1701         result.call_kws = lskws
   1702         try:
-> 1703             lsout = scipy_leastsq(self.__residual, variables, **lskws)
   1704         except AbortFitException:
   1705             pass

~/anaconda3/lib/python3.8/site-packages/scipy/optimize/_minpack_py.py in leastsq(func, x0, args, Dfun, full_output, col_deriv, ftol, xtol, gtol, maxfev, epsfcn, factor, diag)
    411     if not isinstance(args, tuple):
    412         args = (args,)
--> 413     shape, dtype = _check_func('leastsq', 'func', func, x0, args, n)
    414     m = shape[0]
    415 

~/anaconda3/lib/python3.8/site-packages/scipy/optimize/_minpack_py.py in _check_func(checker, argname, thefunc, x0, args, numinputs, output_shape)
     24 def _check_func(checker, argname, thefunc, x0, args, numinputs,
     25                 output_shape=None):
---> 26     res = atleast_1d(thefunc(*((x0[:numinputs],) + args)))
     27     if (output_shape is not None) and (shape(res) != output_shape):
     28         if (output_shape[0] != 1):

~/anaconda3/lib/python3.8/site-packages/lmfit/minimizer.py in __residual(self, fvars, apply_bounds_transformation)
    588             raise AbortFitException(f"fit aborted: too many function evaluations {self.max_nfev}")
    589 
--> 590         out = self.userfcn(params, *self.userargs, **self.userkws)
    591 
    592         if callable(self.iter_cb):

~/anaconda3/lib/python3.8/site-packages/lmfit/model.py in _residual(self, params, data, weights, **kwargs)
    798             raise ValueError(msg)
    799 
--> 800         diff = model - data
    801 
    802         if diff.dtype == complex:

ValueError: operands could not be broadcast together with shapes (200,) (100,)

Using at least one `numpy.ndarray` as return:

import numpy as np
from lmfit.model import Model


def lin1(x, k):
    rtrn = k*x
    return rtrn



def lin2(x, k):
    rtrn = [k * x1 for x1 in x]
    return rtrn



y=np.linspace(0, 100, 100)
x=np.linspace(0, 100, 100)

Model1 = Model(lin1, independent_vars=["x"], prefix="m1_")
Model2 = Model(lin2, independent_vars=["x"], prefix="m2_")


ModelSum = Model1 + Model2
pars = ModelSum.make_params()

pars.add('m1_k', value=0.5)
pars.add('m2_k', value=0.5)
result = ModelSum.fit(y, params=pars, x=x)
print(result.fit_report())

Fit succeeds!

Version information

Python: 3.8.3 (default, Jul 2 2020, 16:21:59)
[GCC 7.3.0]

lmfit: 1.2.1, scipy: 1.10.1, numpy: 1.23.4,asteval: 0.9.29, uncertainties: 3.1.6

If you need any further information, please let me know,
Kind regards,
Julian

The text was updated successfully, but these errors were encountered:

newville · 2023-05-04T11:24:21Z

@Julian-Hochhaus Yup, Model functions need to return ndarrays with datatype of Float64.
I would call the error you are seeing as intentional.

Julian-Hochhaus · 2023-05-04T11:33:52Z

But that is not always the case, it still works for using a single function as modelfunction or using mixed return types (numpy.ndarray and list) in a composite model. Is that due to numpy taking care of the + operator if at least one numpy.ndarray is used? I assume the reason is similar for a single model function where it does not matter whether a listor numpy.ndarrayis returned?

That's a bit confusing.

newville · 2023-05-04T12:38:32Z

@Julian-Hochhaus I view this as similar to the discussion at #873. We have tried to support "array-like" data, such as lists of numbers or several third-party objects (pandas.Series, HDF5 Datasets, xarray, ....) that can be "obviously converted" into numpy ndarrays with dtype Float64.

We could try to support these better. Like, in the case here, it would not be too hard to apply np.asfarray() to the two results left and right as they are combined in CompositeModel so that your example is not adding two lists.

But why would a Model function return a list? Why would someone use list comprehension instead of ndarrays and ufuncs? Do we want to permit this poor usage or flag it?

For input data as in #873, we could try to store in the ModelResult the exact object the user sent (which we do not use) as well as the np.asfarray(data) that we do actually use. I think that causes bigger headaches (is it our responsibility to guarantee that the data passed in can be serialized?). Separate, but similar to this.

On the other hand, we could also just be less permissive. If we're thinking about adding np.asfarray() to every result because people are using list comprehension instead of ndarrays, we could instead raise TypeErrors for values that are not strictly Float64 ndarrays.

Julian-Hochhaus · 2023-05-04T12:55:54Z

Thanks, I get the point. I still find it a bit unsatisfying that it works for single models but not for composite models.
Therefore it was quite hard to identify the problem when it occurred.

One of my colleagues originally had the problem while using lmfit and asked me for help and I honestly needed some time to pin down the error, because the error message is not too helpful in solving the problem and the problem meanwhile is a bit counterintuitive.

newville · 2023-07-03T21:08:10Z

@Julian-Hochhaus I was hoping that #899 would have helped here, but it doesn't, at least not yet (it gets ~halfway there). But, we will fix this for 1.2.2.

newville · 2023-07-06T14:30:34Z

This should now be fixed in the master branch. A variation of your example was added as a test.

This was referenced Jul 5, 2023

Calculate, store correlated uncertainties for variable parameters for easier downstream calcs #888

Merged

Complex eval uncertainty #901

Merged

newville closed this as completed Jul 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CompositeModel depends on return type of model functions #875

CompositeModel depends on return type of model functions #875

Julian-Hochhaus commented May 4, 2023

newville commented May 4, 2023

Julian-Hochhaus commented May 4, 2023

newville commented May 4, 2023

Julian-Hochhaus commented May 4, 2023

newville commented Jul 3, 2023

newville commented Jul 6, 2023

CompositeModel depends on return type of model functions #875

CompositeModel depends on return type of model functions #875

Comments

Julian-Hochhaus commented May 4, 2023

Description

Using two lists as return:

Error message:

Using at least one numpy.ndarray as return:

Version information

newville commented May 4, 2023

Julian-Hochhaus commented May 4, 2023

newville commented May 4, 2023

Julian-Hochhaus commented May 4, 2023

newville commented Jul 3, 2023

newville commented Jul 6, 2023

Using at least one `numpy.ndarray` as return: