Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CompositeModel depends on return type of model functions #875

Closed
Julian-Hochhaus opened this issue May 4, 2023 · 6 comments
Closed

CompositeModel depends on return type of model functions #875

Julian-Hochhaus opened this issue May 4, 2023 · 6 comments

Comments

@Julian-Hochhaus
Copy link
Contributor

Description

When using a single model, whether the return type of the model function is a numpy.ndarray or a list does not affect the result. However, when using a CompositeModel that combines two models (e.g. by + operation), fitting fails if both model functions return a list. In contrast, if at least one of the model functions returns a numpy.ndarray, the fit succeeds.

The issue arises due to the _residual() method of the model class, which throws an error message while calculating diff = model - data. Here, model is the output of eval(). The problem appears to be that the length of model is twice that of data and the output of each model function when using two lists as the return type. It is likely that the bug is caused by appending the lists instead of adding them element-wise during Model evaluation.

Using two lists as return:

import numpy as np
from lmfit.model import Model


def lin1(x, k):
    rtrn = [k*x1 for x1 in x]
    return rtrn



def lin2(x, k):
    rtrn = [k * x1 for x1 in x]
    return rtrn



y=np.linspace(0, 100, 100)
x=np.linspace(0, 100, 100)

Model1 = Model(lin1, independent_vars=["x"], prefix="m1_")
Model2 = Model(lin2, independent_vars=["x"], prefix="m2_")


ModelSum = Model1 + Model2
pars = ModelSum.make_params()

pars.add('m1_k', value=0.5)
pars.add('m2_k', value=0.5)
result = ModelSum.fit(y, params=pars, x=x)
print(result.fit_report())

Error message:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-8-ad2e6908430e> in <module>
     27 pars.add('m1_k', value=0.5)
     28 pars.add('m2_k', value=0.5)
---> 29 result = ModelSum.fit(y, params=pars, x=x)
     30 print(result.fit_report())

~/anaconda3/lib/python3.8/site-packages/lmfit/model.py in fit(self, data, params, weights, method, iter_cb, scale_covar, verbose, fit_kws, nan_policy, calc_covar, max_nfev, **kwargs)
   1088                              nan_policy=self.nan_policy, calc_covar=calc_covar,
   1089                              max_nfev=max_nfev, **fit_kws)
-> 1090         output.fit(data=data, weights=weights)
   1091         output.components = self.components
   1092         return output

~/anaconda3/lib/python3.8/site-packages/lmfit/model.py in fit(self, data, params, weights, method, nan_policy, **kwargs)
   1445         self.userkws.update(kwargs)
   1446         self.init_fit = self.model.eval(params=self.params, **self.userkws)
-> 1447         _ret = self.minimize(method=self.method)
   1448 
   1449         for attr in dir(_ret):

~/anaconda3/lib/python3.8/site-packages/lmfit/minimizer.py in minimize(self, method, params, **kws)
   2377                         val.lower().startswith(user_method)):
   2378                     kwargs['method'] = val
-> 2379         return function(**kwargs)
   2380 
   2381 

~/anaconda3/lib/python3.8/site-packages/lmfit/minimizer.py in leastsq(self, params, max_nfev, **kws)
   1701         result.call_kws = lskws
   1702         try:
-> 1703             lsout = scipy_leastsq(self.__residual, variables, **lskws)
   1704         except AbortFitException:
   1705             pass

~/anaconda3/lib/python3.8/site-packages/scipy/optimize/_minpack_py.py in leastsq(func, x0, args, Dfun, full_output, col_deriv, ftol, xtol, gtol, maxfev, epsfcn, factor, diag)
    411     if not isinstance(args, tuple):
    412         args = (args,)
--> 413     shape, dtype = _check_func('leastsq', 'func', func, x0, args, n)
    414     m = shape[0]
    415 

~/anaconda3/lib/python3.8/site-packages/scipy/optimize/_minpack_py.py in _check_func(checker, argname, thefunc, x0, args, numinputs, output_shape)
     24 def _check_func(checker, argname, thefunc, x0, args, numinputs,
     25                 output_shape=None):
---> 26     res = atleast_1d(thefunc(*((x0[:numinputs],) + args)))
     27     if (output_shape is not None) and (shape(res) != output_shape):
     28         if (output_shape[0] != 1):

~/anaconda3/lib/python3.8/site-packages/lmfit/minimizer.py in __residual(self, fvars, apply_bounds_transformation)
    588             raise AbortFitException(f"fit aborted: too many function evaluations {self.max_nfev}")
    589 
--> 590         out = self.userfcn(params, *self.userargs, **self.userkws)
    591 
    592         if callable(self.iter_cb):

~/anaconda3/lib/python3.8/site-packages/lmfit/model.py in _residual(self, params, data, weights, **kwargs)
    798             raise ValueError(msg)
    799 
--> 800         diff = model - data
    801 
    802         if diff.dtype == complex:

ValueError: operands could not be broadcast together with shapes (200,) (100,) 

Using at least one numpy.ndarray as return:

import numpy as np
from lmfit.model import Model


def lin1(x, k):
    rtrn = k*x
    return rtrn



def lin2(x, k):
    rtrn = [k * x1 for x1 in x]
    return rtrn



y=np.linspace(0, 100, 100)
x=np.linspace(0, 100, 100)

Model1 = Model(lin1, independent_vars=["x"], prefix="m1_")
Model2 = Model(lin2, independent_vars=["x"], prefix="m2_")


ModelSum = Model1 + Model2
pars = ModelSum.make_params()

pars.add('m1_k', value=0.5)
pars.add('m2_k', value=0.5)
result = ModelSum.fit(y, params=pars, x=x)
print(result.fit_report())

Fit succeeds!

Version information

Python: 3.8.3 (default, Jul 2 2020, 16:21:59)
[GCC 7.3.0]

lmfit: 1.2.1, scipy: 1.10.1, numpy: 1.23.4,asteval: 0.9.29, uncertainties: 3.1.6

If you need any further information, please let me know,
Kind regards,
Julian

@newville
Copy link
Member

newville commented May 4, 2023

@Julian-Hochhaus Yup, Model functions need to return ndarrays with datatype of Float64.
I would call the error you are seeing as intentional.

@Julian-Hochhaus
Copy link
Contributor Author

But that is not always the case, it still works for using a single function as modelfunction or using mixed return types (numpy.ndarray and list) in a composite model. Is that due to numpy taking care of the + operator if at least one numpy.ndarray is used? I assume the reason is similar for a single model function where it does not matter whether a listor numpy.ndarrayis returned?

That's a bit confusing.

@newville
Copy link
Member

newville commented May 4, 2023

@Julian-Hochhaus I view this as similar to the discussion at #873. We have tried to support "array-like" data, such as lists of numbers or several third-party objects (pandas.Series, HDF5 Datasets, xarray, ....) that can be "obviously converted" into numpy ndarrays with dtype Float64.

We could try to support these better. Like, in the case here, it would not be too hard to apply np.asfarray() to the two results left and right as they are combined in CompositeModel so that your example is not adding two lists.

But why would a Model function return a list? Why would someone use list comprehension instead of ndarrays and ufuncs? Do we want to permit this poor usage or flag it?

For input data as in #873, we could try to store in the ModelResult the exact object the user sent (which we do not use) as well as the np.asfarray(data) that we do actually use. I think that causes bigger headaches (is it our responsibility to guarantee that the data passed in can be serialized?). Separate, but similar to this.

On the other hand, we could also just be less permissive. If we're thinking about adding np.asfarray() to every result because people are using list comprehension instead of ndarrays, we could instead raise TypeErrors for values that are not strictly Float64 ndarrays.

@Julian-Hochhaus
Copy link
Contributor Author

Thanks, I get the point. I still find it a bit unsatisfying that it works for single models but not for composite models.
Therefore it was quite hard to identify the problem when it occurred.

One of my colleagues originally had the problem while using lmfit and asked me for help and I honestly needed some time to pin down the error, because the error message is not too helpful in solving the problem and the problem meanwhile is a bit counterintuitive.

@newville
Copy link
Member

newville commented Jul 3, 2023

@Julian-Hochhaus I was hoping that #899 would have helped here, but it doesn't, at least not yet (it gets ~halfway there). But, we will fix this for 1.2.2.

@newville
Copy link
Member

newville commented Jul 6, 2023

This should now be fixed in the master branch. A variation of your example was added as a test.

@newville newville closed this as completed Jul 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants