fix aic/bic calculation #263

dbrakenhoff · 2021-01-13T17:03:38Z

Short Description

Fix number of parameters: nparam is number of parameters used in optimization
use noise instead of residuals for calculation

Checklist before PR can be merged:

- nparam is number of parameters where vary == True - use noise instead of residuals

deal with larger values for AIC/BIC

dbrakenhoff · 2021-02-24T17:45:27Z

I've looked into this again and I think the implementation of AIC/BIC calculation is currently incorrect. This might also explain why these statistics were yielding unexpected results, e.g. models with a much better fit after adding a missing variable resulting in a higher AIC than the model without this missing variable. Basically the penalty for extra parameters was outweighing the calculated increase in likelihood.

Based on this entry in the Wikipedia article about the AIC, I propose the following change:

Current implementation:

-2.0 * log((res ** 2.0).sum()) + 2.0 * nparam

Proposed implementation:

res.index.size * log((res ** 2.0).sum()) + 2.0 * nparam

Furthermore, as discussed before the AIC/BIC are calculated using the noise (uncorrelated residuals). That then leads to the question what to return for these statistics when no noisemodel is present? Do we use the residuals, or do we return NaN for the AIC/BIC? I've now opted for the former option, still calculating a value, but formally it can only be interpreted if the residuals meet the statistical criteria (no autocorrelation, homoscedastic, etc.). I think this makes sense, as the use of a noisemodel is also not necessarily a a guarantee that these criteria are met. The fit report curently sets the AIC/BIC to NaN if noise=False.

This change also leads to larger AIC/BIC values, so the fit_report is also changed slightly to accommodate the larger values.

raoulcollenteur · 2021-02-26T17:32:47Z

I checked and this is the same as Lmfit does so I think that makes extra sense

https://github.com/lmfit/lmfit-py/blob/b7d458b548088a8a27d5ddf3c44ff8b224304172/lmfit/minimizer.py#L379

Merging.

dbrakenhoff added 6 commits January 13, 2021 17:56

fix aic/bic calculation

a4c71db

- nparam is number of parameters where vary == True - use noise instead of residuals

add encoding to read_waterbase for pandas 1.2

2935a0c

Merge remote-tracking branch 'origin/dev' into aicbic

a6d5e44

change calculation of AIC/BIC

daa2636

improve fit report

de4e2dc

deal with larger values for AIC/BIC

add logic to use noise or residuals in AIC/BIC

f81587b

dbrakenhoff requested a review from raoulcollenteur February 24, 2021 22:14

raoulcollenteur closed this Feb 26, 2021

raoulcollenteur reopened this Feb 26, 2021

raoulcollenteur merged commit a793a79 into dev Feb 26, 2021

raoulcollenteur deleted the aicbic branch March 15, 2021 06:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix aic/bic calculation #263

fix aic/bic calculation #263

dbrakenhoff commented Jan 13, 2021

dbrakenhoff commented Feb 24, 2021

raoulcollenteur commented Feb 26, 2021

fix aic/bic calculation #263

fix aic/bic calculation #263

Conversation

dbrakenhoff commented Jan 13, 2021

Short Description

Checklist before PR can be merged:

dbrakenhoff commented Feb 24, 2021

raoulcollenteur commented Feb 26, 2021