In [1]:
%matplotlib notebook
from pylab import *



In [2]:
import lmfit as lm
from lmfit import (Model,                  # create a model from a python function
                   models,                 # models for the most common functions 
                   conf_interval,          # robust confidence interval calculation 
                   report_fit, report_ci)  # pretty-print report fit summaries



# Simulate some data

In [3]:
real_mean = 1
real_sigma = 1
num_points = 1000

data = real_sigma * randn(num_points) + real_mean

In [4]:
bins = linspace(-3, 5, 40)
counts, _ = histogram(data, bins)
x = bins[:-1] + 0.5*(bins[1] - bins[0])
y = counts
hist(data, bins)
plot(x, y, 'o');

<IPython.core.display.Javascript object>

### Intent and caveat

We are going to fit a Gaussian to $(x, y)$ values using "curve-fit", i.e. least-square minimization of the difference `model - data`. This is OK to first order approximation and for the purpose of this example. However this assumes:

$$ y = f(x) + \epsilon$$

where $f(x)$ is a deterministic model and  $\epsilon$ is a normal random variable.

In this simulation, however, the $y$ values are "counts" of an histogram, the distribution of each of this counts is  a Poisson random variable (not a Gaussian). So the initial assumption does not strictly hold. This is a very simple model and the fit will be good anyway, but in other cases (i.e. exponential fit) the difference may be important.

## Define the model
LMFit documentation:

- [Modeling Data and Curve Fitting](http://cars9.uchicago.edu/software/python/lmfit/model.html)

In [5]:
def gauss(x, mean=0, sigma=1, ampl=1):
    return ampl * exp((-(x - mean)**2) / (2 * sigma**2))

In [6]:
model = lm.Model(gauss)

**Note:**

- `Model` uses default values as initial values for fitting

In [7]:
model

<lmfit.Model: Model(gauss)>

In [8]:
model.param_names

{'ampl', 'mean', 'sigma'}

Model parameters can have constrains (i.e. boundaries: `min` and/or `max`):

In [9]:
model.set_param_hint('sigma', min=0)

## Fitting model to data

This fits the defined model to the data:

In [10]:
fit_res = model.fit(y, x=x)

> **NOTE** You can optionally pass initial values for the fitted parameters 
> (e.g. `model.fit(y, x=x, mean=2)`). When no inital values are speciefied
> they are taken from the defaults values in the function definition (e.g. `gauss`).

In [11]:
lm.report_fit(fit_res)

[[Fit Statistics]]
    # function evals   = 99
    # data points      = 39
    # variables        = 3
    chi-square         = 1310.110
    reduced chi-square = 36.392
[[Variables]]
    sigma:   1.03452073 +/- 0.036773 (3.55%) (init= 1)
    ampl:    80.2724385 +/- 2.471227 (3.08%) (init= 1)
    mean:    1.00187550 +/- 0.036773 (3.67%) (init= 0)
[[Correlations]] (unreported correlations are <  0.100)
    C(sigma, ampl)               = -0.577 


The reduced $\chi^2$ can be found in:

In [12]:
fit_res.redchi

36.391948056456762

In [13]:
fig = fit_res.plot()

<IPython.core.display.Javascript object>

In [14]:
plt.ylabel('pluto')

<matplotlib.text.Text at 0x1eed4710>

## Robust confidence intervals

> The lmfit confidence module allows you to explicitly calculate confidence intervals for variable parameters. For most models, it is not necessary: the estimation of the standard error from the estimated covariance matrix is normally quite good.

> But for some models, e.g. a sum of two exponentials, the approximation begins to fail. For this case, lmfit has the function conf_interval() to calculate confidence intervals directly. This is substantially slower than using the errors estimated from the covariance matrix, but the results are more robust.

*From [Calculation of confidence intervals](http://cars9.uchicago.edu/software/python/lmfit/confidence.html)*
*in LmFit Documentation.*

In [15]:
ci = lm.conf_interval(fit_res)

In [16]:
lm.report_ci(ci)

         99.70%    95.00%    67.40%     0.00%    67.40%    95.00%    99.70%
sigma   0.92856   0.96573   1.00015   1.03452   1.07013   1.10849   1.15325
 ampl  72.63371  75.38154  77.85899  81.70895  82.70835  85.25869  88.13957
 mean   0.88471   0.92730   0.96525   1.00062   1.03855   1.07670   1.11958
