# Curve fitting

Curve fitting in ERLabPy largely relies on [lmfit](https://lmfit.github.io/lmfit-py/), a flexible curve fitting library for Python, and [xarray-lmfit](https://xarray-lmfit.readthedocs.io/stable/), a compatibility layer between xarray objects and lmfit models.

ERLabPy also provides optional integration of lmfit models with [iminuit ](https://github.com/scikit-hep/iminuit), which is a Python interface to the [Minuit C++ library ](https://root.cern.ch/doc/master/Minuit2Page.html) developed at CERN.

:::{note}

If you are new to [lmfit](https://lmfit.github.io/lmfit-py/) or [xarray-lmfit](https://xarray-lmfit.readthedocs.io/stable/), visit the [lmfit documentation](https://lmfit.github.io/lmfit-py/fitting.html) and the [xarray-lmfit user guide](https://xarray-lmfit.readthedocs.io/stable/user-guide/) first!

:::

In this tutorial, we begin with some convenient functions that ERLabPy provides for common tasks such as Fermi edge fitting. Next, we will introduce some models that are available in ERLabPy. Finally, we will show how to use [iminuit](https://github.com/scikit-hep/iminuit) with lmfit models.


In [None]:
import lmfit
import matplotlib.pyplot as plt
import numpy as np
import xarray as xr

import erlab.analysis as era
import erlab.plotting as eplt
from erlab.io.exampledata import generate_gold_edge

In [None]:
%config InlineBackend.figure_formats = ["svg", "pdf"]
plt.rcParams["figure.constrained_layout.use"] = True
plt.rcParams["figure.dpi"] = 96
plt.rcParams["image.cmap"] = "viridis"
plt.rcParams["figure.figsize"] = eplt.figwh(wscale=1.2, fixed_height=False)

(fermi edge fitting)=

## Fermi edge fitting

Functions related to the Fermi edge are available in {mod}`erlab.analysis.gold`. To fit a polynomial to a Fermi edge, you can use {func}`erlab.analysis.gold.poly`.


:::{hint}

The interactive Fermi edge fitting tool {func}`erlab.interactive.goldtool` can be used to generate the code below interactively.

:::

In [None]:
gold = generate_gold_edge(temp=100, seed=1)

result = era.gold.poly(
    gold,
    angle_range=(-15, 15),
    eV_range=(-0.2, 0.2),
    temp=100.0,
    vary_temp=False,
    bkg_slope=False,
    degree=2,
    plot=True,
)

The resulting polynomial can be used to correct the Fermi edge with {func}`erlab.analysis.gold.correct_with_edge`:

In [None]:
corrected = era.gold.correct_with_edge(gold, result)

corrected.qplot(cmap="Greys")  # Plot the corrected data
eplt.fermiline()  # Annotate the Fermi level

(pre-defined-models)=

## Pre-defined models

Creating composite models with different prefixes every time can be cumbersome, so ERLabPy provides some pre-defined models in {mod}`erlab.analysis.fit.models`.

Before fitting, let us generate a Gaussian peak on a linear background:

In [None]:
# Generate toy data
x = np.linspace(0, 10, 50)
y = -0.1 * x + 2 + 3 * np.exp(-((x - 5) ** 2) / (2 * 1**2))

# Add some noise with fixed seed for reproducibility
rng = np.random.default_rng(5)
yerr = np.full_like(x, 0.3)
y = rng.normal(y, yerr)

# Plot the data
plt.errorbar(x, y, yerr, fmt="o")

### Fitting multiple peaks

One example is {class}`MultiPeakModel <erlab.analysis.fit.models.MultiPeakModel>`, which works like a composite model of multiple Gaussian or Lorentzian peaks.

By supplying keyword arguments, you can specify the number of peaks, their shapes, whether to multiply with a Fermi-Dirac distribution, the shape of the background, and whether to convolve the result with experimental resolution.

For a detailed explanation of all the arguments, see its {class}`documentation <erlab.analysis.fit.models.MultiPeakModel>`.

The model can be constructed as follows:

In [None]:
model = era.fit.models.MultiPeakModel(
    npeaks=1, peak_shapes=["gaussian"], fd=False, background="linear", convolve=False
)
params = model.make_params(p0_center=5.0, p0_width=0.2, p0_height=3.0)
params

We can now fit the model to the toy data:

In [None]:
result = model.fit(y, x=x, params=params, weights=1 / yerr)
_ = result.plot()

We can also plot components.

In [None]:
comps = result.eval_components()
plt.errorbar(x, y, yerr, fmt="o", zorder=-1, alpha=0.3)
plt.plot(x, result.eval(), label="Best fit")
plt.plot(x, comps["1Peak_p0"], "--", label="Peak")
plt.plot(x, comps["1Peak_bkg"], "--", label="Background")
plt.legend()

Now, let us try fitting MDCs cut from simulated data with multiple Lorentzian peaks, convolved with a common instrumental resolution.

In [None]:
from erlab.io.exampledata import generate_data

data = generate_data(seed=1).T
cut = data.qsel(ky=0.3)
cut.qplot(colorbar=True)

In [None]:
mdc = cut.qsel(eV=0.0)
mdc.qplot()

First, we define the model and set the initial parameters.

In [None]:
model = era.fit.models.MultiPeakModel(
    npeaks=2, peak_shapes=["lorentzian"], fd=False, background="linear", convolve=True
)

params = model.make_params(
    p0_center=-0.5,
    p1_center=0.5,
    p0_width=0.03,
    p1_width=0.03,
    lin_bkg={"value": 0.0, "vary": False},
    const_bkg=0.0,
    resolution=0.03,
)
params

Then, we can fit the model to the data using {meth}`xarray.DataArray.xlm.modelfit` from {mod}`xarray-lmfit`:

In [None]:
result = mdc.xlm.modelfit("kx", model=model, params=params)
_ = result.modelfit_results.item().plot()

## Fitting across multiple dimensions

:::{note}

There is a dedicated module for Fermi edge fitting and correction, described [here](fermi edge fitting). The following example is for illustrative purposes.

:::

Suppose you have to fit a single model to multiple data points across some dimension, or even multiple dimensions. `xarray-lmfit` can handle this with ease.

Let's demonstrate this with a simulated cut that represents a curved Fermi edge at 100 K, with an energy broadening of 20 meV.

In [None]:
from erlab.io.exampledata import generate_gold_edge

gold = generate_gold_edge(temp=100, Eres=0.02, seed=1)
gold.qplot(cmap="Greys")

We first select ± 0.2 eV around the Fermi level and fit the model across the energy
axis for every EDC.

In [None]:
gold_selected = gold.sel(eV=slice(-0.2, 0.2))
result_ds = gold_selected.xlm.modelfit(
    "eV",
    era.fit.models.FermiEdgeModel(),
    params={
        "temp": {"value": 100.0, "vary": False},
        "back1": {"value": 0.0, "vary": False},
    },
    guess=True,
)
result_ds

Notice how the data variables in the resulting Dataset now depend on the coordinate
`alpha`. Let's plot the center of the edge as a function of angle!

In [None]:
gold.qplot(cmap="Greys")
plt.errorbar(
    gold_selected.alpha,
    result_ds.modelfit_coefficients.sel(param="center"),
    result_ds.modelfit_stderr.sel(param="center"),
    fmt=".",
)

## Fitting multidimensional models

Fitting is not limited to 1D models. The following example demonstrates a global fit to the cut with a multidimensional model. First, we normalize the data with the averaged intensity of each EDC and then fit the data to {class}`FermiEdge2dModel <erlab.analysis.fit.models.FermiEdge2dModel>`.

In [None]:
gold_norm = gold_selected / gold_selected.mean("eV")
result_2d = gold_norm.T.xlm.modelfit(
    coords=["eV", "alpha"],
    model=era.fit.models.FermiEdge2dModel(),
    params={"temp": {"value": 100.0, "vary": False}},
    guess=True,
)
result_2d

Let's plot the fit results and the residuals.

In [None]:
best_fit = result_2d.modelfit_best_fit.transpose(*gold_norm.dims)

fig, axs = eplt.plot_slices(
    [gold_norm, best_fit, best_fit - gold_norm],
    figsize=(4, 5),
    cmap=["viridis", "viridis", "bwr"],
    norm=[plt.Normalize(), plt.Normalize(), eplt.CenteredPowerNorm(1.0, vcenter=0)],
    colorbar="all",
    hide_colorbar_ticks=False,
    colorbar_kw={"width": 7},
)
eplt.set_titles(axs, ["Data", "FermiEdge2dModel", "Residuals"])

## Fitting background models

{meth}`xarray.Dataset.xlm.modelfit` and {meth}`xarray.DataArray.xlm.modelfit` works with any `lmfit` model, including background models from [lmfitxps](https://lmfitxps.readthedocs.io/). If you have [lmfitxps](https://lmfitxps.readthedocs.io/) installed, you can use the `ShirleyBG` model to iteratively fit a Shirley background to the data:
```python
from lmfitxps.models import ShirleyBG
from lmfit.models import GaussianModel

darr.xlm.modelfit("alpha", GaussianModel() + ShirleyBG())
```

## Visualizing fits

:::{note}

If you are viewing this documentation online, the plots will not be interactive. Run the code locally to try it out.

:::

If [hvplot](https://github.com/holoviz/hvplot) is installed, we can visualize the fit results interactively with the {meth}`xarray.Dataset.qshow` accessor.

To plot the data with the fit and fit components:

In [None]:
result_ds.qshow(plot_components=True)

To plot each parameter as a function of the coordinate:

In [None]:
result_ds.qshow.params()

## Parallelization

For non-dask objects, you can achieve [joblib](https://joblib.readthedocs.io/en/stable/)-based parallelization:

- For non-dask Datasets, basic parallelization across multiple data variables can be achieved with the `parallel` argument to {meth}`xarray.Dataset.xlm.modelfit`.

- For parallelizing fits on a single DataArray along a dimension with many points, the {meth}`xarray.DataArray.parallel_fit` accessor can be used. This method is similar to {meth}`xarray.DataArray.xlm.modelfit`, but requires the name of the dimension to parallelize over instead of the coordinates to fit along. For example, to parallelize the fit in the previous example, you can use the following code:

    ```python

    gold_selected.parallel_fit(
        dim="alpha",
        model=FermiEdgeModel(),
        params={"temp": {"value": 100.0, "vary": False}},
        guess=True,
    )
    ```

    :::{note}
  
    - Note that the initial run will take a long time due to the overhead of creating parallel workers. Subsequent calls will run faster, since joblib's default backend will try to reuse the workers.
      
    - The accessor has some intrinsic overhead due to post-processing. If you need the best performance, handle the parallelization yourself with joblib and {meth}`lmfit.Model.fit <lmfit.model.Model.fit>`.

    :::


## Saving and loading fits

See the [xarray-lmfit documentation](https://xarray-lmfit.readthedocs.io/) for details on saving and loading fit results.

## Using `iminuit`

:::{note}

This part requires the optional [iminuit](https://github.com/scikit-hep/iminuit) dependency.

:::

[iminuit](https://github.com/scikit-hep/iminuit) is a powerful Python interface to the [Minuit C++ library](https://root.cern.ch/doc/master/Minuit2Page.html) developed at CERN. To learn more, see the [iminuit documentation](http://scikit-hep.org/iminuit/).

ERLabPy provides a thin wrapper around {class}`iminuit.Minuit` that allows you to use lmfit models with iminuit. The example below conducts the same fit as the previous one, but using iminuit.

In [None]:
model = era.fit.models.MultiPeakModel(
    npeaks=2, peak_shapes=["lorentzian"], fd=False, convolve=True
)

m = era.fit.minuit.Minuit.from_lmfit(
    model,
    mdc,
    mdc.kx,
    p0_center=-0.5,
    p1_center=0.5,
    p0_width=0.03,
    p1_width=0.03,
    p0_height=1000,
    p1_height=1000,
    lin_bkg={"value": 0.0, "vary": False},
    const_bkg=0.0,
    resolution=0.03,
)

m.migrad()
m.minos()
m.hesse()

You can also use the [interactive fitting interface](https://scikit-hep.org/iminuit/notebooks/interactive.html) provided by iminuit.

:::{note}

- Requires [ipywidgets](https://github.com/jupyter-widgets/ipywidgets) to be installed.
    
- If you are viewing this documentation online, changing the sliders won’t change the plot. run the code locally to try it out.

:::

In [None]:
m.interactive()