# Fitting across multiple dimensions

Suppose you have to fit a single model to multiple data points across some dimension, or even multiple dimensions. The accessor can handle this with ease.

To demonstrate, let's extend our previous example of fitting a 1D Gaussian peak on a linear background to 2D, where each row contains a Gaussian peak with a different center.

In [None]:
import numpy as np
import xarray as xr
import matplotlib.pyplot as plt
import lmfit
import xarray_lmfit

In [None]:
# Define coordinates
x = np.linspace(-5.0, 5.0, 100)
y = np.arange(3)

# Center of the peaks along y
center = np.array([-2.0, 0.0, 2.0])[:, np.newaxis]

# Gaussian peak on a linear background
z = -0.1 * x + 2 + 3 * np.exp(-((x - center) ** 2) / (2 * 1**2))

# Add some noise with fixed seed for reproducibility
rng = np.random.default_rng(5)
zerr = np.full_like(z, 0.1)
z = rng.normal(z, zerr)

# Construct DataArray
darr = xr.DataArray(z, dims=["y", "x"], coords={"y": y, "x": x})
darr.plot()

{meth}`xarray.DataArray.xlm.modelfit` will automatically broadcast the model parameters across the non-fitting dimensions, allowing you to fit all rows in one go.

In [None]:
model = lmfit.models.GaussianModel() + lmfit.models.LinearModel()

params = {"center": 0.0, "slope": -0.1}

result_ds = darr.xlm.modelfit(coords="x", model=model, params=params)
result_ds

Note that {meth}`xarray.DataArray.xlm.modelfit` also allows `params` to be provided as a dictionary, structured like the keyword arguments to {meth}`lmfit.model.Model.make_params` or {func}`lmfit.parameter.create_params`.

## Providing initial guesses

What if you want to provide different initial guesses for each row? Using the powerful broadcasting capabilities of xarray, you can provide initial guesses and bounds for the fitting parameters as {class}`xarray.DataArray`s.

For instance, if we want to provide different initial guesses for the peak positions along `y`, we can do so by passing a dictionary of DataArrays to the `params` argument.

In [None]:
model = lmfit.models.GaussianModel() + lmfit.models.LinearModel()

params = {
    "center": xr.DataArray([-2, 0, 2], coords=[darr.y]),
    "slope": -0.1,
}

result_ds = darr.xlm.modelfit(coords="x", model=model, params=params)
result_ds

Let's overlay the fitted peak positions on the data.

In [None]:
result_ds.modelfit_data.plot()
result_center = result_ds.sel(param="center")

plt.plot(result_center.modelfit_coefficients, result_center.y, "o-")

The same can be done with *all* parameter attributes that can be passed to {func}`lmfit.parameter.create_params` (e.g., `vary`, `min`, `max`, etc.). For example:

In [None]:
model = lmfit.models.GaussianModel() + lmfit.models.LinearModel()

params = {
    "center": {
        "value": xr.DataArray([-2, 0, 2], coords=[darr.y]),
        "min": -5.0,
        "max": xr.DataArray([0, 2, 5], coords=[darr.y]),
    },
    "slope": -0.1,
}

result_ds = darr.xlm.modelfit(coords="x", model=model, params=params)
result_ds

## Parallelization

The accessors are tightly integrated with `xarray`, so passing a dask array will
parallelize the fitting process. See [Parallel Computing with Dask](https://docs.xarray.dev/en/stable/user-guide/dask.html) for background on how xarray and dask work together.

Assuming you have `dask` installed with a `dask` scheduler set up, you can fit large datasets in parallel with ease.

For example, recall the previous example where we created a DataArray with 3 Gaussian peaks, each with a different center, except this time we will create not just 3, but 300 peaks.

In [None]:
# Define coordinates
x = np.linspace(-5.0, 5.0, 100)
y = np.arange(300)

# Center of the peaks along y
center = np.linspace(-2.0, 2.0, 300)[:, np.newaxis]

# Gaussian peak on a linear background
z = -0.1 * x + 2 + 3 * np.exp(-((x - center) ** 2) / (2 * 1**2))

# Construct DataArray
darr = xr.DataArray(z, dims=["y", "x"], coords={"y": y, "x": x})
darr.plot()

Now, let's try chunking the data and converting it to a dask array before fitting:

In [None]:
darr = darr.chunk({"y": 50})
darr

When {meth}`xarray.DataArray.xlm.modelfit` is called on a dask array, the fitting is not performed immediately. Instead, a dask graph is created that represents the computation to be performed:

In [None]:
model = lmfit.models.GaussianModel() + lmfit.models.LinearModel()

params = {
    "center": xr.DataArray(np.linspace(-2.0, 2.0, 300), coords=[darr.y]),
    "slope": -0.1,
}

result = darr.xlm.modelfit(coords="x", model=model, params=params)
result

You can call `.compute()` on the entire Dataset, or on individual variables to perform the fitting in parallel.

In [None]:
result.compute()