# CURVE_FIT

## Overview
The `CURVE_FIT` function fits a user-defined model function to data using non-linear least squares, leveraging SciPy's `curve_fit` method. It is ideal for regression, parameter estimation, and curve fitting directly in Excel.

## Usage
To use the `CURVE_FIT` function in Excel, enter it as a formula in a cell, specifying the model, xdata, ydata, and initial parameter guesses:

```excel
=CURVE_FIT(model, xdata, ydata, p_zero, [bounds_lower], [bounds_upper], [method])
```

- `model` is a string representing the model function, e.g., `"a * x + b"` or `"a * exp(b * x)"`. Use variable `x` and parameter names (e.g., `a`, `b`).
- `xdata` is a 2D list or column of x values.
- `ydata` is a 2D list or column of y values.
- `p_zero` is a 2D list or row of initial parameter guesses (e.g., `[[1, 1]]` for two parameters).
- `bounds_lower` and `bounds_upper` are optional 2D lists or rows specifying lower and upper bounds for each parameter.
- `method` is an optional string specifying the optimization method (e.g., `"trf"`, `"dogbox"`, `"lm"`).

## Parameters
| Parameter      | Type     | Required | Description                                                                 |
|---------------|----------|----------|-----------------------------------------------------------------------------|
| model         | string   | Yes      | Model function as a string, e.g., `"a * x + b"`. Use `x` and parameter names. |
| xdata         | 2D list  | Yes      | Input x values (independent variable).                                      |
| ydata         | 2D list  | Yes      | Observed y values (dependent variable).                                     |
| p_zero        | 2D list  | Yes      | Initial guesses for parameters.                                             |
| bounds_lower  | 2D list  | No       | Lower bounds for parameters.                                                |
| bounds_upper  | 2D list  | No       | Upper bounds for parameters.                                                |
| method        | string   | No       | Optimization method (`"trf"`, `"dogbox"`, `"lm"`).                              |

## Return Value
| Return Value | Type   | Description                                  |
|--------------|--------|----------------------------------------------|
| Parameters   | 2D list| Fitted parameter values as a single row.      |

## Examples

### 1. Fit a straight line to data (y = a * x + b)

**Sample Input Data:**

| x | y |
|---|---|
| 1 | 2 |
| 2 | 4 |
| 3 | 6 |

```excel
=CURVE_FIT("a * x + b", A2:A4, B2:B4, {1, 1})
```
**Sample Output:**

| a   | b   |
|-----|-----|
| 2.0 | 0.0 |

### 2. Fit an exponential model (y = a * exp(b * x))

**Sample Input Data:**

| x | y  |
|---|----|
| 1 | 2.7|
| 2 | 7.4|
| 3 | 20.1|

```excel
=CURVE_FIT("a * exp(b * x)", A2:A4, B2:B4, {1, 1})
```
**Sample Output:**

| a     | b     |
|-------|-------|
| 1.0   | 1.0   |

### 3. Fit with parameter bounds (a >= 0, b >= 0)

**Sample Input Data:**

| x | y |
|---|---|
| 1 | 2 |
| 2 | 4 |
| 3 | 6 |

```excel
=CURVE_FIT("a * x + b", A2:A4, B2:B4, {1, 1}, {0, 0}, {10, 10})
```
**Sample Output:**

| a   | b   |
|-----|-----|
| 2.0 | 0.0 |

## Optimization Methods

The `CURVE_FIT` function uses SciPy's `curve_fit`, which supports several optimization methods. Choosing the right method depends on your fitting problem:

| Method   | Description                                                      | When to Use                                                                                   |
|----------|------------------------------------------------------------------|-----------------------------------------------------------------------------------------------|
| `trf`    | Trust Region Reflective algorithm (default, supports bounds)     | General curve fitting, especially if you need to set lower or upper bounds on parameters.      |
| `dogbox` | Dogleg algorithm in rectangular trust regions (supports bounds)  | If `trf` does not converge or is slow; can sometimes handle difficult bounded problems better. |
| `lm`     | Levenberg-Marquardt algorithm (does not support bounds)          | Only if you do not need parameter bounds; generally faster and more robust for small, unconstrained problems. |

Specify the method as the last argument in the `CURVE_FIT` formula if needed. If omitted, `trf` is used by default.

## Benefits
- Enables advanced curve fitting and regression in Excel without VBA or add-ins.
- Supports nonlinear models and parameter bounds.
- Useful for scientific, engineering, and business analysis.

## Limitations
- The model string must use `x` as the independent variable and parameter names (e.g., `a`, `b`).
- The number of initial guesses must match the number of parameters in the model.
- If the fit fails, an error message is returned as a string.
- Only methods supported by SciPy's `curve_fit` are allowed (`trf`, `dogbox`, `lm`).

In [None]:
import numpy as np
from scipy.optimize import curve_fit as scipy_curve_fit
import math
SAFE_GLOBALS = {k: getattr(math, k) for k in dir(math) if not k.startswith("_")}
SAFE_GLOBALS["np"] = np
SAFE_GLOBALS["numpy"] = np
SAFE_GLOBALS["exp"] = np.exp
SAFE_GLOBALS["log"] = np.log
SAFE_GLOBALS["sin"] = np.sin
SAFE_GLOBALS["cos"] = np.cos
SAFE_GLOBALS["tan"] = np.tan
SAFE_GLOBALS["abs"] = abs
SAFE_GLOBALS["pow"] = pow

def curve_fit(model, xdata, ydata, p_zero, bounds_lower=None, bounds_upper=None, method=None):
    """
    Fits a model to data using scipy.optimize.curve_fit.
    Args:
        model (str): Model function as a string, e.g., "a * x + b"
        xdata (list): 2D list of x values
        ydata (list): 2D list of y values
        p_zero (list): 2D list of initial parameter guesses
        bounds_lower (list, optional): 2D list of lower bounds
        bounds_upper (list, optional): 2D list of upper bounds
        method (str, optional): Optimization method
    Returns:
        2D list: Fitted parameter values as a single row, or error message string
    """
    try:
        x = np.array(xdata).flatten()
        y = np.array(ydata).flatten()
        p_zero = np.array(p_zero).flatten()
        n_params = len(p_zero)
        import re
        param_names = re.findall(r'\b[a-zA-Z_]\w*\b', model)
        param_names = [name for name in param_names if name not in ("x", "exp", "log", "sin", "cos", "tan", "abs", "pow")]
        param_names = list(dict.fromkeys(param_names))
        if len(param_names) != n_params:
            return f"Number of initial guesses (p_zero) does not match number of parameters in model: {param_names}"
        # Validate bounds shapes
        if bounds_lower is not None and bounds_upper is not None:
            bounds_lower_arr = np.array(bounds_lower).flatten()
            bounds_upper_arr = np.array(bounds_upper).flatten()
            if bounds_lower_arr.shape != p_zero.shape or bounds_upper_arr.shape != p_zero.shape:
                return f"Bounds (lower: {bounds_lower_arr.shape}, upper: {bounds_upper_arr.shape}) must match the shape of initial guess p_zero: {p_zero.shape}."
            bounds = (bounds_lower_arr, bounds_upper_arr)
        else:
            bounds = (-np.inf, np.inf)
        def model_func(x, *params):
            local_dict = dict(zip(param_names, params))
            local_dict["x"] = x
            try:
                return eval(model, SAFE_GLOBALS, local_dict)
            except Exception as e:
                raise ValueError(f"Model evaluation error: {e}")
        fit_method = method if method is not None else 'trf'
        if fit_method not in ('trf', 'dogbox', 'lm'):
            return "`method` must be 'trf', 'dogbox' or 'lm'."
        popt, _ = scipy_curve_fit(model_func, x, y, p0=p_zero, bounds=bounds, method=fit_method, maxfev=10000)
        return [popt.tolist()]
    except Exception as e:
        return str(e)

In [None]:
%pip install -q ipytest
import ipytest
ipytest.autoconfig()

def test_linear_fit():
    model = "a * x + b"
    xdata = [[1], [2], [3]]
    ydata = [[2], [4], [6]]
    p_zero = [[1, 1]]
    result = curve_fit(model, xdata, ydata, p_zero)
    assert isinstance(result, list)
    assert len(result) == 1
    assert isinstance(result[0], list)
    assert all(isinstance(x, (float, int)) for x in result[0])
    assert abs(result[0][0] - 2.0) < 1e-2
    assert abs(result[0][1] - 0.0) < 1e-2

def test_exponential_fit():
    model = "a * exp(b * x)"
    xdata = [[1], [2], [3]]
    ydata = [[2.7], [7.4], [20.1]]
    p_zero = [[1, 1]]
    result = curve_fit(model, xdata, ydata, p_zero)
    assert isinstance(result, list)
    assert len(result) == 1
    assert isinstance(result[0], list)
    assert all(isinstance(x, (float, int)) for x in result[0])
    assert abs(result[0][0] - 1.0) < 1e-1
    assert abs(result[0][1] - 1.0) < 1e-1

def test_bounds_fit():
    model = "a * x + b"
    xdata = [[1], [2], [3]]
    ydata = [[2], [4], [6]]
    p_zero = [[1, 1]]
    bounds_lower = [[0, 0]]
    bounds_upper = [[10, 10]]
    result = curve_fit(model, xdata, ydata, p_zero, bounds_lower, bounds_upper)
    assert isinstance(result, list)
    assert len(result) == 1
    assert isinstance(result[0], list)
    assert all(isinstance(x, (float, int)) for x in result[0])
    assert 0 <= result[0][0] <= 10
    assert 0 <= result[0][1] <= 10

ipytest.run()

In [None]:
# Interactive Demo
import gradio as gr

def run_curve_fit(model, xdata, ydata, p_zero, bounds_lower, bounds_upper, method):
    result = curve_fit(model, xdata, ydata, p_zero, bounds_lower, bounds_upper, method)
    if isinstance(result, str):
        return None, result
    return result, None

examples = [
    [
        "a * x + b",
        [[1], [2], [3]],
        [[2], [4], [6]],
        [[1, 1]],
        [[0, 0]],
        [[10, 10]],
        "trf"
    ],
    [
        "a * exp(b * x)",
        [[1], [2], [3]],
        [[2.7], [7.4], [20.1]],
        [[1, 1]],
        [[0, 0]],
        [[10, 10]],
        "trf"
    ],
    [
        "a * x + b",
        [[1], [2], [3]],
        [[2], [4], [6]],
        [[1, 1]],
        [[0, 0]],
        [[10, 10]],
        "trf"
    ]
]

demo = gr.Interface(
    fn=run_curve_fit,
    inputs=[
        gr.Textbox(label="Model (Python expression, e.g. a * x + b)", value="a * x + b"),
        gr.Dataframe(label="x data (column vector)", headers=None, type="array", row_count=3, col_count=1, value=[[1],[2],[3]]),
        gr.Dataframe(label="y data (column vector)", headers=None, type="array", row_count=3, col_count=1, value=[[2],[4],[6]]),
        gr.Dataframe(label="Initial parameter guesses (row)", headers=None, type="array", row_count=1, col_count=2, value=[[1,1]]),
        gr.Dataframe(label="Lower bounds (row, optional)", headers=None, type="array", row_count=1, col_count=2, value=[[0,0]]),
        gr.Dataframe(label="Upper bounds (row, optional)", headers=None, type="array", row_count=1, col_count=2, value=[[10,10]]),
        gr.Textbox(label="Method (trf, dogbox, lm)", value="trf")
    ],
    outputs=[
        gr.Dataframe(label="Fitted Parameters (row)", type="array"),
        gr.Textbox(label="Error Message")
    ],
    examples=examples,
    description="Fit a mathematical model to your data using non-linear least squares. Enter your model as a Python expression (e.g., `a * x + b`), provide your data, and set initial guesses for the parameters. Optionally, specify parameter bounds and the optimization method. Use the demo examples below to see typical use cases.",
    flagging_mode="never",
)
demo.launch()