# LEAST_SQUARES

## Overview
The `LEAST_SQUARES` function solves nonlinear least-squares problems using the SciPy optimization library. It fits a user-defined model function to observed data by minimizing the sum of squared residuals. For more details, see the [scipy.optimize.least_squares documentation](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.least_squares.html).  This is useful for curve fitting, parameter estimation, and regression analysis directly in Excel.

The least-squares method seeks to find the parameter vector $\mathbf{p}$ that minimizes the sum of squared residuals:

```math
S(\mathbf{p}) = \sum_{i=1}^n (y_i - f(x_i, \mathbf{p}))^2
```

where $y_i$ are observed values, $x_i$ are input values, and $f(x, \mathbf{p})$ is the model function. SciPy's `least_squares` supports bounds and several algorithms ("trf", "dogbox", "lm").

This example function is provided as-is without any representation of accuracy.

## Usage
To use the `LEAST_SQUARES` function in Excel, enter it as a formula in a cell, specifying the model, data, and initial parameters:

```excel
=LEAST_SQUARES(model, xdata, ydata, p_zero, [bounds_lower], [bounds_upper], [method])
```
- `model` (string, required): Model function as a string, e.g., "a * x + b" or "a * exp(b * x)". Use variable `x` and parameter names (e.g., `a`, `b`). Example: "a * x + b"
- `xdata` (2D list of float, required): 2D list or column of x values. Example: `{1;2;3}`
- `ydata` (2D list of float, required): 2D list or column of y values. Example: `{2;4;6}`
- `p_zero` (2D list of float, required): 2D list or row of initial parameter guesses. Example: `{1,1}`
- `bounds_lower` (2D list of float, optional): Lower bounds for each parameter. Example: `{0,0}`
- `bounds_upper` (2D list of float, optional): Upper bounds for each parameter. Example: `{10,10}`
- `method` (string, optional): Optimization method ("trf", "dogbox", "lm"). Example: "trf"

The function returns the fitted parameter values as a single row (2D list of float). If the fit fails, an error message string is returned.

## Examples
```excel
=LEAST_SQUARES("a * x + b", {1;2;3}, {2;4;6}, {1,1})
```
Expected output:

| a   | b   |
|-----|-----|
| 2.0 | 0.0 |

```excel
=LEAST_SQUARES("a * exp(b * x)", {1;2;3}, {2.7;7.4;20.1}, {1,1})
```
Expected output:

| a   | b   |
|-----|-----|
| 1.0 | 1.0 |

```excel
=LEAST_SQUARES("a * x + b", {1;2;3}, {2;4;6}, {1,1}, {0,0}, {10,10})
```
Expected output:

| a   | b   |
|-----|-----|
| 2.0 | 0.0 |

In [None]:
import numpy as np
from scipy.optimize import least_squares as scipy_least_squares
import math
SAFE_GLOBALS = {k: getattr(math, k) for k in dir(math) if not k.startswith("_")}
SAFE_GLOBALS["np"] = np
SAFE_GLOBALS["numpy"] = np
SAFE_GLOBALS["exp"] = np.exp
SAFE_GLOBALS["log"] = np.log
SAFE_GLOBALS["sin"] = np.sin
SAFE_GLOBALS["cos"] = np.cos
SAFE_GLOBALS["tan"] = np.tan
SAFE_GLOBALS["abs"] = abs
SAFE_GLOBALS["pow"] = pow

def least_squares(model, xdata, ydata, p_zero, bounds_lower=None, bounds_upper=None, method=None):
    """
    Solve a nonlinear least-squares problem by fitting a user-defined model to data.

    Args:
        model (str): Model function as a string, e.g., "a * x + b". Use variable `x` and parameter names.
        xdata (list[list[float]]): 2D list of input x values (independent variable).
        ydata (list[list[float]]): 2D list of observed y values (dependent variable).
        p_zero (list[list[float]]): 2D list of initial guesses for parameters.
        bounds_lower (list[list[float]], optional): 2D list of lower bounds for parameters.
        bounds_upper (list[list[float]], optional): 2D list of upper bounds for parameters.
        method (str, optional): Optimization method ("trf", "dogbox", "lm").

    Returns:
        list[list[float]]: Fitted parameter values as a single row, or
        str: Error message if calculation fails.

    This example function is provided as-is without any representation of accuracy.
    """
    try:
        x = np.array(xdata).flatten()
        y = np.array(ydata).flatten()
        p_zero = np.array(p_zero).flatten()
        n_params = len(p_zero)
        import re
        param_names = re.findall(r'\b[a-zA-Z_]\w*\b', model)
        param_names = [name for name in param_names if name not in ("x", "exp", "log", "sin", "cos", "tan", "abs", "pow")]
        param_names = list(dict.fromkeys(param_names))
        if len(param_names) != n_params:
            return f"Number of initial guesses (p_zero) does not match number of parameters in model: {param_names}"
        if bounds_lower is not None and bounds_upper is not None:
            bounds = (np.array(bounds_lower).flatten(), np.array(bounds_upper).flatten())
        else:
            bounds = (-np.inf, np.inf)
        def residuals(params):
            local_dict = dict(zip(param_names, params))
            local_dict["x"] = x
            try:
                y_pred = eval(model, SAFE_GLOBALS, local_dict)
            except Exception as e:
                return np.full_like(y, np.nan)
            return y_pred - y
        lsq_method = method if method is not None else 'trf'
        if lsq_method not in ('trf', 'dogbox', 'lm'):
            return "`method` must be 'trf', 'dogbox' or 'lm'."
        result = scipy_least_squares(residuals, p_zero, bounds=bounds, method=lsq_method)
        if not result.success:
            return f"Fit failed: {result.message}"
        return [result.x.tolist()]
    except Exception as e:
        return str(e)

In [None]:
%pip install -q ipytest
import ipytest
ipytest.autoconfig()

demo_cases = [
    ["a * x + b", [[1], [2], [3]], [[2], [4], [6]], [[1, 1]], [[0, 0]], [[10, 10]], "trf"],
    ["a * exp(b * x)", [[1], [2], [3]], [[2.7], [7.4], [20.1]], [[1, 1]], [[0, 0]], [[10, 10]], "trf"],
    ["a * x + b", [[1], [2], [3]], [[2], [4], [6]], [[1, 1]], [[0, 0]], [[10, 10]], "trf"]
]

def is_valid_type(val):
    if isinstance(val, (float, bool, str)):
        return True
    if isinstance(val, list):
        return all(isinstance(row, list) and all(isinstance(x, (float, bool, str)) for x in row) for row in val)
    return False

import pytest
@pytest.mark.parametrize("model, xdata, ydata, p_zero, bounds_lower, bounds_upper, method", demo_cases)
def test_demo_cases(model, xdata, ydata, p_zero, bounds_lower, bounds_upper, method):
    result = least_squares(model, xdata, ydata, p_zero, bounds_lower, bounds_upper, method)
    print(f"test_demo_cases output for {model}: {result}")
    assert is_valid_type(result), f"Output type is not valid. Got: {type(result)} Value: {result}"

def test_invalid_method():
    result = least_squares("a * x + b", [[1], [2], [3]], [[2], [4], [6]], [[1, 1]], [[0, 0]], [[10, 10]], "invalid")
    assert isinstance(result, str) and "method" in result

def test_param_mismatch():
    result = least_squares("a * x + b", [[1], [2], [3]], [[2], [4], [6]], [[1]])
    assert isinstance(result, str) and "initial guesses" in result

ipytest.run('-s')

In [None]:
import gradio as gr

demo = gr.Interface(
    fn=least_squares,
    inputs=[
        gr.Textbox(label="Model", value=demo_cases[0][0]),
        gr.Dataframe(label="xdata", type="array", value=demo_cases[0][1], headers=["x"]),
        gr.Dataframe(label="ydata", type="array", value=demo_cases[0][2], headers=["y"]),
        gr.Dataframe(label="Initial Guesses (p_zero)", type="array", value=demo_cases[0][3], headers=["a", "b"]),
        gr.Dataframe(label="Lower Bounds (optional)", type="array", value=demo_cases[0][4], headers=["a", "b"]),
        gr.Dataframe(label="Upper Bounds (optional)", type="array", value=demo_cases[0][5], headers=["a", "b"]),
        gr.Textbox(label="Method (optional)", value=demo_cases[0][6]),
    ],
    outputs=gr.Dataframe(label="Fitted Parameters", type="array", headers=["a", "b"]),
    examples=demo_cases,
    description="Fit a nonlinear model to data using least squares. Set the model, data, and initial guesses. This demo is provided as-is without any representation of accuracy.",
    flagging_mode="never",
    fill_width=True,
)
demo.launch()