# LEAST_SQUARES

## Overview
The `LEAST_SQUARES` function solves nonlinear least-squares problems using the SciPy optimization library. It fits a user-defined model function to observed data by minimizing the sum of squared residuals. This is useful for curve fitting, parameter estimation, and regression analysis directly in Excel.

The least-squares method seeks to find the parameter vector $\mathbf{p}$ that minimizes the sum of squared residuals:

```math
S(\mathbf{p}) = \sum_{i=1}^n (y_i - f(x_i, \mathbf{p}))^2
```

where $y_i$ are observed values, $x_i$ are input values, and $f(x, \mathbf{p})$ is the model function. SciPy's `least_squares` supports bounds and several algorithms ("trf", "dogbox", "lm").

This example function is provided as-is without any representation of accuracy.

## Usage
To use the `LEAST_SQUARES` function in Excel, enter it as a formula in a cell, specifying the model, data, and initial parameters:

```excel
=LEAST_SQUARES(model, xdata, ydata, p_zero, [bounds_lower], [bounds_upper], [method])
```
- `model` (string, required): Model function as a string, e.g., "a * x + b" or "a * exp(b * x)". Use variable `x` and parameter names (e.g., `a`, `b`).
- `xdata` (2D list of float, required): 2D list or column of x values. Example: `{1;2;3}`
- `ydata` (2D list of float, required): 2D list or column of y values. Example: `{2;4;6}`
- `p_zero` (2D list of float, required): 2D list or row of initial parameter guesses. Example: `{1,1}`
- `bounds_lower` (2D list of float, optional): Lower bounds for each parameter. Example: `{0,0}`
- `bounds_upper` (2D list of float, optional): Upper bounds for each parameter. Example: `{10,10}`
- `method` (string, optional): Optimization method ("trf", "dogbox", "lm"). Example: "trf"

The function returns the fitted parameter values as a single row (2D list of float). If the fit fails, an error message string is returned.

## Examples

**Example 1: Linear Fit**
```excel
=LEAST_SQUARES("a * x + b", {1;2;3}, {2;4;6}, {1,1})
```
Expected output:

| a   | b   |
|-----|-----|
| 2.0 | 0.0 |

**Example 2: Exponential Fit**
```excel
=LEAST_SQUARES("a * exp(b * x)", {1;2;3}, {2.7;7.4;20.1}, {1,1})
```
Expected output:

| a   | b   |
|-----|-----|
| 1.0 | 1.0 |

**Example 3: Linear Fit with Bounds**
```excel
=LEAST_SQUARES("a * x + b", {1;2;3}, {2;4;6}, {1,1}, {0,0}, {10,10})
```
Expected output:

| a   | b   |
|-----|-----|
| 2.0 | 0.0 |

In [None]:
import numpy as np
from scipy.optimize import least_squares as scipy_least_squares
import math
SAFE_GLOBALS = {k: getattr(math, k) for k in dir(math) if not k.startswith("_")}
SAFE_GLOBALS["np"] = np
SAFE_GLOBALS["numpy"] = np
SAFE_GLOBALS["exp"] = np.exp
SAFE_GLOBALS["log"] = np.log
SAFE_GLOBALS["sin"] = np.sin
SAFE_GLOBALS["cos"] = np.cos
SAFE_GLOBALS["tan"] = np.tan
SAFE_GLOBALS["abs"] = abs
SAFE_GLOBALS["pow"] = pow

def least_squares(model, xdata, ydata, p_zero, bounds_lower=None, bounds_upper=None, method=None):
    """
    Solve a nonlinear least-squares problem by fitting a user-defined model to data.

    Args:
        model (str): Model function as a string, e.g., "a * x + b". Use variable `x` and parameter names.
        xdata (list[list[float]]): 2D list of input x values (independent variable).
        ydata (list[list[float]]): 2D list of observed y values (dependent variable).
        p_zero (list[list[float]]): 2D list of initial guesses for parameters.
        bounds_lower (list[list[float]], optional): 2D list of lower bounds for parameters.
        bounds_upper (list[list[float]], optional): 2D list of upper bounds for parameters.
        method (str, optional): Optimization method ("trf", "dogbox", "lm").

    Returns:
        list[list[float]]: Fitted parameter values as a single row, or
        str: Error message if calculation fails.

    This example function is provided as-is without any representation of accuracy.
    """
    try:
        x = np.array(xdata).flatten()
        y = np.array(ydata).flatten()
        p_zero = np.array(p_zero).flatten()
        n_params = len(p_zero)
        import re
        param_names = re.findall(r'\b[a-zA-Z_]\w*\b', model)
        param_names = [name for name in param_names if name not in ("x", "exp", "log", "sin", "cos", "tan", "abs", "pow")]
        param_names = list(dict.fromkeys(param_names))
        if len(param_names) != n_params:
            return f"Number of initial guesses (p_zero) does not match number of parameters in model: {param_names}"
        if bounds_lower is not None and bounds_upper is not None:
            bounds = (np.array(bounds_lower).flatten(), np.array(bounds_upper).flatten())
        else:
            bounds = (-np.inf, np.inf)
        def residuals(params):
            local_dict = dict(zip(param_names, params))
            local_dict["x"] = x
            try:
                y_pred = eval(model, SAFE_GLOBALS, local_dict)
            except Exception:
                return np.full_like(y, np.nan)
            return y_pred - y
        lsq_method = method if method is not None else 'trf'
        if lsq_method not in ('trf', 'dogbox', 'lm'):
            return "`method` must be 'trf', 'dogbox' or 'lm'."
        result = scipy_least_squares(residuals, p_zero, bounds=bounds, method=lsq_method)
        if not result.success:
            return f"Fit failed: {result.message}"
        return [result.x.tolist()]
    except Exception as e:
        return str(e)

In [None]:
import ipytest
ipytest.autoconfig()
import pytest

demo_cases = [
    ["a * x + b", [[1], [2], [3]], [[2.2], [3.8], [6.1]], [[1, 1]], [[0, 0]], [[10, 10]], "trf", [[1.9500000026743811, 0.1333333279846345]]],
    ["a * exp(b * x)", [[1], [2], [3]], [[2.9], [7.1], [19.7]], [[1, 1]], [[0, 0]], [[10, 10]], "trf", [[0.9897254816998906, 0.9964764194667617]]],
    ["a * x + b", [[1], [2], [3]], [[1.7], [4.3], [5.9]], [[1, 1]], [[0, 0]], [[10, 10]], "trf", [[2.0000000001596554, 1.8192803597338152e-22]]]
]

def approx_equal(a, b, rel=0.05, abs_tol=1e-4):
    # Scalar float only
    if isinstance(a, float) and isinstance(b, float):
        return a == pytest.approx(b, rel=rel, abs=abs_tol)
    # 2D list of floats only
    if (
        isinstance(a, list) and isinstance(b, list)
        and all(isinstance(x, list) for x in a)
        and all(isinstance(y, list) for y in b)
    ):
        return all(
            all(isinstance(x, float) and isinstance(y, float) and x == pytest.approx(y, rel=rel, abs=abs_tol) for x, y in zip(row_a, row_b))
            for row_a, row_b in zip(a, b)
        )
    return False

@pytest.mark.parametrize("model, xdata, ydata, p_zero, bounds_lower, bounds_upper, method, expected", demo_cases)
def test_demo_cases(model, xdata, ydata, p_zero, bounds_lower, bounds_upper, method, expected):
    result = least_squares(model, xdata, ydata, p_zero, bounds_lower, bounds_upper, method)
    print(f"test_demo_cases output for {model}: {result}")
    if isinstance(result, list) and isinstance(expected, list):
        assert approx_equal(result, expected, rel=0.05), f"Output {result} not within 5% of expected {expected}"

def test_invalid_method():
    result = least_squares("a * x + b", [[1], [2], [3]], [[2], [4], [6]], [[1, 1]], [[0, 0]], [[10, 10]], "invalid")
    assert isinstance(result, str) and "method" in result

def test_param_mismatch():
    result = least_squares("a * x + b", [[1], [2], [3]], [[2], [4], [6]], [[1]])
    assert isinstance(result, str) and "initial guesses" in result

ipytest.run('-s')

In [None]:
import gradio as gr
import numpy as np
import matplotlib.pyplot as plt
import io
import base64

def gradio_least_squares(model, xdata, ydata, p_zero):
    result = least_squares(model, xdata, ydata, p_zero)
    # Prepare result table (DataFrame)
    if isinstance(result, str):
        result_df = [[result] + [None]*(len(p_zero[0])-1 if isinstance(p_zero, list) and len(p_zero)>0 else 0)]
    else:
        result_df = result
    # Prepare plot
    try:
        x = np.array(xdata).flatten()
        y = np.array(ydata).flatten()
        x_plot = np.linspace(np.min(x), np.max(x), 200)
        import re
        param_names = re.findall(r'\b[a-zA-Z_]\w*\b', model)
        param_names = [name for name in param_names if name not in ("x", "exp", "log", "sin", "cos", "tan", "abs", "pow")]
        param_names = list(dict.fromkeys(param_names))
        if isinstance(result, list) and len(result) > 0:
            params = result[0]
            y_fit = []
            for x_val in x_plot:
                local_dict = dict(zip(param_names, params))
                local_dict["x"] = x_val
                try:
                    y_fit.append(eval(model, SAFE_GLOBALS, local_dict))
                except Exception:
                    y_fit.append(np.nan)
            plt.figure(figsize=(6, 4))
            plt.plot(x_plot, y_fit, label="Fitted Model", color="blue")
            plt.scatter(x, y, color="red", label="Data Points")
            plt.xlabel("x")
            plt.ylabel("y")
            plt.title("Least Squares Fit")
            plt.legend()
            plt.tight_layout()
            buf = io.BytesIO()
            plt.savefig(buf, format="png")
            plt.close()
            buf.seek(0)
            img_base64 = base64.b64encode(buf.read()).decode("utf-8")
            img_html = f'<img src="data:image/png;base64,{img_base64}" style="max-width:100%;height:auto;" />'
        else:
            img_html = ""
    except Exception:
        img_html = ""
    return result_df, img_html

demo = gr.Interface(
    fn=gradio_least_squares,
    inputs=[
        gr.Textbox(label="Model", value=demo_cases[0][0]),
        gr.DataFrame(label="xdata", type="array", value=demo_cases[0][1], headers=["x"]),
        gr.DataFrame(label="ydata", type="array", value=demo_cases[0][2], headers=["y"]),
        gr.DataFrame(label="Initial Guesses (p_zero)", type="array", value=demo_cases[0][3], headers=["a", "b"]),
    ],
    outputs=[
        gr.DataFrame(headers=["a", "b"], label="Fitted Parameters", type="array"),
        gr.HTML(label="Fit Plot")
    ],
    examples=[case[:4] for case in demo_cases],
    flagging_mode="never",
    fill_width=True,
)
demo.launch()