# Introduction to NonLinear Least Squares

The method of least squares is a standard approach used in statistical regression analysis, and its most important application is in data fitting -> used to predict behaviour of dependent variables. 

In practice, we often have to determine unknown parameters of a mathematical model to fit the data given, and usually the number of measurements given is much greater than the number of unknowns -> this is known as an **overdetermined system**.

![overdetermined_system.png](attachment:overdetermined_system.png)  

As we saw in a previous group presentation on Linear least squares (LLS), in least squares problems we need to solve this overdetermined system but generally there is no unique solution, so we use optimization to approximate a solution.

There are many similarities between LLS and NLLS, but also some significant differences as well.

LLS has linear parameters, which can be reduced to a linear algebra problem to be solved analytically.  If the regression model doesn't follow the rules for a linear model, then it is nonlinear.

Nonlinear least squares extends linear least squares for a much larger and general class of functions.  Linear models do not describe processes that asymptote very well.

Here are some examples you may be familiar with that illustrate the diversity of non-linear models: $\theta$ for parameters, $X$s for independent variable.

**Exponential growth model**: $\theta_1 * X^{\theta_2}$

![exponential_nonlinear_model.png](../exponential_nonlinear_model.png)



**Weibull growth model**: $\theta_1 + (\theta_2 - \theta_1) * \exp(-\theta_3 * X^{\theta_4})$

![weibull_growth_model.png](attachment:weibull_growth_model.png)


**Fourier series model**: $\theta_1 * \cos(X + \theta_4) + \theta_2 * \cos(2 * X + \theta_4) + \theta_3$

![fourier_nonlinear_model.png](attachment:fourier_nonlinear_model)

in NLLS, since we cannot analytically solve the model, we use iterative optimization procedures to compute the parameter estimates.

Similar to LLS, in Non-linear least squares (NLLS), we want to minimize the sum of the square of residuals:

Find $x \in \mathbb{R}^n$ that minimizes: 
$$
|| r(x) ||^2 = \sum_{i-1}^{m} r_i(x)^2
$$
where $r: \mathbb{R}^n \to \mathbb{R}^m$, $r(x)$ is a vector of residuals.

We know that the minimum value occurs when the gradient is zero, however, the calculations in NLLS are done differently from LLS, since the parameters are refined iteratively.  We will show how to obtain these values in the next section.

Unfortunately the downside to using iterative procedures means it is more costly and requires user to provide starting values for unknown parameters that are reasonably close, otherwise it may not converge to the global minimum.

There are a few common methods used to solve nonlinear least squares problems, but today we will be focusing on the Gauss-Newton algorithm.  I will now let William introduce the algorithm and explain how it works.