# Generalized Linear Least Squares

The _linear_ in linear least squares refers to how the parameters appear in the fitting function, $Y$.  So something of the form:

$$Y(x; \{a_j\}) = \sum_{j=1}^M a_j Y_j(x)$$

is still linear in the $\{a_j\}$, even if the _basis functions_ $\{Y_j\}$ are nonlinear.  For example:

$$Y(x; \{a_j\}) = a_1 + a_2 x + a_3 x^2$$

is linear in the fit parameters $\{ a_j\}$.

We can apply the same technique we just did for fitting to a line for this general case.

Our $\chi^2$ is:

$$\chi^2(\{a_j\}) = \sum_{i=1}^N \frac{(Y(x_i; \{a_j\}) - y_i)^2}{\sigma_i^2} =
\sum_{i=1}^N \frac{1}{\sigma_i^2} \left [\left (\sum_{j=1}^M a_j Y_j(x_i)\right ) - y_i \right ]^2$$

We can differentiate it with respect to one of the parameters, $a_k$:

\begin{align*}
\frac{\partial \chi^2}{\partial a_k} 
    &= \frac{\partial}{\partial a_k} 
          \sum_{i=1}^N \frac{1}{\sigma_i^2} \left [\left (\sum_{j=1}^M a_j Y_j(x_i)\right ) - y_i \right ]^2 \\
    &= \sum_{i=1}^N \frac{1}{\sigma_i^2} 
          \frac{\partial}{\partial a_k} \left [\left (\sum_{j=1}^M a_j Y_j(x_i)\right ) - y_i \right ]^2 \\
    &= 2 \sum_{i=1}^N \frac{1}{\sigma_i^2} \left [\left (\sum_{j=1}^M a_j Y_j(x_i)\right ) - y_i \right ] Y_k(x_i) = 0
\end{align*}

We can now rewrite this as:

$$\sum_{i=1}^N \sum_{j=1}^M a_j \frac{Y_j(x_i) Y_k(x_i)}{\sigma_i^2} = \sum_{i=1}^N \frac{y_i Y_k(x_i)}{\sigma_i^2}$$

Defining the _design matrix_ as

$$A_{ij} = \frac{Y_j(x_i)}{\sigma_i}$$

and the source as:

$$b_i = \frac{y_i}{\sigma_i}$$

our system is:

$$\sum_{i=1}^N \sum_{j=1}^M A_{ik} A_{ij} a_j = \sum_{i=1}^N A_{ik} b_i$$

which, by looking at which indices contract, gives us the linear system:

$${\bf A}^\intercal {\bf A} {\bf a} = {\bf A}^\intercal {\bf b}$$

where ${\bf A}^\intercal {\bf A}$ is an $M\times M$ matrix.