# `opinf.lstsq`

```{eval-rst}
.. automodule:: opinf.lstsq

.. currentmodule:: opinf.lstsq

.. autosummary::
   :toctree: _autosummaries
   :nosignatures:

   SolverTemplate
   PlainSolver
   L2Solver
   L2DecoupledSolver
   TikhonovSolver
   TikhonovDecoupledSolver
   TotalLeastSquaresSolver
```

TODO: EXAMPLE DATA

## Least-squares Operator Inference Problems

Operator Inference uses data to learn the entries of an $r \times d$ *operator matrix* $\Ohat$ by solving a regression problem, stated generally as

::::{margin}
:::{admonition} What is $\Z$?
For continuous models (systems of ordinary differential equations), $\Z$ consists of the time derivatives of the snapshots; for discrete models (discrete dynamical systems), $\Z$ also contains state snapshots.
:::
::::

$$
\begin{aligned}
    \text{find}\quad\Ohat\quad\text{such that}\quad
    \Z \approx \Ohat\D\trp
    \quad\Longleftrightarrow\quad
    \D\Ohat\trp \approx \Z\trp,
\end{aligned}
$$ (eq:lstsq:general)

where $\D$ is the $k \times d$ *data matrix* formed from state and input snapshots and $\Z$ is the $r \times k$ matrix of left-hand side data.

This module defines classes for solving the least-squares problem {eq}`eq:lstsq:plain`, as well as related problems with regularization terms and/or constraints, given the data matrices $\D$ and $\Z$.
Solver objects are passed to the constructor of {mod}`opinf.models` classes.
The model handles the construction of $\D$ and $\Z$ from snapshot data, passes these matrices to the solver's `fit()` method, calls the solver's `predict()` method to produce $\Ohat$, and interprets $\Ohat$ in the context of the model structure.

::::{admonition} Example
:class: tip

Suppose we want to construct a linear time-invariant (LTI) system,

$$
\begin{align}
    \ddt\qhat(t)
    = \Ahat\qhat(t) + \Bhat\u(t),
    \qquad
    \Ahat\in\RR^{r \times r},
    ~
    \Bhat\in\RR^{r \times m}.
\end{align}
$$ (eq:lstsq:ltiexample)

The operator matrix is $\Ohat = [~\Ahat~~\Bhat~]\in\RR^{r \times d}$ with column dimension $d = r + m$.
To learn $\Ohat$ with Operator Inference, we need data for $\qhat(t)$, $\u(t)$, and $\ddt\qhat(t)$.
For $j = 0, \ldots, k-1$, let

- $\qhat_{j}\in\RR^r$ be a measurement of the (reduced) state at time $t_{j}$,
- $\dot{\qhat}_{j} = \ddt\qhat(t)\big|_{t=t_{j}} \in \RR^r$ be the time derivative of the state at time $t_{j}$, and
- $\u_{j} = \u(t_j) \in \RR^m$ be the input at time $t_{j}$.

In this case, the data matrix $\D$ is given by $\D = [~\Qhat\trp~~\U\trp~]\in\RR^{k \times d}$, where

$$
\begin{aligned}
    \Qhat = \left[\begin{array}{ccc}
        & & \\
        \qhat_0 & \cdots & \qhat_{k-1}
        \\ & &
    \end{array}\right]
    \in \RR^{r\times k},
    \qquad
    \U = \left[\begin{array}{ccc}
        & & \\
        \u_0 & \cdots & \u_{k-1}
        \\ & &
    \end{array}\right]
    \in \RR^{m \times k}.
\end{aligned}
$$

The left-hand side data is $\Z = \dot{\Qhat} = [~\dot{\qhat}_0~~\cdots~~\dot{\qhat}_{k-1}~]\in\RR^{r\times k}$.

:::{dropdown} Derivation
We seek $\Ahat$ and $\Bhat$ such that

$$
\begin{aligned}
    \dot{\qhat}_{j}
    \approx \Ahat\qhat_j + \Bhat\u_j,
    \qquad j = 0, \ldots, k-1.
\end{aligned}
$$

Using the snapshot matrices $\Qhat$, $\U$, and $\dot{\Qhat}$ defined above, we want

$$
\begin{aligned}
    \dot{\Qhat}
    \approx \Ahat\Qhat + \Bhat\U
    = [~\Ahat~~\Bhat~]\left[\begin{array}{c} \Qhat \\ \U \end{array}\right],
    \quad\text{or}
    \\
    [~\Qhat\trp~~\U\trp~][~\Ahat~~\Bhat~]\trp \approx \dot{\Qhat}\trp,
\end{aligned}
$$

which is $\D\Ohat\trp \approx \Z\trp$.

More precisely, a regression problem for $\Ohat$ with respect to the data triples $(\qhat_j, \u_j, \dot{\qhat}_j)$ can be written as

$$
\begin{aligned}
    \argmin_{\Ahat,\Bhat}\sum_{j=0}^{k-1}\left\|
        \Ahat\qhat_j + \Bhat\u_j - \dot{\qhat}_j
    \right\|_{2}^{2}
    &= \argmin_{\Ahat,\Bhat}\left\|
        \Ahat\Qhat + \Bhat\U - \dot{\Qhat}
    \right\|_{F}^{2}
    \\
    &= \argmin_{\Ahat,\Bhat}\left\|
        [~\Ahat~~\Bhat~]\left[\begin{array}{c} \Qhat \\ \U \end{array}\right] - \Z
    \right\|_{F}^{2}
    \\
    &= \argmin_{\Ahat,\Bhat}\left\|
        [~\Qhat\trp~~\U\trp~][~\Ahat~~\Bhat~]\trp - \Z\trp
    \right\|_{F}^{2},
\end{aligned}
$$

which is $\argmin_{\Ohat}\|\D\Ohat\trp - \Z\trp\|_F^2$.
:::
::::

## Default Solver

Most often, we pose {eq}`eq:lstsq:general` as a linear least-squares regression,

$$
\begin{aligned}
    \argmin_{\Ohat} \|\D\Ohat\trp - \Z\trp\|_F^2.
\end{aligned}
$$ (eq:lstsq:plain)

Note that the matrix least-squares problem {eq}`eq:lstsq:plain` decouples into $r$ independent vector least-squares problems, i.e.,

$$
\begin{aligned}
    \argmin_{\ohat_i} \|\D\ohat_i - \z_i\|_2^2,
    \quad i = 1, \ldots, r,
\end{aligned}
$$

where $\ohat_i$ and $\z_i$ are the $i$-th rows of $\Ohat$ and $\Z$, respectively.

The {class}`PlainSolver` class solves {eq}`eq:lstsq:plain` without any additional terms.
This is the default solver used if another solver is not specified in the constructor of an {mod}`opinf.models` class.

## Tikhonov Regularization


It is often advantageous to add a *regularization term* $\mathcal{R}(\Ohat)$ to penalize the entries of the inferred operators.
This prevents over-fitting to data and promotes stability and accuracy in the learned reduced-order model {cite}`mcquarrie2021combustion`.
The regression problem then becomes

$$
\begin{aligned}
    \argmin_{\Ohat}\|
        \D\Ohat\trp - \Z\trp
    \|_{F}^{2} + \mathcal{R}(\Ohat).
\end{aligned}
$$

A [Tikhonov regularization](https://en.wikipedia.org/wiki/Ridge_regression#Tikhonov_regularization) term has the form

$$
\begin{aligned}
    \mathcal{R}(\Ohat)
    = \sum_{i=1}^{r}\|\bfGamma_i\ohat_i\|_2^2,
\end{aligned}
$$

where $\ohat_1,\ldots,\ohat_r$ are the rows of $\Ohat$ and each $\bfGamma_1,\ldots,\bfGamma_r$ is a $d \times d$ symmetric positive-definite matrix.
In this case, the decoupled regressions for the rows of $\Ohat$ are given by

$$
\begin{aligned}
    \argmin_{\ohat_i} \|\D\ohat_i - \z_i\|_2^2 + \|\bfGamma_i\ohat_i\|_2^2,
    \quad i = 1, \ldots, r.
\end{aligned}
$$

The following classes solve Tikhonov-regularized least-squares Operator Inference regressions for different choices of the regularization term $\mathcal{R}(\Ohat)$.

| Solver class                     | Description                                      | Regularization $\mathcal{R}(\Ohat)$ |
| :------------------------------- | :----------------------------------------------- | :------------------ |
| {class}`L2Solver`                | One scalar regularizer for all $\ohat_i$         | $\lambda^{2}\|\Ohat\trp\|_F^2$ |
| {class}`L2DecoupledSolver`       | Different scalar regularizers for each $\ohat_i$ | $\sum_{i=1}^{r}\lambda_i^2\|\ohat_i\|_2^2$ |
| {class}`TikhonovSolver`          | One matrix regularizer for all $\ohat_i$         | $\|\bfGamma\Ohat\trp\|_F^2$ |
| {class}`TikhonovDecoupledSolver` | Different matrix regularizers for each $\ohat_i$ | $\sum_{i=1}^{r}\|\bfGamma_i\ohat_i\|_2^2$ |

## Total Least-Squares

Linear least-squares models for $\D\Ohat\trp \approx \Z\trp$ assume error in $\Z$ only, i.e.,

$$
\begin{aligned}
    \D\Ohat\trp = \Z\trp + \Delta_{\Z\trp}
\end{aligned}
$$

for some $\Delta_{\Z\trp} \in \RR^{r\times k}$
[Total least-squares](https://en.wikipedia.org/wiki/Total_least_squares) is an alternative approach that assumes possible error in the data matrix $\D$ as well as in $\Z$, i.e.,

$$
\begin{aligned}
    (\D + \Delta_{\D})\Ohat\trp = \Ztrp + \Delta_{\Z\trp}.
\end{aligned}
$$

for $\Delta_{\D}\in\RR^{k \times d}$ and $\Delta_{\Z\trp}\in\RR^{r \times k}$.

The {class}`TotalLeastSquaresSolver` class performs a total least-squares solve for $\Ohat$.


## Custom Solvers

The {class}`SolverTemplate` class defines the API for least-squares solvers.