# Best Linear Unbiased Estimator

source

search for `Best Linear Unbiased Estimator by Etan Dowley`

---

## The Model

Let $\mathbf{y}$ be the *observation* vector and $\mathbf{x}$ be the state vector. They shall be related to another via equation:

$$
\mathbf{y} = h\left(\mathbf{x} \right) + \mathbf{e}
$$

$h$ denotes an *observation* operator which maps $\mathbf{x} : \ \in \mathbb{R}^{N_x}$ to $\mathbf{y} : \ \in \mathbb{R}^{N_y}$. $\mathbf{e} : \ \in \mathbb{R}^{N_y} \$  is a noise vector.

The error vector  $\mathbf{e}$ shall have zero mean and it shall be independent on $\mathbf{x}$.

**Assumption**

With a linear estimator $\mathbf{\hat{x}} = \mathbf{A} \cdot \mathbf{y} + \mathbf{b}$ it shall be possible to get a good approximation / estimate of $\mathbf{x}$.

Matrix $\mathbf{A} : \ \in \mathbb{R}^{N_y \times N_x}$.

Vector $\mathbf{b} : \ \in \mathbb{R}^{N_x}$.

The properties of $\mathbf{A}$ and $\mathbf{b}$ shall be determined such as to minimise the variance $E\left((\mathbf{x} - \mathbf{\hat{x}})^T \cdot (\mathbf{x} - \mathbf{\hat{x}}) \right)$.

---

$$\begin{align}
(\mathbf{x} - \mathbf{\hat{x}})^T \cdot (\mathbf{x} - \mathbf{\hat{x}}) &= \left(\mathbf{x} - \left(\mathbf{A} \cdot \mathbf{y} + \mathbf{b} \right)  \right)^T \cdot \left(\mathbf{x} - \left(\mathbf{A} \cdot \mathbf{y} + \mathbf{b} \right)  \right) \\
&= \left(\mathbf{x}^T - \mathbf{x}^T \cdot \mathbf{A}^T  - \mathbf{b}^T  \right) \cdot \left(\mathbf{x} - \mathbf{A} \cdot \mathbf{y} - \mathbf{b}  \right) \\
&= \mathbf{x}^T \cdot \mathbf{x} - \mathbf{x}^T \cdot \mathbf{A} \cdot \mathbf{y} - \mathbf{x}^T \cdot \mathbf{b}  \\
&- \mathbf{y}^T \cdot \mathbf{A}^T \cdot \mathbf{x} + \mathbf{y}^T \cdot \mathbf{A}^T \cdot \mathbf{A} \cdot \mathbf{y} + \mathbf{y}^T \cdot \mathbf{A}^T \cdot \mathbf{b}  \\
&- \mathbf{b}^T \cdot \mathbf{x} + \mathbf{b}^T \cdot \mathbf{A} \cdot \mathbf{y} + \mathbf{b}^T \cdot \mathbf{b}  \\
\\
&= \mathbf{x}^T \cdot \mathbf{x} - 2 \cdot \mathbf{x}^T \cdot \mathbf{A} \cdot \mathbf{y} - 2 \cdot \mathbf{x}^T \cdot \mathbf{b} + \mathbf{y}^T \cdot \mathbf{A}^T \cdot \mathbf{A} \cdot \mathbf{y} + 2 \cdot \mathbf{y}^T \cdot \mathbf{A}^T \cdot \mathbf{b} + \mathbf{b}^T \cdot \mathbf{b}
\end{align}
$$

Taking the partial derivatives $\frac{\partial}{\partial a_{i,\ j}}$ of the elements of matrix $\mathbf{A}$ and the partial derivatives $\frac{\partial}{\partial b_{j}}$ of the elements of vector $\mathbf{b}$ yields:

----

**partial derivatives $\frac{\partial}{\partial a_{i,\ j}}$**

$$\begin{align}
\frac{\partial}{\partial a_{i,\ j}} \mathbf{x}^T \cdot \mathbf{x} &= \mathbf{0} \\
\frac{\partial}{\partial a_{i,\ j}} \mathbf{x}^T \cdot \mathbf{A} \cdot \mathbf{y} &= \mathbf{x} \cdot  \mathbf{y}^T \\
\frac{\partial}{\partial a_{i,\ j}} \mathbf{x}^T \cdot \mathbf{b} &= \mathbf{0} \\
\frac{\partial}{\partial a_{i,\ j}} \mathbf{y}^T \cdot \mathbf{A}^T \cdot \mathbf{A} \cdot \mathbf{y} &= \frac{\partial}{\partial a_{i,\ j}}  \left(\mathbf{A} \cdot \mathbf{y} \right)^T \cdot \left(\mathbf{A} \cdot \mathbf{y} \right)\\
&= \mathbf{A} \cdot \left(\mathbf{y} \cdot \mathbf{y}^T + \mathbf{y} \cdot \mathbf{y}^T  \right)\\
&= 2 \cdot \mathbf{A} \cdot \mathbf{y} \cdot \mathbf{y}^T \\ 
\frac{\partial}{\partial a_{i,\ j}} \mathbf{y}^T \cdot \mathbf{A}^T \cdot \mathbf{b} &=  \mathbf{b} \cdot \mathbf{y}^T \\
\frac{\partial}{\partial a_{i,\ j}} \mathbf{b}^T \cdot \mathbf{b} &= \mathbf{0} 
\end{align}
$$

**partial derivatives $\frac{\partial}{\partial b_{j}}$**

$$\begin{align}
\frac{\partial}{\partial b_j} \mathbf{x}^T \cdot \mathbf{x} &= \mathbf{0} \\
\frac{\partial}{\partial b_j} \mathbf{x}^T \cdot \mathbf{A} \cdot \mathbf{y} &= \mathbf{0} \\
\frac{\partial}{\partial b_j} \mathbf{x}^T \cdot \mathbf{b} &= \mathbf{x} \\
\frac{\partial}{\partial b_j} \mathbf{y}^T \cdot \mathbf{A}^T \cdot \mathbf{A} \cdot \mathbf{y} &= \mathbf{0}\\
\frac{\partial}{\partial b_j} \mathbf{y}^T \cdot \mathbf{A}^T \cdot \mathbf{b} &= \mathbf{A} \cdot \mathbf{y} \\
\frac{\partial}{\partial b_j} \mathbf{b}^T \cdot \mathbf{b} &= 2 \cdot \mathbf{b} 
\end{align}
$$


Putting everything together ...

$$\begin{align}
\frac{\partial}{\partial a_{i,\ j}} (\mathbf{x} - \mathbf{\hat{x}})^T \cdot (\mathbf{x} - \mathbf{\hat{x}}) &=  - 2 \cdot \mathbf{x} \cdot  \mathbf{y}^T   + 2 \cdot \mathbf{A} \cdot \mathbf{y} \cdot \mathbf{y}^T  + 2 \cdot \mathbf{b} \cdot \mathbf{y}^T \\
\\
\frac{\partial}{\partial b_j} (\mathbf{x} - \mathbf{\hat{x}})^T \cdot (\mathbf{x} - \mathbf{\hat{x}}) &=  - 2 \cdot \mathbf{x} + 2 \cdot \mathbf{A} \cdot \mathbf{y}  + 2 \cdot \mathbf{b}\\
\end{align}
$$

taking expectations and setting them to `0`:

$$\begin{align}
E\left(\frac{\partial}{\partial a_{i,\ j}} (\mathbf{x} - \mathbf{\hat{x}})^T \cdot (\mathbf{x} - \mathbf{\hat{x}})\right) &=  - 2 \cdot E\left(\mathbf{x} \cdot  \mathbf{y}^T \right)   + 2 \cdot \mathbf{A} \cdot E\left(\mathbf{y} \cdot \mathbf{y}^T\right)  + 2 \cdot \mathbf{b} \cdot E\left(\mathbf{y}^T\right) = \mathbf{0} \\
\\
E\left(\frac{\partial}{\partial b_j} (\mathbf{x} - \mathbf{\hat{x}})^T \cdot (\mathbf{x} - \mathbf{\hat{x}})\right) &=  - 2 \cdot E\left(\mathbf{x}\right) + 2 \cdot \mathbf{A} \cdot E\left(\mathbf{y} \right) + 2 \cdot \mathbf{b} = \mathbf{0}\\
\end{align}
$$

yields:

$$
\mathbf{b} = E\left(\mathbf{x}\right) - \mathbf{A} \cdot E\left(\mathbf{y} \right)
$$

which is inserted into the equation for $E\left(\frac{\partial}{\partial a_{i,\ j}} (\mathbf{x} - \mathbf{\hat{x}})^T \cdot (\mathbf{x} - \mathbf{\hat{x}})\right)$:

$$\begin{align}
\mathbf{0} &= - E\left(\mathbf{x} \cdot  \mathbf{y}^T \right)   + \mathbf{A} \cdot E\left(\mathbf{y} \cdot \mathbf{y}^T\right)  + \left(E\left(\mathbf{x}\right) - \mathbf{A} \cdot E\left(\mathbf{y} \right) \right) \cdot E\left(\mathbf{y}^T\right) \\
&= - E\left(\mathbf{x} \cdot  \mathbf{y}^T \right)   + \mathbf{A} \cdot E\left(\mathbf{y} \cdot \mathbf{y}^T\right)  + E\left(\mathbf{x}\right) \cdot E\left(\mathbf{y}^T\right) - \mathbf{A} \cdot E\left(\mathbf{y} \right) \cdot E\left(\mathbf{y}^T\right) \\
&= - \left(E\left(\mathbf{x} \cdot  \mathbf{y}^T \right) -  E\left(\mathbf{x}\right) \cdot E\left(\mathbf{y}^T\right) \right) + \mathbf{A} \cdot \left( E\left(\mathbf{y} \cdot \mathbf{y}^T\right) - E\left(\mathbf{y} \right) \cdot E\left(\mathbf{y}^T\right)\right)
\end{align}
$$

Writing $\left(E\left(\mathbf{x} \cdot  \mathbf{y}^T \right) -  E\left(\mathbf{x}\right) \cdot E\left(\mathbf{y}^T\right) \right)$ as:

$$
\left(E\left(\mathbf{x} \cdot  \mathbf{y}^T \right) -  E\left(\mathbf{x}\right) \cdot E\left(\mathbf{y}^T\right) \right) = E\left( \left(\mathbf{x} - E(\mathbf{x}) \right) \cdot  \left(\mathbf{y} - E(\mathbf{y})  \right)^T \right)
$$

which is just the covariance matrix $\mathbf{P}_{xy}$.

$$\begin{align}
\mathbf{P}_{xy} &= \left(E\left(\mathbf{x} \cdot  \mathbf{y}^T \right) -  E\left(\mathbf{x}\right) \cdot E\left(\mathbf{y}^T\right) \right) \\
&= E\left( \left(\mathbf{x} - E(\mathbf{x}) \right) \cdot  \left(\mathbf{y} - E(\mathbf{y})  \right)^T \right)
\end{align}
$$

In a similar way we may write $\left( E\left(\mathbf{y} \cdot \mathbf{y}^T\right) - E\left(\mathbf{y} \right) \cdot E\left(\mathbf{y}^T\right)\right)$ as:

$$
\left( E\left(\mathbf{y} \cdot \mathbf{y}^T\right) - E\left(\mathbf{y} \right) \cdot E\left(\mathbf{y}^T\right)\right) = E\left( \left(\mathbf{y} - E\left(\mathbf{y}\right)\right) \cdot \left(\mathbf{y} - E\left(\mathbf{y}\right)\right)^T\right)
$$

which denoted as matrix $\mathbf{P}_{yy}$:

$$\begin{align}
\mathbf{P}_{yy} &= \left( E\left(\mathbf{y} \cdot \mathbf{y}^T\right) - E\left(\mathbf{y} \right) \cdot E\left(\mathbf{y}^T\right)\right) \\
&= E\left( \left(\mathbf{y} - E\left(\mathbf{y}\right)\right) \cdot \left(\mathbf{y} - E\left(\mathbf{y}\right)\right)^T\right)
\end{align} 
$$

Now we have:

$$
\mathbf{P}_{xy} = \mathbf{A} \cdot \mathbf{P}_{yy}
$$

from which we obtain matrix $\mathbf{A}$:

$$
\mathbf{A} = \mathbf{P}_{xy} \cdot \mathbf{P}_{yy}^{-1}
$$

The linear estimator is then:

$$\begin{align}
\mathbf{\hat{x}} &= \mathbf{A} \cdot \mathbf{y} + \mathbf{b} \\
&= \mathbf{P}_{xy} \cdot \mathbf{P}_{yy}^{-1} \cdot \mathbf{y} + E\left(\mathbf{x}\right) - \mathbf{P}_{xy} \cdot \mathbf{P}_{yy}^{-1} \cdot E\left(\mathbf{y} \right) \\
&= E\left(\mathbf{x}\right) + \mathbf{P}_{xy} \cdot \mathbf{P}_{yy}^{-1} \cdot \left(\mathbf{y} - E\left(\mathbf{y} \right)\right) 
\end{align}
$$ 

The *unbiasedness* of estimator $\mathbf{\hat{x}}$ is proved by taken the expectation $E\left(\mathbf{\hat{x}}\right)$ :

$$
E\left(\mathbf{\hat{x}}\right) = E\left(\mathbf{x}\right) + \mathbf{P}_{xy} \cdot \mathbf{P}_{yy}^{-1} \cdot \underbrace{\left(E\left(\mathbf{y}\right) - E\left(\mathbf{y} \right)\right)}_{\mathbf{0}} = E\left(\mathbf{x}\right)
$$

---

In the previous section observed data $\mathbf{y}$ were modelled by an observation operator and noise contribution.

$$
\mathbf{y} = h\left(\mathbf{x} \right) + \mathbf{e}
$$

In the following somewhat more specific assumption shall be made with regards to the observation operator:

$$
\mathbf{y} = \mathbf{H} \cdot \mathbf{x}+ \mathbf{e}
$$

Thus a linear operator is assumed. The expectation $E\left(\mathbf{y} \right)$ is:

$$
E\left(\mathbf{y} \right) = \mathbf{H} \cdot E\left(\mathbf{x}\right)+ E\left(\mathbf{e}\right) = \mathbf{H} \cdot E\left(\mathbf{x}\right)
$$

$$\begin{align}
\mathbf{P}_{yy} &= E\left( \left(\mathbf{y} - E\left(\mathbf{y}\right)\right) \cdot \left(\mathbf{y} - E\left(\mathbf{y}\right)\right)^T\right) \\
&= E\left( \left(\mathbf{y} - \mathbf{H} \cdot E\left(\mathbf{x}\right)\right) \cdot \left(\mathbf{y} - \mathbf{H} \cdot E\left(\mathbf{x}\right)\right)^T\right) \\
&= E\left( \left(\mathbf{H} \cdot \mathbf{x}+ \mathbf{e} - \mathbf{H} \cdot E\left(\mathbf{x}\right)\right) \cdot \left(\mathbf{H} \cdot \mathbf{x}+ \mathbf{e} - \mathbf{H} \cdot E\left(\mathbf{x}\right)\right)^T\right) \\
&= E\left( \left(\mathbf{H} \cdot \left(\mathbf{x} - E\left(\mathbf{x}\right)\right)+ \mathbf{e} \right) \cdot \left(\mathbf{H} \cdot \left(\mathbf{x} - E\left(\mathbf{x}\right)\right)+ \mathbf{e} \right)^T\right) \\
&= E\left( \left(\mathbf{H} \cdot \left(\mathbf{x} - E\left(\mathbf{x}\right)\right)+ \mathbf{e} \right) \cdot \left(\left(\mathbf{x} - E\left(\mathbf{x}\right)\right)^T \cdot \mathbf{H}^T + \mathbf{e}^T \right)\right) \\
&= \mathbf{H} \cdot \underbrace{E\left(\left(\mathbf{x} - E\left(\mathbf{x}\right)\right) \cdot \left(\mathbf{x} - E\left(\mathbf{x}\right)\right)^T\right)}_{\mathbf{P}_{xx}} \cdot \mathbf{H}^T + \underbrace{E\left(\mathbf{e} \cdot \mathbf{e}^T \right)}_{\mathbf{R}} \\
&= \underbrace{\mathbf{H} \cdot \mathbf{P}_{xx} \cdot \mathbf{H}^T}_{\mathbf{P}_{hh}} + \mathbf{R} \\
&= \mathbf{P}_{hh} + \mathbf{R}
\end{align} 
$$

With these definitions the estimator can be re-formulated like this:

$$\begin{align}
\mathbf{\hat{x}} &= E\left(\mathbf{x}\right) + \mathbf{P}_{xy} \cdot \mathbf{P}_{yy}^{-1} \cdot \left(\mathbf{y} - E\left(\mathbf{y} \right)\right) \\
&= E\left(\mathbf{x}\right) + \mathbf{P}_{xy} \cdot \left(\mathbf{P}_{hh} + \mathbf{R} \right)^{-1} \cdot \left(\mathbf{y} - E\left(\mathbf{y} \right)\right)  \\
&= E\left(\mathbf{x}\right) + \mathbf{P}_{xy} \cdot \left(\mathbf{H} \cdot \mathbf{P}_{xx} \cdot \mathbf{H}^T + \mathbf{R} \right)^{-1} \cdot \left(\mathbf{y} - E\left(\mathbf{y} \right)\right)
\end{align}
$$ 


Furthermore we may express $\mathbf{P}_{xy}$ in terms of $\mathbf{P}_{xx}$:

First

$$\begin{align}
\mathbf{P}_{xy} &= E\left( \left(\mathbf{x} - E(\mathbf{x}) \right) \cdot  \left( \mathbf{H} \cdot \left(\mathbf{x} - E\left(\mathbf{x} \right)  \right) + \mathbf{e}  \right)^T \right) \\
&= \underbrace{E\left(\left(\mathbf{x} - E(\mathbf{x}) \right) \cdot \left(\mathbf{x} - E(\mathbf{x}) \right)^T  \right)}_{\mathbf{P}_{xx}} \cdot \mathbf{H}^T \\
&= \mathbf{P}_{xx} \cdot \mathbf{H}^T
\end{align}
$$

then

$$\begin{align}
\mathbf{\hat{x}} &= E\left(\mathbf{x}\right) + \mathbf{P}_{xx} \cdot \mathbf{H}^T \cdot \left(\mathbf{H} \cdot \mathbf{P}_{xx} \cdot \mathbf{H}^T + \mathbf{R} \right)^{-1} \cdot \left(\mathbf{y} - E\left(\mathbf{y} \right)\right) \\
&= E\left(\mathbf{x}\right) + \mathbf{P}_{xx} \cdot \mathbf{H}^T \cdot \left(\mathbf{H} \cdot \mathbf{P}_{xx} \cdot \mathbf{H}^T + \mathbf{R} \right)^{-1} \cdot \left(\mathbf{y} - \mathbf{H} \cdot E\left(\mathbf{x} \right)\right)
\end{align}
$$ 
