In [2]:
import math
import scipy as sp
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

# required for interactive plotting
from __future__ import print_function
from ipywidgets import interact, interactive, fixed
import ipywidgets as widgets
import numpy.polynomial as np_poly

from IPython.display import Math
from IPython.display import Latex
from IPython.display import HTML

initialization  
$ \newcommand{\E}[1]{\mathbb{E}\left[#1\right]}$  
$ \newcommand{\V}[1]{\mathbb{V}\left[#1\right]}$
$ \newcommand{\cov}[1]{\text{cov} \sigma\left[#1\right]}$
$ \newcommand{\EXP}[1]{\exp\left(#1\right)}$  
$ \newcommand{\P}{\mathbb{P}}$
$\newcommand{\mat}[1]{
\left[
\begin{matrix}
#1
\end{matrix}
\right]
}$
$\newcommand{\commentgray}[1]{\color{gray}{\text{#1}}}$
$\newcommand{\arrthree}[1]{
\begin{array}{rlr}
#1
\end{array}
}
$

$\newcommand{\Nl}[3]{\mathcal{N}\left(#1 \mid #2, #3\right)}$
$\newcommand{\Nstdx}{\Nl{\mathbf{x}}{\mathbf{\mu}}{\Sigma}}$
$\newcommand{\ab}{\mathbf{a}}$
$\newcommand{\Ab}{\mathbf{A}}$
$\newcommand{\Abt}{\Ab^T}$
$\newcommand{\bb}{\mathbf{b}}$
$\newcommand{\Bb}{\mathbf{B}}$
$\newcommand{\Cb}{\mathbf{C}}$
$\newcommand{\Db}{\mathbf{D}}$
$\newcommand{\Lb}{\mathbf{L}}$
$\newcommand{\Lbi}{\Lb^{-1}}$
$\newcommand{\mb}{\mathbf{m}}$
$\newcommand{\Mb}{\mathbf{M}}$
$\newcommand{\Rb}{\mathbf{R}}$
$\newcommand{\xb}{\mathbf{x}}$
$\newcommand{\xab}{\mathbf{x_a}}$
$\newcommand{\xabt}{\mathbf{x_a}^T}$
$\newcommand{\xbb}{\mathbf{x_b}}$
$\newcommand{\xbbt}{\mathbf{x_b}^T}$
$\newcommand{\yb}{\mathbf{y}}$
$\newcommand{\zb}{\mathbf{z}}$
$\newcommand{\Ub}{\mathbf{U}}$
$\newcommand{\mub}{\mathbf{\mu}}$
$\newcommand{\muab}{\mathbf{\mu_a}}$
$\newcommand{\mubb}{\mathbf{\mu_b}}$
$\newcommand{\saa}{\Sigma_{aa}}$
$\newcommand{\sab}{\Sigma_{ab}}$
$\newcommand{\sba}{\Sigma_{ba}}$
$\newcommand{\sbb}{\Sigma_{bb}}$
$\newcommand{\laa}{\Lambda_{aa}}$
$\newcommand{\laai}{\Lambda_{aa}^{-1}}$
$\newcommand{\lab}{\Lambda_{ab}}$
$\newcommand{\lba}{\Lambda_{ba}}$
$\newcommand{\lbb}{\Lambda_{bb}}$
$\newcommand{\lbbi}{\Lambda_{bb}^{-1}}$
$\newcommand{\li}{\Lambda^{-1}}$

$\newcommand{\multivarcoeff}{\frac{1}{(2\pi)^{D/2}}
\frac{1}{\left| \mathbf{\Sigma}\right|^{1/2}}}$
$\newcommand{\multivarexp}[2]
{
\left\{
 -\frac{1}{2} 
 {#1}^T 
 #2
 {#1}
\right\}
}$
$\newcommand{\multivarexpx}[1]{\multivarexp{#1}{\Sigma^{-1}}}$
$\newcommand{\multivarexpstd}{\multivarexpx{(\xb-\mub)}}$


<div id='StandardForm' \>
Standard Form  
$$
\Nstdx
=
\multivarcoeff \exp \multivarexpstd
$$

<div id='QuadraticForm'\>
Quadratic Form:
$$
\Delta^2
=
(\xb - \mub)^T
\mathbf{\Sigma}^{-1}
(\xb - \mub)
$$
$\Delta$ is called the Mahalanobis Distance

$\Sigma$
* symmetric since the non-symmetric part will disappear if written as a combination of a symmetric part + asymmetric part
* Real valued
  * Hence the eigen values are real
  * Eigen vectors are orthonormal

init
$\newcommand{\u}{\mathbf{u}}$

* Eigen: If $\left\{\u_i\right\}, \left\{\lambda_i\right\}$ are the eigenvectors and eigenvalues of $\Sigma$, then
$$
\Sigma ~ \u_i = \lambda_i \u_i
$$

* Orthonormality
$$
\u_i^T \u_j
=
\begin{cases}
1 & \text{if i == j}\\
0 & \text{otherwise}
\end{cases}
$$

* <div id='SpectralDecomposition' \>Spectral Decomposition  
$
\begin{array}{ll}
\Sigma      &= \sum_{i=1}^{D} \lambda_i \u_i \u_i^T\\
\Sigma^{-1} &= \sum_{i=1}^{D} \frac{1}{\lambda_i} \u_i \u_i^T
\end{array}
$

* Determinant
$$
\left\lvert
\Sigma
\right\rvert
^{1/2}
=
\prod_{i=1}^{D} \lambda_i^{1/2}
$$

Multivariate to Independent Univariates
=====================

Substituting <a href="#SpectralDecomposition">Spectral Decomposition</a> into <a href="#QuadraticForm">Quadratic Form</a>, we get
\begin{array}{llr}
\Delta^2
&=
(\xb - \mub)^T
\left(
  \sum_{i=1}^{D} \frac{1}{\lambda_i} \u_i \u_i^T
\right)
(\xb - \mub)
\\
&=
\sum_{i=1}^{D} \frac{y_i^2}{\lambda_i}
\end{array}
where $y_i = \u_i^T (\xb - \mub)$. Further,
\begin{array}{ll}
\text{If } \yb
= 
\left[
\begin{matrix}
y_1 \\ \vdots \\ y_D
\end{matrix}
\right] 
&
\text{Then }
\yb = \Ub (\xb - \mub)
\\
\text{where }
\Ub
&=
\mat{\u_1^T \\ \vdots \\ \u_D^T}
\end{array}

Hence, in the $y_j$ coordinate system, <a href="#StandardForm">Standard Form</a> becomes
$$
p(\yb)
=
p(\xb)
\lvert \mathbf{J} \rvert
=
\prod_{i=1}^{D}
\frac{1}{(2\pi\lambda_i)^{1/2}}
\exp \left(-\frac{y_i^2}{2\lambda_i}\right)
$$
Thus, eigenvectors definea new set of shifted and rotated coordinate system,
where it becomes a set of independent univariate Gaussians with 
$y_i = \u_i^T (\xb - \mu), \lambda_i$ as mean and covariance

Moments
====

Mean
----
\begin{array}{llr}
\E{\xb}
&=
\multivarcoeff
\int \exp \multivarexpstd \xb ~dx
\\
&=
\multivarcoeff
\int \exp \multivarexpx{\zb} ~(\zb + \mub) ~d\zb
&
\commentgray{[1]}
\\
&=
\mub
\multivarcoeff
\int \exp \multivarexpx{\zb} ~d\zb
&
\commentgray{[2]}
\\
\implies
\E{\xb}
&=
\mub
\end{array}

1. Sub $\zb = (\xb - \mub)$
1. Since the exponent is a even component and the integral ranges from $(-\infty, \infty)$,  
$\zb$ in the factor $(\zb + \mub)$ will get the fuck lost, leaving just $\mub$.


Second Moment
------------

\begin{array}{llr}
\E{\xb \xb^T}
&=
\multivarcoeff
\int \exp \multivarexpstd \xb \xb^T ~d\xb
\\
&=
\multivarcoeff
\int \exp \multivarexpx{\zb} (\zb+\mub)(\zb+\mub)^T ~d\zb
&
\commentgray{[1]}
\\
&=
\multivarcoeff
\int \exp \multivarexpx{\zb} (\zb \zb^T + \mub \mub^T) ~d\zb
&
\commentgray{[2]}
\\
&=
\mub \mub^T
+
\multivarcoeff
\int \exp \multivarexpx{\zb} \zb \zb^T ~d\zb
\\
&=
\mub \mub^T
+
\multivarcoeff
\sum_{i=1}^{D} \sum_{j=1}^{D}
\u_i \u_j
\int \exp
\multivarexpx{\zb} \zb \zb^T
y_i y_j ~d\yb
&
\commentgray{[3]}
\\
&=
\mub \mub^T
+
\multivarcoeff
\sum_{i=1}^{D} \sum_{j=1}^{D}
\u_i \u_j
\int \exp
\left\{
  -\sum_{k=1}^{D}
     \frac{y_k^2}
          {2\lambda_k}
\right\}
y_i y_j ~d\yb
&
\commentgray{[4]}
\\
&=
\mub \mub^T
+
\multivarcoeff
\sum_{i=1}^{D}
\u_i \u_i^T
\int \exp
\left\{
  -\sum_{k=1}^{D}
     \frac{y_k^2}
          {2\lambda_k}
\right\}
y_i^2 ~d\yb
&
\commentgray{[5]}
\\
&=
\mub \mub^T
+
\sum_{i=1}^{D} \u_i \u_i^T \lambda_i
&
\commentgray{[6]}
\\
\E{\xb \xb^T}
&=
\mub \mub^T
+
\Sigma
&
\commentgray{[7]}
\end{array}

1. Sub $\zb = (\xb - \mub)$
2. Exponent is even $\implies \zb$ terms will vanish
3. $\zb = \sum_{i=1}^{D} y_i \u_j$
4. <a href='#SpectralDecomposition'>SD</a> of $\Sigma^{-1}$:
   $\left( \sum_{i=1}^{D} y_i \u_j^T \right)
    \left( \sum_{i=1}^{D} \frac{1}{\lambda_i} \u_i \u_i^T \right)
    \left( \sum_{i=1}^{D} y_i \u_j \right)
   $
5. By Symmetry,
$$
\text{integral term containing } y_i y_j = 
\begin{cases}
  y_i^2 & \text{if i==j}\\
  0 & \text{if i}\ne\text{j}
\end{cases}
$$
6. $\E{x^2} = \mu^2 + \sigma^2$
7. <a href='#SpectralDecomposition'>SD</a> of $\Sigma = \sum_{i=1}^{D} \lambda_i \u_i \u_i^T$

Covariance
---------
\begin{array}{llr}
\cov{\xb}
&=
\E{\left(\xb - \E{\xb}\right)\left(\xb - \E{\xb}\right)^T}
\\
&=
\E{\left(\xb - \mub\right)\left(\xb - \mub\right)^T}
&
\commentgray{$\E{\xb} = \mub$}
\\
&=
\E{\xb \xb^T} - \mub \mub^T
&
\commentgray{expanding}
\\
&=
\Sigma
&
\commentgray{$\E{\xb\xb^T} = \mub \mub^T + \Sigma$}
\end{array}
Hence the parameter $\Sigma$ is called the fucking covariance matrix.

Limitations
======

1. Complexity
  1. Number of parameters =
     Parameters in
     $\mu$ + Parameters in $\Sigma = D + \frac{D(D+1)}{2} = \frac{D(D+3)}{2}$ 
  1. Inverting Large $\Sigma$ becomes prohibitive.
1. Always Unimodal

Cure
----
1. Continuous Latent Variables: Tractable no. of parameters
  1. MRF's
  1. Linear dynamical systems
1. Mixture of Gaussians: Multimodal

Conditional Gaussian Distributions
===================

Let 
\begin{array}{rlr}
\xb
&=
\mat{\xab \\ \xbb}
\\
\mub
&=
\mat{\muab \\ \mubb}
\\
\Sigma
&=
\mat{\saa & \sab \\ \sba & \sbb}
\\
\text{since} \Sigma^T = \Sigma,
&
\text{we have } \saa \text{ and } \sbb \text{ being symmetric}
&\text{ and } \sba = \sab^T
\\
\Lambda &\equiv \Sigma^{-1}
\\
\Lambda
&=
\mat{\laa & \lab \\ \lba & \lbb}
\end{array}


To find the conditional $p(\xab \mid \xbb)$, we can set $\xbb$ in the joint distribution $p(\xab, \xbb)$. But there is another way. By using  <a href="#QuadraticForm">Quadratic Form</a>, 

<div id='StandardExpansion' />
Given the quadratic form, we can find the mean and covariance by completing the square. Expanding the quadratic form, we get
$$
-\frac{1}{2} (\xb-\mub)^T \Sigma^{-1} (\xb-\mub)
=
-\frac{1}{2} \xb^T \Sigma^{-1} \xb
+\xb^T \Sigma^{-1} \mub
+ const
$$
If we can equate the coefficients of any quadratic form similar to the RHS of the above equation, it becomes straightforward to find the mean and covariance.

<div id='partitionedexpansion'/>
\begin{array}{rlr}
-\frac{1}{2} (\xb - \mub)^T \Sigma^{-1} (\xb - \mub)
&=
\mat{\xab-\muab & \xbb-\mubb}
\mat{\saa & \sab \\ \sba & \sbb}
\mat{\xab-\muab \\ \xbb-\mubb}
\\
&=
-\frac{1}{2} (\xab - \muab)^T \laa (\xab - \muab)
\\
&
-\frac{1}{2} (\xab - \muab)^T \lab (\xbb - \mubb)
\\
&
-\frac{1}{2} (\xbb - \mubb)^T \lba (\xab - \muab)
\\
&
-\frac{1}{2} (\xbb - \mubb)^T \lbb (\xbb - \mubb)
\end{array}

Now we need to find $\mu_{\ab \mid \bb}$ and $\Sigma_{\ab \mid \bb}$. If we consider $\xbb$ as constant and pick the terms of second order in $\xab$, we get

\begin{array}{llr}
-\frac{1}{2} \xab^T \laa \xab
\end{array}

Hence, $\Sigma_{\ab \mid \bb} = \laa^{-1}$

Terms linear in $\xab$  
$$
\arrthree{
&\laa\muab
\\
&-\frac{1}{2} \lab (\xbb - \mubb)
\\
&-\frac{1}{2} \lab (\xbb - \mubb)
\\
\implies
&
\xab^T
\left\{
  \laa\muab - \lab (\xbb - \mubb)
\right\}
}
\\
\implies
\Sigma^{-1}_{\ab \mid \bb} \mu_{\ab \mid \bb}
=
\laa\muab - \lab (\xbb - \mubb)
\\
\laa \mu_{\ab \mid \bb}
=
\laa\muab - \lab (\xbb - \mubb)
\\
\mu_{\ab \mid \bb}
=
\muab - \laa^{-1} \lab (\xbb - \mubb)
\\
$$

<div id='ConditionalMoments' />
Hence, the mean and covariance of the condional are
$$
\arrthree{
\E{\mub_{\ab \mid \bb}}
&=
\muab - \laa^{-1} \lab (\xbb - \mubb)
\\
\cov{\ab \mid \bb}
&=
\laai
}
$$

<div id='ParitionInverse'/>
Since
\begin{array}{rlr}
\mat{
\Ab & \Bb\\
\Cb & \Db
}^{-1}
&=
\mat{
\Mb & -\Mb\Bb\Db^{-1}
\\
-\Db^{-1}\Cb\Mb &
\Db^{-1}+\Db^{-1}\Cb\Mb\Bb\Db^{-1}
}
\\
\text{where }
\Mb
&=
\left(
  \Ab - \Bb\Db^{-1}\Cb
\right)^{-1}
\\
\end{array}

Using the covariance matrix instead of the precision matrix, we get

$$
\arrthree{
\mat{
\saa & \sab \\
\sba & \sbb
}^{-1}
&=
\mat{\laa & \lab \\ \lba & \lbb}
\\
\laa
&=
\mathbf{M} = (\saa - \sab \sbb^{-1} \sba)^{-1}
\\
\lab
&= -\mathbf{M} \sab \sbb^{-1}
 = -(\saa - \sab \sbb^{-1} \sba)^{-1} \sab \sbb^{-1}
\\
\text{Hence}
\\
\mu_{\ab \mid \bb}
&=
\muab - \laa^{-1} \lab (\xbb - \mubb)
\\
&=
\muab
-
\laa^{-1}
(-\laa \sab \sbb^{-1})
(\xbb - \mubb)
\\
&=
\muab + \sab \sbb^{-1} (\xbb -\mubb)
\\
\Sigma_{\ab \mid \bb}
&=
\saa - \sab\sbb^{-1}\sba
}
$$

Marginal Gaussian Distributions
=================

Need to find $p(\xab) = \int p(\xab, \xbb) d\xbb$.
From <a href="#partitionedexpansion">Partitioned Expansion</a>, we have the following terms involving only $\xbb$.
$$
\arrthree{
\text{Terms having }\xbb
&=
-\frac{1}{2}(\xab-\muab)^T \lab \xbb
\\
&~~~ 
-\frac{1}{2}\xbb^T \lba (\xab-\muab)
\\
&~~~
-\frac{1}{2} \xbb^T \lbb \xbb
+ \xbb^T \lbb \mubb
\\
&=
-\frac{1}{2} \xbb^T \lbb \xbb
+ \xbb^T \lbb \mubb
- \xbb^T \lba (\xab-\muab)
\\
&=
-\frac{1}{2} \xbb^T \lbb \xbb
+ \xbb^T ( \lbb \mubb - \lba (\xab-\muab))
\\
&=
-\frac{1}{2} \xbb^T \lbb \xbb
+ \xbb^T \mb
\\
&=
-\frac{1}{2} \xbb^T \lbb \xbb
+\frac{1}{2} 2 \xbb^T \lbb \lbb^{-1} \mb
-\frac{1}{2} \mb^T \lbb^{-1} \mb
+\frac{1}{2} \mb^T \lbb^{-1} \mb
\\
&=
-\frac{1}{2}
(\xbb - \lbb^{-1} \mb)^{-1}
\lbb
(\xbb - \lbb^{-1} \mb)
+ \frac{1}{2} \mb^T \lbb^{-1} \mb
}
$$

Only the last term is dependent on $\xab$. Taking this term along with the other terms that depend on $\xab$ in <a href="#partitionedexpansion">Partitioned Expansion</a>, we get
$$
\arrthree{
\text{Terms having }\xab
&=
\frac{1}{2} \mb^T \lbbi \mb
\\ &
~~~
-\frac{1}{2}\xabt \laa \xab
+\xabt (\laa \muab + \lab \mubb)
\\
&=
( \lbb \mubb - \lba (\xab-\muab))^T \lbbi ( \lbb \mubb - \lba (\xab-\muab))
\\
&
-\frac{1}{2}\xabt \laa \xab
+\xabt (\laa \muab + \lab \mubb)
\\
&=
-\frac{1}{2} \xabt \left( 
\laa - \lab \lbbi \lba
\right) \xab
\\
& ~~~
+ \xabt \left( \laa\muab - \lab \lbbi \lba \right)^{-1} \muab
+ const
}
$$

From <a href='#StandardExpansion'>Standard Expansion</a>, we get
$$
\arrthree{
\text{Covariance} \\
\Sigma_a
&=
\left( \laa\muab - \lab \lbbi \lba \right)^{-1}
\\
\text{Mean}
&=
\Sigma^{-1} (\text{coeff of linear term})
\\
&=
\Sigma_a \left( \laa\muab - \lab \lbbi \lba \right) \muab
\\
&= \muab
}
$$

Another vantage point
$$
\text{Since }
\mat{\laa & \lab \\ \lba & \lbb}^{-1}
=
\mat{\saa & \sab \\ \sba & \sbb}
$$
we can use <a href='#ParitionInverse'>General Partition Inverse</a>,
to have $\saa = \left( \laa\muab - \lab \lbbi \lba \right)^{-1}$

<div id='MarginalMoments'/>
Thus, the mean and covariance for marginals is given by
$$
\arrthree{
\E{\xab} &= \muab \\
\cov{\xab} &= \saa
}$$

In conclusion, for partitioned gaussians, we have

\begin{array}{rllrl}
\xb
&=
\mat{\xab \\ \xbb}
&&
\mub
&=
\mat{\muab \\ \mubb}
\\
\Sigma
&=
\mat{\saa & \sab \\ \sba & \sbb}
&&
\Lambda
&=
\mat{\laa & \lab \\ \lba & \lbb}
\\
\end{array}
\begin{array}{rlr}
\text{Conditional Distribution}\\
p(\xab \mid \xbb)
&=
\mathcal{N}
\left(
  \xb \mid \mu_{\ab \mid \bb}, \laai
\right)
\\
\mu_{\ab \mid \bb}
&=
\muab - \laa^{-1} \lab (\xbb - \mubb)
\\
\text{Marginal Distribution} \\
p(\xab)
&=
\mathcal{N} \left( \xab \mid \muab, \saa \right)
\end{array}




Bayes' Theorem for Gaussian Variables
=====================

Given
* $p(\xb)$
* $p(\yb \mid \xb)$
  * mean: linear function of **x**
  * covar: independent of **x**
* This is a linear Gaussian Model

Find
* $p(\yb)$
* $p(\xb \mid \yb)$

Let
$$
p(\xb) = \Nl{\xb}{\mu}{\Lambda^{-1}}\\
p(\yb \mid \xb) = \Nl{\yb}{\Ab\xb + \bb}{\Lb^{-1}}
$$

* let $\zb = \mat{\xb \\ \yb}$
* Then
\begin{array}{llr}
\ln p(\zb)
&=
\ln p(\xb) + \ln p(\yb \mid \xb)
\\
&=
\multivarexp{(\xb - \mub)}{\Lambda}
\\
&~~~
+\multivarexp{(\yb-\Ab\xb-\bb)}{\Lb}
+\text{const}
&
\commentgray{terms ind of $\xb, \yb$}
\\
\end{array}

To find the variance consider the second order terms
\begin{array}{llr}
&
-\frac{1}{2}\xb^T(\Lambda + \Ab^T\Lb\Ab)\xb
+\frac{1}{2}\xb^{T}\Ab^T\Lb\xb
+\frac{1}{2}\yb^{T}\Lb\Ab\xb
-\frac{1}{2}\yb^{T}\Lb\yb
\\
&=
-\frac{1}{2}
\mat{\xb & \yb}
\mat{
\Lambda+\Ab^T\Lb\Ab & -\Ab^T\Lb\\
-\Lb\Ab & \Lb
}
\mat{\xb \\ \yb}
\\
&=
-\frac{1}{2} \zb^T \Rb \zb
\end{array}

Since
$$
\mat{x & y} \mat{A & B\\ C & D} \mat{x \\ y}
=
x^TAx + x^TBy + y^TCx  + y^TDy
$$
and
\begin{array}{rlr}
\mat{
\Ab & \Bb\\
\Cb & \Db
}^{-1}
&=
\mat{
\Mb & -\Mb\Bb\Db^{-1}
\\
-\Db^{-1}\Cb\Mb &
\Db^{-1}+\Db^{-1}\Cb\Mb\Bb\Db^{-1}
}
\\
\text{where }
\Mb
&=
\left(
  \Ab - \Bb\Db^{-1}\Cb
\right)^{-1}
\\
\end{array}

<div id='JoinCovariance'/>
\begin{array}{rlr}
\text{we have for }\Rb\\
\Mb & = \Lambda + \Ab^T\Lb\Ab - (-\Ab^T\Lb)(\Lb^{-1})(-\Lb\Ab) = \Lambda
\\
\text{Hence }
\cov{\zb} = \Rb^{-1}
&=
\mat{
\Lambda^{-1} & \Lambda^{-1}\Ab^T\\
\Ab\Lambda^{-1} & \Lb^{-1} + \Ab\Lambda^{-1}\Ab^T
}
\end{array}

Identifying the mean involves the linear terms in $\ln p(\zb)$
$$
\xb^T \Lambda \mub - \xb^T \Ab^T \Lb \bb + \yb^T \Lb \bb
=
\mat{\xb \\ \yb}^T
\mat{\Lambda \mub - \Ab^T \Lb \bb \\ \Lb \bb}
$$
From <a href='#StandardExpansion'>Standard Expansion</a>, we have
$$
\arrthree{
\E{\zb}
&=
\text{cov} \times \text{Coeff of linear term}
\\
&=
R^{-1} \mat{\Lambda \mub - \Ab^T \Lb \bb \\ \Lb \bb}
\\
&=
\mat{
\Lambda^{-1} & \Lambda^{-1}\Ab^T\\
\Ab\Lambda^{-1} & \Lb^{-1} + \Ab\Lambda^{-1}\Ab^T
}
\mat{\Lambda \mub - \Ab^T \Lb \bb \\ \Lb \bb}
\\
&=
\left[
  \begin{array}{rlrl}
     \li
     &
     \left( \Lambda \mu - \Ab^T \Lb \bb\right)
     &+
     \li \Abt
     &
     \Lb \bb
     \\
     \Ab \li 
     &
     \left( \Lambda \mu - \Ab^T \Lb \bb\right)
     &+
     \left( \Lb^{-1} + \Ab \li \Abt \right)
     &
     \Lb \bb
  \end{array}
\right]
\\
&=
\mat{
  \mub \\
  \Ab \mu + \bb
}
}
$$

From <a href='#MarginalMoments'>Moments of Marginals</a> and 
<a href='#JoinCovariance'>Covariance **R**</a> , we have,
$$
\arrthree{
\E{\yb}
&=
\Ab \mub + \bb
\\
\cov{\yb}
&=
\Lbi + \Ab \li \Abt
}
$$

Now we need to seek closure for Conditional $p(\xb \mid \yb)$. Since the expressions for conditional is better expressed using the precision matrix as in <a href='#ConditionalMoments'>Moments of Conditionals</a>, we have
$$
\newcommand{\latla}{\left( \Lambda + \Abt \Lb \Ab \right)}
\newcommand{\latlai}{\latla^{-1}}
\arrthree{
\cov{\xb \mid \yb}
&=
\laai = \latlai
\\
\E{\xb \mid \yb}
&=
\muab - \laa^{-1} \lab (\xbb - \mubb)
\\
&=
 \mub
 -\latlai
 \left( -\Abt \Lb \right)
 (\yb - \Ab \mub - \bb)
\\
&=
\latlai \latla \mub
 +\latlai \left(\Abt \Lb (\yb - \bb) \right)
\\
& ~~~
 +\latlai
 \left(\Abt \Lb \right)(-\Ab \mub)
\\
&=
\latlai \left( \Abt \Lb (\yb - \bb) + \Lambda \mub \right)
}
$$

Maximum Likelihood for Gaussian
==================

Sequential Estimation
============

Bayesian Inference for Gaussian
=================