In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

def plot_samples(t, Zs, wrap, size=1.5):
    """ Plot a number of processes in a grid.
    
    Args:
        t: index array of size n
        Zs: mxn array of m process samples
        wrap: start a new line after this number of plots
        size: size of each plot
    """
    n_samples = Zs.shape[0]
    df = pd.DataFrame(dict(
        sample=np.repeat(range(n_samples), len(t)),
        t=np.tile(t, n_samples),
        Z=Zs.ravel()))
    grid = sns.FacetGrid(df, col='sample', hue='sample', col_wrap=wrap, size=size)
    grid.map(plt.plot, 't', 'Z', marker="o", ms=4)
    plt.show()

# Gaussian Process

GPs are a nice tool for probabilistic regression.

## Links
- [mathematicalmonk on YouTube](https://www.youtube.com/watch?v=vU6AiEYED9E&list=PLD0F06AA0D2E8FFBA&index=150)
- **TODO** Nando de Freitas on YouTube

## Context and definitions
A random process (aka stochastic process) is a collection of random variables (outputs). In a GP, $n$ elements from some index set $S$ are mapped to random variables $Z_t: t \in S$, and $(Z_{t1}, ..., Z_{tn})$ form a multivariate Gaussian distribution.

### Intuition
Often, the index set $S$ contains points in time. Now all *outputs*—let's call them *states*—associated with a point in time $Z_t$ form a multivariate Gaussian distribution. This means that each *state at some time* of the GP is another *dimension* of the overall Gaussian. The key point of GP is, that an **entire family of $S$ to output mappings** is **described by the characteristics** of this overall distribution. It has some mean $\mu$ and—more importantly—a covariance $C$. The **covariance links** an **output** $Z_i$ to **all other outputs** (in general both past and future).

### Example: random lines
Let's pick some random slope $m \sim \mathcal{N}(0, 1)$ (Gaussian with mean 0 and variance 1). Our index can be any real number, so $S = \mathbb{R}$. If we set our outputs to be

$$
Z_t = mt
$$

we get an overall Gaussian (by the affine property)

$$
(Z_{t0}, Z_{t1}, ...)^T = (t_0, t_1, ...)^T \mathcal{N}(0, 1)
$$

Let's plot some samples from this process the straightforward way

In [None]:
def plot_random_lines():
    ms = np.random.randn(8)
    t = np.linspace(0, 1, 5)
    # Shortcut for computing m*t for each m and t
    lines = np.outer(ms, t)
    plot_samples(t, lines, 4)
    
plot_random_lines()

## Mean and covariance
Remember the key point: *we can express a process like random lines by the $\mu$ and $C$ of a Gaussian distribution*. In general, we let $\mu$ be a function $\mu: S \to \mathbb{R}$. The covariance matrix $C$ is defined by another function $k: S \times S \to \mathbb{R}$, such that $C_{ij} = k(t_i, t_j)$. With this, we have defined our overall Gaussian. Note that the covariance of two outputs is defined by a function on their indices: $\mathrm{Cov}(Z_i, Z_j) = k(t_i, t_j)$.

### Random lines revisited
We can re-formulate the random lines process. Its mean function is zero and its covariance function is $k(t_i, t_j) = t_i t_j$. Let's draw some samples from that:

In [None]:
# Ignore the warning, it's due to numerical precision errors.
# You can modify the function to verify that all eigenvalues
# of C are non-negative or very close to zero.

def plot_random_lines_2():
    # Define input set
    t = np.linspace(0, 1, 5)
    # Calculate mu and C
    mu = np.zeros(5)
    C = np.array([[i * j for j in t] for i in t])
    # Draw samples
    lines = np.random.multivariate_normal(mu, C, 8)
    plot_samples(t, lines, 4)
    
    
plot_random_lines_2()

**To be continued...**