### Distance Metrics for Probability Distributions

We'll be looking at 3 different distance metrics, and see how different probability distributions look with them.

### Creating probability distributions

In [None]:
import pymc3 as pm
import numpy as np

In [None]:
dist_1 = np.random.uniform(low=0, high=5, size=1000)
dist_2 = np.random.uniform()

### Create Models

### Creating Manifolds

Torus. Sphere?

### Kullback–Leibler divergence

In [None]:
def KLdivergence(dist_1, dist_2):
    distance = np.sum(dist_1 * np.log(dist_1 / dist_2))
    return distance

### Hellinger Distance

In [None]:
def hellinger(dist_1, dist_2):
    distance = np.sqrt(0.5 * ((np.sqrt(dist_1) - np.sqrt(dist_2))**2).sum())
    return distance

### Fischer-Rao Metric

The Fischer-Rao metric is a particular Riemannian metric. We normally have a statistical manifold with coordinates at each point; in this small snippet we will make do with pseudo code.

In [None]:
def fischer_rao(distribution, coordinate_1, coordinate_2):
    distance = np.sum(np.log(distribution(coordinate_1)) * np.log(distribution(coordinate_2))*distribution)
    return distance

### SoftAbs Metric

The SoftAbs metric is based on an exponential map.
We need to compute the gradient of the quadratic form, and the log determinant. 
Here p is the momenta and pi(q) is the N-dimensional Target density.

H = Q . $lambda$ . $Q^T$

$lambda$ = Diag($lambda_{i}$)

Lambda is the diagonal matrix of eigenvalues and Q is the corresponding matrix of eigenvectors. 

In [None]:
def grad_quad(H_ij, p):
    Q, lambda_i = decompose(H_ij)
    D = diag(Q_t . p / (lambda_i . coth(alpha . lambda_i))
    J = d(lambda_i . coth(alpha . lambda_i))
    grad = - Trace(Q . D . J . D . Q_t . d(H))
    return grad

In [None]:
def grad_log(H_ij):
    Q, lambda_i = decompose(H_ij)
    J = d(lambda_i . coth(alpha . lambda_i))
    R = diag(1 / lambda_i . coth(alpha . lambda_i)
    grad = Trace(Q . (R ◦ J). Q_t . dH)
    return grad