# Learning a quadratic pseudo-metric from distance measurements

Recall that pseudo-metric is a generalization of a metric space in which the distance between two distinct points can be zero.
We are given a set of $N$ pairs of points in $\mathbf{R}^n$, $x_1, \ldots, x_N$, and $y_1, \ldots, y_N$, together with a set of distances $d_1, \ldots, d_N > 0$.
  The goal is to find (or estimate or learn) a quadratic pseudo-metric $d$
  $$d(x,y) =  \left( (x-y)^T P(x-y) \right)^{1/2},$$
  $P\in \mathbf{S}^n_{+}$, which approximates the given distances, i.e., $d(x_i, y_i) \approx d_i$. (The pseudo-metric $d$ is a metric only when $P \succ 0$; when $P\succeq 0$ is singular, it is a pseudo-metric.)
  
  To do this, we will choose $P\in \mathbf{S}^n_+$ that minimizes the mean squared error objective
  
  $$f(S)=\frac{1}{N}\sum_{i=1}^N (d_i - d(x_i,y_i))^2.$$
  
  ### Theoretical part.
  1. Show that the objective function $f$ is convex (Hint: expand the square and see what happens.)
  2. Show that the convex program $\text{minimize }f(S)$, $S\succeq 0$ can be expressed by an equivalent conic program with linear objective and a number of conic constraints using the $R^n_+$ (nonnegative orthant cone), $Q^n$ (second order cone), $Q_r^n$ (rotated second order cone), $S^n_+$ (positive semidefinite cone).
  
  ### Programming Part
  1. Solve the program $\text{minimize }f(S)$, $S\succeq 0$, preferably using a modelling package like ``cvxpy``. Note that "under the hood" your modelling package translates the program to the conic form in point 2. above.
  2. Use the obtained $P$ to measure the mean square error for the test data ``X_test``, ``Y_test``, ``d_test``.
  
---- 
*This exercise originates from "Additional Exercises" collection for Convex Optimization textbook of S. Boyd and L. Vandenberghe. Used under permission*

In [2]:
import cvxpy as cp
import numpy as np
from scipy import linalg as la

1. As the sum of convex functions is convexits sufficient to show that the $(d_i - \sqrt{(x_i-y_i)^T P (x_i - y_i)})^2$ is convex. We can see that $\sqrt{(x_i-y_i)^T P (x_i - y_i)}$ is concave, as it is a composition of concave sqrt function and an affine function $(x_i-y_i)^T P (x_i - y_i)$ thus $d_i - \sqrt{(x_i-y_i)^T P (x_i - y_i)}$ is convex so $(d_i - \sqrt{(x_i-y_i)^T P (x_i - y_i)})^2$ is also convex as a square of a convex function.
So $f(P)$ is convex as a sum of convex functions.


2. To express the convex program $min f(P)$ subject to $P \succeq 0$  as a conic program with a linear objective we can introduce a new variable $t \geq 0$ such that: \newline
$d_i - t_i \geq \sqrt{(x_i-y_i)^T P (x_i - y_i)}$ \newline
$t_i \geq 0$ \newline
The term $\sqrt{(x_i-y_i)^T P (x_i - y_i)}$ is the Euclidean norm of $P^{\frac{1}{2}} (x_i - y_i)$ so it can be written as $||P^{\frac{1}{2}} (x_i - y_i)||_2$, so we can reframe the first constraint as 
$||P^{\frac{1}{2}} (x_i - y_i)||_2 \leq d_i - t_i$, thus the problem becomes:\newline
$min \sum^N_{i = 1} t_i$ \newline
subject to:\newline
$\forall_i ||P^{\frac{1}{2}} (x_i - y_i)||_2 \leq d_i - t_i$ \newline
$t \geq 0$\newline
$P \succeq 0$ \newline
So our problem can be expressed as a conic program with linear objective $QED$.

In [3]:
# In this box we generate the input data

np.random.seed(5680)

n = 5 # Dimension
N = 100 # Number of samples

P = np.random.randn(n,n)
P = P.dot(P.T) + np.identity(n)
sqrtP = la.sqrtm(P)

x = np.random.randn(N,n)
y = np.random.randn(N,n)

d = np.linalg.norm(sqrtP.dot((x-y).T),axis=0)    # distances according to metric P
d = np.maximum(d+np.random.randn(N),0)           # add random noise

N_test = 10 # Samples for test set
X_test = np.random.randn(N_test,n)
Y_test = np.random.randn(N_test,n)
d_test = np.linalg.norm(sqrtP.dot((X_test-Y_test).T),axis=0)  # distances according to metric P
d_test = np.maximum(d_test+np.random.randn(N_test),0)         # add random noise

S = cp.Variable((n,n), PSD=True)
t = cp.Variable(N)

objective = cp.Minimize(cp.sum(t))

# Constraints
constraints = []
for i in range(N):
    z_i = x[i] - y[i]
    constraints += [
        cp.SOC(d[i] - t[i], S @ z_i),
        t[i] >= 0
    ]

problem = cp.Problem(objective, constraints)
problem.solve()


S_opt = S.value
print("Optimal S:", S_opt)

mse = np.mean((d_test - np.linalg.norm(sqrtP.dot((X_test-Y_test).T),axis=0))**2)

print("MSE on test set:", mse)











Optimal S: [[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]
MSE on test set: 0.7802951274920111
