# Constraints Learning

Assume you have an optimization problem written in terms of $\mathbf{\theta}$:

$\min_{\mathbf{\theta}} f(\mathbf{\theta})$

where $f$ maybe be for instance a polynomial or a rationial function, and $\theta \in \mathbb{R}^N$ may be multidimensional. We assume that you can write the above problem in an quivalent QCQP form by using a "lifting function" $\mathbf{l}(\theta) \in \mathbb{R}^M$ and defining the hihger-dimensional lifted state vector

$\mathbf{x}(\theta) = \begin{bmatrix}1 \\ \theta \\ z_1 \\ \vdots \\ z_M \end{bmatrix} = 
\begin{bmatrix}1 \\ \theta \\ \mathbf{l}(\theta) \end{bmatrix} \in \mathbb{R}^{1+N+M}$ 

Now we assume that each of the added constraints can itself be written as a quadratic function: 

$l_m(\theta) - z_m = \mathbf{x}(\theta)^\top \mathbf{A}_m \mathbf{x}(\theta) = 0$

where $\mathbf{A}_m$ ($m=1\ldots M$) are the constraints matrices. 
Sometimes, there may also exist redundant constraints, meaning some other matrices such that

$\mathbf{x}(\theta)^\top \mathbf{B}_m \mathbf{x}(\theta) = 0$. 

The goal of this not is to, for a given lifting function $\mathbf{l}(\theta)$, find the form of the redundant vs. primal constraints. 

**note that currently we just find all constraints and don't distinguish between primal (moment) constraints and redundant constraints**

In [None]:
import numpy as np
import matplotlib.pylab as plt
import pandas as pd
from IPython.display import display
%reload_ext autoreload
%autoreload 2

import shutil
usetex = True if shutil.which('latex') else False
print("found latex:", usetex)
plt.rcParams.update({
    "text.usetex": usetex,
    "font.family": "DejaVu Sans",
    "font.size": 12,
})
plt.rc('text.latex', preamble=r'\usepackage{bm}')
figsize = 7

#%matplotlib notebook
%matplotlib inline

from lifters.plotting_tools import savefig



# 1. Lifting functions

Currently implemented setups:

Poly4Lifter: 

- univariate quartic polynomial

$\mathbf{x}^\top = [1, t, \underbrace{t^2}_{z}]$

- no redundant constraints

Poly6Lifter: 

- univariate sectic polynomial
- $\mathbf{x}^\top = [1, t, \underbrace{t^2}_{z_1}, \underbrace{t^3}_{z_2}]$
- leads to one redundant constraints

RangeOnlyLifter:  

- $N$ positions in $d$ dimensions

- $\mathbf{x}^\top = [1, \mathbf{x}_1, \mathbf{x}_2, \cdots , 
\underbrace{||\mathbf{x}_1||^2}_{z_1}, 
\underbrace{||\mathbf{x}_2||^2}_{z_2}, \cdots]$

- $N$ moment constraints, no redundant constraints

PoseLandmarkLifter: 

- $K$ landmarks $\mathbf{y}_k$, $N$ poses in $d$ dimensions
- $\mathbf{x}^\top = [1, \mathbf{x}_1, \text{vec}(\mathbf{C}_1), \mathbf{y}_1, \cdots , 
\underbrace{\mathbf{C}_1\mathbf{y_1}}_{\mathbf{z}_1}, \underbrace{\mathbf{C}_2\mathbf{y_1}}_{\mathbf{z}_2}, \cdots]$
- $KNd + Nd^2$ moment constraints, many redundant constraints


Stereo1DLifter: 

- $K$ landmarks $y_k$, 1 position in 1 dimensions ($\theta=x$)
- $\mathbf{x}^\top = [1, x, \underbrace{\frac{1}{x-y_1}}_{z_1}, \cdots, \underbrace{\frac{1}{x-y_K}}_{z_K}]$
- $K$ moment constraints, $K(K-1)/2$ redundant constraints

Stereo2DLifter: 

- $K$ landmarks $\mathbf{y}_k$, 1 pose in 2 dimensions ($\mathbf{\theta}=(x, y, \alpha)$, or equivalently transform matrix $\mathbf{T}=\begin{bmatrix} \mathbf{c}_1(\alpha) & \mathbf{c}_2(\alpha) & \begin{bmatrix} x \\ y \end{bmatrix} \\ 0 & 0 & 1 \end{bmatrix})$

- $\mathbf{x}^\top = \begin{bmatrix} 1, \mathbf{c}_1, \mathbf{c}_2, x, y, \underbrace{\frac{1}{\mathbf{e}_y^\top\mathbf{T}{y}_1}\mathbf{T}\mathbf{y}_1}_{z_1}, \cdots, 
\underbrace{\frac{1}{\mathbf{e}_y^\top\mathbf{T}{y}_K}\mathbf{T}\mathbf{y}_K}_{z_K}\end{bmatrix}$

with $\mathbf{e}_y$ the second vector of the 3d identity matrix.

In [None]:
lifter_type = "stereo"
d = 1
level = 0
n_landmarks = 3

In [None]:
from lifters.custom_lifters import Poly4Lifter, Poly6Lifter, RangeOnlyLifter
from lifters.stereo1d_lifter import Stereo1DLifter
from lifters.stereo2d_lifter import Stereo2DLifter
from lifters.stereo3d_lifter import Stereo3DLifter
lifter = None
if lifter_type == "range":
    lifter = RangeOnlyLifter(n_positions=n_landmarks, d=d)
elif lifter_type == "poly":
    if d == 4:
        lifter = Poly4Lifter()
    elif d == 6:
        lifter = Poly6Lifter()
# stereo examples, working
elif lifter_type == "stereo":
    if d == 1:
        lifter = Stereo1DLifter(n_landmarks=n_landmarks)
    elif d == 2:
        lifter = Stereo2DLifter(n_landmarks=n_landmarks, level=level)
    elif d == 3:
        lifter = Stereo3DLifter(n_landmarks=n_landmarks, level=level)
#from lifters.landmark_lifter import PoseLandmarkLifter
# just finds the moment constraint:
#lifter = PoseLandmarkLifter(n_landmarks=2, n_poses=1, d=2)


# 1. "learn" constraints matrices

The idea is to learn the nullspace of the matrix composed of many randomly generated feasible and lifted points:  
Call $\mathbf{x}_{i}$ the $i$-th randomly generated setup, then we know that

$\forall i, m: \quad \text{trace}(\mathbf{x}_{i}{\mathbf{x}_{i}}^\top\mathbf{A}_m) = 0$

$\iff$ 

$\forall i, m: \quad  \underbrace{\text{vec}(\mathbf{X}_{i})^\top}_{\mathbf{y}_{i}^\top} \underbrace{\text{vec}(\mathbf{A}_m)}_{\mathbf{a}_m} = 0$

$\iff$ 

$\mathbf{a}_m \in \mathcal{N}(\mathbf{Y}), \quad \mathbf{Y} = \begin{bmatrix} \mathbf{y}_1^{\top} \\ \vdots \\ \mathbf{y}_L^{\top}\end{bmatrix}$

where $L$ is the number of samples. Note that we can reduce the search to the upper triangular part of $\mathbf{A}_m$ since we know that resulting matrix needs to be symmetric, but for simplicity we write everything in terms of the full matrix below. 
We find an orthonormal basis of the nullspace using SVD or QR decomposition, and then construct $\mathbf{A}_m$ by undoing the (half-)vec operation. 

We call $N_0$ the dimension of the nullspace, or the number of basis vectors found.

In [None]:
t = lifter.get_theta()
x = lifter.get_x(t)
print("unknowns shape", lifter.unknowns.shape)
print("theta shape", t.shape)
print("x shape", x.shape)

In [None]:
# generate many random setups and collect in matrix Y
Y = lifter.generate_Y(factor=3)
print("shape of setup matrix Y:", Y.shape)

In [None]:
from lifters.plotting_tools import plot_singular_values
# compute nullspace of Y

method = "qr"
eps = 1e-4
basis, S = lifter.get_basis(Y, method=method, eps=eps)
print("nullspace basis:", basis.shape)

fig, ax = plot_singular_values(S, eps)
savefig(fig, f"../_plots/svd_{lifter}.png")

In [None]:
A_known = lifter.get_A_known()
for Ai in A_known:
    x = lifter.get_x(lifter.get_theta())
    assert abs(x.T @ Ai @ x) <= 1e-10

In [None]:
# generate matrices from found nullspace

eps = 1e-5
A_list = lifter.generate_matrices(basis)

max_error = -np.inf

# testing only:
# make sure all constraints hold for new setups 
from lifters.stereo_lifter import get_theta_from_unknowns

for seed in range(1000):
    np.random.seed(seed)
    unknowns = lifter.generate_random_unknowns(replace=False)
    x = lifter.get_x(unknowns)
    
    for i, A in enumerate(A_list):
        ci = np.abs(x.T @ A @ x)
        max_error = max(max_error, ci)
        if seed == 0:
            print(f"error of matrix {i}: {ci:.1e}")
        if ci > eps:
            print(f"!! big error for seed {seed}, matrix {i}: {ci:.1e}")
print("max constraint error:", max_error)

In [None]:
# plot resulting matrices
from lifters.plotting_tools import plot_matrices
from math import ceil
n = 10
chunks = min(ceil(len(A_list) / n), 10)
vmin = np.min([np.min(A) for A in A_list])
vmax = np.max([np.max(A) for A in A_list])
plot_count = 0
for k in np.arange(chunks):
    fig, ax = plot_matrices(A_list, n_matrices=n, start_idx=k*n, 
                            colorbar=False, nticks=3,
                            vmin=vmin, vmax=vmax)
    if k in [0, 1, chunks-2, chunks-1]:
        savefig(fig, f"../_plots/A{plot_count}_{lifter}.png")
        plot_count += 1
    
Q, y = lifter.get_Q()
fig, ax = plt.subplots()
fig.set_size_inches(3, 3)
ax.matshow(np.abs(Q) > 1e-10)
ax.set_title("Q mask")
savefig(fig, f"../_plots/Q_{lifter}.png")

# 2. Solve dual problem

Using the learned matrices, we solve the dual problem

$
\begin{align} 
d_n^* = &\max_{\rho, \mathbf{\lambda}} -\rho \\
&\text{s.t. } \mathbf{Q} + \sum_{m=1}^n \lambda_m \mathbf{A}_m + \rho \mathbf{A}_0 \succeq 0
\end{align}
$

where $n \leq N_0$ denotes the number of constraints we are adding
and compare the obtained cost to the cost of the (hopefully globally optimal) solution obtain by solving the original problem with a simple local solver:

$
\begin{align}
q^* &= \min_{\mathbf{\theta}} f(\mathbf{\theta})
\end{align}
$

In [None]:
#df = pd.read_pickle("../_results/study_stereo1d_zero_noise.pkl")
import sys
sys.path.append("../_scripts/")
from noise_study import run_noise_study

params = dict(noise=1e-1, n_seeds=1, n_shuffles=0)
df = run_noise_study(lifter, A_list, **params)
#display(df)
#name = "study_stereo3d_mediumnoise_3level"
#df = pd.read_pickle(f"../_results/{name}.pkl")

In [None]:
from lifters.plotting_tools import plot_tightness
fig, ax = plot_tightness(df)
savefig(fig, f"../_plots/tightness_{lifter}.png")

In [None]:
n_eigs = 5
palette = "viridis"
for seed, df_seed in df.groupby("seed"):
    fig, ax = plt.subplots()
    cmap = plt.get_cmap(lut=len(df_seed.n.unique()), name=palette)
    df_seed = df_seed[df_seed.shuffle==0]
    for (n, shuffle), df_n in df_seed.groupby(["n", "shuffle"]):
        assert len(df_n) == 1
        row = df_n.iloc[0]
        label = f"n={row.n}"
        if row.eigs is not None:
            ax.plot(row.eigs[:n_eigs], color=cmap(n), label=label)
            label = None
    ax.set_title(f"smallest {n_eigs} eigenvalues of H")
    ax.set_xticks(range(n_eigs))
    #ax.legend()
    ax.set_yscale("symlog")
    break

# 4. Next steps

- Using the $A_m$ that we know, try to complete the basis, so that it stays sparse. 