# KIPD: From Unconstrained to Constrained Minimization

This notebook provides the code for the blog post *"KIPD: From Unconstrained to Constrained Minimization"*. We use `cvxpy` to implement four different semidefinite programming (SDP) formulations for finding the maximum lower bound $\gamma$ for a polynomial $p(x)$ over the set where a constraint polynomial $g(x)=0$.

This is achieved by finding the largest $\gamma$ such that $p(x) - \gamma = s(x) + t(x)g(x)$, where $s(x)$ is a sum-of-squares (SOS) polynomial and $t(x)$ is a multiplier polynomial.

### Problem Statement

We will focus on finding the global minimum of:
$$ \min_{x} p(x) = -x \quad \text{s.t.} \quad g(x) = x^2 - 4 = 0 $$
The feasible set is $x = \{-2, 2\}$. The minimum of $p(x)$ on this set is clearly -2, which occurs at $x = 2$. We expect our optimization to find $\gamma^* = -2$.

### Formulation Setup

For this problem, the degrees must match in $p(x) - \gamma = s(x) + t(x)g(x)$. Since deg(p)=1 and deg(g)=2, we can choose deg(s)=2 and deg(t)=0. 

- The SOS polynomial $s(x)$ uses the monomial basis $\mathbf{v}(x) = [1, x]^\top$, so $s(x) = \mathbf{v}(x)^\top \mathbf{X} \mathbf{v}(x)$.
- The multiplier $t(x)$ is a scalar constant, $t(x)=t_0$.

The core identity is:
$$ p(x) - \gamma = \mathbf{v}(x)^\top \mathbf{X} \mathbf{v}(x) + t_0 g(x) $$
$$ -x - \gamma = (X_{00} + 2X_{01}x + X_{11}x^2) + t_0(x^2 - 4) $$
Collecting terms by powers of $x$:
$$ 0x^2 - 1x^1 - \gamma x^0 = (X_{11} + t_0)x^2 + (2X_{01})x^1 + (X_{00} - 4t_0)x^0 $$

Equating coefficients gives the following linear system:
- $x^2: 0 = X_{11} + t_0$
- $x^1: -1 = 2X_{01}$
- $x^0: -\gamma = X_{00} - 4t_0$

We use these equations to build our four SDPs.

In [None]:
import cvxpy as cp
import numpy as np
import time

# Set numpy print options for better readability
np.set_printoptions(precision=4, suppress=True)

In [None]:
# Coefficients of p(x) = -x (coeffs are [p0, p1, p2])
p_coeffs = np.array([0., -1., 0.])

# Coefficients of g(x) = x^2 - 4 (coeffs are [g0, g1, g2])
g_coeffs = np.array([-4., 0., 1.])

# Coefficient-matching matrices A_i for the SOS part <X, A_i>
# A_0, A_1, A_2 correspond to the constant, x, and x^2 terms.
A = [
    np.array([[1, 0], [0, 0]]), # A_0 for s_0
    np.array([[0, 1], [1, 0]]), # A_1 for s_1
    np.array([[0, 0], [0, 1]])  # A_2 for s_2
]

## The Four Optimization Formulations

In [None]:
def solve_primal_kernel_min(p, g, A_matrices):
    """Solves the (P-K) Primal Kernel Optimization Problem."""
    X = cp.Variable((2, 2), symmetric=True)
    gamma = cp.Variable()
    t0 = cp.Variable()
    
    # p_i - delta_i0*gamma = <A_i, X> + t0*g_i
    constraints = [
        p[2] == cp.trace(A_matrices[2] @ X) + t0 * g[2], # x^2 coeff
        p[1] == cp.trace(A_matrices[1] @ X) + t0 * g[1], # x^1 coeff
        p[0] - gamma == cp.trace(A_matrices[0] @ X) + t0 * g[0], # x^0 coeff
        X >> 0
    ]
    
    problem = cp.Problem(cp.Maximize(gamma), constraints)
    start_time = time.time()
    problem.solve(solver=cp.SCS)
    end_time = time.time()
    
    return gamma.value, X.value, problem.status, end_time - start_time

def solve_primal_image_min(p, g):
    """Solves the (P-I) Primal Image Optimization Problem."""
    gamma = cp.Variable()
    # From the linear system, we parameterize X in terms of gamma and a free variable.
    # Let's choose t0 as the free variable.
    t0 = cp.Variable()

    # X_11 = -t0
    # X_01 = -0.5
    # X_00 = p0 - gamma - t0*g0 = 0 - gamma - t0*(-4) = 4*t0 - gamma
    X_param = cp.bmat([
        [4*t0 - gamma, -0.5],
        [-0.5,         -t0]
    ])
    
    constraint = (X_param >> 0)
    problem = cp.Problem(cp.Maximize(gamma), [constraint])
    start_time = time.time()
    problem.solve(solver=cp.SCS)
    end_time = time.time()
    
    X_sol = X_param.value
    return gamma.value, X_sol, problem.status, end_time - start_time

def solve_dual_kernel_min(p, g, A_matrices):
    """Solves the (D-K) Dual Kernel Optimization Problem."""
    lambda_vars = cp.Variable(len(p))
    l0, l1, l2 = lambda_vars[0], lambda_vars[1], lambda_vars[2]

    constraints = [
        l0 == 1, # Normalization
        l0 * A_matrices[0] + l1 * A_matrices[1] + l2 * A_matrices[2] >> 0, # PSD constraint
        # Constraint from multiplier t0: sum_i(lambda_i * g_i) = 0
        lambda_vars @ g == 0
    ]
    
    objective = cp.Minimize(lambda_vars @ p)
    problem = cp.Problem(objective, constraints)
    start_time = time.time()
    problem.solve(solver=cp.SCS)
    end_time = time.time()

    H_value = l0.value * A_matrices[0] + l1.value * A_matrices[1] + l2.value * A_matrices[2]
    return problem.value, H_value, problem.status, end_time - start_time

def solve_dual_image_min(p, g):
    """Solves the (D-I) Dual Image Optimization Problem (Moment Relaxation)."""
    X = cp.Variable((2, 2), symmetric=True)
    
    # Y(p) is a particular solution to <A_i, Y> = p_i. A simple choice:
    Y_p = np.array([[p[0], p[1]/2], [p[1]/2, p[2]]])
    objective = cp.Minimize(cp.trace(Y_p.T @ X))

    constraints = [
        cp.trace(A[0].T @ X) == 1, # Normalization: <X, A0> = 1, i.e., X_00 = 1
        # Constraint from g(x)=0 implies E[g(x)]=0
        # g0*E[1] + g1*E[x] + g2*E[x^2] = 0
        # g0*X_00 + g1*X_01 + g2*X_11 = 0
        g[0]*X[0,0] + g[1]*(X[0,1]) + g[2]*X[1,1] == 0,
        X >> 0
    ]
    
    problem = cp.Problem(objective, constraints)
    start_time = time.time()
    problem.solve(solver=cp.SCS)
    end_time = time.time()

    return problem.value, X.value, problem.status, end_time - start_time

## Example: Finding the Minimum of $p(x) = -x$ s.t. $x^2-4=0$

We expect to find that the optimal `Î³` is -2.

In [None]:
print("--- Solving Constrained SOS Optimization Problem ---")
print(f"min p(x) = {p_coeffs[1]}x  s.t. x^2-4=0\n")

# (P-K)
gamma_pk, X_pk, status, t = solve_primal_kernel_min(p_coeffs, g_coeffs, A)
eigs = np.linalg.eigvalsh(X_pk)
print(f"(P-K) Primal Kernel: Status='{status}', Time={t:.4f}s")
print(f"Optimal gamma = {gamma_pk:.4f}")
print(f"Gram matrix X:\n{X_pk}\nEigenvalues: {eigs}\n")

# (P-I)
gamma_pi, X_pi, status, t = solve_primal_image_min(p_coeffs, g_coeffs)
eigs = np.linalg.eigvalsh(X_pi)
print(f"(P-I) Primal Image:  Status='{status}', Time={t:.4f}s")
print(f"Optimal gamma = {gamma_pi:.4f}")
print(f"Gram matrix X:\n{X_pi}\nEigenvalues: {eigs}\n")

# (D-K)
gamma_dk, L_dk, status, t = solve_dual_kernel_min(p_coeffs, g_coeffs, A)
eigs = np.linalg.eigvalsh(L_dk)
print(f"(D-K) Dual Kernel:   Status='{status}', Time={t:.4f}s")
print(f"Optimal objective (gamma) = {gamma_dk:.4f}")
print(f"Dual matrix H:\n{L_dk}\nEigenvalues: {eigs}\n")

# (D-I)
gamma_di, L_di, status, t = solve_dual_image_min(p_coeffs, g_coeffs)
eigs = np.linalg.eigvalsh(L_di)
print(f"(D-I) Dual Image:    Status='{status}', Time={t:.4f}s")
print(f"Optimal objective (gamma) = {gamma_di:.4f}")
print(f"Dual matrix (Moment matrix):\n{L_di}\nEigenvalues: {eigs}\n")

## Analysis of the Results

As expected, all four formulations correctly find the optimal value $\gamma = -2.0$. This confirms that the minimum value of $-x$ on the set $\{ -2, 2 \}$ is indeed -2.

An interesting thing to note is the rank of the solution matrices. In the primal solutions, the Gram matrix $\mathbf{X}$ is rank-1. Let's inspect the solution from (P-K):

$$ \mathbf{X} \approx \begin{bmatrix} 1 & -0.5 \\ -0.5 & 0.25 \end{bmatrix} = \begin{bmatrix} 1 \\ -0.5 \end{bmatrix} \begin{bmatrix} 1 & -0.5 \end{bmatrix} $$

This corresponds to the SOS polynomial $s(x) = (1 - 0.5x)^2$. This polynomial is zero at $x=2$, which is exactly the global minimizer of our original problem.

Similarly, the dual solution matrix from (D-I), which is a moment matrix, is rank-1:
$$ \mathbf{X} \approx \begin{bmatrix} 1 & 2 \\ 2 & 4 \end{bmatrix} = \begin{bmatrix} 1 \\ 2 \end{bmatrix} \begin{bmatrix} 1 & 2 \end{bmatrix} $$

This is a moment matrix. The entry $X_{00}=1$ is $E[1]$, $X_{01}=2$ is $E[x]$, and $X_{11}=4$ is $E[x^2]$. This implies the underlying probability distribution is a single point mass at $x=2$, which again recovers the global minimizer. When the relaxation is tight (i.e., it finds the true optimum), this rank-1 property of the dual solution allows for the extraction of the global minimizer(s).