# Code description

In this notebook, we verify the part of the proof of Theorem C.10 from 

> "Convergence of Proximal Point and Extragradient-Based Methods Beyond Monotonicity: the Case of Negative Comonotonicity"

that involves computation of eigenvalues and singular values.

## Problem

We consider the following problem
       $$ \text{find } x^\ast \text{ such that }F(x^\ast) = 0$$
where $F:\mathbb{R}^d \to \mathbb{R}^d$ is $\rho$-negative comonotone and $L$-Lipschitz operator.

## Algorithm

We consider Optimistic Gradient method (OG) with same-stepsize policy: $\tilde{x}^0 = x^0$ and for all $k > 0$
$$
\tilde{x}^k = x^k - \gamma F(\tilde{x}^{k-1}),
$$
$$
x^{k+1} = x^k - \gamma F(\tilde{x}^k)
$$

## The goal

In the proof of Theorem C.10, we provide the formulas for eignevalues of
$$
T = \begin{pmatrix}
       1 - \gamma_2 L & \gamma_1\gamma_2L^2\\ 
       1 & - \gamma_1 L
       \end{pmatrix}
$$
and for the maximal singular value of
$$
B = \begin{pmatrix}
       I - \gamma_2 LA & \gamma_1\gamma_2L^2 A^2\\ 
       I & - \gamma_1 LA
       \end{pmatrix}, \quad \text{where} \quad A = \begin{pmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{pmatrix} = \begin{pmatrix} -\frac{1}{2} & -\frac{\sqrt{3}}{2} \\ \frac{\sqrt{3}}{2} & \frac{1}{2} \end{pmatrix}.
$$
Below we derive these formulas symbolically.

In [1]:
import sympy
from sympy import Matrix, cos, sin, symbols, eye, pi, sqrt

### Eigenvalues of $T$

In [2]:
γ_1 = symbols('g_1', positive=True)
γ_2 = symbols('g_2', positive=True)
L = symbols('L', positive=True)
T = Matrix([[1 - γ_2*L, γ_1*γ_2*(L**2)],
           [1, -γ_1*L]])
T

Matrix([
[-L*g_2 + 1, L**2*g_1*g_2],
[         1,       -L*g_1]])

In [3]:
lambd = T.eigenvals(multiple=True)
lambd[0]

-L*g_1/2 - L*g_2/2 - sqrt(L**2*g_1**2 + 2*L**2*g_1*g_2 + L**2*g_2**2 + 2*L*g_1 - 2*L*g_2 + 1)/2 + 1/2

This is exactly the eigenvalue that we report in the proof. One can see that it is a decreasing function of $\gamma_1$. Taking $\gamma_1 = \frac{1}{L}$, we get an upper bound for this eigenvalue:

In [4]:
lambd[0].subs(γ_1, 1.0/L)

-L*g_2/2 - 1.0*sqrt(0.25*L**2*g_2**2 + 1)

Clearly, it is smaller than $1$

### Spectral norm of $B$

In [5]:
θ = symbols('θ')
γ_1 = symbols('g_1', positive=True)
γ_2 = symbols('g_2', positive=True)
L = symbols('L', positive=True)
A = Matrix([[cos(θ), -sin(θ)],[sin(θ), cos(θ)]])

B = Matrix([[eye(2) - γ_2*L*A, γ_1*γ_2*(L**2)*(A**2)], 
            [eye(2), -γ_1*L*A]])

In [6]:
B.simplify()

In [7]:
B = B.subs(θ, 2*pi/3)

In [8]:
B

Matrix([
[     L*g_2/2 + 1, sqrt(3)*L*g_2/2,         -L**2*g_1*g_2/2, sqrt(3)*L**2*g_1*g_2/2],
[-sqrt(3)*L*g_2/2,     L*g_2/2 + 1, -sqrt(3)*L**2*g_1*g_2/2,        -L**2*g_1*g_2/2],
[               1,               0,                 L*g_1/2,        sqrt(3)*L*g_1/2],
[               0,               1,        -sqrt(3)*L*g_1/2,                L*g_1/2]])

In [9]:
v = B.singular_values()
B.norm(2)

Max(sqrt(L**4*g_1**2*g_2**2/2 + L**2*g_1**2/2 + L**2*g_2**2/2 + L*g_2/2 - sqrt(L**4*g_1**2*g_2**2 + L**2*g_1**2 + L**2*g_2**2 - 2*L*g_1 + L*g_2 + 2)*sqrt(L**4*g_1**2*g_2**2 + L**2*g_1**2 + L**2*g_2**2 + 2*L*g_1 + L*g_2 + 2)/2 + 1), sqrt(L**4*g_1**2*g_2**2/2 + L**2*g_1**2/2 + L**2*g_2**2/2 + L*g_2/2 + sqrt(L**4*g_1**2*g_2**2 + L**2*g_1**2 + L**2*g_2**2 - 2*L*g_1 + L*g_2 + 2)*sqrt(L**4*g_1**2*g_2**2 + L**2*g_1**2 + L**2*g_2**2 + 2*L*g_1 + L*g_2 + 2)/2 + 1))

The maximum is attained on the second term, which corresponds to the following singular value

In [10]:
v[0].simplify()

sqrt(2)*sqrt(L**4*g_1**2*g_2**2 + L**2*g_1**2 + L**2*g_2**2 + L*g_2 + sqrt(L**4*g_1**2*g_2**2 + L**2*g_1**2 + L**2*g_2**2 - 2*L*g_1 + L*g_2 + 2)*sqrt(L**4*g_1**2*g_2**2 + L**2*g_1**2 + L**2*g_2**2 + 2*L*g_1 + L*g_2 + 2) + 2)/2

Let us verify that it can be written as 
$$
\sqrt{c + \sqrt{c^2 - L^2\gamma_1^2}}, \quad \text{where}\quad c = \frac{L^2\gamma_1^2\gamma_2^2 + L^2\gamma_1^2 + L^2\gamma_2^2 + L\gamma_2}{2} + 1
$$

In [11]:
c = ((L**4)*(γ_1**2)*(γ_2**2) + (L**2)*(γ_1**2) + (L**2)*(γ_2**2) + L*γ_2)/2 + 1
norm_from_the_paper = sqrt(c + sqrt(c**2 - (L**2)*(γ_1**2)))
norm_from_the_paper.simplify()

sqrt(2)*sqrt(L**4*g_1**2*g_2**2 + L**2*g_1**2 + L**2*g_2**2 + L*g_2 + sqrt(-4*L**2*g_1**2 + (L**4*g_1**2*g_2**2 + L**2*g_1**2 + L**2*g_2**2 + L*g_2 + 2)**2) + 2)/2

In [12]:
difference = v[0]**2 - norm_from_the_paper**2
difference.simplify()

0

This justifies our derivations