## Matrix Games
$\newcommand{\n}[1]{\left\|#1 \right\|}$ 
$\renewcommand{\a}{\alpha}             $ 
$\renewcommand{\b}{\beta}              $ 
$\renewcommand{\c}{\gamma}             $ 
$\renewcommand{\d}{\delta}             $ 
$\newcommand{\D}{\Delta}               $ 
$\newcommand{\la}{\lambda}             $ 
$\renewcommand{\t}{\tau}               $ 
$\newcommand{\s}{\sigma}               $ 
$\newcommand{\e}{\varepsilon}          $ 
$\renewcommand{\th}{\theta}            $ 
$\newcommand{\x}{\bar x}               $ 
$\newcommand{\R}{\mathbb R}            $ 
$\newcommand{\N}{\mathbb N}            $ 
$\newcommand{\Z}{\mathbb Z}            $ 
$\newcommand{\E}{\mathcal E}           $ 
$\newcommand{\lr}[1]{\left\langle #1\right\rangle}$
$\newcommand{\nf}[1]{\nabla f(#1)}     $
$\newcommand{\hx}{\hat x}               $
$\newcommand{\hy}{\hat y}               $
$\DeclareMathOperator{\prox}{prox}      $
$\DeclareMathOperator{\argmin}{argmin}  $
$\DeclareMathOperator{\dom}{dom}        $
$\DeclareMathOperator{\id}{Id}          $
$\DeclareMathOperator{\conv}{conv}      $

We are interested in the following min-max matrix game
\begin{equation}
    \min_{x \in \D_n}\max_{y\in \D_m} \lr{Ax, y},
\end{equation}
where $x\in \R^n$, $y\in \R^m$, $A\in \R^{m\times n}$, and $\Delta_m$,
$\D_n$ denote the standard unit simplices in $\R^m$ and $\R^n$
respectively.

The variational inequality formulation for this problem is:
$$\lr{F(z^*),z-z^*} + G(z) - G(z^*) \geq 0 \quad \forall z \in Z,$$
where 
$$Z = \R^n\times \R^m,\quad z=\binom{x}{y},\quad F = \begin{bmatrix} 0 & A^*\\ -A & 0\end{bmatrix}, \quad G(z) = \d_{\D_n}(x) + \d_{\D_m}(y)$$


In [24]:
import matplotlib.pyplot as plt

from opt_operators import *
from algorithms import *
from pd_algorithms import *
%reload_ext autoreload
%autoreload 2

Choose any generator for random generation of data. In the paper we set gen = 100

In [4]:
gen = 100

Define matrix $A\in \R^{m\times n}$ (choose one of four possible examples below or generate a new instance):

In [5]:
m = 100
n = 100
np.random.seed(gen)
A = np.random.uniform(-1, 1, [m,n])

In [4]:
m = 100
n = 100
np.random.seed(gen)
A = np.random.normal(0, 1, [m, n])

In [14]:
m = 500
n = 100
np.random.seed(gen)
A = np.random.normal(0, 10, [m,n])

In [5]:
m = 100
n = 200
np.random.seed(gen)
A = np.random.uniform(0, 1, [m, n])

Define the starting points $x^0$ and $y^0$

In [15]:
x0 = np.ones(n)/n
y0 = np.ones(m)/m
z0 = np.hstack((x0, y0))

Define all proximal operators, primal-dual gap, etc.

In [16]:
# define general proximal operator. Note that it is independent of 
# the dimension, so we will use it for both primal and dual variables
def prox_g(x, rho):
    return proj_simplex(x)

# define primal-dual gap
def J_gap(x,y):
    return max(A.dot(x)) - min(A.T.dot(y))


### For variational inequalities formulation (Tseng FBF method and PEGM)

def F(z):
    u1 = A.T.dot(z[n:])
    u2 = -A.dot(z[:n])
    return np.hstack((u1, u2))

def prox_G(z, rho):
    u1 = proj_simplex(z[:n])
    u2 = proj_simplex(z[n:])
    return np.hstack((u1, u2))

def J_vip(z):
    return J_gap(z[:n], z[n:])

Compute the matrix norm of operator $A$

In [17]:
L = np.sqrt(np.max(LA.eigh(A.dot(A.T))[0]))

Define stepsizes for primal-dual method. For PDAL we use a simple guess $\t_0 =\frac{\sqrt{\min\{m,n\}}}{||A||_F}$

In [18]:
tau = 1./L
sigma = 1./L

tau0 = np.sqrt(min(m,n))/LA.norm(A)

Define number of iterations:

In [19]:
N = 50000

Run the algorithms: PDA, PDAL, Tseng FBF method, and PEGM:

In [20]:
ans1 = pd(J_gap, prox_g, prox_g, A, x0, y0, sigma, tau, numb_iter=N)
ans2 = pd_linesearch(J_gap, prox_g, prox_g, A, x0,y0, tau0, 1,   numb_iter=N)
ans3 = tseng_fbf_linesearch(J_vip, F, prox_G, z0, delta=1.4, numb_iter=N)
ans4 = alg_VI_prox_affine(J_vip, F, prox_G, z0, numb_iter=N)


----- Primal-dual method -----
Time execution: 12.7101511955
----- Primal-dual method with linesearch-----
Time execution: 18.0970499516
---- FBF ----
Number of iterations: 50000
Number of gradients, n_grad: 147171
Number of prox_g: 97170
Time execution: 28.8428061008
---- Alg. 2, affine operator ----
Number of iterations: 50000
Number of gradients, n_grad: 50002
Number of prox_g: 50000
Time execution: 15.5855660439


Plots of the primal-dual gap vs. number of iterations:

In [23]:
plt.plot(ans1[0], 'b',)
plt.plot(ans2[0], 'r',)
plt.plot(ans3[0], 'g',)
plt.plot(ans4[0], 'm')
plt.yscale('log')
plt.show()

To make the plots as in the paper, run the following

In [22]:
import matplotlib as mpl
mpl.rc('lines', linewidth=2)
mpl.rcParams.update(
    {'font.size': 13, 'font.family': 'STIXGeneral', 'mathtext.fontset': 'stix'})
mpl.rcParams['xtick.major.pad'] = 2
mpl.rcParams['ytick.major.pad'] = 2

plt.plot(ans1[0], 'b', label = 'PDA')
plt.plot(ans2[0], 'r', label = 'PDAL')
plt.plot(ans3[0], 'g', label = 'FBF')
plt.plot(ans4[0], 'm', label = 'PEGM')
plt.yscale('log')
#plt.xscale('log')
plt.xlabel(u' iterations ')
plt.ylabel(u'PD gap')

plt.legend()
plt.savefig('figures/minmax-3.pdf')
plt.show()