<a href="https://colab.research.google.com/github/dnguyend/lagrange_rayleigh/blob/master/TwoLeftInverses.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

$\newcommand{\bC}{\boldsymbol{C}}$
$\newcommand{\bF}{\boldsymbol{F}}$
$\newcommand{\bI}{\boldsymbol{I}}$
$\newcommand{\bJ}{\boldsymbol{J}}$
$\newcommand{\bJtw}{\boldsymbol{J}^{(2)}}$
$\newcommand{\bJthr}{\boldsymbol{J}^{(3)}}$
$\newcommand{\bx}{\boldsymbol{x}}$
$\newcommand{\bH}{\boldsymbol{H}}$
$\newcommand{\bL}{\boldsymbol{L}}$
$\newcommand{\bT}{\boldsymbol{T}}$
$\newcommand{\bLx}{\boldsymbol{L}_{\bx}}$
$\newcommand{\blbd}{\boldsymbol{\lambda}}$
$\newcommand{\cL}{\mathcal{L}}$
$\newcommand{\PibH}{\boldsymbol{\Pi}_{\bH}}$

# Two left inverses:

We want to demonstrate the fact that there are a lot of freedom to choose the Rayleigh quotient and the projection operator.
For an equation of form
$$\bL(\bx, \blbd)=\bF(\bx) - \bH(\bx)\blbd = 0$$
with a constraint $\bC(\bx) = 0$
We can pick a left inverse $\bH^-_1$ to compute $\blbd = \bH^-_1(\bx)\bF(\bx)$, and another left inverse $\bH^-_2$ in the Riemannian Newton equation:
$$\PibH = \bI - \bH\bH^-_2$$
$$\PibH \bLx(\bx,\blbd) = - \bL(\bx, \blbd)$$

In the following, we consider the case $\bC(\bx) = \bx^T\bx-1$, the unit sphere with the function given as the tensor $\bT(\bI,\bx,\cdots,\bx).$ and consider the eigentensor problem:
$$\bL(\bx, \blbd) \bT(\bI,\bx,\cdots,\bx) - \bx\blbd =0$$
Let $A, B$ be two non degenerate matrices, and $a, b$ be two nonnegative integers
For the Rayleight quotient we consider the left inverse:
$$\bH_1^-(\bx) = ((\bx^a)^T A\bx)^{-1}(\bx^a)^T A$$
So 
$$\blbd = \bH_1^-(\bx) \bT(\bI,\bx,\cdots,\bx)$$
For the projection $\PibH$ we consider
$$\bH_2^-(\bx) = ((\bx^b)^T B\bx)^{-1}(\bx^b)^T B$$
We consider the Riemannian Newton algorithm with these two left inverses:


In [1]:
!git clone https://github.com/dnguyend/lagrange_rayleigh

fatal: destination path 'lagrange_rayleigh' already exists and is not an empty directory.


In [0]:
import numpy as np
from numpy.linalg import norm, solve
from numpy import eye
from scipy.linalg import null_space
from lagrange_rayleigh.core import utils
from lagrange_rayleigh.core.eigen_tensor_solver import symmetric_tv_mode_product
    

def ortho_sphere_power(
        T, max_itr, delta, x_init=None, a=None,
        b=None, AA=None, BB=None):
    """Tangent form rayleigh with two different left inverses
    for this first one, let a, b be  two odd integers
    then x^a.T x and x^b.T x are positive.
    left inverse for lamda is (xt^a x)^{-1}AAxt^a AA
    the left inverse for projection is (xt^b x)^{-1}BB xt^b BB
    """
    def pw(x, a):
        if a == 0:
            return np.ones_like(x)
        ret = x.copy()
        for i in range(a-1):
            ret *= x
        return ret
        
    # get tensor dimensionality and order
    n_vec = T.shape
    m = len(n_vec)
    n = T.shape[0]
    R = 1

    if a is None:
        a = 1
    if b is None:
        b = 1
    if BB is None:
        BB = eye(n)
    if AA is None:
        AA = eye(n)

    converge = False

    # if not given as input, randomly initialize
    if x_init is None:
        x_init = np.random.randn(n)
        x_init = x_init/norm(x_init)

    # init lambda_(k) and x_(k)
    x_k = x_init / np.linalg.norm(x_init)
    T_x_m_2 = symmetric_tv_mode_product(T, x_k, m-2)
    T_x_m_1 = T_x_m_2 @ x_k
    # x_t_A_x = x_k.T @ x_k
    lbd = (pw(x_k.T, a) @ AA @ T_x_m_1) / (pw(x_k.T, a) @ AA @ x_k)
    ctr = 0

    while (R > delta) and (ctr < max_itr):
        # compute T(I,I,x_k,...,x_k), T(I,x_k,...,x_k) and g(x_k)
        g = -lbd * x_k + T_x_m_1
        # compute Hessian H(x_k)
        H = (m-1)*T_x_m_2-lbd*eye(n)
        xB = pw(x_k, b).reshape(1, -1) @ BB
        U_x_k_b = null_space(xB)
        U_x_k = null_space(x_k.reshape(1, -1))
        H_p = U_x_k_b.T @ H @ U_x_k
        # fix eigenvector
        y = U_x_k @ solve(H_p, -U_x_k_b.T @ g)
        x_k_n = (x_k + y)/(np.linalg.norm(x_k + y))

        #  update residual and lbd
        R = norm(x_k-x_k_n)
        x_k = x_k_n
        T_x_m_2 = symmetric_tv_mode_product(T, x_k, m-2)
        T_x_m_1 = T_x_m_2 @ x_k

        lbd = (pw(x_k.T, a) @ AA @ T_x_m_1) / (pw(x_k.T, a) @ AA @ x_k)
        # print('ctr=%d lbd=%f' % (ctr, lbd))
        ctr += 1
    x = x_k
    err = norm(symmetric_tv_mode_product(
        T, x, m-1) - lbd * x)

    if ctr < max_itr:
        converge = True

    return x, lbd, ctr, converge, err


We see the the separate choices of left inverse still give a fast convergence algorithm:

In [6]:
  
  n = 10
  m = 3
  tol = 1e-10
  max_itr = 200
  np.random.seed(0)
  n_test = 10
  for i in range(n_test):
      a = np.random.randint(0, 5)
      b = np.random.randint(0, 5)
      x_init = np.random.randn(n)
      x_init /= np.linalg.norm(x_init)
      BB = utils.gen_random_symmetric_pos(n)
      AA = utils.gen_random_symmetric_pos(n)

      T = utils.generate_symmetric_tensor(n, m)
      x, lbd, ctr, converge, err = ortho_sphere_power(
          T, max_itr, tol, x_init, a=a,
          b=b, AA=AA, BB=BB)
      print('x=%s, lbd=%f, ctr=%d, converge=%d, err=%f' % (
          str(x), lbd, ctr, converge, err))


x=[0.3086719  0.3131101  0.32540478 0.31263685 0.32890518 0.31959297
 0.314951   0.26865241 0.30468664 0.35851523], lbd=16.482988, ctr=14, converge=1, err=0.000000
x=[-0.34087415 -0.27923109 -0.33199413 -0.31457023 -0.2997952  -0.30388138
 -0.33195865 -0.31484761 -0.31845993 -0.3220201 ], lbd=-16.386986, ctr=13, converge=1, err=0.000000
x=[0.28937886 0.31311781 0.32097044 0.34686254 0.3193481  0.34330286
 0.28451378 0.30959818 0.33163879 0.29708247], lbd=16.237043, ctr=17, converge=1, err=0.000000
x=[0.28897953 0.35212553 0.28557277 0.34409407 0.32312788 0.31224425
 0.30619494 0.30813692 0.33105809 0.30386616], lbd=16.837899, ctr=14, converge=1, err=0.000000
x=[-0.29404104 -0.32688864 -0.31055265 -0.31938101 -0.30648427 -0.30770691
 -0.28010165 -0.32443913 -0.31621864 -0.36865762], lbd=-15.564369, ctr=14, converge=1, err=0.000000
x=[-0.35666234 -0.28240407 -0.33414726 -0.30370524 -0.35245732 -0.27056948
 -0.30865049 -0.2853069  -0.31352692 -0.34168752], lbd=-16.331798, ctr=16, converge