<a href="https://colab.research.google.com/github/dnguyend/lagrange_rayleigh/blob/master/EigenTensor.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

$\newcommand{\by}{\boldsymbol{y}}$
$\newcommand{\bu}{\boldsymbol{u}}$
$\newcommand{\bz}{\boldsymbol{z}}$
$\newcommand{\bx}{\boldsymbol{x}}$
$\newcommand{\bg}{\boldsymbol{g}}$
$\newcommand{\bH}{\boldsymbol{H}}$
$\newcommand{\bI}{\boldsymbol{I}}$
$\newcommand{\bU}{\boldsymbol{U}}$
$\newcommand{\bT}{\boldsymbol{T}}$
$\newcommand{\bF}{\boldsymbol{F}}$
$\newcommand{\bJ}{\boldsymbol{J}}$
$\newcommand{\bA}{\boldsymbol{A}}$
$\newcommand{\blbd}{\boldsymbol{\lambda}}$
$\newcommand{\EL}{E_L}$
$\newcommand{\NCM}{\text{NCM}}$
$\newcommand{\ONCM}{\text{O-NCM}}$
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dnguyend/lagrange_rayleigh/blob/master/EigenTensor.ipynb)
# Schur form of Lagrange-Rayleigh algorithm for Eigen-Tensor problem.

A good example to compare the Schur versus Riemnannian form of the Lagrange-Rayleigh algorithm is the eigen-tensor problem.

The paper [1] proposed two methods , $\NCM$ (Newton correction method) and $\ONCM$ (orthogonal $\NCM$). We can see $\ONCM$ is Riemannian-Newton on the sphere: the updating vector $\bu$ is on the tangent space, and the updating equation is the Riemannian-Newton equation. The updating equation for NCM requires solving $\bH \by =  \bg$ where $\bH$ is of form $\nabla^2 -\lambda\bI$
which is an extension of the resolvent equation. It converges quadratically, but the increment $\by$ is not on the tangent plane. Our Schur form Rayleigh formulation provides an updating $\by_1= \by -c \zeta$ that is on the tangent space and could be computed by solving $\bH$. $\ONCM$ on the other hand, requires solving $\bH_p \bz = -\bU'\bg$, the projected Hessian ($\bU'$ is the projection). While $\bH_p$ has dimension one less than $\bH$, we need extra steps to compute the projected Hessian, as well as the imbedding of $\bz$ from the tangent space back to $\bu$ in the ambient space. [1] found $\NCM$ is inferior to $\ONCM$. Our Schur form is competitive in this case, as we show below that it generally provides around 30% in improvement in execution time over our python implementation of $\ONCM$. In theorem $\by_1$ should be identical to $\bu$ of $\ONCM$, as we proved that Schur form is just another way of solving the updating equation. It is indeed so most of the time, however there are instances where numerical discrepencies makes the two iterations diverges. However, Schur form solution is also quite stable and given the time improvement should be a competitive candidate to solving eigen-tensor problems.

Further, the paper noted that although OCM is fast, without running the homotopy method first, it could not determine if it recovers all real eigenpairs. We address the issue by deriving a RQI algorithm for complex eigenpairs. Here, we have a formula in [2] telling us the exact number of eigenpairs with correct multiplicity count is $((m-1)^n - 1) / (m-2)$ for $m \geq 3$ (for $m=2$ it is $n$). Before this work, Grobner base or homotopy method are both costly to compute all complex eigenvalues.
we note the work [3] listed a number of complex eigenpairs. We show that a unitary version of RQI computes all eigenpairs much faster than the existing algorithms: instead of the few hours for *real* pairs, we are able to compute all *complex pairs* within 15 minutes for a $8\times 8 \times 8 \times 8$ tensor. This algorithm is an easy consequence of our main theorem.

[1] *Newton Correction Methods for Computing Real Eigenpairs of Symmetric Tensors.*
Ariel Jaffe, Roi Weiss, and Boaz Nadler
SIAM Journal on Matrix Analysis and Applications 2018 39:3, 1071-1094

[2] *The number of eigenvalues of a tensor.*
Dustin Cartwright and Bernd Sturmfels,
Linear Algebra and its Applications 2013 438:2, 942-952

[3] *Shifted Power Method for Computing Tensor Eigenpairs*
Kolda, Tamara G. and Mayo, Jackson R.
SIAM Journal on Matrix Analysis and Applications 2011 32:4, 1095-1124



First we pull the code from our github. Since our main development tool is python, we quickly convert the matlab code for [1] from https://github.com/arJaffe/BinaryLatentVariables/tree/master/NCM_functions to python to compare. Later on we develop both the real and unitary versions in matlab together with the python version. The runtime for python and matlab are mostly comparable to each other.

We also implement a custom copy of Schur form solution. The code for these two functions are in lagrange_rayleigh/core/eigen_tensor_solver. The readers can run
!cat lagrange_rayleigh/core/eigen_tensor_solver.py
to view the code.

In [2]:
!git clone https://github.com/dnguyend/lagrange_rayleigh

Cloning into 'lagrange_rayleigh'...
remote: Enumerating objects: 132, done.[K
remote: Counting objects: 100% (132/132), done.[K
remote: Compressing objects: 100% (91/91), done.[K
remote: Total 132 (delta 62), reused 99 (delta 37), pack-reused 0[K
Receiving objects: 100% (132/132), 1.24 MiB | 1.39 MiB/s, done.
Resolving deltas: 100% (62/62), done.


In [0]:
# run this cell to view the codes of the two routines, one for ONCM and one for
# Schur-Rayleigh:
!cat lagrange_rayleigh/core/eigen_tensor_solver.py

We also show how to use our library routine rayleigh_quotient_iteration and rayleigh_chebyshev, by first derive from the class Lagrangian and providing it with methods to compute the function to solve and derivatives. In this case the function is given by $\bF(\bx) = \bT(\bI, \bx,\cdots\bx)$. Its derivatives is $\bJ_{\bF}=\bT(\bI,\bI,\bx,\cdots \bx)$. We also compute the second derivative of $\bF$ for Rayleigh-Chebyshev.

In [0]:
from __future__ import print_function
import numpy as np
import pandas as pd
import time


import lagrange_rayleigh.core.utils as utils
from lagrange_rayleigh.core.vector_lagrangian import explicit_vector_lagrangian
from lagrange_rayleigh.core.constraints import base_constraints

from lagrange_rayleigh.core.solver import rayleigh_quotient_iteration
from lagrange_rayleigh.core.solver import rayleigh_chebyshev
from lagrange_rayleigh.core.eigen_tensor_solver import\
    orthogonal_newton_correction_method, schur_form_rayleigh,\
    symmetric_tv_mode_product, schur_form_rayleigh_chebyshev


class eigen_tensor_lagrange(explicit_vector_lagrangian):
    def calc_H(self, x):
        return x[:, None]

    def F(self, x):
        v = self._args['A'].copy()
        for i in range(self._m-3):
            v = np.tensordot(v, x, axes=1)
        self._F2 = v
        self._F1 = np.tensordot(self._F2, x, axes=1)
        self._F0 = np.tensordot(self._F1, x, axes=1)
        self._F1 *= (self._m - 1)
        self._F2 *= (self._m - 2) * (self._m - 1)
        return self._F0
    
    def __init__(self, A):
        self._args = {'A': A}
        self._k = A.shape[0]
        self._m = len(A.shape)
        self._shape_in = (A.shape[0], 0)
        self._shape_out = (A.shape[0], 0)

    def calc_J_F(self, x):
        return self._F1

    def calc_J_H(self, x):
        return np.eye(x.shape[0]).reshape(
            x.shape[0], 1, x.shape[0])

    def J_F2(self, d_x):
        return np.tensordot(
            np.tensordot(self._F2, d_x, axes=1), d_x, axes=1)

    def J_C(self, d_x):
        return np.dot(
            self._state['J_C'], d_x)

    def calc_J_F2(self, x):
        return self._F2

    def calc_J_H2(self, x):
        pass

    def J_H2(self, d_x, d_lbd):
        return np.zeros((d_x.shape[0]))

    def J_C2(self, d_x):
        return 2 * np.dot(d_x.T, d_x).reshape(1)
    
    def calc_J_RAYLEIGH(self, x):
        return (- 2 * self['RAYLEIGH'] * x.T +
                self._F0.T + np.dot(x.T, self._F1)).reshape(1, -1)

Our next function calls and compare four routines, orthogonal_newton_correction_method, schur_form_rayleigh, rayleigh_quotient_iteration and rayleigh_chebyshev. We note schur_form_rayleigh and rayleigh_quotient_iteration are just different implementations of the same algorithm, the later means to be a general purpose routine so not quite efficient. schur_form_rayleigh is modelled on the style of orthogonal_newton_correction_method where we just replace solving the projected Hessian by the Hessian equation and apply the Schur form adjustment. However, the first three routines still show discrepancies, as for a small number of initial values they converge to different eigenpairs, this is due to numerical errors difficult to pinpoint.

In [0]:
def test_eigen_tensor(k, m, max_err, max_itr, n_test):
    def sphere_func(x):
        return np.dot(x.T, x) - 1

    def sphere_jacobian(x):
        return 2 * x.reshape(1, -1)

    def sphere_retraction(x, u):
        return (x + u) / np.linalg.norm(x + u)

    sphere = base_constraints(
        shape_in=(k,),
        shape_constraint=(1,),
        equality=sphere_func)

    sphere.set_analytics(
        J_C=sphere_jacobian,
        retraction=sphere_retraction)

    A = utils.generate_symmetric_tensor(k, m)
    e = eigen_tensor_lagrange(A)
    e.constraints = sphere

    o_ncm_cnt = np.zeros(n_test, dtype=int)
    schur_cnt = np.zeros(n_test, dtype=int)
    ray_cnt = np.zeros(n_test, dtype=int)
    schur_cheb_cnt = np.zeros(n_test, dtype=int)

    o_ncm_err = np.zeros(n_test)
    schur_err = np.zeros(n_test)
    ray_err = np.zeros(n_test)
    schur_cheb_err = np.zeros(n_test)

    o_ncm_lbd = np.zeros(n_test)
    schur_lbd = np.zeros(n_test)
    ray_lbd = np.zeros(n_test)
    schur_cheb_lbd = np.zeros(n_test)

    o_ncm_time = np.zeros(n_test)
    schur_time = np.zeros(n_test)
    ray_time = np.zeros(n_test)
    schur_cheb_time = np.zeros(n_test)

    for jj in range(n_test):
        x0 = np.random.randn(k)
        x0 = x0 / np.linalg.norm(x0)

        # do orthogonal
        t_start = time.time()
        o_x, o_lbd, o_ctr, converge = orthogonal_newton_correction_method(
            A, max_itr, max_err, x_init=x0)
        t_end = time.time()
        o_ncm_cnt[jj] = o_ctr
        o_ncm_lbd[jj] = o_lbd
        o_ncm_err[jj] = np.linalg.norm(
            symmetric_tv_mode_product(
                A, o_x, m-1) - o_lbd * o_x)
        o_ncm_time[jj] = t_end - t_start

        # do schur_form_rayleigh
        t_start = time.time()
        if False:
            s_x, s_lbd, ctr, converge = schur_form_rayleigh(
                A, max_itr, max_err, x_init=x0)
        else:
            # s_x, s_lbd, ctr, converge = schur_form_rayleigh_chebyshev_linear(
            # A, max_itr, max_err, x_init=x0, do_chebyshev=True)
            s_x, s_lbd, ctr, converge, err = schur_form_rayleigh_chebyshev(
                A, max_itr, max_err, x_init=x0, do_chebyshev=False)

        t_end = time.time()
        schur_cnt[jj] = ctr
        schur_lbd[jj] = s_lbd
        schur_err[jj] = np.linalg.norm(
            symmetric_tv_mode_product(
                A, s_x, m-1) - s_lbd * s_x)
        schur_time[jj] = t_end - t_start

        # now do rayleigh

        t_start = time.time()
        res_ray = rayleigh_quotient_iteration(
            e, x0, max_err=max_err,
            max_iter=max_itr, verbose=False,
            exit_by_diff=True)
        t_end = time.time()
        ray_time[jj] = t_end - t_start
        ray_cnt[jj] = res_ray['n_iter']
        ray_lbd[jj] = res_ray['lbd']
        ray_err[jj] = np.linalg.norm(res_ray['err'])
        # print("doing rayleigh")
        # print(res_ray)
        # print(e.L(res_ray['x'], res_ray['lbd']))

        # now do rayleigh chebyshev
        t_start = time.time()
        if True:
            sch_x, sch_lbd, ctr, converge, err = schur_form_rayleigh_chebyshev(
                A, max_itr, max_err, x_init=x0, do_chebyshev=True)
        else:
            sch_x, sch_lbd, ctr, converge = schur_form_rayleigh_linear(
                A, max_itr, max_err, x_init=x0, u=None)
        t_end = time.time()
        """
        res_ray_cheb = rayleigh_chebyshev(
            e, x0, max_err=max_err, max_iter=max_itr,
            verbose=False, exit_by_diff=True)
        """

        schur_cheb_time[jj] = t_end - t_start
        schur_cheb_cnt[jj] = ctr
        schur_cheb_lbd[jj] = sch_lbd
        schur_cheb_err[jj] = np.linalg.norm(
            symmetric_tv_mode_product(
                A, sch_x, m-1) - sch_lbd * sch_x)
        schur_cheb_time[jj] = t_end - t_start

        # print("doing raychev")
        # print(res_ray_cheb)
        # print(e.L(res_ray_cheb['x'], res_ray_cheb['lbd']))

    summ = pd.DataFrame(
        {
            'o_ncm_iter': o_ncm_cnt,
            'schur_iter': schur_cnt,
            'ray_iter': ray_cnt, 'schur_cheb_iter': schur_cheb_cnt,
            'o_ncm_err': o_ncm_err,
            'schur_err': schur_err,
            'ray_err': ray_err, 'schur_cheb_err': schur_cheb_err,
            'o_ncm_lbd': o_ncm_lbd,
            'schur_lbd': schur_lbd,
            'ray_lbd': ray_lbd,
            'schur_cheb_lbd': schur_cheb_lbd,
            'o_ncm_time': o_ncm_time,
            'schur_time': schur_time,
            'ray_time': ray_time,
            'schur_cheb_time': schur_cheb_time
        },
        columns=['o_ncm_iter', 'o_ncm_lbd', 'o_ncm_err', 'o_ncm_time',
                 'schur_iter', 'schur_lbd', 'schur_err', 'schur_time',
                 'ray_iter', 'ray_lbd', 'ray_err', 'ray_time',
                 'schur_cheb_iter', 'schur_cheb_lbd', 'schur_cheb_err',
                 'schur_cheb_time'])
    return summ

We now run the routine for a small test size:

In [29]:
from IPython.display import display, HTML
np.random.seed(2)
k = 6
m = 3
max_err = 1e-10
max_itr = 200
n_test = 100

summ = test_eigen_tensor(k, m, max_err, max_itr, n_test)
# summ[['o_ncm_time', 'schur_time', 'ray_time', 'ray_cheb_time']].describe())
# display(HTML(summ.describe().to_html()))
display(HTML(summ[[a for a in summ.columns if 'time' in a]].describe().to_html()))
display(HTML(summ[[a for a in summ.columns if 'iter' in a]].describe().to_html()))
display(HTML(summ[[a for a in summ.columns if 'lbd' in a]].describe().to_html()))
display(HTML(summ[[a for a in summ.columns if 'err' in a]].describe().to_html()))


Unnamed: 0,o_ncm_time,schur_time,ray_time,schur_cheb_time
count,100.0,100.0,100.0,100.0
mean,0.010046,0.004415,0.016861,0.00431
std,0.008988,0.005305,0.016078,0.003704
min,0.002081,0.000863,0.003328,0.001315
25%,0.003527,0.001335,0.005529,0.001949
50%,0.006114,0.002418,0.010162,0.003386
75%,0.013591,0.004897,0.021473,0.005394
max,0.04145,0.02802,0.072702,0.030089


Unnamed: 0,o_ncm_iter,schur_iter,ray_iter,schur_cheb_iter
count,100.0,100.0,100.0,100.0
mean,24.78,28.74,24.62,19.17
std,22.346934,37.076666,24.637895,22.295107
min,5.0,5.0,4.0,4.0
25%,8.0,8.0,7.0,7.0
50%,15.0,15.0,14.0,13.0
75%,33.5,33.5,32.5,24.0
max,108.0,200.0,112.0,200.0


Unnamed: 0,o_ncm_lbd,schur_lbd,ray_lbd,schur_cheb_lbd
count,100.0,100.0,100.0,100.0
mean,0.111948,0.198989,0.087606,0.054705
std,1.558951,1.697254,1.565964,1.729099
min,-6.752001,-6.752001,-6.752001,-6.752001
25%,-0.427413,-0.420226,-0.421385,-0.421385
50%,0.321902,0.322613,0.322613,0.322258
75%,0.54753,0.792287,0.442781,0.855013
max,6.752001,6.752001,6.752001,6.752001


Unnamed: 0,o_ncm_err,schur_err,ray_err,schur_cheb_err
count,100.0,100.0,100.0,100.0
mean,2.298026e-16,0.00591848,2.060569e-12,0.003485744
std,2.19117e-16,0.03777224,6.077792e-12,0.03485744
min,4.3506640000000004e-17,5.951707e-17,1.010318e-16,2.775558e-17
25%,1.16627e-16,1.751306e-16,3.688838e-16,1.822372e-16
50%,1.643246e-16,2.60148e-16,1.039251e-15,2.939175e-16
75%,2.540287e-16,4.335983e-16,2.017441e-13,3.938848e-16
max,1.472877e-15,0.2889771,3.643705e-11,0.3485744


# Rayleigh Chebyshev versus RQI#
We see in the above example, Chebyshev does offer a saving in number of iterations, but not really in time. When we increase m and n Chebyshev does not offer an advantage: in the chebyshev routine, we have to put a constraint to apply chebyshev only if the increment size is not too big, otherwise the iteration will go all over the place. When $n$ and $m$ are bigger, the saving in number of iterations is also gone. This is in contrast with the nonlinear eigen problem where Chebyshev seems to be competitive for the examples we look at.

This is probably the same situation with the common experience with higher order derivatives in unconstrained iterations.

The reader can inspect the dataframe to get more information about the eigenpairs obtained

In [30]:
display(HTML(summ.head(10).to_html()))
        

Unnamed: 0,o_ncm_iter,o_ncm_lbd,o_ncm_err,o_ncm_time,schur_iter,schur_lbd,schur_err,schur_time,ray_iter,ray_lbd,ray_err,ray_time,schur_cheb_iter,schur_cheb_lbd,schur_cheb_err,schur_cheb_time
0,8,0.179542,8.212042e-17,0.013855,8,0.179542,1.059886e-16,0.001312,7,0.179542,3.006486e-12,0.005511,7,0.179542,1.13912e-16,0.001804
1,10,1.251493,4.855243e-16,0.005769,10,1.251493,5.467214e-16,0.004474,9,1.251493,5.822059e-16,0.006687,9,1.251493,3.692639e-16,0.002104
2,63,-0.43506,1.892996e-16,0.024881,56,0.322613,1.109486e-16,0.008445,70,0.43506,1.531944e-12,0.04669,46,0.792287,3.202483e-16,0.009371
3,11,1.043193,2.733607e-16,0.004495,11,1.043193,5.200468e-16,0.001741,10,1.043193,8.418683e-16,0.007298,8,1.043193,5.635903e-16,0.001956
4,26,0.322613,1.010737e-16,0.009971,26,0.322613,5.951707e-17,0.004464,25,0.322613,1.098683e-12,0.016618,21,-0.43506,2.977263e-16,0.005864
5,8,0.792287,1.94786e-16,0.003454,8,0.792287,3.130972e-16,0.001355,7,0.792287,8.955017e-12,0.005535,7,0.792287,3.742284e-16,0.001846
6,25,0.322613,1.829969e-16,0.009742,25,0.322613,1.296329e-16,0.0037,24,0.322613,3.349069e-12,0.016323,24,0.407032,1.600465e-16,0.005131
7,11,0.322613,1.388463e-16,0.004737,11,0.322613,1.721516e-16,0.0018,10,0.322613,2.480609e-16,0.007372,10,0.322613,1.796554e-16,0.002702
8,30,-0.407032,1.24514e-16,0.011734,30,-0.407032,2.289835e-16,0.004573,29,-0.407032,1.384971e-12,0.020231,16,-1.043193,3.533389e-16,0.004434
9,5,-6.752001,1.256074e-15,0.005227,5,-6.752001,2.73755e-15,0.000897,4,-6.752001,1.71995e-15,0.003451,4,-6.752001,1.71995e-15,0.001315


We note that the python version of the algorithm shows the Schur form shows better improvements over the original OCM than in the matlab version. It is possible that the tensor calculation in matlab is the bottle neck, overwhelm the effect of calculating the projected Hessian.
Before moving to the complex version, we ends the real version with a bigger test size which takes a few minutes to finish: 


In [33]:
  np.random.seed(0)
  k = 15
  m = 4
  max_err = 1e-10
  max_itr = 200
  n_test = 500

  summ = test_eigen_tensor(k, m, max_err, max_itr, n_test)
  display(HTML(summ[[a for a in summ.columns if 'time' in a]].describe().to_html()))      

Unnamed: 0,o_ncm_time,schur_time,ray_time,schur_cheb_time
count,500.0,500.0,500.0,500.0
mean,0.04432,0.017383,0.065544,0.031082
std,0.035817,0.013257,0.052044,0.018585
min,0.004733,0.002229,0.00728,0.003196
25%,0.015816,0.006304,0.023888,0.014142
50%,0.032393,0.013341,0.049824,0.028473
75%,0.062239,0.02387,0.087444,0.052412
max,0.167289,0.054603,0.198669,0.072326


# The Complex Eigentensor# 
The unitary version is a simple modification of the real version: to apply our framework remember that $\blbd$ could be made real. Thus $\EL$ is of dimension $1$, and we need one constraint. Naturally we choose it to be $\bz^*\bz-1$. For the detail derivations, the reader can consult the main paper. We will demonstrate here how fast the algorithm solves all eigenpairs for a tensor with around one thousand tensor pairs.
First we import the required functions:


In [0]:
from __future__ import print_function
import numpy as np
import pandas as pd
import time

import lagrange_rayleigh.core.utils as utils
from lagrange_rayleigh.core.eigen_tensor_solver import\
        schur_form_rayleigh_chebyshev_unitary,\
        find_all_unitary_eigenpair



The main module is schur_form_rayleigh_chebyshev unitary, where there is an option to run Chebyshev(do_chebyshev=True) versus simple RQI.
We now do a quick comparison between the two versions:

In [39]:

def test_tensor_unitary_eigenpair():
    # output is the table of results
    # 2n*+2 columns: lbd, is real, real, complex eigenvalue
    from lagrange_rayleigh.core.eigen_tensor_solver import\
        schur_form_rayleigh_chebyshev_unitary

    n = 8
    m = 3
    tol = 1e-10
    max_itr = 200
    n_test = 1000

    A = utils.generate_symmetric_tensor(n, m)

    # n_eig = complex_eigen_cnt(n, m)

    su_x = np.zeros((n_test, n), dtype=np.complex)
    su_cnt = np.zeros(n_test, dtype=int)
    su_err = np.zeros(n_test)
    su_lbd = np.zeros(n_test)
    su_time = np.zeros(n_test)

    su_cheb_x = np.zeros((n_test, n), dtype=np.complex)
    su_cheb_cnt = np.zeros(n_test, dtype=int)
    su_cheb_err = np.zeros(n_test)
    su_cheb_lbd = np.zeros(n_test)
    su_cheb_time = np.zeros(n_test)

    for jj in range(n_test):
        x0r = np.random.randn(2*n)
        x0r /= np.linalg.norm(x0r)
        x0 = x0r[:n] + x0r[n:] * 1.j
        t_start = time.time()
        x, lbd, ctr, converge, err = schur_form_rayleigh_chebyshev_unitary(
            A, max_itr, tol, x_init=x0, do_chebyshev=False)
        
        t_end = time.time()
        # su_err[jj] = np.linalg.norm(
        # symmetric_tv_mode_product(
        # A, x, m-1) - lbd * x)
        su_err[jj] = err
        su_x[jj] = x
        su_cnt[jj] = ctr
        su_lbd[jj] = lbd
        su_time[jj] = t_end - t_start
        
        t_start = time.time()
        sc_x, sc_lbd, sc_ctr, converge, sc_err =\
            schur_form_rayleigh_chebyshev_unitary(
                A, max_itr, tol, x_init=x0, do_chebyshev=True)
        t_end = time.time()
        su_cheb_err[jj] = sc_err
        su_cheb_x[jj] = sc_x
        su_cheb_cnt[jj] = sc_ctr
        su_cheb_lbd[jj] = sc_lbd
        su_cheb_time[jj] = t_end - t_start

    summ = pd.DataFrame(
        {
            'su_time': su_time,
            'su_cnt': su_cnt,
            'su_lbd': su_lbd,
            'su_err': su_err,

            'su_cheb_time': su_cheb_time,
            'su_cheb_cnt': su_cheb_cnt,
            'su_cheb_lbd': su_cheb_lbd,
            'su_cheb_err': su_cheb_err

        },
        columns=['su_time', 'su_cnt', 'su_lbd', 'su_err',
                 'su_cheb_time', 'su_cheb_cnt', 'su_cheb_lbd', 'su_cheb_err'])
    # print(summ)
    print(summ[[a for a in summ.columns if 'err' in a]].describe())
    print(summ[[a for a in summ.columns if 'cnt' in a]].describe())
    print(summ[[a for a in summ.columns if 'lbd' in a]].describe())
    print(summ[[a for a in summ.columns if 'time' in a]].describe())
    return summ, su_x, su_cheb_x
summ, su_x, su_cheb_x = test_tensor_unitary_eigenpair()

  np.sum((x_k.conjugate() * lhs[:, 0]).real)) - lhs[:, 1]
  x_k_n = (x_k + y) / norm(x_k + y)


             su_err   su_cheb_err
count  9.990000e+02  9.990000e+02
mean   1.521712e-03  1.521712e-03
std    3.703065e-02  3.703065e-02
min    8.575923e-17  9.532846e-17
25%    2.636680e-16  2.736172e-16
50%    3.423915e-16  3.483988e-16
75%    4.348952e-16  4.325889e-16
max    1.088234e+00  1.088234e+00
           su_cnt  su_cheb_cnt
count  1000.00000  1000.000000
mean     17.52500    16.901000
std      12.41739    12.552082
min       6.00000     6.000000
25%      12.00000    11.000000
50%      15.00000    15.000000
75%      20.00000    19.000000
max     200.00000   200.000000
           su_lbd  su_cheb_lbd
count  999.000000   999.000000
mean    -0.028405    -0.029194
std      0.488015     0.489312
min     -1.569892    -1.569892
25%     -0.327341    -0.330662
50%     -0.032370    -0.037088
75%      0.296855     0.293093
max      1.569892     1.569892
           su_time  su_cheb_time
count  1000.000000   1000.000000
mean      0.003793      0.004919
std       0.002542      0.002790
min 

We see in this case Chebyshev offers a bit of saving in iterations but underperforms in time. The readers can inspect the dataframe summ and su_x, su_cheb_x for a bit more information on the eigen pairs. We will move to finding all eigenpairs next. First we define a function to test with some parameters

In [0]:
def test_find_all_unitary(m, n, tol, max_itr):

    A = utils.generate_symmetric_tensor(n, m)

    # find from begining
    t_start = time.time()
    all_eig, n_runs = find_all_unitary_eigenpair(
        all_eig=None, eig_cnt=None, A=A, max_itr=max_itr, max_test=int(1e6), tol=tol)

    """continue finding more pairs
    all_eig, n_runs = find_all_unitary_eigenpair(
        all_eig, eig_cnt=None, A=A, max_itr=max_itr,
        max_test=int(1e6), tol=tol)
    """

    t_end = time.time()
    tot_time = t_end - t_start
    print('tot time %f avg=%f' % (tot_time, tot_time / all_eig.x.shape[0]))

    """
    np.savez_compressed('save_eigen_%d_%d.npz' % (
        n, m), A=A, lbd=all_eig.lbd,
                        x=all_eig.x, is_real=all_eig.is_real,
                        is_self_conj=all_eig.is_self_conj)
    """
    return all_eig, n_runs, A

In [51]:
np.random.seed(0)
tol = 1e-10
max_itr = 200
n = 9
m = 3
tol = 1e-10
max_itr = 200
all_eig, n_runs, A = test_find_all_unitary(m, n, tol, max_itr)
print("number of runs required %d" % n_runs)
print("number of eigen pairs: %d" % all_eig.x.shape[0] )
print("first few values:")
print("lambda:")
print(all_eig.lbd[:10])
print("x:")
print(all_eig.x[:10])
print("is real:")
print(all_eig.is_real[:10])


Found 180 eigenpairs
Found 230 eigenpairs
Found 330 eigenpairs
Found 350 eigenpairs
Found 360 eigenpairs
Found 380 eigenpairs
Found 420 eigenpairs
Found 450 eigenpairs
Found 480 eigenpairs


  np.sum((x_k.conjugate() * lhs[:, 0]).real)) - lhs[:, 1]
  x_k_n = (x_k + y) / norm(x_k + y)


Found 510 eigenpairs
tot time 150.938395 avg=0.295378
number of runs required 33995
number of eigen pairs: 511
first few values:
lambda:
[0.2513244  0.2513244  0.18866206 0.18866206 0.76006037 0.54851984
 0.54851984 0.13505906 0.13505906 0.34017126]
x:
[[-0.10676053+0.09695859j -0.43619278+0.21792098j  0.72428733+0.01278017j
  -0.12202021-0.08297018j  0.07024636-0.03381531j -0.14873212+0.1418052j
   0.30056491+0.09156707j  0.01055348-0.03663936j -0.18206366-0.11526622j]
 [-0.10676053-0.09695859j -0.43619278-0.21792098j  0.72428733-0.01278017j
  -0.12202021+0.08297018j  0.07024636+0.03381531j -0.14873212-0.1418052j
   0.30056491-0.09156707j  0.01055348+0.03663936j -0.18206366+0.11526622j]
 [-0.16927686-0.03780571j -0.28168585-0.21766014j -0.12368132+0.26405896j
   0.57685995-0.17683274j -0.0795599 +0.2850413j   0.01711769+0.10566102j
   0.36175606-0.03170698j -0.04624686+0.04025455j -0.32074923+0.23787426j]
 [-0.16927686+0.03780571j -0.28168585+0.21766014j -0.12368132-0.26405896j
   0.5

As we can see the main return is all_eig, which keep the eigen vectors and values, plus two flags to tell if the eigenvectors are real. (there is a self conjugate flag but it is only for experimental use). The reader can see that we compute around 90% of the eigenpairs pretty fast, the last 10% typically takes much longer time. However for the tensors that we have seen in the eigenvalue literature we are able to find all eigenpairs in around 15 minutes.

The next section is a function checking the eigenpairs are actually eigen, the real flag is correct, and there is no duplicate. We also check for multiple eigenvector, the phenomenom where we may have two eigenvectors corresponding to one eigenvalue.

In [53]:
def check_eigen(all_eig, A, tol):
    m = len(A.shape)
    good = True
    for i in range(all_eig.lbd.shape[0]):
        # first check eigen works
        err = np.sum(np.abs(
            symmetric_tv_mode_product(
                A, all_eig.x[i], m-1) - all_eig.lbd[i] * all_eig.x[i]))
        if err > tol:
            print("bad entry i=%d lbd=%f z=%s" % (
                i, all_eig.lbd[i], str(all_eig.x[i])))
            good = False
    if good:
        print("checked! They are all eigenvectors")

    neg_factor = np.exp(np.pi/(m-2)*1j)
    neg_factors = np.power(neg_factor, np.arange(m-2))
    # second check real
    good = True
    for i in range(all_eig.x.shape[0]):
        err = np.sum(np.abs(
            symmetric_tv_mode_product(
                A, all_eig.x[i].real, m-1) -
            all_eig.lbd[i] * all_eig.x[i].real))
        if (err < tol) != (all_eig.is_real[i]):
            print("bad real i=%d lbd=%f z=%s" % (
                i, all_eig.lbd[i], str(all_eig.x[i])))
            good = False
    if good:
        print("checked! The real flags are correct")

    # number of real eigen pairs:
    print("number of real=%d total=%d " % (np.where(
        all_eig.is_real)[0].shape[0],
        all_eig.x.shape[0]))
    good = True
    for i in range(all_eig.x.shape[0]):
        # third check no duplicate
        match_lbd = np.where(
            np.abs(np.abs(
                all_eig.lbd[:]) - np.abs(all_eig.lbd[i])) < tol)[0]
        match_other = match_lbd[match_lbd != i]
        if match_other.shape[0] == 0:
            continue

        for jj in range(m-2):
            dup = np.where(np.abs(
                all_eig.x[match_other] -
                all_eig.x[i] * neg_factors[jj]) < tol)[0]
            if dup.shape[0] > 0:
                print("bad dup i=%d lbd=%f z=%s, jj=%d" % (
                    i, all_eig.lbd[i], str(all_eig.x[i]), jj))
                good = False
    if good:
        print("checked! no duplicate")
           
    # check if we have multiple eigenvector again:
    rnk = np.vectorize(lambda a: '%.6f' % a)(all_eig.lbd)
    u, cnts = np.unique(rnk, return_counts=True)
    mult_vector = False
    for iu in range(len(u)):
        if cnts[iu] > 2:
            mtc = np.where(rnk == u[iu])[0]
            print("multiple eigenvectors found:")
            mult_vector = True
            print(mtc)
            print(all_eig.x[mtc])
            print(all_eig.lbd[mtc])
    if not mult_vector:
      print("No multiple eigenvector found")
    else:
      print("Found multiple eigenvector")
check_eigen(all_eig, A, tol)

checked! They are all eigenvectors
checked! The real flags are correct
number of real=79 total=511 
checked! no duplicate
No multiple eigenvector found
