# CPD Options

The *cpd* function has several options at disposal. Some of them may improve performance, precision or give insight about the tensor at hand. If you look at the source code, the first line of *cpd* is the following:

>def cpd(T, r, maxiter=200, tol=1e-12, maxiter_refine=200, tol_refine=1e-10, init='random', trunc_dims=0, level=1, symm=False, display=0):

We will see all these parameters now. Let's start importing the necessary modules and creating the same tensor of the previous notebook.

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import TensorFox as tfx

In [2]:
# Create and print the tensor.
m = 2
T = np.zeros((m, m, m))
for i in range(m):
    for j in range(m):
        for k in range(m):
            T[i,j,k] = i+j+k
            
Tsize = np.linalg.norm(T)            
tfx.showtens(T)

[[0. 1.]
 [1. 2.]]

[[1. 2.]
 [2. 3.]]



# The *Display* Option

There are four choices for the *display* option: $0,1,2,3$. These options controls what the user can see during the computations. Previously we let the default option and there were no output whatsoever (default is $0$). The option $1$ shows useful information about the principal stages of the computation, option $2$ shows everything the option $1$ shows plus information about each iteration. The option $3$ is special, it shows eveything the option $2$ shows and also shows the relative error of the compressed tensor. The computation of this error is costly so avoid that for big tensors.  

In [3]:
# Compute the CPD of T with partial display.
r = 2
Lambda, X, Y, Z, T_approx, info = tfx.cpd(T, r, display=1)

--------------------------------------------------------------------------------------------------------------
Computing HOSVD of T
    No compression detected
    Working with dimensions (2, 2, 2)
--------------------------------------------------------------------------------------------------------------
Type of initialization: smart random
--------------------------------------------------------------------------------------------------------------
Computing CPD of T
--------------------------------------------------------------------------------------------------------------
Computing refinement of solution
Final results
    Number of steps = 29
    Relative error = 0.00040529188420815836
    Accuracy =  99.96 %


In [4]:
# Compute the CPD of T with full display.
Lambda, X, Y, Z, T_approx, info = tfx.cpd(T, r, display=2)

--------------------------------------------------------------------------------------------------------------
Computing HOSVD of T
    No compression detected
    Working with dimensions (2, 2, 2)
--------------------------------------------------------------------------------------------------------------
Type of initialization: smart random
--------------------------------------------------------------------------------------------------------------
Computing CPD of T
    Iteration | Rel Error  | Rel Error Diff |     ||g||    | Damp| #CG iterations
        1     |  0.680649  |        -       |  4.762203  | 2.0 |    0
        2     |  0.438175  |    0.242474    |  3.924482  | 2.0 |    2
        3     |  0.220500  |    0.217675    |  3.097812  | 1.0 |    2
        4     |  0.156636  |    0.063864    |  1.392917  | 1.5 |    2
        5     |  0.118028  |    0.038608    |  0.700328  | 0.75 |    2
        6     |  0.104628  |    0.013399    |  0.287187  | 0.375 |    2
        7     |  0.

In [5]:
# Compute the CPD of T with full display plus intermediate relative errors.
Lambda, X, Y, Z, T_approx, info = tfx.cpd(T, r, display=3)

--------------------------------------------------------------------------------------------------------------
Computing HOSVD of T
    No compression detected
    Working with dimensions (2, 2, 2)
    Compression relative error = 0.0
--------------------------------------------------------------------------------------------------------------
Type of initialization: smart random
    Initial guess relative error = 0.91034
--------------------------------------------------------------------------------------------------------------
Computing CPD of T
    Iteration | Rel Error  | Rel Error Diff |     ||g||    | Damp| #CG iterations
        1     |  0.711901  |        -       |  4.762203  | 2.0 |    1
        2     |  0.394365  |    0.317536    |  4.454366  | 2.0 |    2
        3     |  0.208961  |    0.185404    |  3.445028  | 2.0 |    2
        4     |  0.152590  |    0.056371    |  1.741205  | 3.0 |    3
        5     |  0.117753  |    0.034837    |  0.982957  | 1.5 |    3
        6   

# Initialization

The iteration process needs a starting point for iterating. This starting point depends on the *init* option, and there are three possible choices in this case: *smart_random* (default), *random*, and *user*. The *smart_random* option generates a random CPD of rank $r$ with a original (and smart) strategy. This strategy makes the starting point to have small relative error, so it is already close to the objective tensor. The *random* option generates a CPD of rank $r$ with entries drawn from the Normal Distribution. The relative error in this case usually is close to $1$. Finally, there is the 'user' option where the user provides a list $[X, Y, Z]$ as starting point.

In [6]:
# Compute the CPD of T with random initialization.
Lambda, X, Y, Z, T_approx, info = tfx.cpd(T, r, init='random', display=1)

--------------------------------------------------------------------------------------------------------------
Computing HOSVD of T
    No compression detected
    Working with dimensions (2, 2, 2)
--------------------------------------------------------------------------------------------------------------
Type of initialization: random
--------------------------------------------------------------------------------------------------------------
Computing CPD of T
--------------------------------------------------------------------------------------------------------------
Computing refinement of solution
Final results
    Number of steps = 51
    Relative error = 7.009868308962553e-05
    Accuracy =  99.99 %


In [7]:
# Compute the CPD of T with user initialization.
X = np.ones((m, r))
Y = np.ones((m, r))
Z = np.ones((m, r))
Lambda, X, Y, Z, T_approx, info = tfx.cpd(T, r, init=[X,Y,Z], display=1)

--------------------------------------------------------------------------------------------------------------
Computing HOSVD of T
    No compression detected
    Working with dimensions (2, 2, 2)
--------------------------------------------------------------------------------------------------------------
Type of initialization: user
--------------------------------------------------------------------------------------------------------------
Computing CPD of T
--------------------------------------------------------------------------------------------------------------
Computing refinement of solution
Final results
    Number of steps = 13
    Relative error = 0.1420877165623935
    Accuracy =  85.79 %


# *Maxiter* and *Tol*

As the names suggest, *maxiter* is the maximum number of iterations permitted, while *tol* is the tolerance parameter, gives a stopping criterion to stop iterating. Both values are related in the sense we should increase *maxiter* when we decrease *tol*. Notice we have these parameters for the main stage and the refinement stage. Changing this parameters in this little example might not matter too much but for larger tensors we may want to increase precision by decreasing *tol*, for instance. 

Let's decrease *tol* and see if we get better approximations for the CPD. We will use *tol* = 1e-16 and keep the rest with default values.

In [8]:
# Compute the CPD of T with tol = 1e-16.
Lambda, X, Y, Z, T_approx, info = tfx.cpd(T, r, tol=1e-16, display=1)

--------------------------------------------------------------------------------------------------------------
Computing HOSVD of T
    No compression detected
    Working with dimensions (2, 2, 2)
--------------------------------------------------------------------------------------------------------------
Type of initialization: smart random
--------------------------------------------------------------------------------------------------------------
Computing CPD of T
--------------------------------------------------------------------------------------------------------------
Computing refinement of solution
Final results
    Number of steps = 51
    Relative error = 5.457725867301445e-05
    Accuracy =  99.99 %


We could decrease the relative error just a little much. This indicates that the default tolerance is already good enough for this problem. Remember that the previous error was of $6.70809 \cdot 10^{-5}$ and this one is of $6.70808 \cdot 10^{-5}$, so this is slightly better. 

Sometimes the tolerance parameter may not behave as expected. Decreasing this value makes the algorithm perform more iterations, but also make it follows a different path in the space of tensors. This path can be worse sometimes, and in this case the user could achieve worse results. This is just bad luck and in this case we can increase 'maxiter' or just repeat the computation (which will generate another initialization, maybe better).

# level and trunc_dims

Consider a matrix $A \in \mathbb{R}^{m \times n}$ and its reduced SVD 

$$A = U \Sigma V^T = [U_1 \ldots U_n] \cdot \text{diag}(\sigma_1, \ldots, \sigma_n) \cdot [V_1 \ldots V_m]^T.$$ 

It is commom to truncate $\Sigma$ in order to obtain the *truncate SVD* of $A$ given by 

$$\tilde{A} = [U_1 \ldots U_p] \cdot \text{diag}(\sigma_1, \ldots, \sigma_p) \cdot [V_1 \ldots V_p]^T,$$
where $p < n$. This truncated version of $A$ should be seen as a compressed version of $A$. We always work with compressed versions here.

There are several application in this decomposition we won't discuss here. We just want to mention that the sum $\sigma_1^2 + \ldots + \sigma_p^2$ is called the *energy* of $\tilde{A}$. The more energy the truncation has, more close to $A$ it is. On the other hand, less energy means fewer dimensions to take in account, and this translates to less computational time. As you can see, there is a trade off between proximity and dimensionality. We want to truncate as much as possible, but keeping the truncation close enough to $A$. 

In the same way we can compress matrices using the SVD, we can compress tensors using the HOSVD. The parameter *level* imposes thresholds values at the compression stage of $T$. Higher level means harder constraints, which means bigger dimensions. The level parameter is 0, 1, 2, 3 or 4. The larger is this value, the bigger is the threshold value of the energy to stop truncating. Small level values means small truncations, and big level values means bigger truncations. In particular, *level* $= 4$ means no truncation at all. Default is *level* $=1$.

For each choice of *level* the program automatically constructs a truncation, and it may happend that none truncation performs well. In this case the user can manually choose the truncation if needed too. Just set *trunc_dims* = $[m', n', p']$ for 

$$2 \leq m' \leq m,\quad 2 \leq n' \leq n,\quad 2 \leq p' \leq p.$$

The truncated tensor obtained is of dimension $m' \times n' \times p'$.

# Symmetric Tensors

If one want to work with symmetric tensors, just set *symm = True*. With this option activated the initialization and all iterations of the dGN function will be done with symmetric tensors. At each iteration of the dGN function we have an approximated CPD given by the triple $X, Y, Z$. This triple is obtained with the conjugate gradient method. The next step is to set 

$$X = \frac{X+Y+Z}{3},\quad Y = X,\quad Z = X.$$

If the objective tensor is really symmetric, then this procedure converges. Otherwise it can diverge.