# Federal University of Ceará
# Teleinformatics Departament
# Graduate Program in Teleinformatics Engeneering
## TIP8419 - Tensor Algebra
## Homework 3 - Least-Squares Khatri-Rao Factorization (LSKRF)
### Report and Simulation results

- Ezequias Márcio - 497779

To run this notebook properly, it is necessary Python3 installed alongside alongside with the packages listed below:

- `numpy 1.17.2`
- `scipy 1.4.1`
- `tdqm 4.36.1`
- `bokeh 1.3.4`

Make sure that the files `tensoralg.py` and `ta_simulations.py` are in the same directory as this notebook. In this files, it can be found the tensor algebra module functions and the code listings of the simulations.

In [1]:
# Importing the simulation module:
from ta_simulations import *
np.set_printoptions(3, linewidth=175)
output_notebook()

### Part 1

- Generate $\mathbf{X} = \mathbf{A}\diamond \mathbf{B} \in \mathbb{C}^{24×2}$, for randomly chosen $\mathbf{A} \in \mathbb{C}^{4×2}$ and $\mathbf{B} \in \mathbb{C}^{6×2}$. Then, implement the Least-Squares Khatri-Rao Factorization (LSKRF) algorithm that estimate $\mathbf{A}$ and $\mathbf{B}$ by solving the following problem:

\begin{equation}
    (\hat{\mathbf{A}},\hat{\mathbf{B}}) = \underset{\mathbf{A},\mathbf{B}}{min} ||\mathbf{X} - \mathbf{A}\diamond \mathbf{B}|| ^{2}_{F}
\end{equation}

- Compare the estimated matrices $\hat{\mathbf{A}}$ and $\hat{\mathbf{B}}$ with the original ones. What can you conclude? Explain the results.

### Solution: 

- Testing the implemented LSKRF function:

In the cell below, the matrices $\mathbf{A} \in \mathbb{C}^{4×2}$ an $\mathbf{B} \in \mathbb{C}^{6×2}$ are randomly generated and is calculated the squared error between these matrices and their estimates for comparison.

In [2]:
# Testing the Least-Squares Khatri-Rao Factorization:
# Generating random matrices: A (4x2) and B (6x2)
A = rand(4, 2*2).view(np.complex_)
B = rand(6, 2*2).view(np.complex_)
X = tensoralg.kr(A, B)
# Estimating matrices A and B:
A_hat, B_hat = tensoralg.lskrf(X, A.shape[0], B.shape[0])
# Calculating the squared error
errorA = norm(A - A_hat, 'fro')**2
errorB = norm(B - B_hat, 'fro')**2
errorX = norm(X - tensoralg.kr(A_hat, B_hat), 'fro')**2
print(f'''Squared Errors:\n- Matrix A: {errorA}\n- Matrix B: {errorB}
- Matrix X: {errorX}''')

Squared Errors:
- Matrix A: 15.157025798631754
- Matrix B: 20.908533266091062
- Matrix X: 1.0233910241987675e-30


As can be seen by the result above, the error between the estimated matrices and the oringinal ones is high. However, the error between the reconstructed version of $\mathbf{X}$ $(\hat{\mathbf{A}}\diamond \hat{\mathbf{B}})$ and the original is minimized by the LSKRF algorithm.

The porpouse of the LSKRF algorithm is to find the matrices $\hat{\mathbf{A}}$ and $\hat{\mathbf{B}}$ that minimizes the squared error between the original matrix and your reconstructed version by using the best rank-1 approximation of each column of $\mathbf{X}$. Doing that, each column of the matrices $\hat{\mathbf{A}}$ and $\hat{\mathbf{B}}$ is reconstructed by the truncated version of the SVD of each column of $\mathbf{X}$. Because only the first singular value and vectors are chosen for that truncated SVD, the estimated matrices are not close to the original ones, that is evidenced by the high error values above.

- Validating the results:

Using the provided matrices, it can be seen below that the implemented algorithm is working as expected, presenting a small error for the reconstructed matrix.

In [3]:
# Validating the results with the matrices provided:
# Loading the .m file given as a python dictionary:
krf_data = loadmat('m-files/krf_matrix.mat')
# Extracting data:
realA = np.array(krf_data['A'])
realB = np.array(krf_data['B'])
realX = np.array(krf_data['X'])
# print(realA, realB)
# Estimating matrices A and B:
Ahat, Bhat = tensoralg.lskrf(realX, realA.shape[0], realB.shape[0])
# Calculating the squared error
A_error = norm(realA - Ahat, 'fro')**2
B_error = norm(realB - Bhat, 'fro')**2
X_error = norm(realX - tensoralg.kr(Ahat, Bhat), 'fro')**2
print(f'''Squared Errors:\n- Matrix A: {A_error}\n- Matrix B: {B_error}
- Matrix X: {X_error}''')

Squared Errors:
- Matrix A: 64.12906997763164
- Matrix B: 59.31911461019167
- Matrix X: 1.7749677053948895e-30


### Part 2: 
- Assuming 1000 Monte Carlo experiments, generate  $\mathbf{X}_{0} = \mathbf{A}\diamond \mathbf{B} \in \mathbb{C}^{IJ×R}$, for randomly chosen $\mathbf{A} \in \mathbb{C}^{I×R}$ and $\mathbf{B} \in \mathbb{C}^{J×R}$, with R = 4, whose elements are drawn from a normal distribution. Let $\mathbf{X} = \mathbf{X}_{0} + \alpha \mathbf{V}$ be a noisy version of $\mathbf{X}_{0}$ where $\mathbf{V}$ is the additive noise term, whose elements are drawn from a normal distribution. The parameter $\alpha$ controls the power (variance) of the noise term, and is defined as a function of the signal to noise ratio (SNR), in dB, as follows:\begin{equation}
    SNR_{dB} = 10 \log_{10} \frac{||\mathbf{X}_{0}||^{2}_{F}}{||\alpha \mathbf{V}||^{2}_{F}}\end{equation}

- Assuming the SNR range $[0, 5, 10, 15, 20, 25, 30]$ dB, find the estimates $\hat{\mathbf{A}}$ and $\hat{\mathbf{B}}$ obtained with the LSKRF algorithm for the configurations $(I, J) = (10, 10)$ and $(I, J) = (30, 10)$. Let us define the normalized mean square error (NMSE) measure as follows:\begin{equation}
    \text{NMSE}(\mathbf{X}_{0}) = \frac{1}{1000} \sum^{1000}_{i = 1} \frac{||\hat{\mathbf{X}}_{0}(i) - \mathbf{X}_{0}(i)||^{2}_{F}}{||\mathbf{X}_{0}(i)||^{2}_{F}}\end{equation}  
  where $\mathbf{X}_{0}(i)$ and $\hat{\mathbf{X}}_{0}(i)$ represent the original data matrix and the reconstructed one at the ith experiment, respectively. For each SNR value and configuration, plot the NMSE vs. SNR curve. Discuss the obtained results.

### Solution:

- Monte Carlo Simulation

The simulation results are generated in the cell below. First, for the 1000 realizations, the matrix $\mathbf{X}_{0}(i)$ is generated and then, for each value of SNR, it is added Gaussian noise to the matrix tha will be the input of the LSKRF algorithm. After that, with the estimated matrices $\hat{\mathbf{A}}$ and $\hat{\mathbf{B}}$ are used to build the estimative $\hat{\mathbf{X}}_{0}(i)$. Lastly, the RMSE between the estimated matrix and the original is sorted. This process is implemented in the function `run_simulation_lskr`.

In [4]:
# Number of columns R; Number of rows I, J:
ncol = 4; nrow_a = 10, 30; nrow_b = 10
# SNR values:
snr = np.arange(0, 35, 5)
# Monte Carlo Realizations:
mc_realizations = 1000
#Generating data for the two cases: I, J = 10, 10 and I, J = 30, 10
case1 = run_simulation_lskr(snr, mc_realizations, nrow_a[0], nrow_b, ncol)
case2 = run_simulation_lskr(snr, mc_realizations, nrow_a[1], nrow_b, ncol)

HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))




HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))




The NMSE vs. SNR curve is presented in a log-scale plot below.

As can be seen in the figure below, with the increase in SNR, the estimative is more accurate because the noise power is decreasing. Also, with respect to the case where the size of the matrices are $I, J, R = 30, 10, 4$, the performance of the LSKRF algorithm is slightly better because the error is normalized by a smaller factor than in the first case $(I = 10)$.

In [5]:
plot_results(snr, case1, case2, 'I, J = 10, 10', 'I, J = 30, 10', 'LSKRF')