# About This Notebook

This notebook shows how to implement **Low-Rank Tensor Completion with Truncated Nuclear Norm minimization (LRTC-TNN)** on some real-world data sets. For an in-depth discussion of LRTC-TNN, please see our article [1].

<div class="alert alert-block alert-info">
<font color="black">
<b>[1]</b> Xinyu Chen, Jinming Yang, Lijun Sun (2020). <b>A Nonconvex Low-Rank Tensor Completion Model for Spatiotemporal Traffic Data Imputation</b>. arXiv.2003.10271. <a href="https://arxiv.org/abs/2003.10271" title="PDF"><b>[PDF]</b></a> 
</font>
</div>


## Quick Run

This notebook is publicly available for any usage at our data imputation project. Please check out [**transdim - GitHub**](https://github.com/xinychen/transdim).


## Low-Rank Tensor Completion

We start by importing the necessary dependencies.

In [1]:
import numpy as np
from numpy.linalg import inv as inv

### Tensor Unfolding (`ten2mat`) and Matrix Folding (`mat2ten`)

Using numpy reshape to perform 3rd rank tensor unfold operation. [[**link**](https://stackoverflow.com/questions/49970141/using-numpy-reshape-to-perform-3rd-rank-tensor-unfold-operation)]

In [2]:
def ten2mat(tensor, mode):
    return np.reshape(np.moveaxis(tensor, mode, 0), (tensor.shape[mode], -1), order = 'F')

In [3]:
X = np.array([[[1, 2, 3, 4], [3, 4, 5, 6]], 
              [[5, 6, 7, 8], [7, 8, 9, 10]], 
              [[9, 10, 11, 12], [11, 12, 13, 14]]])
print('tensor size:')
print(X.shape)
print('original tensor:')
print(X)
print()
print('(1) mode-1 tensor unfolding:')
print(ten2mat(X, 0))
print()
print('(2) mode-2 tensor unfolding:')
print(ten2mat(X, 1))
print()
print('(3) mode-3 tensor unfolding:')
print(ten2mat(X, 2))

tensor size:
(3, 2, 4)
original tensor:
[[[ 1  2  3  4]
  [ 3  4  5  6]]

 [[ 5  6  7  8]
  [ 7  8  9 10]]

 [[ 9 10 11 12]
  [11 12 13 14]]]

(1) mode-1 tensor unfolding:
[[ 1  3  2  4  3  5  4  6]
 [ 5  7  6  8  7  9  8 10]
 [ 9 11 10 12 11 13 12 14]]

(2) mode-2 tensor unfolding:
[[ 1  5  9  2  6 10  3  7 11  4  8 12]
 [ 3  7 11  4  8 12  5  9 13  6 10 14]]

(3) mode-3 tensor unfolding:
[[ 1  5  9  3  7 11]
 [ 2  6 10  4  8 12]
 [ 3  7 11  5  9 13]
 [ 4  8 12  6 10 14]]


In [4]:
def mat2ten(mat, tensor_size, mode):
    index = list()
    index.append(mode)
    for i in range(tensor_size.shape[0]):
        if i != mode:
            index.append(i)
    return np.moveaxis(np.reshape(mat, list(tensor_size[index]), order = 'F'), 0, mode)

### Singular Value Thresholding (SVT) for TNN

In [5]:
def svt_tnn(mat, alpha, rho, theta):
    """This is a Numpy dependent singular value thresholding (SVT) process."""
    u, s, v = np.linalg.svd(mat, full_matrices = 0)
    vec = s.copy()
    vec[theta :] = s[theta :] - alpha / rho
    vec[vec < 0] = 0
    return np.matmul(np.matmul(u, np.diag(vec)), v)

**Understanding these codes**:

- **`line 1`**: Necessary inputs including any input matrix $\boldsymbol{X}$, weight of Truncated Nuclear Norm (TNN) regularization $\alpha$, learning rate $\rho$, and positive integer number $\theta$ for nuclear norm truncation.

- **`line 2`**: Compute the Singular Value Decomposition (SVD) for any matrix $\boldsymbol{X}$ with `numpy.linalg.svd` (i.e., SVD function in `Numpy`'s linear algebra package).

- **`line 3-5`**: Truncate singular values $\sigma_{\theta+1},...$ with the following rule:

\begin{equation}
\sigma_{i}=\left[\sigma_{i}(\boldsymbol{X})-\frac{\alpha}{\rho}\right]_{+}.
\end{equation}

- **`line 6`**: Return the resulted matrix.

**Potential alternative for this**:

This is a competitively efficient algorithm for implementing SVT-TNN.

In [6]:
def svt_tnn(mat, alpha, rho, theta):
    tau = alpha / rho
    [m, n] = mat.shape
    if 2 * m < n:
        u, s, v = np.linalg.svd(mat @ mat.T, full_matrices = 0)
        s = np.sqrt(s)
        idx = np.sum(s > tau)
        mid = np.zeros(idx)
        mid[:theta] = 1
        mid[theta:idx] = (s[theta:idx] - tau) / s[theta:idx]
        return (u[:, :idx] @ np.diag(mid)) @ (u[:, :idx].T @ mat)
    elif m > 2 * n:
        return svt_tnn(mat.T, tau, theta).T
    u, s, v = np.linalg.svd(mat, full_matrices = 0)
    idx = np.sum(s > tau)
    vec = s[:idx].copy()
    vec[theta:idx] = s[theta:idx] - tau
    return u[:, :idx] @ np.diag(vec) @ v[:idx, :]

<div class="alert alert-block alert-warning">
<ul>
<li><b><code>compute_mape</code>:</b> <font color="black">Compute the value of Mean Absolute Percentage Error (MAPE).</font></li>
<li><b><code>compute_rmse</code>:</b> <font color="black">Compute the value of Root Mean Square Error (RMSE).</font></li>
</ul>
</div>

> Note that $$\mathrm{MAPE}=\frac{1}{n} \sum_{i=1}^{n} \frac{\left|y_{i}-\hat{y}_{i}\right|}{y_{i}} \times 100, \quad\mathrm{RMSE}=\sqrt{\frac{1}{n} \sum_{i=1}^{n}\left(y_{i}-\hat{y}_{i}\right)^{2}},$$ where $n$ is the total number of estimated values, and $y_i$ and $\hat{y}_i$ are the actual value and its estimation, respectively.

In [7]:
def compute_rmse(var, var_hat):
    return np.sqrt(np.sum((var - var_hat) ** 2) / var.shape[0])

In [8]:
def compute_mape(var, var_hat):
    return np.sum(np.abs(var - var_hat) / var) / var.shape[0]

### Define LRTC-TNN Function with `Numpy`

In [9]:
def LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter):
    """Low-Rank Tenor Completion with Truncated Nuclear Norm, LRTC-TNN."""
    
    dim = np.array(sparse_tensor.shape)
    pos_missing = np.where(sparse_tensor == 0)
    pos_test = np.where((dense_tensor != 0) & (sparse_tensor == 0))
    
    X = np.zeros(np.insert(dim, 0, len(dim))) # \boldsymbol{\mathcal{X}}
    T = np.zeros(np.insert(dim, 0, len(dim))) # \boldsymbol{\mathcal{T}}
    Z = sparse_tensor.copy()
    last_tensor = sparse_tensor.copy()
    snorm = np.sqrt(np.sum(sparse_tensor ** 2))
    it = 0
    while True:
        rho = min(rho * 1.05, 1e5)
        for k in range(len(dim)):
            X[k] = mat2ten(svt_tnn(ten2mat(Z - T[k] / rho, k), alpha[k], rho, np.int(np.ceil(theta * dim[k]))), dim, k)
        Z[pos_missing] = np.mean(X + T / rho, axis = 0)[pos_missing]
        T = T + rho * (X - np.broadcast_to(Z, np.insert(dim, 0, len(dim))))
        tensor_hat = np.einsum('k, kmnt -> mnt', alpha, X)
        tol = np.sqrt(np.sum((tensor_hat - last_tensor) ** 2)) / snorm
        last_tensor = tensor_hat.copy()
        it += 1
        if (it + 1) % 50 == 0:
            print('Iter: {}'.format(it + 1))
            print('RMSE: {:.6}'.format(compute_rmse(dense_tensor[pos_test], tensor_hat[pos_test])))
            print()
        if (tol < epsilon) or (it >= maxiter):
            break

    print('Imputation MAPE: {:.6}'.format(compute_mape(dense_tensor[pos_test], tensor_hat[pos_test])))
    print('Imputation RMSE: {:.6}'.format(compute_rmse(dense_tensor[pos_test], tensor_hat[pos_test])))
    print()
    
    return tensor_hat

**Understanding these codes**:

- **`line 18-19`**: Update $\boldsymbol{\mathcal{Z}}_{k}^{l+1},k=1,2,3$.

- **`line 20-22`**: Update $\boldsymbol{\mathcal{X}}_{k}^{l+1}$ by

\begin{equation}
\boldsymbol{\mathcal{X}}_{k}^{l+1}=\mathcal{P}_{\Omega}(\boldsymbol{\mathcal{Y}})+\mathcal{P}_{\Omega}^{\perp}\left(\boldsymbol{\mathcal{Z}}_{k}^{l+1}-\frac{1}{\rho}\boldsymbol{\mathcal{T}}_{k}^{l}\right),k=1,2,3.
\end{equation}

- **`line 23`**: Update $\boldsymbol{\mathcal{T}}_{k}^{l+1}$ by

\begin{equation}
\boldsymbol{\mathcal{T}}_{k}^{l+1}=\boldsymbol{\mathcal{T}}_{k}^{l}+\rho_k\left(\boldsymbol{\mathcal{X}}_{k}^{l+1}-\boldsymbol{\mathcal{Z}}_{k}^{l+1}\right).
\end{equation}

## Data Organization

### 1) Matrix Structure

We consider a dataset of $m$ discrete time series $\boldsymbol{y}_{i}\in\mathbb{R}^{f},i\in\left\{1,2,...,m\right\}$. The time series may have missing elements. We express spatio-temporal dataset as a matrix $Y\in\mathbb{R}^{m\times f}$ with $m$ rows (e.g., locations) and $f$ columns (e.g., discrete time intervals),

$$Y=\left[ \begin{array}{cccc} y_{11} & y_{12} & \cdots & y_{1f} \\ y_{21} & y_{22} & \cdots & y_{2f} \\ \vdots & \vdots & \ddots & \vdots \\ y_{m1} & y_{m2} & \cdots & y_{mf} \\ \end{array} \right]\in\mathbb{R}^{m\times f}.$$

### 2) Tensor Structure

We consider a dataset of $m$ discrete time series $\boldsymbol{y}_{i}\in\mathbb{R}^{nf},i\in\left\{1,2,...,m\right\}$. The time series may have missing elements. We partition each time series into intervals of predifined length $f$. We express each partitioned time series as a matrix $Y_{i}$ with $n$ rows (e.g., days) and $f$ columns (e.g., discrete time intervals per day),

$$Y_{i}=\left[ \begin{array}{cccc} y_{11} & y_{12} & \cdots & y_{1f} \\ y_{21} & y_{22} & \cdots & y_{2f} \\ \vdots & \vdots & \ddots & \vdots \\ y_{n1} & y_{n2} & \cdots & y_{nf} \\ \end{array} \right]\in\mathbb{R}^{n\times f},i=1,2,...,m,$$

therefore, the resulting structure is a tensor $\mathcal{Y}\in\mathbb{R}^{m\times n\times f}$.

## Missing Data Imputation

In the following, we apply the above defined TRMF function to the task of missing data imputation task on the following spatiotemporal multivariate time series datasets/matrices:

- **Guangzhou data set**: [Guangzhou urban traffic speed data set](https://doi.org/10.5281/zenodo.1205228).
- **Birmingham data set**: [Birmingham parking data set](https://archive.ics.uci.edu/ml/datasets/Parking+Birmingham).
- **Hangzhou data set**: [Hangzhou metro passenger flow data set](https://doi.org/10.5281/zenodo.3145403).
- **Settle data set**: [Seattle freeway traffic speed data set](https://github.com/zhiyongc/Seattle-Loop-Data).

The original data sets have been adapted into our experiments, and it is now available at the fold of `datasets`.

### Experiments on Guangzhou Data Set

In [10]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

missing_rate = 0.2

# =============================================================================
### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [11]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.30
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 5.46701

Iter: 100
RMSE: 2.8773

Iter: 150
RMSE: 2.86962

Imputation MAPE: 0.0670164
Imputation RMSE: 2.88087

Running time: 43 seconds


In [12]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

missing_rate = 0.4

# =============================================================================
### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [13]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.30
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 5.62032

Iter: 100
RMSE: 3.14578

Iter: 150
RMSE: 3.15347

Imputation MAPE: 0.0732347
Imputation RMSE: 3.16691

Running time: 40 seconds


In [14]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

missing_rate = 0.5

# =============================================================================
### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [15]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.30
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 5.83294

Iter: 100
RMSE: 3.28317

Iter: 150
RMSE: 3.31253

Imputation MAPE: 0.076822
Imputation RMSE: 3.32346

Running time: 39 seconds


In [16]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

missing_rate = 0.6

# =============================================================================
### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [17]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.30
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 5.49785

Iter: 100
RMSE: 3.43917

Iter: 150
RMSE: 3.49909

Imputation MAPE: 0.0811862
Imputation RMSE: 3.51235

Running time: 40 seconds


In [18]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

missing_rate = 0.7

# =============================================================================
### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [19]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.30
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 5.34354

Iter: 100
RMSE: 3.5962

Iter: 150
RMSE: 3.699

Imputation MAPE: 0.0860955
Imputation RMSE: 3.71831

Running time: 41 seconds


In [20]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_matrix = scipy.io.loadmat('../datasets/Guangzhou-data-set/random_matrix.mat')
random_matrix = random_matrix['random_matrix']

missing_rate = 0.2

# =============================================================================
### Non-random missing (NM) scenario:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(random_matrix[i1, i2] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [21]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.05
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 4.86922

Iter: 100
RMSE: 3.96752

Iter: 150
RMSE: 3.97225

Imputation MAPE: 0.0936998
Imputation RMSE: 3.97219

Running time: 34 seconds


In [22]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_matrix = scipy.io.loadmat('../datasets/Guangzhou-data-set/random_matrix.mat')
random_matrix = random_matrix['random_matrix']
random_tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

missing_rate = 0.4

# =============================================================================
### Non-random missing (NM) scenario:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(random_matrix[i1, i2] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [23]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.05
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 4.98029

Iter: 100
RMSE: 4.06503

Iter: 150
RMSE: 4.07531

Imputation MAPE: 0.0954183
Imputation RMSE: 4.07556

Running time: 36 seconds


In [24]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_matrix = scipy.io.loadmat('../datasets/Guangzhou-data-set/random_matrix.mat')
random_matrix = random_matrix['random_matrix']

missing_rate = 0.5

# =============================================================================
### Non-random missing (NM) scenario:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(random_matrix[i1, i2] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [25]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.05
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 5.15228

Iter: 100
RMSE: 4.14767

Imputation MAPE: 0.0973779
Imputation RMSE: 4.16227

Running time: 36 seconds


In [26]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_matrix = scipy.io.loadmat('../datasets/Guangzhou-data-set/random_matrix.mat')
random_matrix = random_matrix['random_matrix']

missing_rate = 0.6

# =============================================================================
### Non-random missing (NM) scenario:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(random_matrix[i1, i2] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [27]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.05
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 5.1033

Iter: 100
RMSE: 4.22383

Imputation MAPE: 0.0992274
Imputation RMSE: 4.24004

Running time: 42 seconds


In [28]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_matrix = scipy.io.loadmat('../datasets/Guangzhou-data-set/random_matrix.mat')
random_matrix = random_matrix['random_matrix']

missing_rate = 0.7

# =============================================================================
### Non-random missing (NM) scenario:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(random_matrix[i1, i2] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [29]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.05
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 5.21747

Iter: 100
RMSE: 4.30683

Imputation MAPE: 0.102139
Imputation RMSE: 4.32667

Running time: 34 seconds


**Experiment results** of missing data imputation using LRTC-TNN:

|  scenario |`alpha` (vector input)|`rho`|`theta`|`maxiter`|       mape |      rmse |
|:----------|-----:|---------:|---------:|-------- --:|-------- --:|----------:|
|**0.2, RM**| $\left(\frac{1}{3},\frac{1}{3},\frac{1}{3}\right)$ | 0.00001 | 0.30 | 200 | **0.0670** | **2.88**|
|**0.4, RM**| $\left(\frac{1}{3},\frac{1}{3},\frac{1}{3}\right)$ | 0.00001 | 0.30 | 200 | **0.0732** | **3.17**|
|**0.2, NM**| $\left(\frac{1}{3},\frac{1}{3},\frac{1}{3}\right)$ | 0.00001 | 0.05 | 200 | **0.0937** | **3.97**|
|**0.4, NM**| $\left(\frac{1}{3},\frac{1}{3},\frac{1}{3}\right)$ | 0.00001 | 0.05 | 200 | **0.0954** | **4.08**|


### Experiments on Birmingham Data Set


In [30]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Birmingham-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_tensor = scipy.io.loadmat('../datasets/Birmingham-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

missing_rate = 0.1

# =============================================================================
### Random missing (RM) scenario
### Set the RM scenario by:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [31]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.15
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 18.0692

Iter: 100
RMSE: 13.0828

Imputation MAPE: 0.0420645
Imputation RMSE: 13.1064

Running time: 0 seconds


In [32]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Birmingham-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_tensor = scipy.io.loadmat('../datasets/Birmingham-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

missing_rate = 0.3

# =============================================================================
### Random missing (RM) scenario
### Set the RM scenario by:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [33]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.15
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 20.5441

Iter: 100
RMSE: 17.4561

Imputation MAPE: 0.0515019
Imputation RMSE: 17.4746

Running time: 0 seconds


In [34]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Birmingham-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_tensor = scipy.io.loadmat('../datasets/Birmingham-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

missing_rate = 0.5

# =============================================================================
### Random missing (RM) scenario
### Set the RM scenario by:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [35]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.15
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 25.6108

Iter: 100
RMSE: 22.6758

Imputation MAPE: 0.0651507
Imputation RMSE: 22.6791

Running time: 0 seconds


In [36]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Birmingham-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_tensor = scipy.io.loadmat('../datasets/Birmingham-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

missing_rate = 0.6

# =============================================================================
### Random missing (RM) scenario
### Set the RM scenario by:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [37]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.15
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 29.9069

Iter: 100
RMSE: 27.1879

Imputation MAPE: 0.076155
Imputation RMSE: 27.1492

Running time: 0 seconds


In [38]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Birmingham-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_tensor = scipy.io.loadmat('../datasets/Birmingham-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

missing_rate = 0.7

# =============================================================================
### Random missing (RM) scenario
### Set the RM scenario by:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [39]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.15
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 35.5653

Iter: 100
RMSE: 33.3886

Imputation MAPE: 0.0895481
Imputation RMSE: 33.4512

Running time: 0 seconds


In [40]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Birmingham-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_matrix = scipy.io.loadmat('../datasets/Birmingham-data-set/random_matrix.mat')
random_matrix = random_matrix['random_matrix']

missing_rate = 0.1

# =============================================================================
### Non-random missing (NM) scenario
### Set the RM scenario by:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(random_matrix[i1, i2] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [41]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.05
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 22.6221

Imputation MAPE: 0.0939775
Imputation RMSE: 23.2553

Running time: 0 seconds


In [42]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Birmingham-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_matrix = scipy.io.loadmat('../datasets/Birmingham-data-set/random_matrix.mat')
random_matrix = random_matrix['random_matrix']

missing_rate = 0.3

# =============================================================================
### Non-random missing (NM) scenario
### Set the RM scenario by:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(random_matrix[i1, i2] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [43]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.05
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 52.6786

Imputation MAPE: 0.1331
Imputation RMSE: 52.3164

Running time: 0 seconds


In [44]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Birmingham-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_matrix = scipy.io.loadmat('../datasets/Birmingham-data-set/random_matrix.mat')
random_matrix = random_matrix['random_matrix']

missing_rate = 0.5

# =============================================================================
### Non-random missing (NM) scenario
### Set the RM scenario by:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(random_matrix[i1, i2] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [45]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.05
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 85.8883

Imputation MAPE: 0.153229
Imputation RMSE: 86.9536

Running time: 0 seconds


In [46]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Birmingham-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_matrix = scipy.io.loadmat('../datasets/Birmingham-data-set/random_matrix.mat')
random_matrix = random_matrix['random_matrix']

missing_rate = 0.6

# =============================================================================
### Non-random missing (NM) scenario
### Set the RM scenario by:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(random_matrix[i1, i2] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [47]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.05
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 96.0289

Imputation MAPE: 0.180065
Imputation RMSE: 96.1844

Running time: 0 seconds


In [48]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Birmingham-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_matrix = scipy.io.loadmat('../datasets/Birmingham-data-set/random_matrix.mat')
random_matrix = random_matrix['random_matrix']

missing_rate = 0.7

# =============================================================================
### Non-random missing (NM) scenario
### Set the RM scenario by:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(random_matrix[i1, i2] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [49]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.05
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 147.425

Iter: 100
RMSE: 145.304

Imputation MAPE: 0.209472
Imputation RMSE: 145.293

Running time: 0 seconds


**Experiment results** of missing data imputation using LRTC-TNN:

|  scenario |`alpha` (vector input)|`rho` | `theta` |`maxiter`|       mape |      rmse |
|:----------|-----:|---------:|---------:|-------- --:|-------- --:|----------:|
|**0.1, RM**| $\left(\frac{1}{3},\frac{1}{3},\frac{1}{3}\right)$ | 0.00001 | 0.15 | 200 | **0.0421** | **13.11**|
|**0.3, RM**| $\left(\frac{1}{3},\frac{1}{3},\frac{1}{3}\right)$ | 0.00001 | 0.15 | 200 | **0.0515** | **17.47**|
|**0.1, NM**| $\left(\frac{1}{3},\frac{1}{3},\frac{1}{3}\right)$ | 0.00001 | 0.05 | 200 | **0.0940** | **23.26**|
|**0.3, NM**| $\left(\frac{1}{3},\frac{1}{3},\frac{1}{3}\right)$ | 0.00001 | 0.05 | 200 | **0.1331** | **52.32**|


### Experiments on Hangzhou Data Set

In [50]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

missing_rate = 0.2

# =============================================================================
### Random missing (RM) scenario
### Set the RM scenario by:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [51]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.10
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 24.7402

Iter: 100
RMSE: 24.7786

Imputation MAPE: 0.180318
Imputation RMSE: 24.8976

Running time: 2 seconds


In [52]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

missing_rate = 0.4

# =============================================================================
### Random missing (RM) scenario
### Set the RM scenario by:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [53]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.10
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 25.7935

Iter: 100
RMSE: 25.7774

Imputation MAPE: 0.188003
Imputation RMSE: 25.8979

Running time: 3 seconds


In [54]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

missing_rate = 0.5

# =============================================================================
### Random missing (RM) scenario
### Set the RM scenario by:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [55]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.10
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 26.4952

Iter: 100
RMSE: 26.7525

Imputation MAPE: 0.1926
Imputation RMSE: 26.8564

Running time: 2 seconds


In [56]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

missing_rate = 0.6

# =============================================================================
### Random missing (RM) scenario
### Set the RM scenario by:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [57]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.10
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 27.6098

Iter: 100
RMSE: 27.7448

Imputation MAPE: 0.195575
Imputation RMSE: 27.8357

Running time: 2 seconds


In [58]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

missing_rate = 0.7

# =============================================================================
### Random missing (RM) scenario
### Set the RM scenario by:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [59]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.10
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 29.4194

Iter: 100
RMSE: 29.8024

Imputation MAPE: 0.203381
Imputation RMSE: 29.8976

Running time: 2 seconds


In [60]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_matrix = scipy.io.loadmat('../datasets/Hangzhou-data-set/random_matrix.mat')
random_matrix = random_matrix['random_matrix']

missing_rate = 0.2

# =============================================================================
### Non-random missing (NM) scenario
### Set the NM scenario by:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(random_matrix[i1, i2] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [61]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.10
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 29.5311

Iter: 100
RMSE: 27.4519

Imputation MAPE: 0.197053
Imputation RMSE: 27.4161

Running time: 2 seconds


In [62]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_matrix = scipy.io.loadmat('../datasets/Hangzhou-data-set/random_matrix.mat')
random_matrix = random_matrix['random_matrix']

missing_rate = 0.4

# =============================================================================
### Non-random missing (NM) scenario
### Set the NM scenario by:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(random_matrix[i1, i2] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [63]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.10
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 29.7898

Iter: 100
RMSE: 28.9947

Imputation MAPE: 0.204345
Imputation RMSE: 29.03

Running time: 2 seconds


In [64]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_matrix = scipy.io.loadmat('../datasets/Hangzhou-data-set/random_matrix.mat')
random_matrix = random_matrix['random_matrix']

missing_rate = 0.5

# =============================================================================
### Non-random missing (NM) scenario
### Set the NM scenario by:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(random_matrix[i1, i2] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [65]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.10
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 31.9214

Iter: 100
RMSE: 30.7007

Imputation MAPE: 0.212233
Imputation RMSE: 30.6813

Running time: 2 seconds


In [66]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_matrix = scipy.io.loadmat('../datasets/Hangzhou-data-set/random_matrix.mat')
random_matrix = random_matrix['random_matrix']

missing_rate = 0.6

# =============================================================================
### Non-random missing (NM) scenario
### Set the NM scenario by:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(random_matrix[i1, i2] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [67]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.10
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 38.5045

Iter: 100
RMSE: 37.6957

Imputation MAPE: 0.212195
Imputation RMSE: 37.6735

Running time: 2 seconds


In [68]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_matrix = scipy.io.loadmat('../datasets/Hangzhou-data-set/random_matrix.mat')
random_matrix = random_matrix['random_matrix']

missing_rate = 0.7

# =============================================================================
### Non-random missing (NM) scenario
### Set the NM scenario by:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(random_matrix[i1, i2] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [69]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.10
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 40.522

Iter: 100
RMSE: 39.6418

Imputation MAPE: 0.212856
Imputation RMSE: 39.6974

Running time: 2 seconds


**Experiment results** of missing data imputation using LRTC-TNN:

|  scenario |`alpha` (vector input)|`rho`|`theta`|`maxiter`|       mape |      rmse |
|:----------|-----:|---------:|---------:|--------:|----------:|----------:|
|**0.2, RM**| $\left(\frac{1}{3},\frac{1}{3},\frac{1}{3}\right)$ | 0.00001 | 0.10 | 200 | **0.1803** | **24.90**|
|**0.4, RM**| $\left(\frac{1}{3},\frac{1}{3},\frac{1}{3}\right)$ | 0.00001 | 0.10 | 200 | **0.1880** | **25.90**|
|**0.2, NM**| $\left(\frac{1}{3},\frac{1}{3},\frac{1}{3}\right)$ | 0.00001 | 0.10 | 200 | **0.1971** | **27.42**|
|**0.4, NM**| $\left(\frac{1}{3},\frac{1}{3},\frac{1}{3}\right)$ | 0.00001 | 0.10 | 200 | **0.2043** | **29.03**|


### Experiments on Seattle Data Set

In [70]:
import pandas as pd

dense_mat = pd.read_csv('../datasets/Seattle-data-set/mat.csv', index_col = 0)
RM_mat = pd.read_csv('../datasets/Seattle-data-set/RM_mat.csv', index_col = 0)
dense_mat = dense_mat.values
RM_mat = RM_mat.values
dense_tensor = dense_mat.reshape([dense_mat.shape[0], 28, 288])
RM_tensor = RM_mat.reshape([RM_mat.shape[0], 28, 288])

missing_rate = 0.2

# =============================================================================
### Random missing (RM) scenario
### Set the RM scenario by:
binary_tensor = np.round(RM_tensor + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [71]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.30
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 5.91611

Iter: 100
RMSE: 3.01208

Iter: 150
RMSE: 3.04636

Imputation MAPE: 0.0464732
Imputation RMSE: 3.05844

Running time: 65 seconds


In [72]:
import pandas as pd

dense_mat = pd.read_csv('../datasets/Seattle-data-set/mat.csv', index_col = 0)
RM_mat = pd.read_csv('../datasets/Seattle-data-set/RM_mat.csv', index_col = 0)
dense_mat = dense_mat.values
RM_mat = RM_mat.values
dense_tensor = dense_mat.reshape([dense_mat.shape[0], 28, 288])
RM_tensor = RM_mat.reshape([RM_mat.shape[0], 28, 288])

missing_rate = 0.4

# =============================================================================
### Random missing (RM) scenario
### Set the RM scenario by:
binary_tensor = np.round(RM_tensor + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [73]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.30
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 5.86054

Iter: 100
RMSE: 3.22163

Iter: 150
RMSE: 3.28947

Imputation MAPE: 0.0511641
Imputation RMSE: 3.30216

Running time: 63 seconds


In [74]:
import pandas as pd

dense_mat = pd.read_csv('../datasets/Seattle-data-set/mat.csv', index_col = 0)
RM_mat = pd.read_csv('../datasets/Seattle-data-set/RM_mat.csv', index_col = 0)
dense_mat = dense_mat.values
RM_mat = RM_mat.values
dense_tensor = dense_mat.reshape([dense_mat.shape[0], 28, 288])
RM_tensor = RM_mat.reshape([RM_mat.shape[0], 28, 288])

missing_rate = 0.5

# =============================================================================
### Random missing (RM) scenario
### Set the RM scenario by:
binary_tensor = np.round(RM_tensor + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [75]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.30
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 5.85456

Iter: 100
RMSE: 3.35752

Iter: 150
RMSE: 3.45611

Imputation MAPE: 0.0543108
Imputation RMSE: 3.46931

Running time: 66 seconds


In [76]:
import pandas as pd

dense_mat = pd.read_csv('../datasets/Seattle-data-set/mat.csv', index_col = 0)
RM_mat = pd.read_csv('../datasets/Seattle-data-set/RM_mat.csv', index_col = 0)
dense_mat = dense_mat.values
RM_mat = RM_mat.values
dense_tensor = dense_mat.reshape([dense_mat.shape[0], 28, 288])
RM_tensor = RM_mat.reshape([RM_mat.shape[0], 28, 288])

missing_rate = 0.6

# =============================================================================
### Random missing (RM) scenario
### Set the RM scenario by:
binary_tensor = np.round(RM_tensor + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [77]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.30
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 5.85415

Iter: 100
RMSE: 3.51862

Iter: 150
RMSE: 3.65423

Imputation MAPE: 0.0580412
Imputation RMSE: 3.66197

Running time: 57 seconds


In [78]:
import pandas as pd

dense_mat = pd.read_csv('../datasets/Seattle-data-set/mat.csv', index_col = 0)
RM_mat = pd.read_csv('../datasets/Seattle-data-set/RM_mat.csv', index_col = 0)
dense_mat = dense_mat.values
RM_mat = RM_mat.values
dense_tensor = dense_mat.reshape([dense_mat.shape[0], 28, 288])
RM_tensor = RM_mat.reshape([RM_mat.shape[0], 28, 288])

missing_rate = 0.7

# =============================================================================
### Random missing (RM) scenario
### Set the RM scenario by:
binary_tensor = np.round(RM_tensor + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [79]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.30
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 5.95502

Iter: 100
RMSE: 3.80994

Iter: 150
RMSE: 4.02848

Imputation MAPE: 0.0653333
Imputation RMSE: 4.04162

Running time: 60 seconds


In [80]:
import pandas as pd

dense_mat = pd.read_csv('../datasets/Seattle-data-set/mat.csv', index_col = 0)
NM_mat = pd.read_csv('../datasets/Seattle-data-set/NM_mat.csv', index_col = 0)
dense_mat = dense_mat.values
NM_mat = NM_mat.values
dense_tensor = dense_mat.reshape([dense_mat.shape[0], 28, 288])

missing_rate = 0.2

# =============================================================================
### Non-random missing (NM) scenario
### Set the NM scenario by:
binary_tensor = np.zeros((dense_mat.shape[0], 28, 288))
for i1 in range(binary_tensor.shape[0]):
    for i2 in range(binary_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(NM_mat[i1, i2] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [81]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.05
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 4.89165

Iter: 100
RMSE: 4.17949

Iter: 150
RMSE: 4.18524

Imputation MAPE: 0.0692741
Imputation RMSE: 4.18538

Running time: 61 seconds


In [82]:
import pandas as pd

dense_mat = pd.read_csv('../datasets/Seattle-data-set/mat.csv', index_col = 0)
NM_mat = pd.read_csv('../datasets/Seattle-data-set/NM_mat.csv', index_col = 0)
dense_mat = dense_mat.values
NM_mat = NM_mat.values
dense_tensor = dense_mat.reshape([dense_mat.shape[0], 28, 288])

missing_rate = 0.4

# =============================================================================
### Non-random missing (NM) scenario
### Set the NM scenario by:
binary_tensor = np.zeros((dense_mat.shape[0], 28, 288))
for i1 in range(binary_tensor.shape[0]):
    for i2 in range(binary_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(NM_mat[i1, i2] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [83]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.05
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 5.24117

Iter: 100
RMSE: 4.4993

Iter: 150
RMSE: 4.50354

Imputation MAPE: 0.0759062
Imputation RMSE: 4.50354

Running time: 64 seconds


In [84]:
import pandas as pd

dense_mat = pd.read_csv('../datasets/Seattle-data-set/mat.csv', index_col = 0)
NM_mat = pd.read_csv('../datasets/Seattle-data-set/NM_mat.csv', index_col = 0)
dense_mat = dense_mat.values
NM_mat = NM_mat.values
dense_tensor = dense_mat.reshape([dense_mat.shape[0], 28, 288])

missing_rate = 0.5

# =============================================================================
### Non-random missing (NM) scenario
### Set the NM scenario by:
binary_tensor = np.zeros((dense_mat.shape[0], 28, 288))
for i1 in range(binary_tensor.shape[0]):
    for i2 in range(binary_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(NM_mat[i1, i2] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [85]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.05
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 5.46856

Iter: 100
RMSE: 4.72734

Iter: 150
RMSE: 4.73129

Imputation MAPE: 0.0811558
Imputation RMSE: 4.7314

Running time: 59 seconds


In [86]:
import pandas as pd

dense_mat = pd.read_csv('../datasets/Seattle-data-set/mat.csv', index_col = 0)
NM_mat = pd.read_csv('../datasets/Seattle-data-set/NM_mat.csv', index_col = 0)
dense_mat = dense_mat.values
NM_mat = NM_mat.values
dense_tensor = dense_mat.reshape([dense_mat.shape[0], 28, 288])

missing_rate = 0.6

# =============================================================================
### Non-random missing (NM) scenario
### Set the NM scenario by:
binary_tensor = np.zeros((dense_mat.shape[0], 28, 288))
for i1 in range(binary_tensor.shape[0]):
    for i2 in range(binary_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(NM_mat[i1, i2] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [87]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.05
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 5.75773

Iter: 100
RMSE: 4.99998

Imputation MAPE: 0.0861234
Imputation RMSE: 5.00438

Running time: 59 seconds


In [88]:
import pandas as pd

dense_mat = pd.read_csv('../datasets/Seattle-data-set/mat.csv', index_col = 0)
NM_mat = pd.read_csv('../datasets/Seattle-data-set/NM_mat.csv', index_col = 0)
dense_mat = dense_mat.values
NM_mat = NM_mat.values
dense_tensor = dense_mat.reshape([dense_mat.shape[0], 28, 288])

missing_rate = 0.7

# =============================================================================
### Non-random missing (NM) scenario
### Set the NM scenario by:
binary_tensor = np.zeros((dense_mat.shape[0], 28, 288))
for i1 in range(binary_tensor.shape[0]):
    for i2 in range(binary_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(NM_mat[i1, i2] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

In [89]:
import time
start = time.time()
alpha = np.ones(3) / 3
rho = 1e-5
theta = 0.05
epsilon = 1e-4
maxiter = 200
LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 50
RMSE: 6.09623

Iter: 100
RMSE: 5.44157

Iter: 150
RMSE: 5.45282

Imputation MAPE: 0.0940823
Imputation RMSE: 5.45341

Running time: 76 seconds


**Experiment results** of missing data imputation using LRTC-TNN:

|  scenario |`alpha` (vector input)|`rho`|`theta`|`maxiter`|       mape |      rmse |
|:----------|-----:|---------:|---------:|---------:|-------- --:|----------:|
|**0.2, RM**| $\left(\frac{1}{3},\frac{1}{3},\frac{1}{3}\right)$ | 0.00001 | 0.30 | 200 | **0.0465** | **3.06**|
|**0.4, RM**| $\left(\frac{1}{3},\frac{1}{3},\frac{1}{3}\right)$ | 0.00001 | 0.30 | 200 | **0.0512** | **3.30**|
|**0.2, NM**| $\left(\frac{1}{3},\frac{1}{3},\frac{1}{3}\right)$ | 0.00001 | 0.05 | 200 | **0.0693** | **4.19**|
|**0.4, NM**| $\left(\frac{1}{3},\frac{1}{3},\frac{1}{3}\right)$ | 0.00001 | 0.05 | 200 | **0.0759** | **4.50**|
