## Low-Rank Autoregressive Tensor Completion Imputer (LATC-imputer)

This notebook shows how to implement a LATC (with nuclear norm) imputer on three real-world data sets (i.e., PeMS traffic speed data, Guangzhou traffic speed data, Electricity data). To overcome the problem of missing values within multivariate time series data, this method takes into account both low-rank structure and time series regression. For an in-depth discussion of LATC-imputer, please see [1].

<div class="alert alert-block alert-info">
<font color="black">
<b>[1]</b> Xinyu Chen, Jinming Yang, Lijun Sun (2020). <b>Low-Rank Autorgressive Tensor Completion for Multivariate Time Series Forecasting</b>. arXiv:2006.10436. <a href="https://arxiv.org/abs/2006.10436" title="PDF"><b>[PDF]</b></a> 
</font>
</div>


In [1]:
import numpy as np
from numpy.linalg import inv as inv

### Define LATC-imputer kernel

We start by introducing some necessary functions that relies on `Numpy`.

<div class="alert alert-block alert-warning">
<ul>
<li><b><code>ten2mat</code>:</b> <font color="black">Unfold tensor as matrix by specifying mode.</font></li>
<li><b><code>mat2ten</code>:</b> <font color="black">Fold matrix as tensor by specifying dimension (i.e, tensor size) and mode.</font></li>
<li><b><code>svt</code>:</b> <font color="black">Implement the process of Singular Value Thresholding (SVT).</font></li>
</ul>
</div>

In [2]:
def ten2mat(tensor, mode):
    return np.reshape(np.moveaxis(tensor, mode, 0), (tensor.shape[mode], -1), order = 'F')

def mat2ten(mat, dim, mode):
    index = list()
    index.append(mode)
    for i in range(dim.shape[0]):
        if i != mode:
            index.append(i)
    return np.moveaxis(np.reshape(mat, list(dim[index]), order = 'F'), 0, mode)

In [3]:
def svt_tnn(mat, tau, theta):
    [m, n] = mat.shape
    if 2 * m < n:
        u, s, v = np.linalg.svd(mat @ mat.T, full_matrices = 0)
        s = np.sqrt(s)
        idx = np.sum(s > tau)
        mid = np.zeros(idx)
        mid[: theta] = 1
        mid[theta : idx] = (s[theta : idx] - tau) / s[theta : idx]
        return (u[:, : idx] @ np.diag(mid)) @ (u[:, : idx].T @ mat)
    elif m > 2 * n:
        return svt_tnn(mat.T, tau, theta).T
    u, s, v = np.linalg.svd(mat, full_matrices = 0)
    idx = np.sum(s > tau)
    vec = s[: idx].copy()
    vec[theta : idx] = s[theta : idx] - tau
    return u[:, : idx] @ np.diag(vec) @ v[: idx, :]

<div class="alert alert-block alert-warning">
<ul>
<li><b><code>compute_mape</code>:</b> <font color="black">Compute the value of Mean Absolute Percentage Error (MAPE).</font></li>
<li><b><code>compute_rmse</code>:</b> <font color="black">Compute the value of Root Mean Square Error (RMSE).</font></li>
</ul>
</div>

> Note that $$\mathrm{MAPE}=\frac{1}{n} \sum_{i=1}^{n} \frac{\left|y_{i}-\hat{y}_{i}\right|}{y_{i}} \times 100, \quad\mathrm{RMSE}=\sqrt{\frac{1}{n} \sum_{i=1}^{n}\left(y_{i}-\hat{y}_{i}\right)^{2}},$$ where $n$ is the total number of estimated values, and $y_i$ and $\hat{y}_i$ are the actual value and its estimation, respectively.

In [4]:
def compute_mape(var, var_hat):
    return np.sum(np.abs(var - var_hat) / var) / var.shape[0]

def compute_rmse(var, var_hat):
    return  np.sqrt(np.sum((var - var_hat) ** 2) / var.shape[0])

The main idea behind LATC-imputer is to approximate partially observed data with both low-rank structure and time series dynamics. The following `imputer` kernel includes some necessary inputs:

<div class="alert alert-block alert-warning">
<ul>
<li><b><code>dense_tensor</code>:</b> <font color="black">This is an input which has the ground truth for validation. If this input is not available, you could use <code>dense_tensor = sparse_tensor.copy()</code> instead.</font></li>
<li><b><code>sparse_tensor</code>:</b> <font color="black">This is a partially observed tensor which has many missing entries.</font></li>
<li><b><code>time_lags</code>:</b> <font color="black">Time lags, e.g., <code>time_lags = np.array([1, 2, 3])</code>. </font></li>
<li><b><code>alpha</code>:</b> <font color="black">Weights for tensors' nuclear norm, e.g., <code>alpha = np.ones(3) / 3</code>. </font></li>
<li><b><code>rho</code>:</b> <font color="black">Learning rate for ADMM, e.g., <code>rho = 0.0005</code>. </font></li>
<li><b><code>lambda0</code>:</b> <font color="black">Weight for time series regressor, e.g., <code>lambda0 = 5 * rho</code>. If <code>lambda0 = 0</code>, then this imputer is actually a standard low-rank tensor completion (i.e., High-accuracy Low-Rank Tensor Completion, or HaLRTC).</font></li>
<li><b><code>epsilon</code>:</b> <font color="black">Stop criteria, e.g., <code>epsilon = 0.001</code>. </font></li>
<li><b><code>maxiter</code>:</b> <font color="black">Maximum iteration to stop algorithm, e.g., <code>maxiter = 50</code>. </font></li>
</ul>
</div>


In [5]:
def imputer(dense_tensor, sparse_tensor, time_lags, alpha, rho0, lambda0, theta, epsilon, maxiter):
    """Low-Rank Autoregressive Tensor Completion, LATC-imputer."""
    dim = np.array(sparse_tensor.shape)
    dim_time = np.int(np.prod(dim) / dim[0])
    d = len(time_lags)
    max_lag = np.max(time_lags)
    sparse_mat = ten2mat(sparse_tensor, 0)
    pos_missing = np.where(sparse_mat == 0)
    pos_test = np.where((dense_tensor != 0) & (sparse_tensor == 0))
    
    X = np.zeros(np.insert(dim, 0, len(dim))) # \boldsymbol{\mathcal{X}}
    T = np.zeros(np.insert(dim, 0, len(dim))) # \boldsymbol{\mathcal{T}}
    Z = sparse_mat.copy()                     # \boldsymbol{Z}
    Z[pos_missing] = np.mean(sparse_mat[sparse_mat != 0])
    A = 0.001 * np.random.rand(dim[0], d)     # \boldsymbol{A}
    it = 0
    ind = np.zeros((d, dim_time - max_lag), dtype = np.int_)
    for i in range(d):
        ind[i, :] = np.arange(max_lag - time_lags[i], dim_time - time_lags[i])
    last_mat = sparse_mat.copy()
    snorm = np.linalg.norm(sparse_mat, 'fro')
    rho = rho0
    while True:
        rho = min(rho*1.05, 1e5)
        for k in range(len(dim)):
            X[k] = mat2ten(svt_tnn(ten2mat(mat2ten(Z, dim, 0) - T[k] / rho, k), alpha[k] / rho, theta), dim, k)
        tensor_hat = np.einsum('k, kmnt -> mnt', alpha, X)
        mat_hat = ten2mat(tensor_hat, 0)
        mat0 = np.zeros((dim[0], dim_time - max_lag))
        if lambda0 > 0:
            for m in range(dim[0]):
                Qm = mat_hat[m, ind].T
                A[m, :] = np.linalg.pinv(Qm) @ Z[m, max_lag :]
                mat0[m, :] = Qm @ A[m, :]
            mat1 = ten2mat(np.mean(rho * X + T, axis = 0), 0)
            Z[pos_missing] = np.append((mat1[:, : max_lag] / rho), (mat1[:, max_lag :] + lambda0 * mat0) 
                                       / (rho + lambda0), axis = 1)[pos_missing]
        else:
            Z[pos_missing] = (ten2mat(np.mean(X + T / rho, axis = 0), 0))[pos_missing]
        T = T + rho * (X - np.broadcast_to(mat2ten(Z, dim, 0), np.insert(dim, 0, len(dim))))
        tol = np.linalg.norm((mat_hat - last_mat), 'fro') / snorm
        last_mat = mat_hat.copy()
        it += 1
        if it % 100 == 0:
            print('Iter: {}'.format(it))
            print('Tolerance: {:.6}'.format(tol))
            print('MAPE: {:.6}'.format(compute_mape(dense_tensor[pos_test], tensor_hat[pos_test])))
            print('RMSE: {:.6}'.format(compute_rmse(dense_tensor[pos_test], tensor_hat[pos_test])))
            print()
        if (tol < epsilon) or (it >= maxiter):
            break

    print('Total iteration: {}'.format(it))
    print('Tolerance: {:.6}'.format(tol))
    print('Imputation MAPE: {:.6}'.format(compute_mape(dense_tensor[pos_test], tensor_hat[pos_test])))
    print('Imputation RMSE: {:.6}'.format(compute_rmse(dense_tensor[pos_test], tensor_hat[pos_test])))
    print()
    
    return tensor_hat

### Guangzhou data

We generate **random missing (RM)** values on Guangzhou traffic speed data set.

In [6]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

missing_rate = 0.2

### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
sparse_tensor = np.multiply(dense_tensor, binary_tensor)

dense_tensor = np.transpose(dense_tensor, [0, 2, 1])
sparse_tensor = np.transpose(sparse_tensor, [0, 2, 1])

del tensor, random_tensor,binary_tensor

We use `imputer` to fill in the missing entries and measure performance metrics on the ground truth.

In [7]:
import time
start = time.time()
time_lags = np.array([1, 2, 3, 4, 5, 6, 142, 143, 144, 145, 146, 147])
alpha = np.ones(3) / 3
rho = 1e-4
lambda0 = 5 * rho
theta = 0
epsilon = 1e-4
maxiter = 100
tensor_hat = imputer(dense_tensor, sparse_tensor, time_lags, alpha, rho, lambda0, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 21
Tolerance: 9.92196e-05
Imputation MAPE: 0.0711961
Imputation RMSE: 2.96892

Running time: 18 seconds


In [8]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

missing_rate = 0.4

### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
sparse_tensor = np.multiply(dense_tensor, binary_tensor)

dense_tensor = np.transpose(dense_tensor, [0, 2, 1])
sparse_tensor = np.transpose(sparse_tensor, [0, 2, 1])

del tensor, random_tensor,binary_tensor

In [9]:
import time
start = time.time()
time_lags = np.array([1, 2, 3, 4, 5, 6, 142, 143, 144, 145, 146, 147])
alpha = np.ones(3) / 3
rho = 1e-4
lambda0 = 5 * rho
theta = 0
epsilon = 1e-4
maxiter = 100
tensor_hat = imputer(dense_tensor, sparse_tensor, time_lags, alpha, rho, lambda0, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 24
Tolerance: 8.38559e-05
Imputation MAPE: 0.0782436
Imputation RMSE: 3.24305

Running time: 18 seconds


In [10]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

missing_rate = 0.6

### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
sparse_tensor = np.multiply(dense_tensor, binary_tensor)

dense_tensor = np.transpose(dense_tensor, [0, 2, 1])
sparse_tensor = np.transpose(sparse_tensor, [0, 2, 1])

del tensor, random_tensor,binary_tensor

In [11]:
import time
start = time.time()
time_lags = np.array([1, 2, 3, 4, 5, 6, 142, 143, 144, 145, 146, 147])
alpha = np.ones(3) / 3
rho = 1e-4
lambda0 = 5 * rho
theta = 0
epsilon = 1e-4
maxiter = 100
tensor_hat = imputer(dense_tensor, sparse_tensor, time_lags, alpha, rho, lambda0, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 30
Tolerance: 9.04404e-05
Imputation MAPE: 0.088627
Imputation RMSE: 3.62113

Running time: 24 seconds


We generate **non-random missing (NM)** values on Guangzhou traffic speed data set. Then, we conduct the imputation experiment.

In [12]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_matrix = scipy.io.loadmat('../datasets/Guangzhou-data-set/random_matrix.mat')
random_matrix = random_matrix['random_matrix']

missing_rate = 0.2

### Non-random missing (NM) scenario:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(random_matrix[i1, i2] + 0.5 - missing_rate)
sparse_tensor = np.multiply(dense_tensor, binary_tensor)

dense_tensor = np.transpose(dense_tensor, [0, 2, 1])
sparse_tensor = np.transpose(sparse_tensor, [0, 2, 1])

del tensor, random_matrix, binary_tensor

In [13]:
import time
start = time.time()
time_lags = np.array([1, 2, 3, 4, 5, 6, 142, 143, 144, 145, 146, 147])
alpha = np.ones(3) / 3
rho = 1e-4
lambda0 = 1 * rho
theta = 0
epsilon = 1e-4
maxiter = 100
tensor_hat = imputer(dense_tensor, sparse_tensor, time_lags, alpha, rho, lambda0, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 23
Tolerance: 7.63173e-05
Imputation MAPE: 0.104565
Imputation RMSE: 4.20724

Running time: 18 seconds


In [14]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_matrix = scipy.io.loadmat('../datasets/Guangzhou-data-set/random_matrix.mat')
random_matrix = random_matrix['random_matrix']

missing_rate = 0.4

### Non-random missing (NM) scenario:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(random_matrix[i1, i2] + 0.5 - missing_rate)
sparse_tensor = np.multiply(dense_tensor, binary_tensor)

dense_tensor = np.transpose(dense_tensor, [0, 2, 1])
sparse_tensor = np.transpose(sparse_tensor, [0, 2, 1])

del tensor, random_matrix, binary_tensor

In [15]:
import time
start = time.time()
time_lags = np.array([1, 2, 3, 4, 5, 6, 142, 143, 144, 145, 146, 147])
alpha = np.ones(3) / 3
rho = 1e-4
lambda0 = 1 * rho
theta = 0
epsilon = 1e-4
maxiter = 100
tensor_hat = imputer(dense_tensor, sparse_tensor, time_lags, alpha, rho, lambda0, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 32
Tolerance: 9.23915e-05
Imputation MAPE: 0.108882
Imputation RMSE: 4.37916

Running time: 27 seconds


In [16]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_matrix = scipy.io.loadmat('../datasets/Guangzhou-data-set/random_matrix.mat')
random_matrix = random_matrix['random_matrix']

missing_rate = 0.6

### Non-random missing (NM) scenario:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(random_matrix[i1, i2] + 0.5 - missing_rate)
sparse_tensor = np.multiply(dense_tensor, binary_tensor)

dense_tensor = np.transpose(dense_tensor, [0, 2, 1])
sparse_tensor = np.transpose(sparse_tensor, [0, 2, 1])

del tensor, random_matrix, binary_tensor

In [17]:
import time
start = time.time()
time_lags = np.array([1, 2, 3, 4, 5, 6, 142, 143, 144, 145, 146, 147])
alpha = np.ones(3) / 3
rho = 1e-4
lambda0 = 1 * rho
theta = 0
epsilon = 1e-4
maxiter = 100
tensor_hat = imputer(dense_tensor, sparse_tensor, time_lags, alpha, rho, lambda0, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 35
Tolerance: 8.88991e-05
Imputation MAPE: 0.118113
Imputation RMSE: 4.6925

Running time: 31 seconds


### PeMS data

In [18]:
dense_mat = np.load('../datasets/PeMS-data-set/pems.npy')
random_tensor = np.load('../datasets/PeMS-data-set/random_tensor.npy')

missing_rate = 0.2

### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
sparse_mat = np.multiply(dense_mat, ten2mat(binary_tensor, 0))

sparse_tensor = mat2ten(sparse_mat, np.array(binary_tensor.shape), 0)
dense_tensor = mat2ten(dense_mat, np.array(binary_tensor.shape), 0)

del dense_mat, random_tensor, binary_tensor

In [19]:
import time
start = time.time()
time_lags = np.array([1, 2, 3, 4, 5, 6, 286, 287, 288, 289, 290, 291])
alpha = np.ones(3) / 3
rho = 1e-4
lambda0 = 5 * rho
theta = 0
epsilon = 1e-4
maxiter = 100
tensor_hat = imputer(dense_tensor, sparse_tensor, time_lags, alpha, rho, lambda0, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 26
Tolerance: 8.71907e-05
Imputation MAPE: 0.0336431
Imputation RMSE: 2.3156

Running time: 39 seconds


In [20]:
dense_mat = np.load('../datasets/PeMS-data-set/pems.npy')
random_tensor = np.load('../datasets/PeMS-data-set/random_tensor.npy')

missing_rate = 0.4

### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
sparse_mat = np.multiply(dense_mat, ten2mat(binary_tensor, 0))

sparse_tensor = mat2ten(sparse_mat, np.array(binary_tensor.shape), 0)
dense_tensor = mat2ten(dense_mat, np.array(binary_tensor.shape), 0)

del dense_mat, random_tensor, binary_tensor

In [21]:
import time
start = time.time()
time_lags = np.array([1, 2, 3, 4, 5, 6, 286, 287, 288, 289, 290, 291])
alpha = np.ones(3) / 3
rho = 1e-4
lambda0 = 5 * rho
theta = 0
epsilon = 1e-4
maxiter = 100
tensor_hat = imputer(dense_tensor, sparse_tensor, time_lags, alpha, rho, lambda0, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 28
Tolerance: 8.58061e-05
Imputation MAPE: 0.0413385
Imputation RMSE: 2.83571

Running time: 37 seconds


In [22]:
dense_mat = np.load('../datasets/PeMS-data-set/pems.npy')
random_tensor = np.load('../datasets/PeMS-data-set/random_tensor.npy')

missing_rate = 0.6

### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
sparse_mat = np.multiply(dense_mat, ten2mat(binary_tensor, 0))

sparse_tensor = mat2ten(sparse_mat, np.array(binary_tensor.shape), 0)
dense_tensor = mat2ten(dense_mat, np.array(binary_tensor.shape), 0)

del dense_mat, random_tensor, binary_tensor

In [23]:
import time
start = time.time()
time_lags = np.array([1, 2, 3, 4, 5, 6, 286, 287, 288, 289, 290, 291])
alpha = np.ones(3) / 3
rho = 1e-4
lambda0 = 5 * rho
theta = 0
epsilon = 1e-4
maxiter = 100
tensor_hat = imputer(dense_tensor, sparse_tensor, time_lags, alpha, rho, lambda0, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 30
Tolerance: 8.91685e-05
Imputation MAPE: 0.0536306
Imputation RMSE: 3.61833

Running time: 45 seconds


In [24]:
dense_mat = np.load('../datasets/PeMS-data-set/pems.npy')
random_matrix = np.load('../datasets/PeMS-data-set/random_matrix.npy')

missing_rate = 0.2

### Nonrandom missing (NM) scenario:
binary_tensor = np.zeros((dense_mat.shape[0], 288, 44))
for i1 in range(dense_mat.shape[0]):
    for i2 in range(44):
        binary_tensor[i1,:,i2] = np.round(random_matrix[i1,i2] + 0.5 - missing_rate)
binary_mat = ten2mat(binary_tensor, 0)
sparse_mat = np.multiply(dense_mat, binary_mat)

sparse_tensor = mat2ten(sparse_mat, np.array(binary_tensor.shape), 0)
dense_tensor = mat2ten(dense_mat, np.array(binary_tensor.shape), 0)

del dense_mat, random_matrix, binary_tensor

In [25]:
import time
start = time.time()
time_lags = np.array([1, 2, 3, 4, 5, 6, 286, 287, 288, 289, 290, 291])
alpha = np.ones(3) / 3
rho = 1e-4
lambda0 = 1 * rho
theta = 0
epsilon = 1e-4
maxiter = 100
tensor_hat = imputer(dense_tensor, sparse_tensor, time_lags, alpha, rho, lambda0, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 27
Tolerance: 2.47755e-05
Imputation MAPE: 0.0878944
Imputation RMSE: 5.65188

Running time: 43 seconds


In [26]:
dense_mat = np.load('../datasets/PeMS-data-set/pems.npy')
random_matrix = np.load('../datasets/PeMS-data-set/random_matrix.npy')

missing_rate = 0.4

### Nonrandom missing (NM) scenario:
binary_tensor = np.zeros((dense_mat.shape[0], 288, 44))
for i1 in range(dense_mat.shape[0]):
    for i2 in range(44):
        binary_tensor[i1,:,i2] = np.round(random_matrix[i1,i2] + 0.5 - missing_rate)
binary_mat = ten2mat(binary_tensor, 0)
sparse_mat = np.multiply(dense_mat, binary_mat)

sparse_tensor = mat2ten(sparse_mat, np.array(binary_tensor.shape), 0)
dense_tensor = mat2ten(dense_mat, np.array(binary_tensor.shape), 0)

del dense_mat, random_matrix, binary_tensor

In [27]:
import time
start = time.time()
time_lags = np.array([1, 2, 3, 4, 5, 6, 286, 287, 288, 289, 290, 291])
alpha = np.ones(3) / 3
rho = 1e-4
lambda0 = 1 * rho
theta = 0
epsilon = 1e-4
maxiter = 100
tensor_hat = imputer(dense_tensor, sparse_tensor, time_lags, alpha, rho, lambda0, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 29
Tolerance: 7.52688e-05
Imputation MAPE: 0.096959
Imputation RMSE: 6.11712

Running time: 43 seconds


In [28]:
dense_mat = np.load('../datasets/PeMS-data-set/pems.npy')
random_matrix = np.load('../datasets/PeMS-data-set/random_matrix.npy')

missing_rate = 0.6

### Nonrandom missing (NM) scenario:
binary_tensor = np.zeros((dense_mat.shape[0], 288, 44))
for i1 in range(dense_mat.shape[0]):
    for i2 in range(44):
        binary_tensor[i1,:,i2] = np.round(random_matrix[i1,i2] + 0.5 - missing_rate)
binary_mat = ten2mat(binary_tensor, 0)
sparse_mat = np.multiply(dense_mat, binary_mat)

sparse_tensor = mat2ten(sparse_mat, np.array(binary_tensor.shape), 0)
dense_tensor = mat2ten(dense_mat, np.array(binary_tensor.shape), 0)

del dense_mat, random_matrix, binary_tensor

In [29]:
import time
start = time.time()
time_lags = np.array([1, 2, 3, 4, 5, 6, 286, 287, 288, 289, 290, 291])
alpha = np.ones(3) / 3
rho = 1e-4
lambda0 = 1 * rho
theta = 0
epsilon = 1e-4
maxiter = 100
tensor_hat = imputer(dense_tensor, sparse_tensor, time_lags, alpha, rho, lambda0, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 33
Tolerance: 3.77374e-05
Imputation MAPE: 0.112362
Imputation RMSE: 6.83121

Running time: 45 seconds


### Electricity data

- **Random Missing (RM)**:

In [30]:
dense_mat = np.load('../datasets/Electricity-data-set/electricity35.npy')
random_tensor = np.load('../datasets/Electricity-data-set/random_tensor.npy')

missing_rate = 0.2

### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
sparse_mat = np.multiply(dense_mat, ten2mat(binary_tensor, 0))

sparse_tensor = mat2ten(sparse_mat, np.array(binary_tensor.shape), 0)
dense_tensor = mat2ten(dense_mat, np.array(binary_tensor.shape), 0)

del dense_mat, random_tensor, binary_tensor

In [31]:
import time
start = time.time()
time_lags = np.array([1, 2, 3, 4, 5, 6, 22, 23, 24, 25, 26, 27])
alpha = np.ones(3) / 3
rho = 1e-6
lambda0 = 5 * rho
theta = 0
epsilon = 1e-4
maxiter = 100
tensor_hat = imputer(dense_tensor, sparse_tensor, time_lags, alpha, rho, lambda0, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 67
Tolerance: 9.32503e-05
Imputation MAPE: 0.0979131
Imputation RMSE: 527.227

Running time: 11 seconds


In [32]:
dense_mat = np.load('../datasets/Electricity-data-set/electricity35.npy')
random_tensor = np.load('../datasets/Electricity-data-set/random_tensor.npy')

missing_rate = 0.4

### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
sparse_mat = np.multiply(dense_mat, ten2mat(binary_tensor, 0))

sparse_tensor = mat2ten(sparse_mat, np.array(binary_tensor.shape), 0)
dense_tensor = mat2ten(dense_mat, np.array(binary_tensor.shape), 0)

del dense_mat, random_tensor, binary_tensor

In [33]:
import time
start = time.time()
time_lags = np.array([1, 2, 3, 4, 5, 6, 22, 23, 24, 25, 26, 27])
alpha = np.ones(3) / 3
rho = 1e-6
lambda0 = 5 * rho
theta = 0
epsilon = 1e-4
maxiter = 100
tensor_hat = imputer(dense_tensor, sparse_tensor, time_lags, alpha, rho, lambda0, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 70
Tolerance: 9.78155e-05
Imputation MAPE: 0.106619
Imputation RMSE: 738.185

Running time: 12 seconds


In [34]:
dense_mat = np.load('../datasets/Electricity-data-set/electricity35.npy')
random_tensor = np.load('../datasets/Electricity-data-set/random_tensor.npy')

missing_rate = 0.6

### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
sparse_mat = np.multiply(dense_mat, ten2mat(binary_tensor, 0))

sparse_tensor = mat2ten(sparse_mat, np.array(binary_tensor.shape), 0)
dense_tensor = mat2ten(dense_mat, np.array(binary_tensor.shape), 0)

del dense_mat, random_tensor, binary_tensor

In [35]:
import time
start = time.time()
time_lags = np.array([1, 2, 3, 4, 5, 6, 22, 23, 24, 25, 26, 27])
alpha = np.ones(3) / 3
rho = 1e-6
lambda0 = 5 * rho
theta = 0
epsilon = 1e-4
maxiter = 100
tensor_hat = imputer(dense_tensor, sparse_tensor, time_lags, alpha, rho, lambda0, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 77
Tolerance: 9.99196e-05
Imputation MAPE: 0.1243
Imputation RMSE: 845.817

Running time: 16 seconds


- **Nonrandom Missing (NM)**:

In [36]:
dense_mat = np.load('../datasets/Electricity-data-set/electricity35.npy')
random_matrix = np.load('../datasets/Electricity-data-set/random_matrix.npy')

missing_rate = 0.2

### Nonrandom missing (NM) scenario:
binary_tensor = np.zeros((dense_mat.shape[0], 24, 35))
for i1 in range(dense_mat.shape[0]):
    for i2 in range(35):
        binary_tensor[i1,:,i2] = np.round(random_matrix[i1,i2] + 0.5 - missing_rate)
binary_mat = ten2mat(binary_tensor, 0)
sparse_mat = np.multiply(dense_mat, binary_mat)

sparse_tensor = mat2ten(sparse_mat, np.array(binary_tensor.shape), 0)
dense_tensor = mat2ten(dense_mat, np.array(binary_tensor.shape), 0)

del dense_mat, random_matrix, binary_tensor

In [37]:
import time
start = time.time()
time_lags = np.array([1, 2, 3, 4, 5, 6, 22, 23, 24, 25, 26, 27])
alpha = np.ones(3) / 3
rho = 1e-6
lambda0 = 5 * rho
theta = 0
epsilon = 1e-4
maxiter = 100
tensor_hat = imputer(dense_tensor, sparse_tensor, time_lags, alpha, rho, lambda0, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 67
Tolerance: 9.32324e-05
Imputation MAPE: 0.165459
Imputation RMSE: 801.571

Running time: 10 seconds


In [38]:
dense_mat = np.load('../datasets/Electricity-data-set/electricity35.npy')
random_matrix = np.load('../datasets/Electricity-data-set/random_matrix.npy')

missing_rate = 0.4

### Nonrandom missing (NM) scenario:
binary_tensor = np.zeros((dense_mat.shape[0], 24, 35))
for i1 in range(dense_mat.shape[0]):
    for i2 in range(35):
        binary_tensor[i1,:,i2] = np.round(random_matrix[i1,i2] + 0.5 - missing_rate)
binary_mat = ten2mat(binary_tensor, 0)
sparse_mat = np.multiply(dense_mat, binary_mat)

sparse_tensor = mat2ten(sparse_mat, np.array(binary_tensor.shape), 0)
dense_tensor = mat2ten(dense_mat, np.array(binary_tensor.shape), 0)

del dense_mat, random_matrix, binary_tensor

In [39]:
import time
start = time.time()
time_lags = np.array([1, 2, 3, 4, 5, 6, 22, 23, 24, 25, 26, 27])
alpha = np.ones(3) / 3
rho = 1e-6
lambda0 = 5 * rho
theta = 0
epsilon = 1e-4
maxiter = 100
tensor_hat = imputer(dense_tensor, sparse_tensor, time_lags, alpha, rho, lambda0, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 92
Tolerance: 9.71735e-05
Imputation MAPE: 0.155089
Imputation RMSE: 1467.23

Running time: 16 seconds


In [40]:
dense_mat = np.load('../datasets/Electricity-data-set/electricity35.npy')
random_matrix = np.load('../datasets/Electricity-data-set/random_matrix.npy')

missing_rate = 0.6

### Nonrandom missing (NM) scenario:
binary_tensor = np.zeros((dense_mat.shape[0], 24, 35))
for i1 in range(dense_mat.shape[0]):
    for i2 in range(35):
        binary_tensor[i1,:,i2] = np.round(random_matrix[i1,i2] + 0.5 - missing_rate)
binary_mat = ten2mat(binary_tensor, 0)
sparse_mat = np.multiply(dense_mat, binary_mat)

sparse_tensor = mat2ten(sparse_mat, np.array(binary_tensor.shape), 0)
dense_tensor = mat2ten(dense_mat, np.array(binary_tensor.shape), 0)

del dense_mat, random_matrix, binary_tensor

In [41]:
import time
start = time.time()
time_lags = np.array([1, 2, 3, 4, 5, 6, 22, 23, 24, 25, 26, 27])
alpha = np.ones(3) / 3
rho = 1e-6
lambda0 = 5 * rho
theta = 0
epsilon = 1e-4
maxiter = 100
tensor_hat = imputer(dense_tensor, sparse_tensor, time_lags, alpha, rho, lambda0, theta, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 100
Tolerance: 0.000230474
MAPE: 0.174645
RMSE: 5671.92

Total iteration: 100
Tolerance: 0.000230474
Imputation MAPE: 0.174645
Imputation RMSE: 5671.92

Running time: 17 seconds


### License

<div class="alert alert-block alert-danger">
<b>This work is released under the MIT license.</b>
</div>