## Low-Rank Matrix Completion Imputer Based on Nuclear Norm  (LRMC-NN-imputer)

This notebook shows how to implement a LRMC-NN imputer on some real-world data sets (e.g., PeMS traffic speed data, Guangzhou traffic speed data). To overcome the problem of missing values within multivariate time series data, this method takes into account the low-rank structure of data matrix.

In [1]:
import numpy as np
from numpy.linalg import inv as inv

### Define LRMC-imputer kernel

We start by introducing the necessary function that relies on `Numpy`.

<div class="alert alert-block alert-warning">
<ul>
<li><b><code>svt</code>:</b> <font color="black">Implement the process of Singular Value Thresholding (SVT).</font></li>
</ul>
</div>

In [2]:
def svt(mat, tau):
    u, s, v = np.linalg.svd(mat, full_matrices = 0)
    vec = s - tau
    vec[vec < 0] = 0
    return np.matmul(np.matmul(u, np.diag(vec)), v)

<div class="alert alert-block alert-warning">
<ul>
<li><b><code>compute_mape</code>:</b> <font color="black">Compute the value of Mean Absolute Percentage Error (MAPE).</font></li>
<li><b><code>compute_rmse</code>:</b> <font color="black">Compute the value of Root Mean Square Error (RMSE).</font></li>
</ul>
</div>

> Note that $$\mathrm{MAPE}=\frac{1}{n} \sum_{i=1}^{n} \frac{\left|y_{i}-\hat{y}_{i}\right|}{y_{i}} \times 100, \quad\mathrm{RMSE}=\sqrt{\frac{1}{n} \sum_{i=1}^{n}\left(y_{i}-\hat{y}_{i}\right)^{2}},$$ where $n$ is the total number of estimated values, and $y_i$ and $\hat{y}_i$ are the actual value and its estimation, respectively.

In [3]:
def compute_mape(var, var_hat):
    return np.sum(np.abs(var - var_hat) / var) / var.shape[0]

def compute_rmse(var, var_hat):
    return  np.sqrt(np.sum((var - var_hat) ** 2) / var.shape[0])

The main idea behind LRMC-NN-imputer is to approximate partially observed data with matrix low-rank structure. The following `imputer` kernel includes some necessary inputs:

<div class="alert alert-block alert-warning">
<ul>
<li><b><code>sparse_mat</code>:</b> <font color="black">This is a partially observed matrix which has many missing entries.</font></li>
<li><b><code>dense_mat</code>:</b> <font color="black">This is an input which has the ground truth for validation. If this input is not available, you could use <code>dense_tensor = sparse_mat.copy()</code> instead.</font></li>
<li><b><code>rho</code>:</b> <font color="black">Learning rate for ADMM, e.g., <code>rho = 0.0005</code>. </font></li>
<li><b><code>epsilon</code>:</b> <font color="black">Stop criteria, e.g., <code>epsilon = 0.001</code>. </font></li>
<li><b><code>maxiter</code>:</b> <font color="black">Maximum iteration to stop algorithm, e.g., <code>maxiter = 50</code>. </font></li>
</ul>
</div>


In [4]:
def imputer(dense_mat, sparse_mat, rho, epsilon, maxiter):
    pos_train = np.where(sparse_mat != 0)
    pos_test = np.where((sparse_mat == 0) & (dense_mat != 0))
    binary_mat = sparse_mat.copy()
    binary_mat[pos_train] = 1
    snorm = np.linalg.norm(sparse_mat, 'fro')
    X = sparse_mat.copy()
    Z = sparse_mat.copy()
    T = sparse_mat.copy()
    last_X = np.ones_like(X) * np.inf
    for it in range(maxiter):
        Z = svt(X + T / rho, 1 / rho)
        X = Z - T / rho
        X[pos_train] = sparse_mat[pos_train]
        T = T - rho * (Z - X)
        tol = np.linalg.norm((X - last_X), 'fro') / snorm
        last_X = X.copy()
        if (it+1) % 200 == 0:
            print('Iter: {}'.format(it))
            print('Tolerance: {:.6}'.format(tol))
            print('MAPE: {:.6}'.format(compute_mape(dense_mat[pos_test], X[pos_test])))
            print('RMSE: {:.6}'.format(compute_rmse(dense_mat[pos_test], X[pos_test])))
            print()
        if (tol < epsilon):
            break

    print('Total iteration: {}'.format(it))
    print('Tolerance: {:.6}'.format(tol))
    print('Imputation MAPE: {:.6}'.format(compute_mape(dense_mat[pos_test], X[pos_test])))
    print('Imputation RMSE: {:.6}'.format(compute_rmse(dense_mat[pos_test], X[pos_test])))
    print()
    return X

### Guangzhou data

We generate **random missing (RM)** values on Guangzhou traffic speed data set.

In [5]:
np.random.seed(110)

In [6]:
import scipy.io

def ten2mat(tensor, mode):
    return np.reshape(np.moveaxis(tensor, mode, 0), (tensor.shape[mode], -1), order = 'F')

tensor = scipy.io.loadmat('../../Guangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_tensor = scipy.io.loadmat('../../Guangzhou-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

missing_rate = 0.2

### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
sparse_tensor = np.multiply(dense_tensor, binary_tensor)

dense_tensor = np.transpose(dense_tensor, [0, 2, 1])
sparse_tensor = np.transpose(sparse_tensor, [0, 2, 1])

dense_mat = ten2mat(dense_tensor, 0)
sparse_mat = ten2mat(sparse_tensor, 0)
print('Matrix shape:')
print(dense_mat.shape)

Matrix shape:
(214, 8784)


We use `imputer` to fill in the missing entries and measure performance metrics on the ground truth.

In [7]:
import time
start = time.time()
rho = 1e-4
epsilon = 1e-4
maxiter = 200
mat_hat = imputer(dense_mat, sparse_mat, rho, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 69
Tolerance: 8.56618e-05
Imputation MAPE: 0.0975444
Imputation RMSE: 4.02125

Running time: 15 seconds


In [8]:
import scipy.io

def ten2mat(tensor, mode):
    return np.reshape(np.moveaxis(tensor, mode, 0), (tensor.shape[mode], -1), order = 'F')

tensor = scipy.io.loadmat('../../Guangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_tensor = scipy.io.loadmat('../../Guangzhou-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

missing_rate = 0.4

### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
sparse_tensor = np.multiply(dense_tensor, binary_tensor)

dense_tensor = np.transpose(dense_tensor, [0, 2, 1])
sparse_tensor = np.transpose(sparse_tensor, [0, 2, 1])

dense_mat = ten2mat(dense_tensor, 0)
sparse_mat = ten2mat(sparse_tensor, 0)
print('Matrix shape:')
print(dense_mat.shape)

Matrix shape:
(214, 8784)


In [9]:
import time
start = time.time()
rho = 1e-4
epsilon = 1e-4
maxiter = 200
mat_hat = imputer(dense_mat, sparse_mat, rho, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 95
Tolerance: 8.08163e-05
Imputation MAPE: 0.100902
Imputation RMSE: 4.1457

Running time: 20 seconds


In [10]:
import scipy.io

def ten2mat(tensor, mode):
    return np.reshape(np.moveaxis(tensor, mode, 0), (tensor.shape[mode], -1), order = 'F')

tensor = scipy.io.loadmat('../../Guangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_tensor = scipy.io.loadmat('../../Guangzhou-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

missing_rate = 0.6

### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
sparse_tensor = np.multiply(dense_tensor, binary_tensor)

dense_tensor = np.transpose(dense_tensor, [0, 2, 1])
sparse_tensor = np.transpose(sparse_tensor, [0, 2, 1])

dense_mat = ten2mat(dense_tensor, 0)
sparse_mat = ten2mat(sparse_tensor, 0)
print('Matrix shape:')
print(dense_mat.shape)

Matrix shape:
(214, 8784)


In [11]:
import time
start = time.time()
rho = 1e-4
epsilon = 1e-4
maxiter = 200
mat_hat = imputer(dense_mat, sparse_mat, rho, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 131
Tolerance: 8.15136e-05
Imputation MAPE: 0.10673
Imputation RMSE: 4.34484

Running time: 28 seconds


We generate **non-random missing (NM)** values on Guangzhou traffic speed data set. Then, we conduct the imputation experiment.

In [12]:
import scipy.io

def ten2mat(tensor, mode):
    return np.reshape(np.moveaxis(tensor, mode, 0), (tensor.shape[mode], -1), order = 'F')

tensor = scipy.io.loadmat('../../Guangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_matrix = scipy.io.loadmat('../../Guangzhou-data-set/random_matrix.mat')
random_matrix = random_matrix['random_matrix']

missing_rate = 0.2

### Non-random missing (NM) scenario:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(random_matrix[i1, i2] + 0.5 - missing_rate)
sparse_tensor = np.multiply(dense_tensor, binary_tensor)

dense_tensor = np.transpose(dense_tensor, [0, 2, 1])
sparse_tensor = np.transpose(sparse_tensor, [0, 2, 1])

dense_mat = ten2mat(dense_tensor, 0)
sparse_mat = ten2mat(sparse_tensor, 0)
print('Matrix shape:')
print(dense_mat.shape)

del tensor, random_matrix, binary_tensor

Matrix shape:
(214, 8784)


In [13]:
import time
start = time.time()
rho = 1e-4
epsilon = 1e-4
maxiter = 200
mat_hat = imputer(dense_mat, sparse_mat, rho, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 66
Tolerance: 9.74267e-05
Imputation MAPE: 0.102178
Imputation RMSE: 4.17095

Running time: 15 seconds


In [14]:
import scipy.io

def ten2mat(tensor, mode):
    return np.reshape(np.moveaxis(tensor, mode, 0), (tensor.shape[mode], -1), order = 'F')

tensor = scipy.io.loadmat('../../Guangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_matrix = scipy.io.loadmat('../../Guangzhou-data-set/random_matrix.mat')
random_matrix = random_matrix['random_matrix']

missing_rate = 0.4

### Non-random missing (NM) scenario:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(random_matrix[i1, i2] + 0.5 - missing_rate)
sparse_tensor = np.multiply(dense_tensor, binary_tensor)

dense_tensor = np.transpose(dense_tensor, [0, 2, 1])
sparse_tensor = np.transpose(sparse_tensor, [0, 2, 1])

dense_mat = ten2mat(dense_tensor, 0)
sparse_mat = ten2mat(sparse_tensor, 0)
print('Matrix shape:')
print(dense_mat.shape)

del tensor, random_matrix, binary_tensor

Matrix shape:
(214, 8784)


In [15]:
import time
start = time.time()
rho = 1e-4
epsilon = 1e-4
maxiter = 200
mat_hat = imputer(dense_mat, sparse_mat, rho, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 103
Tolerance: 8.5583e-05
Imputation MAPE: 0.105649
Imputation RMSE: 4.3234

Running time: 23 seconds


In [16]:
import scipy.io

def ten2mat(tensor, mode):
    return np.reshape(np.moveaxis(tensor, mode, 0), (tensor.shape[mode], -1), order = 'F')

tensor = scipy.io.loadmat('../../Guangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_matrix = scipy.io.loadmat('../../Guangzhou-data-set/random_matrix.mat')
random_matrix = random_matrix['random_matrix']

missing_rate = 0.6

### Non-random missing (NM) scenario:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(random_matrix[i1, i2] + 0.5 - missing_rate)
sparse_tensor = np.multiply(dense_tensor, binary_tensor)

dense_tensor = np.transpose(dense_tensor, [0, 2, 1])
sparse_tensor = np.transpose(sparse_tensor, [0, 2, 1])

dense_mat = ten2mat(dense_tensor, 0)
sparse_mat = ten2mat(sparse_tensor, 0)
print('Matrix shape:')
print(dense_mat.shape)

del tensor, random_matrix, binary_tensor

Matrix shape:
(214, 8784)


In [17]:
import time
start = time.time()
rho = 1e-4
epsilon = 1e-4
maxiter = 200
mat_hat = imputer(dense_mat, sparse_mat, rho, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 165
Tolerance: 9.56238e-05
Imputation MAPE: 0.113354
Imputation RMSE: 4.61244

Running time: 36 seconds


### PeMS data

In [18]:
np.random.seed(122)

In [19]:
dense_mat = np.load('../../PeMS-data-set/pems.npy')
random_tensor = np.load('../../PeMS-data-set/random_tensor.npy')

missing_rate = 0.2

### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
sparse_mat = np.multiply(dense_mat, ten2mat(binary_tensor, 0))

print('Matrix shape:')
print(dense_mat.shape)

del random_tensor, binary_tensor

Matrix shape:
(228, 12672)


In [20]:
import time
start = time.time()
rho = 1e-4
epsilon = 1e-4
maxiter = 200
mat_hat = imputer(dense_mat, sparse_mat, rho, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 84
Tolerance: 8.83005e-05
Imputation MAPE: 0.0672923
Imputation RMSE: 4.65687

Running time: 32 seconds


In [21]:
dense_mat = np.load('../../PeMS-data-set/pems.npy')
random_tensor = np.load('../../PeMS-data-set/random_tensor.npy')

missing_rate = 0.4

### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
sparse_mat = np.multiply(dense_mat, ten2mat(binary_tensor, 0))

print('Matrix shape:')
print(dense_mat.shape)

del random_tensor, binary_tensor

Matrix shape:
(228, 12672)


In [22]:
import time
start = time.time()
rho = 1e-4
epsilon = 1e-4
maxiter = 200
mat_hat = imputer(dense_mat, sparse_mat, rho, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 90
Tolerance: 8.95425e-05
Imputation MAPE: 0.0744392
Imputation RMSE: 5.04453

Running time: 34 seconds


In [23]:
dense_mat = np.load('../../PeMS-data-set/pems.npy')
random_tensor = np.load('../../PeMS-data-set/random_tensor.npy')

missing_rate = 0.6

### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
sparse_mat = np.multiply(dense_mat, ten2mat(binary_tensor, 0))

print('Matrix shape:')
print(dense_mat.shape)

del random_tensor, binary_tensor

Matrix shape:
(228, 12672)


In [24]:
import time
start = time.time()
rho = 1e-4
epsilon = 1e-4
maxiter = 200
mat_hat = imputer(dense_mat, sparse_mat, rho, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 114
Tolerance: 7.78685e-05
Imputation MAPE: 0.0844793
Imputation RMSE: 5.57011

Running time: 42 seconds


In [25]:
dense_mat = np.load('../../PeMS-data-set/pems.npy')
random_matrix = np.load('../../PeMS-data-set/random_matrix.npy')

missing_rate = 0.2

### Nonrandom missing (NM) scenario:
binary_tensor = np.zeros((dense_mat.shape[0], 288, 44))
for i1 in range(dense_mat.shape[0]):
    for i2 in range(44):
        binary_tensor[i1,:,i2] = np.round(random_matrix[i1,i2] + 0.5 - missing_rate)
binary_mat = ten2mat(binary_tensor, 0)
sparse_mat = np.multiply(dense_mat, binary_mat)

print('Matrix shape:')
print(dense_mat.shape)

del random_matrix, binary_tensor

Matrix shape:
(228, 12672)


In [26]:
import time
start = time.time()
rho = 1e-4
epsilon = 1e-4
maxiter = 200
mat_hat = imputer(dense_mat, sparse_mat, rho, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 75
Tolerance: 9.9873e-05
Imputation MAPE: 0.0769798
Imputation RMSE: 5.22173

Running time: 29 seconds


In [27]:
dense_mat = np.load('../../PeMS-data-set/pems.npy')
random_matrix = np.load('../../PeMS-data-set/random_matrix.npy')

missing_rate = 0.4

### Nonrandom missing (NM) scenario:
binary_tensor = np.zeros((dense_mat.shape[0], 288, 44))
for i1 in range(dense_mat.shape[0]):
    for i2 in range(44):
        binary_tensor[i1,:,i2] = np.round(random_matrix[i1,i2] + 0.5 - missing_rate)
binary_mat = ten2mat(binary_tensor, 0)
sparse_mat = np.multiply(dense_mat, binary_mat)

print('Matrix shape:')
print(dense_mat.shape)

del random_matrix, binary_tensor

Matrix shape:
(228, 12672)


In [28]:
import time
start = time.time()
rho = 1e-4
epsilon = 1e-4
maxiter = 200
mat_hat = imputer(dense_mat, sparse_mat, rho, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 99
Tolerance: 9.64282e-05
Imputation MAPE: 0.0840071
Imputation RMSE: 5.6179

Running time: 38 seconds


In [29]:
dense_mat = np.load('../../PeMS-data-set/pems.npy')
random_matrix = np.load('../../PeMS-data-set/random_matrix.npy')

missing_rate = 0.6

### Nonrandom missing (NM) scenario:
binary_tensor = np.zeros((dense_mat.shape[0], 288, 44))
for i1 in range(dense_mat.shape[0]):
    for i2 in range(44):
        binary_tensor[i1,:,i2] = np.round(random_matrix[i1,i2] + 0.5 - missing_rate)
binary_mat = ten2mat(binary_tensor, 0)
sparse_mat = np.multiply(dense_mat, binary_mat)

print('Matrix shape:')
print(dense_mat.shape)

del random_matrix, binary_tensor

Matrix shape:
(228, 12672)


In [30]:
import time
start = time.time()
rho = 1e-4
epsilon = 1e-4
maxiter = 200
mat_hat = imputer(dense_mat, sparse_mat, rho, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Total iteration: 141
Tolerance: 9.15216e-05
Imputation MAPE: 0.0984319
Imputation RMSE: 6.30658

Running time: 52 seconds


### Electricity data

- **Random Missing (RM)**:

In [31]:
np.random.seed(400)

In [32]:
dense_mat = np.load('../../Electricity-data-set/electricity35.npy')
random_tensor = np.load('../../Electricity-data-set/random_tensor.npy')

missing_rate = 0.2

### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
sparse_mat = np.multiply(dense_mat, ten2mat(binary_tensor, 0))

print('Matrix shape:')
print(dense_mat.shape)

del random_tensor, binary_tensor

Matrix shape:
(321, 840)


In [33]:
import time
start = time.time()
rho = 1e-4
epsilon = 1e-4
maxiter = 200
mat_hat = imputer(dense_mat, sparse_mat, rho, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 199
Tolerance: 0.000610205
MAPE: 0.110965
RMSE: 5082.8

Total iteration: 199
Tolerance: 0.000610205
Imputation MAPE: 0.110965
Imputation RMSE: 5082.8

Running time: 9 seconds


In [34]:
dense_mat = np.load('../../Electricity-data-set/electricity35.npy')
random_tensor = np.load('../../Electricity-data-set/random_tensor.npy')

missing_rate = 0.4

### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
sparse_mat = np.multiply(dense_mat, ten2mat(binary_tensor, 0))

print('Matrix shape:')
print(dense_mat.shape)

del random_tensor, binary_tensor

Matrix shape:
(321, 840)


In [35]:
import time
start = time.time()
rho = 1e-4
epsilon = 1e-4
maxiter = 200
mat_hat = imputer(dense_mat, sparse_mat, rho, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 199
Tolerance: 0.000986468
MAPE: 0.125715
RMSE: 6190.59

Total iteration: 199
Tolerance: 0.000986468
Imputation MAPE: 0.125715
Imputation RMSE: 6190.59

Running time: 10 seconds


In [36]:
dense_mat = np.load('../../Electricity-data-set/electricity35.npy')
random_tensor = np.load('../../Electricity-data-set/random_tensor.npy')

missing_rate = 0.6

### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
sparse_mat = np.multiply(dense_mat, ten2mat(binary_tensor, 0))

print('Matrix shape:')
print(dense_mat.shape)

del random_tensor, binary_tensor

Matrix shape:
(321, 840)


In [37]:
import time
rho = 1e-4
epsilon = 1e-4
maxiter = 200
mat_hat = imputer(dense_mat, sparse_mat, rho, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 199
Tolerance: 0.00121444
MAPE: 0.150808
RMSE: 7315.67

Total iteration: 199
Tolerance: 0.00121444
Imputation MAPE: 0.150808
Imputation RMSE: 7315.67

Running time: 20 seconds


In [38]:
dense_mat = np.load('../../Electricity-data-set/electricity35.npy')
random_matrix = np.load('../../Electricity-data-set/random_matrix.npy')

missing_rate = 0.2

### Nonrandom missing (NM) scenario:
binary_tensor = np.zeros((dense_mat.shape[0], 24, 35))
for i1 in range(dense_mat.shape[0]):
    for i2 in range(35):
        binary_tensor[i1,:,i2] = np.round(random_matrix[i1,i2] + 0.5 - missing_rate)
binary_mat = ten2mat(binary_tensor, 0)
sparse_mat = np.multiply(dense_mat, binary_mat)

print('Matrix shape:')
print(dense_mat.shape)

del random_matrix, binary_tensor

Matrix shape:
(321, 840)


In [39]:
import time
rho = 1e-4
epsilon = 1e-4
maxiter = 200
mat_hat = imputer(dense_mat, sparse_mat, rho, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 199
Tolerance: 0.000594105
MAPE: 0.174351
RMSE: 3865.44

Total iteration: 199
Tolerance: 0.000594105
Imputation MAPE: 0.174351
Imputation RMSE: 3865.44

Running time: 31 seconds


In [40]:
dense_mat = np.load('../../Electricity-data-set/electricity35.npy')
random_matrix = np.load('../../Electricity-data-set/random_matrix.npy')

missing_rate = 0.4

### Nonrandom missing (NM) scenario:
binary_tensor = np.zeros((dense_mat.shape[0], 24, 35))
for i1 in range(dense_mat.shape[0]):
    for i2 in range(35):
        binary_tensor[i1,:,i2] = np.round(random_matrix[i1,i2] + 0.5 - missing_rate)
binary_mat = ten2mat(binary_tensor, 0)
sparse_mat = np.multiply(dense_mat, binary_mat)

print('Matrix shape:')
print(dense_mat.shape)

del random_matrix, binary_tensor

Matrix shape:
(321, 840)


In [41]:
import time
rho = 1e-4
epsilon = 1e-4
maxiter = 200
mat_hat = imputer(dense_mat, sparse_mat, rho, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 199
Tolerance: 0.0008677
MAPE: 0.16666
RMSE: 6089.24

Total iteration: 199
Tolerance: 0.0008677
Imputation MAPE: 0.16666
Imputation RMSE: 6089.24

Running time: 42 seconds


In [42]:
dense_mat = np.load('../../Electricity-data-set/electricity35.npy')
random_matrix = np.load('../../Electricity-data-set/random_matrix.npy')

missing_rate = 0.6

### Nonrandom missing (NM) scenario:
binary_tensor = np.zeros((dense_mat.shape[0], 24, 35))
for i1 in range(dense_mat.shape[0]):
    for i2 in range(35):
        binary_tensor[i1,:,i2] = np.round(random_matrix[i1,i2] + 0.5 - missing_rate)
binary_mat = ten2mat(binary_tensor, 0)
sparse_mat = np.multiply(dense_mat, binary_mat)

print('Matrix shape:')
print(dense_mat.shape)

del random_matrix, binary_tensor

Matrix shape:
(321, 840)


In [43]:
import time
rho = 1e-4
epsilon = 1e-4
maxiter = 200
mat_hat = imputer(dense_mat, sparse_mat, rho, epsilon, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))

Iter: 199
Tolerance: 0.00110955
MAPE: 0.195524
RMSE: 7791.05

Total iteration: 199
Tolerance: 0.00110955
Imputation MAPE: 0.195524
Imputation RMSE: 7791.05

Running time: 52 seconds


### License

<div class="alert alert-block alert-danger">
<b>This work is released under the MIT license.</b>
</div>