In [1]:
import numpy as np
import sparse
import tensorly as tl

In this notebook we will see how to perform a symmetric CP decomposition on sparse tensors using the symmetric robust tensor power iteration.

Here, the input tensor is sparse, and so is the decomposition, so the factors for each mode are the same.

In [2]:
size = 100
rank = 5

starting_factor = sparse.random(shape=(size, rank), density=0.3)
print(starting_factor)

weights = sparse.ones(rank)

<COO: shape=(100, 5), dtype=float64, nnz=150, fill_value=0.0>


Now convert it to a tensor. As for the other sparse operations, it is very important to use `kruskal_to_tensor` from the sparse backend, as a fully dense version of the tensor would use several TB of memory.

In [3]:
from tensorly.contrib.sparse.kruskal_tensor import kruskal_to_tensor
tensor = kruskal_to_tensor((weights, [starting_factor]*3))
tensor

0,1
Format,coo
Data Type,float64
Shape,"(100, 100, 100)"
nnz,144637
Density,0.144637
Read-only,True
Size,4.4M
Storage ratio,0.6


In [4]:
import time
%load_ext memory_profiler

We import the symmetric CP decomposition:

In [5]:
from tensorly.contrib.sparse.decomposition import symmetric_parafac_power_iteration as parafac_sparse

In [6]:
%%memit
start_time = time.time()
sparse_kruskal = parafac_sparse(tensor, rank=2*rank, verbose=True)
end_time = time.time()
total_time = end_time - start_time
print('Took %d mins %d secs' % (divmod(total_time, 60)))

Best score of 10: <COO: shape=(), dtype=float64, nnz=1, fill_value=0.0>
Eingenvalue: <COO: shape=(), dtype=float64, nnz=1, fill_value=0.0>, explained: 0.6906871094733891
Best score of 10: <COO: shape=(), dtype=float64, nnz=1, fill_value=0.0>
Eingenvalue: <COO: shape=(), dtype=float64, nnz=1, fill_value=0.0>, explained: 0.7417982119451308
Best score of 10: <COO: shape=(), dtype=float64, nnz=1, fill_value=0.0>
Eingenvalue: <COO: shape=(), dtype=float64, nnz=1, fill_value=0.0>, explained: 0.7155941455024695
Best score of 10: <COO: shape=(), dtype=float64, nnz=1, fill_value=0.0>
Eingenvalue: <COO: shape=(), dtype=float64, nnz=1, fill_value=0.0>, explained: 0.7148363070325362
Best score of 10: <COO: shape=(), dtype=float64, nnz=1, fill_value=0.0>
Eingenvalue: <COO: shape=(), dtype=float64, nnz=1, fill_value=0.0>, explained: 0.44918586263031257
Best score of 10: <COO: shape=(), dtype=float64, nnz=1, fill_value=0.0>
Eingenvalue: <COO: shape=(), dtype=float64, nnz=1, fill_value=0.0>, explained

Let's look at the result

In [7]:
sparse_kruskal

(<COO: shape=(10,), dtype=float64, nnz=10, fill_value=0.0>,
 <COO: shape=(100, 10), dtype=float64, nnz=910, fill_value=0.0>)

In [8]:
weights_sparse = sparse_kruskal[0]

In [9]:
factors_sparse = sparse_kruskal[1]

Because the `factors_sparse` are sparse, we can reconstruct them into a tensor without using too much memory. In general, this will not be the case, but it is for our toy example. Let's do this to look at the absolute error for the decomposition. 

You can obtain the reconstruction as follows:

In [10]:
rec = kruskal_to_tensor((weights_sparse, [factors_sparse]*3))

In [11]:
tl.norm(tensor - rec)/tl.norm(tensor)

0.025484865905353097