# Cellular sheaves on graphs 
## Learning sheaf laplacian through minimum total variation approach 

________________

### Generating a toy-case topology

In [1]:
import numpy as np 

In [2]:
# Let's generate a toy topology for our example

nodes = [i for i in range(7)]
edges = [
    (0,1),
    (0,3),
    (0,6),
    (1,2),
    (1,5),
    (2,4),
    (4,6),
    (5,6)
]

V = 7
E = len(edges)

d = 20                                          # Node and edges stalks dimension

F = {
    e:{
        e[0]:np.random.randn(d,d),
        e[1]:np.random.randn(d,d)
        } 
        for e in edges
    }                                           # Incidency linear maps

# Sheaf representation 

# Coboundary map

B = np.zeros((d*E, d*V))

for i in range(len(edges)):
    edge = edges[i]

    u = edge[0] 
    v = edge[1] 

    B_u = F[edge][u]
    B_v = F[edge][v]

    B[i*d:(i+1)*d, u*d:(u+1)*d] = B_u
    B[i*d:(i+1)*d, v*d:(v+1)*d] = - B_v

# Sheaf Laplacian

L_f = B.T @ B

# Generating a smooth signals dataset 

*(from Hansen J., "Learning sheaf Laplacians from smooth signals")* 

In order to retrieve a dataset of smoothsignals, first of all we sample random gaussians vectors on the nodes of the graph. Then we smooth them according to their expansion in terms of the eigenvectors of the sheaf Laplacian $L_0$.

So let's firstly define a dataset of random gaussian vectors. 

In [3]:
N = 100
X = np.random.randn(V*d,N)

Now we'll use the Fourier-domain embedded in the Laplacian spectrum. 

We'll consider a Tikhonov inspired procedure where we firstly project our dataset over the space spanned by the eigenvectors of the sheaf laplacian: namely $U$ the matrix collecting this eigenvectors we have 
\begin{equation}
    \hat{x} = U^T x
\end{equation}

So that defining $h(\lambda) = \frac{1}{1 + 10\lambda}$ and $H = \mathrm{diag}\{h(\lambda)\}_{\lambda}$, we now have

\begin{equation}
    \hat{y} = H(\Lambda) \hat{x}
\end{equation}

and finally our dataset is just reprojected back into the vertex domain:

\begin{equation}
    y = U H(\Lambda) \hat{x} = U H(\Lambda) U^T x
\end{equation}

In [11]:
Lambda, U = np.linalg.eig(L_f)
H = 1/(1 + 10*Lambda)

In [12]:
Y = U @ np.diag(H) @ U.T @ X

Y += np.random.normal(0, 10e-2, size=Y.shape)

In [13]:
np.trace(X.T @ L_f @ X)

628073.0816488669

In [14]:
np.trace(Y.T @ L_f @ Y)

6415.094246945023

_________________________

### A first test

Let's try our centralized procedure over our toy case topology. 

In [15]:
from controller import learning
from itertools import combinations

edge_blocks = learning(7, d, Y, 50, 0.5, 10, 30)

100%|██████████| 30/30 [02:19<00:00,  4.66s/it]


Let's now retrieve the energy expressed by each edge:

In [16]:
all_edges = list(combinations(range(V), 2))

energies = {
    e : 0
    for e in all_edges
    }

for e in (all_edges):
    BB = edge_blocks[e]
    u = e[0]
    v = e[1]
    
    '''
    X_ = np.zeros_like(X)
    X_[u*d:(u+1)*d,:] = X[u*d:(u+1)*d,:]
    X_[v*d:(v+1)*d,:] = X[v*d:(v+1)*d,:]
    energies[e] = np.linalg.norm(BB @ X_)
    '''

    energies[e] = np.linalg.norm(BB @ X)

We'll consider the first $E_0$ edges sorted accordingly to their energy:

In [17]:
retrieved = sorted(energies.items(), key=lambda x:x[1])[:E]

Let's reconstruct the sheaf laplacian:

In [18]:
L_ = np.zeros((d*V, d*V))

for i in range(E):
    edge = retrieved[i][0]

    BB_ = edge_blocks[edge]

    L_ += BB_

In [19]:
# The metric chosen by Hansen for the evaluation was the average entry-wise euclidean distance

np.sqrt(np.sum((L_f - L_)**2)) / L_f.size

0.03817704643882571

Let's see the precision of our procedure:

In [20]:
len(set(list(map(lambda x: x[0], retrieved))).intersection(set(edges))) / E

0.5

In [21]:
edges

[(0, 1), (0, 3), (0, 6), (1, 2), (1, 5), (2, 4), (4, 6), (5, 6)]

In [22]:
retrieved

[((1, 2), 1.613458856152648),
 ((2, 5), 1.647640810999168),
 ((2, 6), 1.6601743618745313),
 ((1, 6), 1.660928897593217),
 ((1, 5), 1.668288869573076),
 ((5, 6), 1.6983577918296178),
 ((4, 6), 1.73089602101169),
 ((0, 2), 1.736452999389246)]