# Handle large design matrix A in network intergration

This notebook propose a solution mainly based on [`scipy.sparse.linalg.lsmr`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.lsmr.html).

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import scipy.sparse
import xarray as xr
import dask.array as da
from matplotlib import pyplot as plt

from depsi.network import form_network

## Setup Experiments

Here we simulate an network STM and for network from it.

To generate sensible arc amibuities, we also simulate the ambiguity of the network STM and generate arc ambiguities from it.

In [None]:
N_points = 3000
N_time = 50

In [None]:
# Generate an STM with points and arcs
# Simulate ambiguities directly
# Phase and h2ph are simultated to form arcs but not used
stm = xr.Dataset(
    coords={
        "space": (["space"], np.arange(N_points)),
        "time": (["time"], np.arange(N_time)),
        "x": (["space"], np.random.uniform(0, 100, N_points)),
        "y": (["space"], np.random.uniform(0, 100, N_points)),
    },
    data_vars={
        "phase": (["space", "time"], da.random.uniform(0, 1, (N_points, N_time))),
        "h2ph": (["space", "time"], da.random.uniform(0, 1, (N_points, N_time))),
        "ambiguity": (["space", "time"], np.random.choice([-1, 0, 1], (N_points, N_time), p=[0.02, 0.96, 0.02])),
    },
)

stm

In [None]:
# Form the network from the STM
# This takes ~30s for 3000 points
arcs = form_network(
    stm,
    key_phase = "phase",
    key_h2ph = "h2ph",
    key_Btemp = "time",
    key_xlabel = "x",
    key_ylabel = "y",
    max_length = 3.0
)

arcs

In [None]:
# Arcs start- and end-points
N_arcs = arcs.sizes["space"]
xindex = np.arange(N_arcs)
yidx_start = arcs["source"].values
yidx_end = arcs["target"].values

# Create a y represents the difference in ambiguity
# Since y is created directly from the STM, it is supposed to be error free
y = stm["ambiguity"].isel(space=yidx_end).values - stm["ambiguity"].isel(space=yidx_start).values
y_noisy = y + np.random.choice([-1, 0, 1], y.shape, p=[0.002, 0.996, 0.002]) # Add some minor to the y values

In [None]:
# Visualize ambiguitis of the first 50 arcs
fig, ax = plt.subplots(figsize=(10, 5))
ax.imshow(y[:50])
ax.set_aspect("auto")

In [None]:
# Create the design matrix A in sparse
end = scipy.sparse.coo_array((np.ones_like(xindex), (xindex, yidx_end)), shape=(N_arcs, N_points))
start = scipy.sparse.coo_array((-np.ones_like(xindex), (xindex, yidx_start)), shape=(N_arcs, N_points))
A_sparse = start + end
A_sparse

In [None]:
# # Only when number of arcs is small
# # Verify the sparse matrix is equivalent to its dense counterpart
# A = np.zeros((N_arcs, N_points), dtype=int)
# A[xindex, yidx_end] = 1
# A[xindex, yidx_start] = -1

# assert np.allclose(A_sparse.todense(), A), "Problem in setting up sparse matrix"

In [None]:
# design a vectorized function to solve each column of y separately
# Essentially, this is a looping over time but since the time dimension is not large
@np.vectorize(signature="(i)->(j)")
def lsmr(y, tol=1.e-10):
    x, *_ = scipy.sparse.linalg.lsmr(A_sparse, y, atol=tol, btol=tol)
    return x

In [None]:
%%time
x = lsmr(y.T).T  # double traspose should be fixed..

In [None]:
%%time
x_noisy = lsmr(y_noisy.T).T  # double traspose should be fixed..

In [None]:
# All elements in e is supposed to be close to zero, since y is error free
e = A_sparse @ x - y
np.max(np.abs(e))
np.allclose(e, 0.0, atol=1.e-6)

In [None]:
# All elements in e is supposed to be close to zero, since y is error free
e_noisy = A_sparse @ x - y_noisy
print(np.max(np.abs(e_noisy)))
print(e_noisy)

## When y is not equally weighted
functional model: 
```math
y = Ax
```

stochastic model:
```math
D\{y\} = Q_{yy} 
```
where diagnal elements of Qyy are varainces, off-diagonal elements are covariances.

Given cholesky decomposition of Qyy:
```math
Q_{yy} = L L^T
```

To use the `scipy.sparse.linalg.lsmr` function with Qyy, we can generalize the functional model to:
```math
L^{-1} y = L^{-1} A x
```

Solving the above Linear system is equivalent to solving the original functional model with the covariance matrix Qyy.

### General computation of $L^{-1}$

Mark $F_{chole}$ as cholesky decomposition operation:
```math
F_{chole} (Q_{yy}) = L
```

Then since:
```math
Q_{yy}^{-1} = (L L^T)^{-1} = (L^{-1})^{T} L^{-1}
```

Therefore, 
```math
F_{chole} (Q_{yy}^{-1}) = (L^{-1})^T
```

Then for any Q_{yy}^{-1} we can get $L^{-1}$ as:
```math
L^{-1} = (F_{chole} (Q_{yy}^{-1}))^{T}
```

### In case observations in $y$ are independent

In many cases y are independent, so Qyy is diagonal. Therefore we can easily compute $L^{-1}$ as:
```math
L^{-1} = diag(Q_{yy}^{-1})^{1/2}
```

In other words, $L^{-1}$ is a diagonal matrix with elements being the square root of the inverse of the diagonal elements of Qyy. 

### When $y$ is independent

In [None]:
# For now, we can consider a uncorrelated y
# This means Qyy is a diagonal matrix
Qyy_diag = np.random.uniform(0.8, 0.99, N_arcs) # diagonal components
L_inv = scipy.sparse.diags(1 / np.sqrt(Qyy_diag), 0, shape=(N_arcs, N_arcs))
L_inv

In [None]:
@np.vectorize(signature="(i)->(j)")
def lsmr(y):
    x, *_ = scipy.sparse.linalg.lsmr(L_inv@A_sparse, y)
    return x

In [None]:
%%time
x = lsmr((L_inv @ y).T).T  # double traspose should be fixed..

### When $y$ is not independent (Doesn't work yet)

Below is a quick example with dask. But turned out hard to create a case where $Q_{yy}^{-1}$ is cholesky decomposable. 

In [None]:
# Qyy_diag = da.random.uniform(0.8, 0.99, N_arcs) # diagonal components
# Qyy = da.diag(Qyy_diag)
# Qyy_inv_off_diag = da.random.uniform(0.01,0.1, (N_arcs, N_arcs))  # off-diagonal components
# Qyy_inv = da.linalg.inv(Qyy) + Qyy_inv_off_diag
# L_inv = da.linalg.cholesky(Qyy_inv)
# L_inv

In [None]:
# @np.vectorize(signature="(i)->(j)")
# def lsmr(y):
#     x, *_ = scipy.sparse.linalg.lsmr(L_inv@A_sparse, y)
#     return x

In [None]:
# %%time
# x = lsmr((L_inv @ y).T).T  # double traspose should be fixed..