# Pseudo-code discussion towards v2

Authors: **Gorka Zamora-López** and **Matthieu Gilson**

---------------------

The goal of this notebook is to try, propose and discuss how we would like that typical workflows look like in the object-oriented version of *SiReNetA* (v2). The notebook is based on the workflow described in the first tutorial [notebook](https://github.com/mb-BCA/SiReNetA_Tutorials/blob/master/Tutorial_Notebooks/1_GettingStarted.ipynb).

See also further discussions in the [TODO_Future.md](https://github.com/mb-BCA/SiReNetA/blob/v1_dev/TODO_Future.md) file.

----------------------

#### Philosophy for user interfacing

We want that the library avoids some obvious possible errors from user, or crucial parameters to be altered accidentally. BUT, in general we want the workflows to require users to know what they are doing, at every step. In some situations, it would be easier to have the code to do some calculations for the user internally or specify some parameters for them. That could be safer sometimes and lead to shorter workflows, but this is also black-box-like. And we don't want that. The user has to be aware of the steps needed to calculate something, and request them explicitly, one-by-one. So, the code will only calculate what the user asks, at every step.

----------------------

In [1]:
# Python standard & third-party library imports
from timeit import default_timer as timer

import matplotlib.pyplot as plt
import numpy as np

import sireneta as sna
print( 'SiReNetA:', sna.__version__ )

SiReNetA: 1.0.0.dev1


In [1]:
# Define plotting options to control visualization
%matplotlib inline

# Load the options from a local file
from plot_specs import *

We consider the network determined by the following binary matrix.

In [None]:
# Load the sample graph to study
net = np.loadtxt('../Data/Testnet_N8.txt', dtype=int)
# Number of nodes
N = len(net)

Convergence of the leaky-cascade model depends on the leakage time-constant, when $\tau \leq 1 \,/\, \lambda_{max}$. Find the largest eigenvalue of connectivity *A* and the critical $\tau_{max}$.

In [None]:
# Find the largest eigenvalue of the connectivity matrix A
evs = np.linalg.eigvals(net)
evmax = evs.real.max()
# Calculate the largest possible tau
taumax = 1.0 / evmax

print( f'Spectral radius:\t{evmax:2.5f}' )
print( f'Largest possible tau:\t{taumax:2.5f}' )

<br>

## 1. Initialize the main container object (instance)

Object `Rmats` must be initialized with three mandatory parameters:

* `con` : The connectivity matrix.
* `orientation` : specify whether `con` is based on graph analysis convention ($A_{ij}=1$ means $i \to j$) or the dynamical systems convention ($A_{ij}=1$ means $j \to i$). Accepted values: 'ij' or 'ji'.
* `model` : identifier for the canonical model. Accepted values are strings, representative of the canonical model.

These three parameters shall be **_unmutable_**. They cannot be changed after object creation. `con` will be a connectivity matrix, either loaded from data or generated with some graph library.


TODO : Maybe, a better name for the instance creation function (class name) `PairWiseResp()`.

In [None]:
# Rmat must be initialized for a connectivity matrix and a canonical model.
# These are unmutable and cannot be changed after creation.
Rmats = sna.PairWiseResp(con=net, orientation='ij', model='LeakyCascade', [optional_params ...] )

# Is this maybe more correct, as for functions?
Rmats = sna.PairWiseResp(net, 'ij', 'LeakyCascade', [optional_params ...] )


After this, `Rmats` should also *know* whether `con` is directed or undirected and the number of nodes *N*. E.g.:

``` python
>>> Rmats.N
8
>>> Rmats.directed
False

```

These should be evaluated internally, at time of instance generation. 

> **DIRECTED / SYMMETRIC** : Or shall we request the user to specify if the network is directed (weighted asymmetric) as a fourth mandatory argument? Even if we do so, at instance initialization we should verify that info. It is important because later, the node-wise responses will return one array for undirected / weighted-symmetric `con` and two array (input and output responses) for directed / weighted-asymmetric `con`.

## 2. Enter (needed) parameters and compute the pair-wise responses

We would do this by first populating the `Rmats` object with all the parameters needed for the "simulation" (feed them as attributes) and then the pair-wise responses are calculated.

1. Feed the different parameters (attributes) to `Rmats`, e.g.:
    - `Rmats.tfinal = 10`
    - `Rmats.dt = 0.01`
    - `Rmats.tau = 0.8 * taumax`
    - ...
2. Calculate the pairwise responses calling method : `Rmats.Calc_Resp()` .

I think this is the typical workflow for, e.g. TVB, right? In script or nb, this should look like:

In [None]:
# Define the "simulation" parameters
# Set the temporal resolution
Rmats.tfinal = 10
Rmats.dt = 0.01

# Set the leakage time-constants τ, proportional to taumax
# NOTE: Actually, `taumax` could also be an attribute of Rmats, (for the Leaky-Cascade)
Rmats.tau = 0.8 * taumax

# Define the stimulation amplitude to every node
# Example of stimulus on one node
Rmats.S0 = np.zeros(N)
Rmats.S0[0] = 1.0

# Example of all nodes receive same stimulus
Rmats.S0 = 1.0

# Decide which output version we want for the leaky-cascade model
Rmats.case = 'regressed'

In [None]:
# Finally, calculate the temporal evolution of the pair-wise responses R(t)
Rmats.Calc_Resp() 
# ... this will take time ...


# And, print some feedback
print( Rmats.tfinal, Rmats.dt, Rmats.nsteps )
print( Rmats.tpoints )
print( len(Rmats.tpoints), Rmats.nsteps )


In this way, all the parameters needed for the "simulation" have been already stored as attributes of `Rmats`, before calling the `Calc_Resp()` function. These are: `S0`, `tmax`, `dt` and `tau`. Parameter `case` is only available in the 'LeakyCascade' and the 'ContDiffusion' canonical models. 

Alternative, could we have a hybrid possibility such that the parameters could be entered either as attributes before calling `Calc_Resp()` method, and as parameters to `Calc_resp()` ? Imagine the following situation in which we define a few parameters before, but then override add `dt` and `case` at method call. For example :

In [None]:
# Define the "simulation" parameters
# Set the temporal resolution
Rmats.tfinal = 10
Rmats.dt = 0.1

# Set the leakage time-constants τ, proportional to taumax
Rmats.tau = 0.8 * taumax

# Define the stimulation amplitude to every node
Rmats.S0 = 1.0

# Finally, calculate the temporal evolution of the pair-wise responses R(t)
Rmats.Calc_Resp(dt=0.01, case='regressed') 

Is this possible, safe and reasonable? Would allowing this make the code more complicated?
If this were an option, the parameters introduced here "by hand" willoverride any previous value `Rmats` had for these attributes such that if, after running `Calc_Resp()` we will obtain:

``` python
>>> Rmats.dt
0.01
>>> Rmats.case
regressed
```

### Output of `Calc_Resp()` function

The main result of `Calc_Resp()` function (method) is to compute the response matrices over time. This is stored into a 3D numpy array `data` of shape (tmax//dt+1, N, N). The "+1" is because the first time step is *t=0*. All subsequent results are obtained out of `Rmats.data`, either slicing the 3D array projecting along different axes, or by estimating metrics out of it.

### Plot some results

For example, visualize the response matrices at selected time points.

In [None]:
# Plot a few response curves. 
# The responses of i = 2, ..., N to the stimulus at j = 1.

plt.figure(figsize=(6.4,3))
for i in range(1,N):
    plt.plot(Rmats.tpoints, Rmats.data[:,i,0], label=f'1 $\\to$ {i+1}')
plt.xlabel( 'Time (a.u.)' )
plt.ylabel( 'Pair-wise Response' )
plt.legend()

plt.tight_layout()

In [None]:
# Visualise the pair-wise response matrices at times t = 0.1, 0.3, 0.5, 1.0, 2.0, 3.0
maxresp = Rmats.data.max()

tidxlist = [10,20,50,100,200,500]
plt.figure(figsize=[12,6])
for i, tidx in enumerate(tidxlist):
    t = tpoints[tidx]
    plt.subplot(2,3,i+1)
    plt.title( f'$\mathcal{{R}}(t)$ matrix at t={t:1.1f}' )
    plt.imshow(Rmats.data[tidx], cmap=new_Reds)
    
    plt.clim(0,maxresp)
    plt.colorbar()
    plt.xticks(np.arange(N), np.arange(N)+1)
    plt.yticks(np.arange(N), np.arange(N)+1)
    plt.xlabel( 'source node' )
    plt.ylabel( 'target node' )

plt.tight_layout()

<br>

## 3. Slicing and projecting `Rmats.data`

Once the main tensor $\mathcal{R}_{ij}(t)$ has been calculated, the rest of the analysis consists of extracting information from here in the form of metrics. BUT, also we will extract two types of projections from this tensor, to which we apply some of the same methods. 

1. The **global response** $R(t)$ is the projection of $\mathcal{R}_{ij}(t)$ along both "spatial" axes, i *and* j. Therefore, the result is a 1D ndarray of length `nsteps`. This should be calculated using method (or function, see code options below) `GlobalResponse()`. The output `Rglob` is another object which inherits `tfinal`, `dt`, `tpoints`, `nsteps` from `Rmats`. And `Rglob.data` will be the 1D ndarray (time-series).

2. The **node-wise responses** $r_i(t)$ is the projection of $\mathcal{R}_{ij}(t)$ along one of the spatial axes (i or j). This should be calculated with a method (or function, see code options below) called `NodeResponses()`. The result, `r_nodes`, is another object which inherits `tfinal`, `dt`, `tpoints`, `nsteps` and `labels` (of the nodes) from `Rmats`. The numerical result `r_nodes.data` is a 2D ndarray of shape (`nsteps`,`N`).

In next cell, some pseudo-code of how we would call these two:

In [None]:
# Compute the global network response
## `Rglob` is an object which inherits `tpoints`, `dt`, etc.
Rglob = Rmats.GlobalResponse()  # or better ...
Rglob = sna.GlobalResponse(Rmats)   # MG: I WOULD GO FOR THIS OPTION, WITH CHECK ON OBJET Rmats THAT METHOD MAKES SENSE

# Print some feedback
print( Rglob.data.shape )  # `Rglob.data` is a 1D array (time-series)
print( Rglob.tfinal )
print( Rglob.nsteps )

# Plot the global response curve, r(t)
plt.figure()
plt.plot(Rglob.tpoints, Rglob.data)
plt.xlabel( 'Time (a.u.)' )


In [None]:
# Compute the node-wise responses
## `r_nodes` is an object which inherits the temporal attributes of Rmats + Rmats.labels  
# If network `con` is directed, we get two results (input and output properties)
if Rmats.directed:
    inr_nodes, outr_nodes = Rmats.NodeResponses(selfresp=True)  # or prefered ...
    inr_nodes, outr_nodes = sna.NodeResponses(Rmats, selfresp=True)
# if network  `con` is undirected, we only get one result 
else:
    r_nodes = Rmats.NodeResponses(selfresp=True)   # or prefered ...
    r_nodes = sna.NodeResponses(Rmats, selfresp=True)

# Print some feedback 
print( r_nodes.data.shape )
print( r_nodes.tfinal )

# Plot all the node responses
plt.figure()
for i in range(N):
    plt.plot(inr_nodes.tpoints, inr_nodes.data[i])
plt.xlabel( 'Time (a.u.)' )


## 4. Extracting further metrics

So far, we have computed three "*time-series*" objects, `Rmats`, `r_nodes` and `globR` of dimensions 3D (time,N,N), 2D (time,N) and 1D (time,) respectively. From here, the idea is to extract more metrics out of them. For example, the total reponses and the time-to-peak. In the current (v1) version, these are computed in `sna.AreaUnderCurve()` and `sna.Time2Peak()`. At this moment, these two functions accept arrays of 3D, 2D or 1D as input and will do the same calculations on them. As result, the output of both functions are a 2D matrix (N,N), a 1D array (N,) and a scalar number, respectively. 

Ideally, we would like the input/output to work in the same way as it does now, such that we can still call for example:

- `totR = sna.AreaUnderCurve(Rmats)` is a 2D matrix, ndarray of shape (N,N),
- `totr_nodes = sna.AreaUnderCurve(r_nodes)` is a 1D ndarray of shape (N,), and
- `totRglob = sna.AreaUnderCurve(Rglob)` is a scalar number.

Results `totR`, `totr_nodes` and `totRglob` are no longer objects. In principle, they don't need to carry metadata. The calculation of `AreaUnderCurve()` and `Time2Peak()` require also the `dt`, which will be read directly from the input objects. In v1 of the library, we need to pass `dt` manually, and that sucks.

So, here some possible pseudo-code:

In [None]:
# Compute the time-to-peak between all pair of nodes
ttpR = Rmats.Time2Peak()  # Or prefered ...
ttpR = sna.Time2Peak(Rmats)

# Visualize the ttp matrix
plt.figure()
plt.title( 'Time-to-peak distance' )
plt.imshow(ttpR)
plt.colorbar()


# Compute the total node responses
totinr_nodes = inr_nodes.AreaUnderCurve()  # or prefered ...
totinr_nodes = sna.AreaUnderCurve(inr_nodes)

totoutr_nodes = outr_nodes.AreaUnderCurve()   # or prefered ...
totoutr_nodes = sna.AreaUnderCurve(outr_nodes)

print( totinr_nodes.data.shape )   # `data` is the 2D array

# Plot the relation between the input / output node relations
maxr = max(totinr_nodes.max(), totoutr_nodes.max())

plt.figure()
plt.scatter(totinr_nodes, totoutr_nodes)
plt.plot((0,0), (maxr,maxr), ls='--', color='gray')  # refer
plt.xlabel( 'Total IN-response' )
plt.ylabel( 'Total OUT-response' )



In [None]:
# Calculate some further results
totRglob = Rglob.AreaUnderCurve()  # or bettet ...
totRglob = sna.AreaUnderCurve(Rmats)                              

# MG: WE CAN CHECK THE dt OF THE OBJECT Rmats (AND FOR E.G. DISCRETE CASCADE SET dt=1)
def AreaUnderCurve(rmat, type=xxx): # with xxx = 'global', 'in', 'out', 'conn'

    if type(Rmat) in [sna.DiscreteCascade, sna.ContinuousCascade]:
        raise ValueError('AreaUnderCurve for diverging response doesnt make sense...') # or just warning???
    if type(Rmat) in [sna.RandomWalk]:
        raise ValueError('AreaUnderCurve for saturating response doesnt make sense...') # or just warning???
    if type(Rmat) in [sna.LeakyCascade]:
        if type=='global':
            return rmat.data.sum(axis=(0,1,2))
        elif type=='in':
            return rmat.data.sum(axis=(0,2))
        elif type=='out':
            return rmat.data.sum(axis=(0,1))
        elif type=='conn':
            return rmat.data.sum(axis=(0))
        else:
            raise ValueError('unknown type')

gr = sna.GlobalResponse(Rmat)
plt.plot(Rmat.tpoints, gr)


# OR WE CAN HAVE A FUNCTION THAT IS INHERITED  BUT BLOCKED FOR SOME CLASSES

def RMAT():
    def __init__():
        self.data = None
    
    def AreaUnderCurve(type=xxx):
        if type=='global':
            return self.data.sum(axis=(0,1,2))
        elif type=='in':
            return self.data.sum(axis=(0,2))
        elif type=='out':
            return self.data.sum(axis=(0,1))
        elif type=='conn':
            return self.data.sum(axis=(0))

def DiscreteCascade(RMAT):

    def AreaUnderCurve():
        raise ValueError('doesnt make sense!!!')

#####


ttp_Rglob = Rglob.Time2Peak()      # Already knows `dt`, inherited it 
ttp_Rglob = sna.Time2Peak(Rglob)   # ?

