# Pseudo-code discussion towards v2

Authors: **Gorka Zamora-López** and **Matthieu Gilson**

---------------------

The goal of this notebook is to try, propose and discuss how we would like that typical workflows look like in the object-oriented version of *SiReNetA* (v2). The notebook is based on the workflow described in the first tutorial [notebook](https://github.com/mb-BCA/SiReNetA_Tutorials/blob/master/Tutorial_Notebooks/1_GettingStarted.ipynb).

See also further discussions in the [TODO_Future.md](https://github.com/mb-BCA/SiReNetA/blob/v1_dev/TODO_Future.md) file.

----------------------

In [None]:
# Python standard & third-party library imports
from timeit import default_timer as timer

import matplotlib.pyplot as plt
import numpy as np

import sireneta as sna
print( 'SiReNetA:', sna.__version__ )

In [None]:
# Define plotting options to control visualization
%matplotlib inline

# Load the options from a local file
from plot_specs import *

We consider the network determined by the following binary matrix.

In [None]:
# Load the sample graph to study
net = np.loadtxt('../Data/Testnet_N8.txt', dtype=int)
# Number of nodes
N = len(net)

Convergence of the leaky-cascade model depends on the leakage time-constant, when $\tau \leq 1 \,/\, \lambda_{max}$. Find the largest eigenvalue of connectivity *A* and the critical $\tau_{max}$.

In [None]:
# Find the largest eigenvalue of the connectivity matrix A
evs = np.linalg.eigvals(net)
evmax = evs.real.max()
# Calculate the largest possible tau
taumax = 1.0 / evmax

print( f'Spectral radius:\t{evmax:2.5f}' )
print( f'Largest possible tau:\t{taumax:2.5f}' )

<br>

## 1. Initialize the main container object (instance)

Object `Rmats` must be initialized with two mandatory parameters:

* `con` : The connectivity matrix.
* `canmod` : identifier for the canonical model. Probably a string.

These two parameters shall be **_unmutable_**. They cannot be changed after object creation. `con` will be a connectivity matrix, either loaded from data or generated with some graph library.


> WARNING ! `con` is likely to be generated by a graph analysis software, or loaded from data as 

**Gorka**: I don't think we need to initialize `Rmats` with more parameters.

TODO : Probably, find a better name for the creation function `PairWiseResp()`.

In [None]:
# Rmat must be initialized for a connectivity matrix and a canonical model.
# These are unmutable and cannot be changed. Ever !
Rmats = sna.PairWiseResp(con=net.T, canmod='LeakyCascade')


After this, `Rmats` should also *know* whether `con` is directed or undirected and the number of nodes *N*. E.g.:

``` python
>>> Rmats.N
8
>>> Rmats.directed
False

```

## 2. Prepare the conditions and compute the pair-wise responses

For now, I see two ways in which the needed metadata could be inputed into the `Rmats` object and compute the pair-wise responses.

----------------------

### Option 1: Feed `Rmats` with parameters and then call the calculation of $\mathcal{R}_{ij}(t)$.
In this option, the `Rmats` object is first populated with all the parameters needed for the "simulation" (feed the necessary attributes) and then the pair-wise responses are calculated.

1. Feed the different parameters (attributes) to `Rmats`, e.g.:
    - `Rmats.max = 10`
    - `Rmats.tau = 0.8 * taumax`
    - ...
2. Calculate the pairwise responses `Rmats.Calc_Resp()`

I think this is quite a traditional way, although not my favourite choice (Gorka) because it is not flexible in case the "simulation" needs to be run again, e.g. because the `tmax` was not long enough and responses didn't converge enough. See *Option 2* and the discussion below.

In [None]:
# Define the "simulation" parameters
# Set the temporal resolution
Rmats.tmax = 10
Rmats.timestep = 0.01

# Set the leakage time-constants τ, proportional to taumax
# NOTE: Actually, `taumax` should also be an attribute of Rmats, right?
Rmats.tau = 0.8 * taumax

# Define the stimulation amplitude to every node
# Example of stimulus on one node
Rmats.S0 = np.zeros(N)
Rmats.S0[0] = 1.0

# Example of all nodes receive same stimulus
Rmats.S0 = 1.0

In [None]:
# Calculate temporal evolution of the pair-wise responses R(t)
Rmats.Calc_Resp(case='regressed') 


# Print some feedback
print( Rmats.tfinal, Rmats.dt, Rmats.nsteps )
print( Rmats.tpoints )
print( len(Rmats.tpoints) )


In this option, all the parameters needed for the "simulation" have been already stored as attributes of `Rmats`, before calling the `Calc_Resp()` function.
These are: `S0`, `tmax`, `timestep` and `tau`. Parameter `case` is rather optional and only available in two of the five canonical models. `case` could be stored before calling the function, or added as an attribute to `Rmats` at function call.


----------------------

### Option 2: Define the parameters externally, and then call the calculation of $\mathcal{R}_{ij}(t)$.
In this option, the "simulation" parameters are defined in a more traditional way (non-OO). The parameters are defined and explicitly entered into the `Calc_Resp()` function. It is only when `Calc_Resp()` is called, that the attributes of `Rmats` are populated or updated.

1. Feed the different parameters (attributes) to `Rmats`, e.g.:
    - `tfinal = 10 10`
    - `dt = 0.01` 
    - `tau = 0.8 * taumax`
    - `stimvec = np.ones(N)` 
2. Calculate the pairwise responses `Rmats.Calc_Resp(S0=stimvec, tmax=tfinal, timestep=dt, case='regressed')`.

I think this is quite a traditional way, although not my favourite choice (Gorka) because it is not flexible in case the "simulation" needs to be run again, e.g. because the `tmax` was not long enough and responses didn't converge enough. See *Option 2* and the discussion below.

In [None]:
# Define the "simulation" parameters
# Set the temporal resolution
tfinal = 10
dt = 0.01

# Set the leakage time-constants τ, proportional to taumax
tau = 0.8 * taumax

# Define the stimulation amplitude to every node
stimvec = 1.0


In [None]:
# Calculate temporal evolution of the pair-wise responses R(t)
## Here, all necessary "simulation" parameters (`S0`, `tmax`, `timestep` and `tau`)
## shall be manually entered. `case`is an optional parameter.
Rmats.calc_Resp(S0=stim, tau=tau, tmax=tfinal, timestep=dt, case='regressed')


# Print some feedback
print( Rmats.tfinal, Rmats.dt, Rmats.nsteps )
print( Rmats.tpoints )
print( len(Rmats.tpoints) )

In this option, the crucial parameters needed for the "simulation" (`S0`, `tmax`, `timestep` and `tau`) need to be entered to `Calc_Resp()` explicitely. Then, when the function is called, the function fills or updates those attributes into `Rmats` object. 

Q : Actually, is there a way in Python to prevent that a user externally updates attributes `S0`, `tmax`, `timestep` and `tau`, and that these can **only** be updated from inside `Calc_Resp()` function ? That would make things pretty safe.

----------------------

**Gorka** : Why do I prefer this option? I think it is safer that the crucial metadata will not be changed accidentally by the user or anything else. For example, attribures `tmax` and `timestep` are not directly modified by the user, but are only updated when calling the `Calc_Resp()` function. 

Imagine we compute the responses for tmax = 10 and timestep = 0.1. We plot the result and realize that this temporal scale is too large. We want to recalculate the responses (without having to initialize a new instance) for tmax = 1.0 and timestep=0.001.
In Option-1 that would mean to edit manually the attributes `Rmats.tmax=1.0` and `Rmats.timestep=0.001`. Then, we have to deliberatelly call again `Rmats.Calc_Resp()` in order to update the results. But, what if the user, after changing `tmax` and `timestep` attributes, forget to call `Calc_Resp()` ? In that case, the metadata of `Rmats` does no longer match the one used to previously compute the responses, whose result is still saved into an array of shape (10//0.1+1,N,N), instead of the (1.0//0.001+1,N,N) now expected. This situation might not be very usual when scripting but … I can easily see things like this happening when working on Jupyter Notebooks.

In Option-2, the only time that attributes are allowed to change are at the call of `Calc_Resp()` function, and only this function has the permission to change them.


### Output of `Calc_Resp()` function

The main result of `Calc_Resp()` function (method) is to compute the response matrices over time. This is stored into a 3D numpy array `data` of shape (tmax//dt+1, N, N). The "+1" is because the first time step is *t=0*. All subsequent results are obtained out of `Rmats.data`, either slicing the 3D array projecting along different axes, or by estimating metrics out of it.

### Plot some results

For example, visualize the response matrices at selected time points.

In [None]:
# Visualise the pair-wise response matrices at times t = 0.1, 0.3, 0.5, 1.0, 2.0, 3.0
maxresp = Rmats.max()

tidxlist = [10,20,50,100,200,500]
plt.figure(figsize=[12,6])
for i, tidx in enumerate(tidxlist):
    t = tpoints[tidx]
    plt.subplot(2,3,i+1)
    plt.title( f'$\mathcal{{R}}(t)$ matrix at t={t:1.1f}' )
    plt.imshow(Rmats.data[tidx], cmap=new_Reds)
    
    plt.clim(0,maxresp)
    plt.colorbar()
    plt.xticks(np.arange(N), np.arange(N)+1)
    plt.yticks(np.arange(N), np.arange(N)+1)
    plt.xlabel( 'source node' )
    plt.ylabel( 'target node' )

plt.tight_layout()

In [None]:
# Plot a few response curves. 
# The responses of i = 2, ..., N to the stimulus at j = 1.

plt.figure(figsize=(6.4,3))
for i in range(1,N):
    plt.plot(Rmats.tpoints, Rmats.data[:,i,0], label=f'1 $\\to$ {i+1}')
plt.xlabel( 'Time (a.u.)' )
plt.ylabel( 'Pair-wise Response' )
plt.legend()

plt.tight_layout()

<br>

## 3. Slicing and projecting `Rmats.data`. Extracting metrics.

I have a major doubt here. After the tensor containing $\mathcal{R}_{ij}(t)$, the other two time-series "arrays" are the global response $R(t)$ and the pair-wise responses $r_i(t)$. `Rglob` is a 1D ndarray of length `nsteps`, while `rnodes` is a 2D ndarray of shape (`nsteps`,`N`).

The question is whether `Rglob` and `rnodes` should be considered as:
1. two independent objects which inherit the temporal attributes from `Rmats`, and the methods (e.g., `Time2Peak()` and `AreaUnderCurve()`) or 
2. attributes of `Rmats`. Q: Can attributes be objects themselves, that share selected attributes?  

GORKA : Personally, I have a slight preference for the first possibility because that would allow me to keep the the workflows closer to what I am used to. But I ignore the details of what would be easier and more consistent to code internally, given that functions like `AreaUnderCurve()` and `Time2Peak()` do apply to all the 3D, 2D and 1D time-series tensors.

If we consider them **as independent objects**, we would for example do as follows. These  function calls (even in the form of methods) they will create and return a new object instance. Not same object class as `Rmats` but similar.

In [None]:
# Compute the global network response
## `Rglob` is an object which, at time of creation inherits `tpoints` and `dt`
Rglob = Rmats.GlobalResponse()  # or
Rglob = sna.GlobalResponse(Rmats)
print( Rglob.data.shape() )  # `data` is the 1D array
print( Rglob.shape() )       # Would that be legal ?
print( Rglob.nsteps )

# Plot the global response curve, r(t)
plt.figure()
plt.plot(Rglob.tpoints, Rglob.data)
plt.xlabel( 'Time (a.u.)' )


# Calculate some further results
totRglob = Rglob.AreaUnderCurve()  # Already knows `dt`, inherited it 
totRglob = sna.AreaUnderCurve(Rmats)  # ?

ttp_Rglob = Rglob.Time2Peak()      # Already knows `dt`, inherited it 
ttp_Rglob = sna.Time2Peak(Rglob)   # ?


In [None]:
# Compute the node-wise responses
## `rnodes` is an object which, at time of creation inherits `tpoints` and `dt`
if Rmats.directed:
    inr_nodes, outr_nodes = Rmats.NodeResponses(selfresp=True)
else: 
    r_nodes = Rmats.NodeResponses(selfresp=True)

# Plot all the node responses
plt.figure()
for i in range(N):
    plt.plot(inr_nodes.tpoints, inr_nodes.data[i])
plt.xlabel( 'Time (a.u.)' )


# Compute the total node responses
totinr_nodes = inr_nodes.AreaUnderCurve()
totoutr_nodes = outr_nodes.AreaUnderCurve()

print( totinr_nodes.data.shape )   # `data` is the 2D array
print( totinr_nodes.shape )        # Would this be legal ?


# Calculate time-to-peak for the node responses
## Ok, here ttp_inr and ttp_outr are 1D arrays of length N.
## As just arrays, they loose any metadata associated to the nodes, e.g., `labels`.
ttp_inr = inr_nodes.Time2Peak()
ttp_outr = inr_nodes.Time2Peak()

print( ttp_inr, ttp_outr )



If we consider them **as attributes of the `Rmats` object**, then we may just need to call the methods, without assignment. The first time called, they will do the calculation and create the corresponding array, saved within the `Rmats` container. The second time we call these methods, `Rmats` should already know the data is there and read it.

In [None]:
# Compute the global network response
## `Rglob` is an object which, at time of creation inherits `tpoints` and `dt`
Rmats.GlobalResponse()

print( Rmats.GlobalResponse().shape )  # `Rmats.GlobalResponse() is now a 1D array.
print( Rmats.nsteps )


# Plot the global response curve, r(t)
plt.figure()
plt.plot(Rmats.tpoints, Rmats.GlobalResponse())
plt.xlabel( 'Time (a.u.)' )


# Calculate some further results
## `totGlobalResonse()` is a method of `Rmats` which, internally, 
## calls `AreaUnderCurve()` and applies to the data saved in `Rmats.GlobalResponse()
Rmats.totGlobalResponse()   
## `totTime2Peak() is a method of `Rmats` which, internally, 
## calls `Time2Peak()` and applies to the data saved in `Rmats.GlobalResponse()
Rmats.totTime2Peak()


These last start to look pretty ugly. Probably is going to be worse for the node responses.

In [None]:
# Compute the node-wise responses
Rmats.NodeResponses(selfresp=True)

## How to deal in this case with the possiblity of having both 
## the input and the output node responses? For example ... ?
print( Rmats.NodeResponses()['in'].shape )  # ??
print( Rmats.NodeResponses()['out'].shape )  # ??


# Plot all the node responses
plt.figure()
for i in range(N):
    plt.plot( Rmats.tpoints, Rmats.NodeResponses()['in'][i] )  # Jeez, that looks ugly!!
plt.xlabel( 'Time (a.u.)' )


# Compute the total node responses
Rmats.totNodeResponses()

print( Rmats.totNodeResponses()['in'].shape )
print( Rmats.totNodeResponses()['out'].shape )


# Calculate time-to-peak for the node responses
Rmats.nodeTime2Peak()

print( Rmats.nodeTime2Peak()['in'], Rmats.nodeTime2Peak()['in'] )


Ok ... all these looks pretty ugly and complicated. It seems that generating new independent object instances when e.g., `AreaUnderCurve()` or `Time2Peak()` are called, is the better option, despite it might be a waste of memory that each array has to remember some redundant metadata.