# Network Neuroscience

##### Authors: Mauricio Barahona and Robert Peach

##### Motivation


The field of neuroscience is simulanteously being blessed and cursed with a rapid expansion in the size, scope and complexity of neural data drawn from multiple levels of spatial and temporal organisation. Large portions of this data are relational, describing the interconnections between many individual elements of neurobiological systems. Examples include, but are not limited to:

- Protein interaction networks
- Genetic regulatory networks
- Synaptic connections
- Dynamical patterns of neural signalling
- Interactions between brain networks and environment

The data is not only multi-scale, but involve different domains of biology or data types, posing challenges in analysis. From the intersection between (i) the development of empirical methods for mapping and recording neurobiological data and (ii) the theoretical and computational advances in data analysis and modeling of brain networks, there is an emerging trend of research which falls under the umbrella of *network neuroscience*.
We can ask how ideas and tools from network science may bring changes and advances to the types of questions that we can ask and the hypotheses that we can test about neuroscience.


##### What to expect?
There is no doubt that many of you in this room are more experienced with network neuroscience than us. That being said, we wanted to touch on a couple of topics for those that haven't!
1. Network analysis of connectome
2. Topological analysis




In [None]:
import glob
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns
import networkx as nx
import pickle

## Functional connectivity analysis

Functional neuroimaging techniques are used widely in cognitive neuroscience to investigate aspects of functional specialization and functional integration in the human brain. First we must define our choice of nodes, then using data we must define the links between our nodes.

![title](images/network_neuroscience.png)
<p style="text-align:center">Connectome pipeline.</p> 

Here will just perform a short analysis of some functional connectome data. The resting state fMRI (resting-state fMRI) matrices used here (i.e., based in correlation values of time series) were obtained from the The UCLA multimodal connectivity database (1000_Functional_Connectomes dataset http://fcon_1000.projects.nitrc.org/fcpClassic/FcpTable.html).




Lets load in the connectivity matrix that I have already averaged across the full dataset!

In [None]:
folder = './data/1000_functional_connectomes/'

matrix = np.genfromtxt(folder + "connectome_average.csv",delimiter=',')

Code for running over full dataset of the individual connectomes is available below! Don't run these elements if you don't have the data :)

When working with fMRI brain network data, it is useful to generate some plots (e.g., the heatmaps for matrix visualisation, and distribution plots of edge weights) to facilitate data comprehension and flag potential artefacts.


In [None]:
matrix_diagnan = matrix.copy()
np.fill_diagonal(matrix_diagnan,np.nan) # remove diagonal

# plotting heatmap
plt.figure(figsize=(20,16))
sns.heatmap(matrix_diagnan, cmap='coolwarm', cbar=True, square=False, mask=None)

In brain networks, we expect mostly weak edges and a smaller proportion of strong ones. When plotted as a probability density of log10, we expect the weight distribution to have a Gaussian-like form.

In [None]:
fig,axes = plt.subplots(1,2,figsize=(10,4))

# Distribution of absolute raw weights
rawdist = sns.distplot(abs(matrix_diagnan.flatten()),  kde=False, ax=axes[0], norm_hist=True)
rawdist.set(xlabel='Correlation Values', ylabel = 'Density Frequency')

# Probability density of log10
log10dist = sns.distplot(np.log10(matrix_diagnan).flatten(), kde=False, ax=axes[1], norm_hist=True)
log10dist.set(xlabel='log(weights)')

Not quite Gaussian on the right... but hey ho.

Lets take the absolute (turning negative correlations into positive) and then sparsify (remove low correlation edges). Then we can generate a networkx graph object from our correlation matrix.

In [None]:
# take absolute 
matrix = abs(matrix)

# Create sparser graphs for visualisation and easier analysis
matrix_filtered = matrix.copy()
matrix_filtered[matrix_filtered<=0.4] = 0

In [None]:
# Creating a graph object
G = nx.from_numpy_matrix(matrix_filtered)

# Removing self-loops
G.remove_edges_from(list(nx.selfloop_edges(G)))

Load additional data and information.

In [None]:
# lets get the xyz coordinates of each region

path_pos = folder + '/HCP_positions.txt'
positions = pd.read_csv(path_pos, header = None, delim_whitespace=True)

# defining coordinates
pos = {}
for node in G.nodes:
    pos[node] = np.array([positions.loc[node,0],positions.loc[node,1]])
    
# lets create a simple plot
nx.draw(G,pos)


This visualisation isn't very nice, nor is it informative. Lets trying something 3D with a nice trace of the brain instead...

The below cell has some functions to help us plot. We don't need to understand this part in detail.

In [None]:
from plotting_brain import plot_brain_network

In [None]:
path_brainobj = folder +  '/brain.obj'
path_pos = folder + '/HCP_positions.txt'

fig, node_size = plot_brain_network(G, path_pos, path_brainobj)

## Pose estimation and topology




Quantifying behavior is crucial for many applications in neuroscience. Videography provides easy methods for the observation and recording of animal behavior in diverse settings. Extracting particular aspects of a behavior for further analysis isn't easy, but there are various tools such as deep lab cut, that allow us to track body parts. For example, here we are quite interested in examining motor disorders.

![title](images/pose_networks.gif)
<p style="text-align:center">Figure: Pose estimation of humans from mediapipe.</p> 

However, given the large data sets for tracking body parts, how do we then analyse the data? Of course, this depends on the experiment at hand, but here we explore the topology and geometry of the movements.

#### Persistent homology

To analyse this data, we will touch on an area closely linked to networks (and their extensions into higher order networks). 
Persistent homology is a method for computing topological features of a space at different spatial resolutions. With it, we can track homology cycles across simplicial complexes (higher order networks), and determine whether there were homology classes that "persisted" for a long time. The basic idea is summarized in the illustration below.

![title](images/persistent_homology.jpeg)
Figure: Topological data analysis. (A) Illustration of simplexes. (B) Representation of simplexes/cliques of different order being formed in the system (e.g., in the brain) across the filtration process. (C) Barcode respective to panel B, representing the filtration across distances (i.e., the inverse of weights in a correlation matrix). Line A represents cycle A in B. H0-2 indicates the homology groups. (H0 = connected components, H1 = one-dimensional holes, H2 = 2-dimensional holes). (D) Circular projection of how the system (e.g., the brain) would be connected. (E) Persistence diagram (or Birth/Death plot) obtained from real resting-state fMRI brain data. In this plot, it is also possible to identify a phase transition between H1 and H2.[1]

[1] Centeno, Eduarda Gervini Zampieri, et al. "A hands-on tutorial on network and topological neuroscience." Brain Structure and Function 227.3 (2022): 741-762.



##### Data

Here, we are using some data kindly donated by Alex Grotemeyer (unpublished so please don't share further than this room!). The data is of rats running on a treadmill - see example image below. We have both ventral and lateral views taken in simultaneous videos.



<img src="images/rat_gait.png" alt="drawing" width="400"/>
<p style="text-align:center">Figure: Ventral view of rat on treadmill task. The markers have been predicted using DLC.</p> 

We are going to use ideas from topological data analysis to construct our network and then analyse it with persistent homology!


Lets load the cleaned and processed data that I generated earlier!

In [None]:
with open('./data/gait/gait_data.pickle', 'rb') as handle:
    all_data, labels = pickle.load(handle)

The next set of functions are simply to load and clean the DLC data which are stored in CSV files. You don't need to run these, but I am leaving them here for your future interest/usage.

Now the fun begins! We have all our DLC files with the most interesting limbs tracked over time. We can now compute correlation matrices between all the limbs in both the x and y directions of the video.

In [None]:
import scipy.stats as st
import itertools

data = all_data[0]
markers = list(data.columns.get_level_values(level=0).unique())
marker_pairs = list(itertools.combinations(markers, 2))


graphs = []
for data in all_data:
    correlation_matrix = pd.DataFrame(data=0,columns=markers,index=markers)

    # computing correlation as mean of x and y correlations - there are better ways to do this...
    correlation_matrix_x =  data.loc[:,data.columns.get_level_values(1)=='x'].corr()
    correlation_matrix_y =  data.loc[:,data.columns.get_level_values(1)=='y'].corr()
    correlation_matrix = np.dstack([correlation_matrix_x,correlation_matrix_y]).mean(axis=2)
        
    # converting correlation to a distance measure
    distance_matrix = 1 - abs(correlation_matrix)    
    distance_matrix[np.isnan(distance_matrix)]=1
    
    graphs.append(distance_matrix)
    
    


Now we are going to use a package called Giotto-tda to help us compute the persistence homology!

In [None]:
from gtda.homology import VietorisRipsPersistence

# Track connected components, loops, and voids
homology_dimensions = [0, 1, 2]

# Collapse edges to speed up H2 persistence calculation!
persistence = VietorisRipsPersistence(
    metric="precomputed",
    homology_dimensions=homology_dimensions,
    n_jobs=6,
    collapse_edges=True,
)

# fit persistence diagram
persistence_diagrams = persistence.fit_transform(graphs)

In [None]:
from gtda.plotting import plot_diagram

plot_diagram(persistence_diagrams[10])

Although persistence diagrams are useful descriptors of the data, they cannot be used directly for machine learning applications. This is because different persistence diagrams may have different numbers of points, and basic operations like the addition and multiplication of diagrams are not well-defined.

To overcome these limitations, a variety of proposals have been made to “vectorize” persistence diagrams via embeddings or kernels which are well-suited for machine learning. Here, we use the persistence entropy function in giotto-tda, which measures the entropy of points in a persistence diagram.

In [None]:
from gtda.diagrams import PersistenceEntropy

# define persistence entropy object
persistence_entropy = PersistenceEntropy()

# calculate topological feature matrix
X = persistence_entropy.fit_transform(persistence_diagrams)

In [None]:
from gtda.plotting import plot_point_cloud

# lets visualise our rats. Each point is a rat!
plot_point_cloud(X)

Lets now see if there is a statistical difference between the preop and 6 week groups.

In [None]:
import scipy.stats as st

x = X[(labels.week=='pre OP')]
y = X[(labels.week=='6 weeks')]

tvalues, pvalues = st.ttest_rel(x,y)

print(pvalues)

We notice that there is significance when we consider H2 homology! Can we interpret this? (maybe, with great difficulty).

Persistence entropies (each row in our feature matrix) are calculated as the (base 2) Shannon entropies of the collections of differences d - b (“lifetimes”), normalized by the sum of all such differences. A larger entropy in H2 holes means that the lifetimes of H2 holes vary more. We observe a larger entropy in the lifetimes of H2 holes in preOP mice relative to 6 weeks. Potentially this could be a training effect?

Thankfully we are at the end of the workshop and I won't need to spend any further time interpreting this horror!