# How-to: core neurons are crossroads of cortical dynamics 

Analysis code to reproduce all panels in figures 1 and 2 of the paper by Guarino, Filipchuk, Destexhe (2022)   
preprint link: https://www.biorxiv.org/content/10.1101/2022.05.24.493230v2

All this code is hosted on a github [repository](https://github.com/dguarino/Guarino-Filipchuk-Destexhe) (with a Zenodo DOI persistent identifier [here](https://zenodo.org)) and can be interactively executed here [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/dguarino/Guarino-Filipchuk-Destexhe/HEAD?urlpath=Lab).  
The repository also contains a copy of the required data files from the [MICrONS project phase1](https://www.microns-explorer.org/phase1) (freely available on the project website), to ease the setup on Binder. 

This notebook performs loading and selection of the MICrONS data, structural and dynamical analyses, and plots the results as in the paper panels.

We divided the analysis code into:
- `imports_functions.py` : performs the imports and definition of various helper functions.
- `structural_analysis.py` : creates a graph from the connectivity matrix and computes several graph measures (using [igraph](https://igraph.org)).
- `dynamical_analysis.py` : performs the same population event analysis as in [Filipchuk et al. 2022](https://www.biorxiv.org/content/10.1101/2021.08.31.458322v2) and then also extracts the core neurons of the events.


In [1]:
from builtins import execfile

execfile('imports_functions.py')

## Loading curated data from MICrONS project phase 1

The following code for data loading and selection is taken from   
https://github.com/AllenInstitute/MicronsBinder/blob/master/notebooks/intro/MostSynapsesInAndOut.ipynb   
https://github.com/AllenInstitute/MicronsBinder/blob/master/notebooks/vignette_analysis/function/structure_function_analysis.ipynb

`Neurons.pkl` contains the `segment_id` for each pyramidal neuron in the EM volume.    
`Soma.pkl` contains the soma position for all the cells in the EM volume.   
`calcium_trace.pkl` contains the calcium imaging traces (including deconvolved spikes).    
`soma_subgraph_synapses_spines_v185.csv` contains the list of synapses with root pre-/post-synaptic somas.

**CAUTION: The cell below might take some time to download the data.**

In [2]:
with open("MICrONS_data/Neuron.pkl", 'rb') as handle:
    Neuron = pickle.load(handle)
with open("MICrONS_data/Soma.pkl", 'rb') as handle:
    Soma = pickle.load(handle)
if os.path.exists("MICrONS_data/calcium_trace.pkl"):
    calcium_trace = pd.read_pickle("MICrONS_data/calcium_trace.pkl")
# print(calcium_trace)

syn_spines_df = pd.read_csv('MICrONS_data/soma_subgraph_synapses_spines_v185.csv')
# id, pre_root_id, post_root_id, cleft_vx, spine_vol_um3

syn_df = pd.read_csv('MICrONS_data/pni_synapses_v185.csv')

Get the IDs and number of recorded pyramidal neurons

In [3]:
pyc_list = Neuron["segment_id"]
n_pyc = pyc_list.shape[0]

Set the folder to which all results will be saved, and the frame duration (from the MICrONS docs).

In [4]:
exp_path = os.getcwd()
frame_duration = 0.0674 # sec, 14.8313 frames per second

#### Accessing 2-photon Calcium imaging data subset

We are interested in reading only the Ca-imaging data of the cells for which also the EM reconstruction is available.   

##### CAUTION: next cell can take some time to load all calcium imaging data!

In [5]:
print("Pyramidal neurons recorded with 2-photon Calcium imaging: ",len(calcium_trace))
ophys_cell_ids = list(calcium_trace.keys())
n_frames = len(calcium_trace[ophys_cell_ids[0]]['spike'])
start_time = 200*frame_duration # 200 frames of blank screen
stop_time = (200+n_frames)*frame_duration
time = np.arange(start_time,stop_time,frame_duration)

decs = []
for ocell_id in ophys_cell_ids:
    decs.append(calcium_trace[ocell_id]["spike"]) # deconvolved Ca spiketrains

fig, ax = plt.subplots()
ax.plot(range(n_frames), decs[0])
fig.savefig(exp_path+'/results/deconvolved_Ca_spikes0.png', dpi=300, transparent=True)
plt.close()
fig.clf()
spiketrains = []
for decst in decs:
    spiketrains.append( time[:][np.nonzero(decst)] )

print("... producing spike rasterplot")
fig = plt.figure(figsize=[12.8,4.8])
for row,train in enumerate(spiketrains):
    plt.scatter( train, [row]*len(train), marker='o', edgecolors='none', s=1, c='k' )
plt.ylabel("cell IDs")
plt.xlabel("time (s)")
fig.savefig(exp_path+'/results/rasterplot.png', transparent=False, dpi=800)
plt.close()
fig.clear()
fig.clf()

Pyramidal neurons recorded with 2-photon Calcium imaging:  112
... producing spike rasterplot


#### Create the cell indexes from the list of IDs

In [6]:
ophys_cell_indexes = range(len(ophys_cell_ids))

#### Get soma center locations

They are provided in voxels coordinates of 4,4,40 nm

In [7]:
pyc_soma_loc = np.zeros((n_pyc, 3))
for i in range(n_pyc):
    seg_id = pyc_list[i]
    pyc_soma_loc[i,:] = get_soma_loc(Soma, seg_id)

Join cell indexes with their position

In [8]:
pyc_ca_soma_loc = np.zeros((len(ophys_cell_indexes), 3))
for i in ophys_cell_indexes:
    seg_id = ophys_cell_ids[i]
    idx = np.where(pyc_list==seg_id)[0][0]
    pyc_ca_soma_loc[i,:] = pyc_soma_loc[idx,:]

#### Select only the synapses of the 2p recorded neurons

Take only synapses and spines whose root_id is either pre- or post- synaptic to somas corresponding to 2photon-recorded pyramidal neurons:

In [9]:
pyc_ca_syn_df = syn_df.query('(pre_root_id in @ophys_cell_ids) and (post_root_id in @ophys_cell_ids)')
pyc_ca_syn_spines_df = syn_spines_df.query('(pre_root_id in @ophys_cell_ids) and (post_root_id in @ophys_cell_ids)')

Also take post-synaptic spines of IDs which are coming from non-imaged neurons, and even neurons whose somas are not present in the EM:

In [10]:
postsyn_spines_df = syn_spines_df.query('post_root_id in @ophys_cell_ids')
print(postsyn_spines_df.shape)

(1669, 17)


## Structural Analysis

First, we build an adjacency matrix of the 2p/EM-imaged neurons:

In [11]:
adjacency_matrix = np.zeros((len(ophys_cell_indexes), len(ophys_cell_indexes)))
for i in ophys_cell_indexes:
    root_id = ophys_cell_ids[i]
    root_id_postsyn_list = pyc_ca_syn_df[pyc_ca_syn_df['pre_root_id'] == root_id]['post_root_id'].tolist()
    # print(root_id_postsyn_list)
    for ps in root_id_postsyn_list:
        if ps in ophys_cell_ids:
            # ips = np.argwhere(ophys_cell_ids==ps)[0][0]
            ips = ophys_cell_ids.index(ps)
            # print(ps, ips)
            adjacency_matrix[i][ips]=1
np.save(exp_path+'/results/adjacency_matrix.npy', adjacency_matrix)

Several global purely structural measures.    
This includes **panel 2B** (with inset).

In [12]:
global_degree_counts = []
global_degree_distribution = []
global_structural_betweeness = []
global_structural_motifs = []
global_structural_motifsratio = []
global_structural_motifsurrogates = []

execfile('structural_analysis.py')

global_structural_betweeness.append(betweenness_centrality)
global_degree_counts.append(degree_counts)
global_degree_distribution.append(degrees)
global_structural_motifs.append(motifs)
global_structural_motifsurrogates.append(surrogate_motifs)
global_structural_motifsratio.append(motifsratio)

... adjacency matrix
... loaded
    number of vertices: 112
... Network nodes degrees
... Degree distributions
... Betweenness centrality
... Motifs


  motifsratio = motifs/surrogate_motifs


## Dynamical Analysis

Here we first population events, we quantify them, and we extract their core neurons.   
This analysis extends (from step 5 on) that performed by Filipchuk et al. 2022:
1. Compute population instantaneous firing rate (bin)

2. Establish significance threshold for population events   
    2.1 compute Inter-Spike Intervals (ISI) of the original spiketrains   
    2.2 reshuffle ISI (100) times   
    2.3 compute the population instantaneous firing rate for each surrogate time-binned rasterplot   

3. Find population events   
    3.1 smoothed firing rate   
    3.2 instantaneous threshold is the 99% of the surrogate population instantaneous firing rate   
    3.3 the peaks above intersections of smoothed fr and threshold mark population events   
    3.4 the minima before and after a peak are taken as start and end times of the population event   
    
4. Find clusters of events   
    4.1 produce a cell id signature vector of each population event   
    4.2 perform clustering linkage by complete cross-correlation of event vectors   
    4.3 produce surrogates clusters to establish a cluster significance threshold     
    4.4 find the event reproducibility within each cluster (cluster events cross-correlation)   

5. Find core neurons   
    5.1 take all neurons participating to a cluster of events   
    5.2 use the 99% of the cluster event reproducibility as significance threshold   
    5.3 if the occurrence frequency of a neuron is beyond threshold, then the neuron is taken as core   
    5.4 remove core neurons if firing unspecifically within and outside their cluster   
    
### All panels of Figure 1

are produced in the next cell by the file `dynamical_analysis.py`.

In [13]:
global_structural_motif_cores = {k: 0 for k in range(16)}
global_structural_motif_others = {k: 0 for k in range(16)}
global_events_sec = []
global_events_duration = []
global_cluster_number = []
global_cluster_selfsimilarity = []

execfile('dynamical_analysis.py')

global_events_sec.append(events_sec)
global_events_duration.extend(events_durations_f)
global_cluster_number.append(nclusters)
global_cluster_selfsimilarity.extend(reproducibility_list)

... firing statistics
    population firing: 1.23±1.14 sp/frame
    smoothing
... generating surrogates to establish population event threshold
    cells firing rate: 0.01±0.10 sp/s
    event size threshold (mean): 3.2139256168072126
... find population events in the trial
... signatures of population events
    number of events: 226
    number of events per sec: 0.1228247519048706
    events duration: 0.674±0.258
    events size: 8.000±3.956
... Similarity of events matrix
... clustering
    linkage
    surrogate events signatures for clustering threshold
    cluster reproducibility threshold: 0.24984204332868037
    cluster size threshold: 2
    #clusters: 91
    below size threshold: 5
    sorting events signatures by cluster
... finding cluster cores
    refining cluster cores
    gathering cores from all clusters ...
    # cores: 88
    # non-cores: 24
    plotting single events rasterplots ...



---
## Mixing structural and dynamical analysis results to characterize cores

Here, we collect the evidence contrasting the hypothesis that core neurons are strongly connected.   
We tested two fundamental attractor-driven assumptions:   
- synapses between cores are more efficient compared to others   
- circuits made by cores involve more recursive connections toward cores

The real question we set out to answer is: **What does "strong connections" means?**   

### Spine volumes (panel 2A, first four boxes)

We can take the volume of a post-synaptic spine as a proxy for its functional efficacy.   
For each set of reproducible cluster cores we count their post-synaptic spine volume.   

In [15]:
print("... post-synaptic spine volume of core and other synapses (within each cluster)")
core2core_spine_vol = [] # µm3
core2other_spine_vol = []
other2core_spine_vol = []
other2other_spine_vol = []
set_ids = set(ophys_cell_ids)
for dyn_core_ids in clusters_cores:
    dyn_other_ids = set_ids.symmetric_difference(dyn_core_ids)
    # searching
    # id, pre_root_id, post_root_id, cleft_vx, spine_vol_um3
    core2core_synapse_df = pyc_ca_syn_spines_df.query('(pre_root_id in @dyn_core_ids) and (post_root_id in @dyn_core_ids)')
    if not core2core_synapse_df.empty:
        core2core_spine_vol.extend( core2core_synapse_df['spine_vol_um3'].tolist() )
    core2other_synapse_df = pyc_ca_syn_spines_df.query('(pre_root_id in @dyn_core_ids) and (post_root_id in @dyn_other_ids)')
    if not core2other_synapse_df.empty:
        core2other_spine_vol.extend( core2other_synapse_df['spine_vol_um3'].tolist() )
    other2core_synapse_df = pyc_ca_syn_spines_df.query('(pre_root_id in @dyn_other_ids) and (post_root_id in @dyn_core_ids)')
    if not other2core_synapse_df.empty:
        other2core_spine_vol.extend( other2core_synapse_df['spine_vol_um3'].tolist() )
    other2other_synapse_df = pyc_ca_syn_spines_df.query('(pre_root_id in @dyn_other_ids) and (post_root_id in @dyn_other_ids)')
    if not other2other_synapse_df.empty:
        other2other_spine_vol.extend( other2other_synapse_df['spine_vol_um3'].tolist() )

# description
print("    {:d} core2core spines, volume: {:1.3f}±{:1.2f} µm3".format(len(core2core_spine_vol), np.mean(core2core_spine_vol),np.std(core2core_spine_vol)) )
print("    "+str(stats.describe(core2core_spine_vol)) )
print("    {:d} core2other spines, volume: {:1.3f}±{:1.2f} µm3".format(len(core2other_spine_vol), np.mean(core2other_spine_vol),np.std(core2other_spine_vol)) )
print("    "+str(stats.describe(core2other_spine_vol)) )
print("    {:d} other2core spines, volume: {:1.3f}±{:1.2f} µm3".format(len(other2core_spine_vol), np.mean(other2core_spine_vol),np.std(other2core_spine_vol)) )
print("    "+str(stats.describe(other2core_spine_vol)) )
print("    {:d} other2other spines, volume: {:1.3f}±{:1.2f} µm3".format(len(other2other_spine_vol), np.mean(other2other_spine_vol),np.std(other2other_spine_vol)) )
print("    "+str(stats.describe(other2other_spine_vol)) )

# significativity
kwstat,pval = stats.kruskal(core2core_spine_vol, other2other_spine_vol)
print("    core-core vs other-other spine size Kruskal-Wallis test results:",kwstat,pval)
if len(core2core_spine_vol)>0 and len(other2other_spine_vol)>0:
    d,_ = stats.ks_2samp(core2core_spine_vol, other2other_spine_vol) # non-parametric measure of effect size [0,1]
    print('    Kolmogorov-Smirnov Effect Size: %.3f' % d)
kwstat,pval = stats.kruskal(core2core_spine_vol, core2other_spine_vol)
print("    core-core vs core-other spine size Kruskal-Wallis test results:",kwstat,pval)
if len(core2core_spine_vol)>0 and len(core2other_spine_vol)>0:
    d,_ = stats.ks_2samp(core2core_spine_vol, core2other_spine_vol) # non-parametric measure of effect size [0,1]
    print('    Kolmogorov-Smirnov Effect Size: %.3f' % d)
kwstat,pval = stats.kruskal(core2core_spine_vol, other2core_spine_vol)
print("    core-core vs other-core spine size Kruskal-Wallis test results:",kwstat,pval)
if len(core2core_spine_vol)>0 and len(other2core_spine_vol)>0:
    d,_ = stats.ks_2samp(core2core_spine_vol, other2core_spine_vol) # non-parametric measure of effect size [0,1]
    print('    Kolmogorov-Smirnov Effect Size: %.3f' % d)

# all spine volumes by type
fig, ax = plt.subplots()
xs = np.random.normal(1, 0.04, len(core2core_spine_vol))
plt.scatter(xs, core2core_spine_vol, edgecolor='forestgreen', facecolor=('#228B224d'))
xs = np.random.normal(2, 0.04, len(core2other_spine_vol))
plt.scatter(xs, core2other_spine_vol, edgecolor='forestgreen', facecolor=('#228B224d'))
xs = np.random.normal(3, 0.04, len(other2core_spine_vol))
plt.scatter(xs, other2core_spine_vol, edgecolor='silver', facecolor=('#D3D3D34d'))
xs = np.random.normal(4, 0.04, len(other2other_spine_vol))
plt.scatter(xs, other2other_spine_vol, edgecolor='silver', facecolor=('#D3D3D34d'))
vp = ax.violinplot([core2core_spine_vol,core2other_spine_vol,other2core_spine_vol,other2other_spine_vol], widths=0.3, showextrema=False, showmedians=True)
for pc in vp['bodies']:
    pc.set_edgecolor('black')
for pc in vp['bodies'][0:2]:
    pc.set_facecolor('#228B224d')
for pc in vp['bodies'][2:]:
    pc.set_facecolor('#D3D3D34d')
vp['cmedians'].set_color('orange')
ax.spines['top'].set_visible(False)
ax.spines['bottom'].set_visible(False)
ax.spines['left'].set_visible(False)
ax.spines['right'].set_visible(False)
plt.ylabel('Spine Volume (µm^3)')
plt.xticks([1, 2, 3, 4], ["core-core\n(n={:d})".format(len(core2core_spine_vol)), "core-other\n(n={:d})".format(len(core2other_spine_vol)),"other-core\n(n={:d})".format(len(other2core_spine_vol)),"other-other\n(n={:d})".format(len(other2other_spine_vol))])
fig.savefig(exp_path+'/results/global_cores_others_spine_vol.svg', transparent=True)
fig.savefig(exp_path+'/results/global_cores_others_spine_vol.png', transparent=True, dpi=1200)
plt.close()
fig.clf()

... post-synaptic spine volume of core and other synapses (within each cluster)
    2609 core2core spines, volume: 0.079±0.08 µm3
    DescribeResult(nobs=2609, minmax=(0.0069091717153331, 0.4958473865189629), mean=0.0788826974027215, variance=0.007030806570867594, skewness=2.2538533062951736, kurtosis=5.092917354192226)
    13068 core2other spines, volume: 0.079±0.08 µm3
    DescribeResult(nobs=13068, minmax=(0.0069091717153331, 0.5923887058109223), mean=0.07874288803329178, variance=0.006845741873721701, skewness=2.393753914341554, kurtosis=6.417018099468153)
    14216 other2core spines, volume: 0.078±0.08 µm3
    DescribeResult(nobs=14216, minmax=(0.0069091717153331, 0.5923887058109223), mean=0.07759392306985267, variance=0.006462010221758793, skewness=2.3519409185640088, kurtosis=6.09507362313156)
    105657 other2other spines, volume: 0.080±0.08 µm3
    DescribeResult(nobs=105657, minmax=(0.0069091717153331, 0.5923887058109223), mean=0.08014096184459575, variance=0.0071181490931690

### Non-Ca-imaged and outside EM volume inputs (panel 2A, last two boxes)

Core responses could be due to non-imaged and outside volume sources. How can we rule this out (or reduce our lack of knowledge)?   
We can ask *Are there more or stronger spines made by non-imaged neurons (either local or far) on cores or others?*   
We have this information since we know the cell ID of all somas in the volume. We can take the spines having presynaptic ID different from the known Ca-imaged IDs or different from the somas within the EM volume.

In [16]:
print("... postsynaptic spines on cores or others from sources non-imaged or without soma in the volume")
far2core_spine_vol = [] # µm3
far2other_spine_vol = []
set_ids = set(ophys_cell_ids)
for dyn_core_ids in clusters_cores:
    dyn_other_ids = set_ids.symmetric_difference(dyn_core_ids)
    # searching
    # id, pre_root_id, post_root_id, cleft_vx, spine_vol_um3
    far2core_synapse_df = postsyn_spines_df.query('(pre_root_id not in @set_ids) and (post_root_id in @dyn_core_ids)')
    if not far2core_synapse_df.empty:
        far2core_spine_vol.extend( far2core_synapse_df['spine_vol_um3'].tolist() )
    far2other_synapse_df = postsyn_spines_df.query('(pre_root_id not in @set_ids) and (post_root_id in @dyn_other_ids)')
    if not far2other_synapse_df.empty:
        far2other_spine_vol.extend( far2other_synapse_df['spine_vol_um3'].tolist() )
        
# description
print("    {:d} far2core spines, volume: {:1.3f}±{:1.2f} µm3".format(len(far2core_spine_vol), np.mean(far2core_spine_vol),np.std(far2core_spine_vol)) )
print("    "+str(stats.describe(far2core_spine_vol)) )
print("    {:d} far2other spines, volume: {:1.3f}±{:1.2f} µm3".format(len(far2other_spine_vol), np.mean(far2other_spine_vol),np.std(far2other_spine_vol)) )
print("    "+str(stats.describe(far2other_spine_vol)) )

# significativity
kwstat,pval = stats.kruskal(far2core_spine_vol, far2other_spine_vol)
print("    far-core vs far-other spine size Kruskal-Wallis test results:",kwstat,pval)
if len(far2core_spine_vol)>0 and len(far2other_spine_vol)>0:
    d,_ = stats.ks_2samp(far2core_spine_vol, far2other_spine_vol) # non-parametric measure of effect size [0,1]
    print('    Kolmogorov-Smirnov Effect Size: %.3f' % d)

# all spine volumes by type
fig, ax = plt.subplots()
xs = np.random.normal(1, 0.04, len(far2core_spine_vol))
plt.scatter(xs, far2core_spine_vol, edgecolor='forestgreen', facecolor=('#228B224d'))
xs = np.random.normal(2, 0.04, len(far2other_spine_vol))
plt.scatter(xs, far2other_spine_vol, edgecolor='silver', facecolor=('#D3D3D34d'))
vp = ax.violinplot([far2core_spine_vol,far2other_spine_vol], widths=0.3, showextrema=False, showmedians=True)
for pc in vp['bodies']:
    pc.set_edgecolor('black')
for pc,cb in zip(vp['bodies'],['#228B224d','#D3D3D34d']):
    pc.set_facecolor(cb)
ax.spines['top'].set_visible(False)
ax.spines['bottom'].set_visible(False)
ax.spines['left'].set_visible(False)
ax.spines['right'].set_visible(False)
plt.ylabel('Spine Volume (µm^3)')
plt.xticks([1, 2], ["far-core\n(n={:d})".format(len(far2core_spine_vol)), "far-other\n(n={:d})".format(len(far2other_spine_vol))])
fig.savefig(exp_path+'/results/global_far_cores_others_spine_vol.svg', transparent=True)
fig.savefig(exp_path+'/results/global_far_cores_others_spine_vol.png', transparent=True, dpi=1200)
plt.close()
fig.clf()

... postsynaptic spines on cores or others from sources non-imaged or without soma in the volume
    4693 far2core spines, volume: 0.082±0.08 µm3
    DescribeResult(nobs=4693, minmax=(0.0045489328169941, 0.5061542490551949), mean=0.08184275456291715, variance=0.00678000711236265, skewness=1.9177180498107906, kurtosis=3.6625919947332726)
    35475 far2other spines, volume: 0.081±0.08 µm3
    DescribeResult(nobs=35475, minmax=(0.0045489328169941, 0.5061542490551949), mean=0.08052572966146575, variance=0.007194358612202637, skewness=2.127805468310405, kurtosis=4.738962873722818)
    far-core vs far-other spine size Kruskal-Wallis test results: 4.165532770873525 0.04125443665776142
    Kolmogorov-Smirnov Effect Size: 0.025


### 3-Motif connectivity of cores and others (panel 2C)

In [17]:
# For each set of reproducible cluster cores we count their connectivity motifs.
set_indexes = set(ophys_cell_indexes)
for dyn_core in clusters_cores:
    dyn_core_indexes = set([ophys_cell_ids.index(strid) for strid in dyn_core])
    dyn_other_indexes = set_indexes.symmetric_difference(dyn_core_indexes)
    for mclass, mlist in motif_vertices.items():
        for mtriplet in mlist:
            intersection_cores = len(list(dyn_core_indexes.intersection(mtriplet)))
            intersection_others = len(list(dyn_other_indexes.intersection(mtriplet)))
            global_structural_motif_cores[mclass] += intersection_cores
            global_structural_motif_others[mclass] += intersection_others

fig = plt.figure()
plt.bar(global_structural_motif_cores.keys(), global_structural_motif_cores.values(), color='forestgreen')
plt.ylabel('cores occurrences')
plt.yscale('log')
plt.ylim([0.7,plt.ylim()[1]])
plt.xlabel('motifs types')
fig.savefig(exp_path+'/results/global_motifs_cores.svg', transparent=True)
plt.close()
fig.clear()
fig.clf()
fig = plt.figure()
plt.bar(global_structural_motif_others.keys(), global_structural_motif_others.values(), color='silver')
plt.ylabel('non-cores occurrences')
plt.yscale('log')
plt.ylim([0.7,plt.ylim()[1]])
plt.xlabel('motifs types')
fig.savefig(exp_path+'/results/global_motifs_others.svg', transparent=True)
plt.close()
fig.clear()
fig.clf()
print("... saved mutual connectivity of cores and others")

... saved mutual connectivity of cores and others


In [18]:
# dgraph is already defined from the structural_analysis included file
print("    graph diameter (#vertices):", dgraph.diameter(directed=True, unconn=True, weights=None))
print("    graph average path length (#vertices):", dgraph.average_path_length(directed=True, unconn=True))

    graph diameter (#vertices): 7
    graph average path length (#vertices): 2.5824324324324324


In [19]:
dgraph.vs["ophys_cell_id"] = ophys_cell_ids
is_id_core = np.array( [0] * len(ophys_cell_ids) )
is_id_core[core_indexes] = 1
dgraph.vs["is_core"] = is_id_core.tolist()
is_syn_core = np.array( [0] * len(pyc_ca_syn_df) )
for cid in [item for sublist in clusters_cores for item in sublist]:
    is_syn_core[pyc_ca_syn_df['pre_root_id'] == cid] = 1
dgraph.es["is_core"] = is_syn_core.tolist()
color_dict = {0: "gray", 1: "green"}
ig.plot(dgraph, exp_path+'/results/all_ring.svg', layout=dgraph.layout("circle"),
        edge_curved=0.2,
        edge_color=[color_dict[is_core] for is_core in dgraph.es["is_core"]],
        edge_width=0.5,
        edge_arrow_size=0.1,
        vertex_size=5,
        vertex_color=[color_dict[is_core] for is_core in dgraph.vs["is_core"]],
        vertex_frame_color=[color_dict[is_core] for is_core in dgraph.vs["is_core"]],
        margin=50)
print('... assortativity')
# is a preference for a network's nodes to attach to others that are similar in some way
print("    overall:", dgraph.assortativity_nominal("is_core", directed=True) )
# cores degree distro vs others degree distro
# biological networks typically show negative assortativity, or disassortative mixing, or disassortativity, as high degree nodes tend to attach to low degree nodes.
print("    assortativity degree:", dgraph.assortativity_degree(directed=True) )
# the proportion of mutual connections in a directed graph.
print("    reciprocity:", dgraph.reciprocity(ignore_loops=True, mode='default') )

... assortativity
    overall: -0.041273885350318486
    assortativity degree: -0.08993903571766572
    reciprocity: 0.17627118644067796


### Degree centrality of cores and others (panel 2D)

In [20]:
print('... degree centrality')
degree_centrality_cores = dgraph.degree(core_indexes, mode='all', loops=True)
degree_centrality_others = dgraph.degree(other_indexes, mode='all', loops=True)
# description
print("    cores: "+str(stats.describe(degree_centrality_cores)) )
print("    others: "+str(stats.describe(degree_centrality_others)) )
# significativity
print("    Welch t test:  %.3f p= %.3f" % stats.ttest_ind(degree_centrality_cores, degree_centrality_others, equal_var=False))
d,_ = stats.ks_2samp(degree_centrality_cores, degree_centrality_others) # non-parametric measure of effect size [0,1]
print('    Kolmogorov-Smirnov Effect Size: %.3f' % d)

fig, ax = plt.subplots()
xs = np.random.normal(1, 0.04, len(degree_centrality_cores))
plt.scatter(xs, degree_centrality_cores, alpha=0.3, c='forestgreen')
xs = np.random.normal(2, 0.04, len(degree_centrality_others))
plt.scatter(xs, degree_centrality_others, alpha=0.3, c='silver')
vp = ax.violinplot([degree_centrality_cores,degree_centrality_others], widths=0.3, showextrema=False, showmedians=True)
for pc in vp['bodies']:
    pc.set_edgecolor('black')
for pc,cb in zip(vp['bodies'],['#228B224d','#D3D3D34d']):
    pc.set_facecolor(cb)
vp['cmedians'].set_color('orange')
ax.spines['top'].set_visible(False)
ax.spines['bottom'].set_visible(False)
ax.spines['left'].set_visible(False)
ax.spines['right'].set_visible(False)
plt.ylabel('Degree')
plt.xticks([1, 2], ["core\n(n={:d})".format(len(degree_centrality_cores)), "other\n(n={:d})".format(len(degree_centrality_others))])
fig.savefig(exp_path+'/results/global_cores_others_degree.svg', transparent=True)
plt.close()
fig.clf()

... degree centrality
    cores: DescribeResult(nobs=88, minmax=(0, 45), mean=6.1022727272727275, variance=88.71355799373042, skewness=2.1119993597130113, kurtosis=4.1256883192165175)
    others: DescribeResult(nobs=24, minmax=(0, 15), mean=3.0416666666666665, variance=21.432971014492754, skewness=2.0273508499589137, kurtosis=2.516557870224295)
    Welch t test:  2.220 p= 0.029
    Kolmogorov-Smirnov Effect Size: 0.216


### Betweenness of cores and others (panel 2E)

In [21]:
print('... betweenness')
cores_betweenness = betweenness_centrality[core_indexes]
others_betweenness = betweenness_centrality[other_indexes]
cores_betweenness[cores_betweenness<0.0001] = 0.0001 # for later stats and plotting
others_betweenness[others_betweenness<0.0001] = 0.0001
print("    cores: "+str(stats.describe(cores_betweenness)) )
print("    others: "+str(stats.describe(others_betweenness)) )
# significativity
print("    Welch t test:  %.3f p= %.3f" % stats.ttest_ind(cores_betweenness, others_betweenness, equal_var=False))
d,_ = stats.ks_2samp(cores_betweenness, others_betweenness) # non-parametric measure of effect size [0,1]
print('    Kolmogorov-Smirnov Effect Size: %.3f' % d)

fig, ax = plt.subplots()
xs = np.random.normal(1, 0.04, len(cores_betweenness))
plt.scatter(xs, cores_betweenness, alpha=0.3, c='forestgreen')
xs = np.random.normal(2, 0.04, len(others_betweenness))
plt.scatter(xs, others_betweenness, alpha=0.3, c='silver')
vp = ax.violinplot([cores_betweenness,others_betweenness], widths=0.3, showextrema=False, showmedians=True)
for pc in vp['bodies']:
    pc.set_edgecolor('black')
for pc,cb in zip(vp['bodies'],['#228B224d','#D3D3D34d']):
    pc.set_facecolor(cb)
vp['cmedians'].set_color('orange')
ax.spines['top'].set_visible(False)
ax.spines['bottom'].set_visible(False)
ax.spines['left'].set_visible(False)
ax.spines['right'].set_visible(False)
plt.ylabel('Betweenness')
# plt.yscale('log')
plt.xticks([1, 2], ["core\n(n={:d})".format(len(cores_betweenness)), "other\n(n={:d})".format(len(others_betweenness))])
fig.savefig(exp_path+'/results/global_cores_others_betweenness.svg', transparent=True)
plt.close()
fig.clf()

... betweenness
    cores: DescribeResult(nobs=88, minmax=(0.0001, 617.3584776334777), mean=38.31566688803619, variance=11380.497190231206, skewness=3.768333135018123, kurtosis=15.106092348481077)
    others: DescribeResult(nobs=24, minmax=(0.0001, 82.43214285714285), mean=5.8845839105339115, variance=409.60474942908024, skewness=3.1726288362792947, kurtosis=8.410869197953964)
    Welch t test:  2.680 p= 0.009
    Kolmogorov-Smirnov Effect Size: 0.178


### Hub scores of cores and others (panel 2F)

Are the cores also hubs of the network?

In [22]:
print("... hub score")
# what is the overlap of cores and hubs?
# Hub
hub_scores = np.array(dgraph.hub_score(weights=None, scale=True, return_eigenvalue=False))
hub_scores_cores = hub_scores[core_indexes]
hub_scores_others = hub_scores[other_indexes]
print("    hub cores: "+str(stats.describe(hub_scores_cores)) )
print("    hub others: "+str(stats.describe(hub_scores_others)) )
# significativity
print("    Welch t test:  %.3f p= %.3f" % stats.ttest_ind(hub_scores_cores, hub_scores_others, equal_var=False))
d,_ = stats.ks_2samp(hub_scores_cores, hub_scores_others) # non-parametric measure of effect size [0,1]
print('    Kolmogorov-Smirnov Effect Size: %.3f' % d)
# all eccentricity by type
fig, ax = plt.subplots()
xs = np.random.normal(1, 0.04, len(hub_scores_cores))
plt.scatter(xs, hub_scores_cores, alpha=0.3, c='forestgreen')
xs = np.random.normal(2, 0.04, len(hub_scores_others))
plt.scatter(xs, hub_scores_others, alpha=0.3, c='silver')
vp = ax.violinplot([hub_scores_cores,hub_scores_others], widths=0.3, showextrema=False, showmedians=True)
for pc in vp['bodies']:
    pc.set_edgecolor('black')
for pc,cb in zip(vp['bodies'],['#228B224d','#D3D3D34d']):
    pc.set_facecolor(cb)
vp['cmedians'].set_color('orange')
ax.spines['top'].set_visible(False)
ax.spines['bottom'].set_visible(False)
ax.spines['left'].set_visible(False)
ax.spines['right'].set_visible(False)
plt.ylabel('Hub score')
plt.xticks([1, 2], ["core\n(n={:d})".format(len(hub_scores_cores)), "other\n(n={:d})".format(len(hub_scores_others))])
fig.savefig(exp_path+'/results/global_cores_others_hub_score.svg', transparent=True)
plt.close()
fig.clf()

... hub score
    hub cores: DescribeResult(nobs=88, minmax=(0.0, 1.0), mean=0.16632219693795944, variance=0.05505671254257365, skewness=1.727342180182276, kurtosis=1.9688198150126173)
    hub others: DescribeResult(nobs=24, minmax=(0.0, 0.6583271768882885), mean=0.08454815571881226, variance=0.018204968226009473, skewness=3.3686065750336445, kurtosis=11.919146282777985)
    Welch t test:  2.198 p= 0.032
    Kolmogorov-Smirnov Effect Size: 0.223


### Flow of cores or others (panel 2G)

So far, we used structural (graph) measures of neurons selected by looking at their activity.   
So, in a sense, we already crossed structural and dynamical information about the network.    
However, we could push this even further.   
To understand how core centrality could affect population events, we considered the flow – number and identity of connections to cut to interrupt the circuit between the first and the last firing neuron of each population event (e.g. the subgraphs made by neurons active in the events depicted in Fig. 1E). 

In [None]:
print('... flow between beginning and end of event cells')
# Flow
# Returns all the cuts between the source and target vertices in a directed graph.
# This function lists all edge-cuts between a source and a target vertex. Every cut is listed exactly once.
core_edges = []
other_edges = []
for sts,stscol in zip(source_target_cidx,source_target_color):
    cuts = dgraph.all_st_cuts(source=sts[0], target=sts[1])
    for cut in cuts:
        for edge in cut.es:
            source_vertex_id = edge.source
            target_vertex_id = edge.target
            if source_vertex_id in core_indexes:
                core_edges.append(source_vertex_id)
            elif target_vertex_id in core_indexes:
                core_edges.append(target_vertex_id)
            else:
                other_edges.append(source_vertex_id)
                other_edges.append(target_vertex_id)
# clusters_cores_by_color
cores_edges_count = sum(np.unique(core_edges, return_counts=True)[1])
others_edges_count = sum(np.unique(other_edges, return_counts=True)[1])
print("    cores in the edges removed to stop the flow:",cores_edges_count)
print("    others in the edges removed to stop the flow:",others_edges_count)

# print(core_edges)
x = np.array(["cores", "others"])
y = np.array([cores_edges_count, others_edges_count])
fig, ax = plt.subplots()
plt.bar(x, y, color=['forestgreen','silver'])
ax.spines['top'].set_visible(False)
ax.spines['bottom'].set_visible(False)
ax.spines['left'].set_visible(False)
ax.spines['right'].set_visible(False)
plt.ylabel('Count of cutting-flow edges')
plt.xticks([0, 1], ["core\n(n={:d})".format(cores_edges_count), "other\n(n={:d})".format(others_edges_count)])
fig.savefig(exp_path+'/results/global_cores_others_flow.svg', transparent=True)
plt.close()
fig.clf()

... flow between beginning and end of event cells


---
## Supplementary figure 3
   
To have keep cores within the attractor framework, cores activity could be sustained by indirect synaptic feedback, through highly connected secondary paths.   
To back up the attractor idea, one would expect that core neurons would have shorter paths or cycles, compared to others. 

### Shortest paths of cores and others (panel S3A)

In [23]:
print('... number of shortest paths between cores')
core_shortestpaths = []
for coreidx in core_indexes:
    othercores = list(core_indexes)
    othercores.remove(coreidx)
    shrtpth = dgraph.get_shortest_paths(coreidx, to=othercores, weights=None, mode='out', output='vpath')
    for strp in shrtpth:
        core_shortestpaths.append(len(strp))
other_shortestpaths = []
for otheridx in other_indexes:
    otherothers = list(other_indexes)
    otherothers.remove(otheridx)
    shrtpth = dgraph.get_shortest_paths(otheridx, to=otherothers, weights=None, mode='out', output='vpath')
    for strp in shrtpth:
        other_shortestpaths.append(len(strp))
print("    cores shortest paths: "+str(stats.describe(core_shortestpaths)) )
print("    others shortest paths: "+str(stats.describe(other_shortestpaths)) )
print("    equal variances? "+str(stats.levene(core_shortestpaths, other_shortestpaths)) )
# significativity
print("    Welch t test:  %.3f p= %.3f" % stats.ttest_ind(core_shortestpaths, other_shortestpaths, equal_var=False))
d,_ = stats.ks_2samp(core_shortestpaths, other_shortestpaths) # non-parametric measure of effect size [0,1]
print('    Kolmogorov-Smirnov Effect Size: %.3f' % d)
fig, ax = plt.subplots()
xs = np.random.normal(1, 0.04, len(core_shortestpaths))
plt.scatter(xs, core_shortestpaths, alpha=0.3, c='forestgreen')
xs = np.random.normal(2, 0.04, len(other_shortestpaths))
plt.scatter(xs, other_shortestpaths, alpha=0.3, c='silver')
vp = ax.violinplot([core_shortestpaths,other_shortestpaths], widths=0.3, showextrema=False, showmedians=True)
for pc in vp['bodies']:
    pc.set_edgecolor('black')
for pc,cb in zip(vp['bodies'],['#228B224d','#D3D3D34d']):
    pc.set_facecolor(cb)
vp['cmedians'].set_color('orange')
ax.spines['top'].set_visible(False)
ax.spines['bottom'].set_visible(False)
ax.spines['left'].set_visible(False)
ax.spines['right'].set_visible(False)
plt.ylabel('Shortest path length')
plt.xticks([1, 2], ["core\n(n={:d})".format(len(core_shortestpaths)), "other\n(n={:d})".format(len(other_shortestpaths))])
fig.savefig(exp_path+'/results/global_cores_others_shortestpath.svg', transparent=True)
plt.close()
fig.clf()

... number of shortest paths between cores
    cores shortest paths: DescribeResult(nobs=7656, minmax=(0, 8), mean=0.6973615464994776, variance=2.2021929070201556, skewness=1.9183980841905637, kurtosis=2.3952491144903725)
    others shortest paths: DescribeResult(nobs=552, minmax=(0, 6), mean=0.4329710144927536, variance=1.5236296325521452, skewness=2.6288663230231437, kurtosis=5.331450467096188)
    equal variances? LeveneResult(statistic=16.688595530457878, pvalue=4.446402238140857e-05)
    Welch t test:  4.789 p= 0.000
    Kolmogorov-Smirnov Effect Size: 0.083


  shrtpth = dgraph.get_shortest_paths(coreidx, to=othercores, weights=None, mode='out', output='vpath')
  shrtpth = dgraph.get_shortest_paths(otheridx, to=otherothers, weights=None, mode='out', output='vpath')


### Cycles between cores or others (panel S3B)

Cycles are built starting from a core (or other) and iterating neighbors of different lenghts, where the last vertex is the starting one.

In [1]:
print('... cycles')
# breadth first search of paths and unique cycles
def get_cycles(adj, paths, maxlen):
    # tracking the actual path length:
    maxlen -= 1
    nxt_paths = []
    # iterating over all paths:
    for path in paths['paths']:
        # iterating neighbors of the last vertex in the path:
        for nxt in adj[path[-1]]:
            # attaching the next vertex to the path:
            nxt_path = path + [nxt]
            if path[0] == nxt and min(path) == nxt:
                # the next vertex is the starting vertex, we found a cycle
                # we keep the cycle only if the starting vertex has the
                # lowest vertex id, to avoid having the same cycles
                # more than once
                paths['cycles'].append(nxt_path)
                # if you don't need the starting vertex
                # included at the end:
                # paths$cycles <- c(paths$cycles, list(path))
            elif nxt not in path:
                # keep the path only if we don't create
                # an internal cycle in the path
                nxt_paths.append(nxt_path)
    # paths grown by one step:
    paths['paths'] = nxt_paths
    if maxlen == 0:
        # the final return when maximum search length reached
        return paths
    else:
        # recursive return, to grow paths further
        return get_cycles(adj, paths, maxlen)
# Comparison of core based cycles vs other based cycles
maxlen = 10 # the maximum length to limit computation time
# creating an adjacency list
adj = [[n.index for n in v.neighbors()] for v in dgraph.vs]
# recursive search of cycles
# for each core vertex as candidate starting point
core_cycles = []
for start in core_indexes:
    core_cycles += get_cycles(adj,{'paths': [[start]], 'cycles': []}, maxlen)['cycles']
print("    # core-based cycles:", len(core_cycles) )
# count the length of loops involving 1 core
core_cycles_lens = [len(cycle) for cycle in core_cycles]
print("    core-based cycles length: "+str(stats.describe(core_cycles_lens)) )

other_cycles = []
for start in other_indexes:
    other_cycles += get_cycles(adj,{'paths': [[start]], 'cycles': []}, maxlen)['cycles']
print("    # other-based cycles:", len(other_cycles) )
# count the length of loops involving 1 core
other_cycles_lens = [len(cycle) for cycle in other_cycles]
print("    other-based cycles length: "+str(stats.describe(other_cycles_lens)) )

d,_ = stats.ks_2samp(core_cycles_lens, other_cycles_lens) # non-parametric measure of effect size [0,1]
print('    Kolmogorov-Smirnov Effect Size: %.3f' % d)
# all cycles by type
fig, ax = plt.subplots()
xs = np.random.normal(1, 0.04, len(core_cycles_lens))
plt.scatter(xs, core_cycles_lens, alpha=0.3, c='forestgreen')
xs = np.random.normal(2, 0.04, len(other_cycles_lens))
plt.scatter(xs, other_cycles_lens, alpha=0.3, c='silver')
bp = ax.boxplot([core_cycles_lens,other_cycles_lens], notch=0, sym='', showcaps=False, zorder=10)
ax.spines['top'].set_visible(False)
ax.spines['bottom'].set_visible(False)
ax.spines['left'].set_visible(False)
ax.spines['right'].set_visible(False)
plt.ylabel('Cycles length')
plt.xticks([1, 2], ["core\n(n={:d})".format(len(core_cycles_lens)), "other\n(n={:d})".format(len(other_cycles_lens))])
fig.savefig(exp_path+'/results/global_cores_others_cyclelens.png', transparent=True, dpi=1500)
plt.close()
fig.clf()

... cycles


NameError: name 'dgraph' is not defined