# Null distribution of agreement between LSNs

Here, agreement matrices are calculated for data driven community partitions. Agreement matrix in an n-by-n matrix calculated for the set of community partitions (this set can also contain single partition). An element of the agreement matrix $D_{ij}$ indicated how many times nodes $i$ and $j$ were placed within the same community. If the input set of partitions consist of single partition, agreement can be either 0 (indicating that nodes are not a part of the same community) or 1 (indicating that both nodes share the same community). Agreement matrix can be summarized by calculating block means over reference communities. This procedure decreases the size of agreement matrix from $N_{roi}\times N_{roi}$ to $N_{networks}\times N_{networks}$, where $N_{networks}$ is number of referecence communities. 

Block mean agreement matrix, called LSN agreement, `d_networks` indicates for every pair of communities **what is the probability that two randomly selected ROIs (one from the first community and another from the second community) will be placed within the same data-driven community**. Diagonal elements of `d_networks` matrix reflects how stable is given community, with the value of 1 indicating that any two ROIs from that are part of the same data-driven communtiy. In other words, that means that in data-driven community structure such community exists that it at least includes all ROIs from reference community. Off-diagonal elements of `d_networks` matrix reflects tendency for two reference communities to be placed within the same data-driven community. High agreement values indicate increased communication or similarity between communities.    

In order to test whether LSN agreement differs between task conditions Monte Carlo strategy for testing significance can be employed. This strategy requires creating null distribution of statistical test of interest. For example, ...

In [None]:
import json
from os.path import join

import numpy as np
import pandas as pd
from dn_utils.networks import agreement_networks
from dn_utils.path import path
from tqdm.notebook import tqdm

### Settings

In [None]:
atlas = "combined_roi"

n_nulls = 10_000
gamma_range = np.arange(0.25, 3.5, 0.25)

In [None]:
# Load correlation matrices and metadata
path_corrmats = join(path["bsc"], "corrmats")
with open(join(path_corrmats, atlas, "corrmats_aggregated.json"), "r") as f:
    corrmats_meta = json.loads(f.read()) 

# Load ROI information
df_roi = pd.read_csv(
    join(path_corrmats, atlas, "roi_table_filtered.csv"), index_col=0)
df_roi = df_roi.reset_index()

n_subjects = len(corrmats_meta["dim1"])
n_conditions = len(corrmats_meta["dim2"])
n_perr_sign = len(corrmats_meta["dim3"])
n_rois = len(corrmats_meta["dim4"])
n_nets = len(df_roi["netName"].unique())

In [None]:
network_names = df_roi["netName"].unique() 
network_mapping = {net: i for i, net in enumerate(network_names)}

networks = np.array(df_roi["netName"].map(network_mapping))
networks_unique = network_mapping.values()
print("Reference communities:\n", networks)

### Randomize networks

In [None]:
for gamma in gamma_range:
    print(f"γ = {gamma}")

    gamma_str = str(float(gamma)).replace('.', '_')
    path_graph = join(path_corrmats, atlas, "unthr", f"gamma_{gamma_str}")

    # Load graph metrics
    m_aggregated = np.load(join(path_graph, "m_aggregated.npy"))
    q_aggregated = np.load(join(path_graph, "q_aggregated.npy"))
    n_subjects, n_conditions, n_perr_sign, n_roi = m_aggregated.shape

    d = np.zeros((n_subjects, n_conditions, n_perr_sign, n_roi, n_roi))
    d_networks = np.zeros((n_subjects, n_conditions, n_perr_sign, n_nets, n_nets))
    d_networks_null = np.zeros((n_nulls, ) + d_networks.shape)

    network_names = list(df_roi["netName"].unique())

    for sub_idx in tqdm(range(n_subjects)):
        for con_idx in range(n_conditions):
            for perr_sign_idx in range(n_perr_sign):

                m = m_aggregated[sub_idx, con_idx, perr_sign_idx]
                
                # Agreement averaged over LSN pairs
                d_networks[sub_idx, con_idx, perr_sign_idx] = \
                    agreement_networks(m, networks, networks_unique)

                # Monte Carlo null distribution of averaged agreement
                for rep in range(n_nulls):
                    np.random.shuffle(m)
                    d_networks_null[rep, sub_idx, con_idx, perr_sign_idx] = \
                        agreement_networks(m, networks, networks_unique)
                    
    np.save(join(path_graph, "d_networks_null.npy"), d_networks_null)
    np.save(join(path_graph, "d_networks.npy"), d_networks)