#### **Creating more custom plots, figures, and graphs.**

The current notebook will generate some figures found in Gagnon at al. 2024. Most of them have been created to highlight some specific results that were not already generated during the actual analysis. *Please note, this notebook doesn't perform any analysis, it simply reuse previously computed results and visualize them.*

**Here is a list of the custom figures that will created with the following cells:**

1. Radar plot showing mean cognitive and behavioral values and stds for all studies combined. Also generate a radar plot for each individual study.
1. Graph network with data coming from the BANDA and GESTE studies labelled. 

In [1]:
# Imports
import os

import networkx as nx
import numpy as np
import pandas as pd
from scipy.stats import f_oneway
from sklearn.preprocessing import MinMaxScaler

from neurostatx.clustering.viz import radar_plot
from neurostatx.network.utils import fetch_attributes_df, fetch_edge_data
from neurostatx.network.viz import visualize_network

In [2]:
# Setting up relevant paths.
repository_path = "/Users/anthonygagnon/code/Article-s-Code/" # CHANGE THIS
abcd_base_path = "/Volumes/T7/CCPM/ABCD/Release_5.1/abcd-data-release-5.1/" # CHANGE THIS
geste_base_dir = "/Volumes/T7/CCPM/GESTE/" # CHANGE THIS
banda_dir = '/Volumes/T7/CCPM/BANDA/BANDARelease1.1/' # CHANGE THIS
output_folder = "/Volumes/T7/CCPM/RESULTS_JUNE_24/" # CHANGE THIS
data_dir = f"{output_folder}/fuzzyclustering/"
output_dir = f"{output_folder}/viz/"

# Create output directory if it does not exist.
if not os.path.exists(output_dir):
    os.makedirs(output_dir)

In [3]:
# Load up the graph network file.
G = nx.read_gml(f"{data_dir}/GraphNetwork.gml")

In [4]:
# Fetch the attributes from the graph network.
attributes_df = fetch_attributes_df(G, attributes='')

# Fetch the edge data. 
edge_df = fetch_edge_data(G)



#### **Generate a radar plot for the global population (all 3 studies).**

In [5]:
# Assertion that index from the attributes df is the same as the edge df.
assert np.all(attributes_df.index == edge_df.index), "Mismatch in index between attributes and edge data."

In [6]:
# Scale the values by dividing them by the maximum value. Using a loop to avoid hardcoding.
vars = ['Internalization', 'Externalization', 'Stress', 'VA', 'EFPS', 'MEM']

for var in vars:
    attributes_df.loc[:, var] = MinMaxScaler((0, 5)).fit_transform(attributes_df[[var]])

In [7]:
membership = np.argmax(edge_df.values, axis=1)
radar_plot(attributes_df.loc[:, vars], membership, title='FCM Clustering',
           output=f"{output_dir}/RadarPlotCombined.png")

In [8]:
# Generating radar plot for each dataset.
attributes_df.loc[:, 'profiles'] = membership

abcd_df = attributes_df[attributes_df.cohort == 1]
banda_df = attributes_df[attributes_df.cohort == 2]
geste_df = attributes_df[attributes_df.cohort == 3]

radar_plot(abcd_df.loc[:, vars], abcd_df.profiles, title='ABCD Clustering',
           output=f"{output_dir}/RadarPlotABCD.png")
radar_plot(banda_df.loc[:, vars], banda_df.profiles, title='BANDA Clustering',
           output=f"{output_dir}/RadarPlotBANDA.png")
radar_plot(geste_df.loc[:, vars], geste_df.profiles, title='GESTE Clustering',
           output=f"{output_dir}/RadarPlotGESTE.png")

#### **Exporting results from the one-way ANOVA between profiles on the raw cognitive and behavioral variables.**

When generating the radar plot, a one-way ANOVA is computed to determine the statistical difference between profiles for each raw variable. However, results are not exported in tabular format but appended to the radar plot. The next cells will compute the ANOVA, and export the results in a table. The exported table will include results from the combined and invidual studies.

In [25]:
# Computing the ANOVA for the combined dataset.
anova_combined = []
for var in vars:
    f, p = f_oneway(*[attributes_df.loc[attributes_df.profiles == i, var] for i in np.unique(attributes_df.profiles)])
    anova_combined.append([var, f, p])


In [26]:
# Computing the ANOVA for each dataset.
anova_abcd = []
anova_banda = []
anova_geste = []

for var in vars:
    f, p = f_oneway(*[abcd_df.loc[abcd_df.profiles == i, var] for i in np.unique(abcd_df.profiles)])
    anova_abcd.append([var, f, p])

    f, p = f_oneway(*[banda_df.loc[banda_df.profiles == i, var] for i in np.unique(banda_df.profiles)])
    anova_banda.append([var, f, p])

    f, p = f_oneway(*[geste_df.loc[geste_df.profiles == i, var] for i in np.unique(geste_df.profiles)])
    anova_geste.append([var, f, p])

In [27]:
# Merging into a single Pandas DataFrame.
anova_combined_df = pd.DataFrame(anova_combined, columns=['Variable', 'F_comb', 'p_comb'])
anova_combined_df.set_index('Variable', inplace=True)
anova_abcd_df = pd.DataFrame(anova_abcd, columns=['Variable', 'F_abcd', 'p_abcd'])
anova_abcd_df.set_index('Variable', inplace=True)
anova_banda_df = pd.DataFrame(anova_banda, columns=['Variable', 'F_banda', 'p_banda'])
anova_banda_df.set_index('Variable', inplace=True)
anova_geste_df = pd.DataFrame(anova_geste, columns=['Variable', 'F_geste', 'p_geste'])
anova_geste_df.set_index('Variable', inplace=True)

# Merging the DataFrames.
anova_df = pd.concat([anova_combined_df, anova_abcd_df, anova_banda_df, anova_geste_df], axis=1)
anova_df.to_excel(f"{output_dir}/ANOVA_results.xlsx")


#### **Generate a Graph Network file with GESTE and BANDA data labelled.**

The next cells will generate a graph network figure highlight subjects coming from the BANDA and GESTE study. The main purpose of this figure is to evaluated the distribution of both studies across the graph network. Since it is projected onto the ABCD clustering results, we want to ensure that it covers the majority of the graph, and not only specific regions.

In [47]:
# Generate colormap for the different cohorts.
label = attributes_df['cohort'].values

nodes_cmap = []
for i in label:
    if i == 1:
        nodes_cmap.append("darkgrey")
    elif i == 2:
        nodes_cmap.append("red")
    else:
        nodes_cmap.append("orange")

# Create node alpha list.
nodes_alpha = []
for i in nodes_cmap:
    if i == "darkgrey":
        nodes_alpha.append(0.3)
    else:
        nodes_alpha.append(1)

In [48]:
# Visualize the network.
visualize_network(G, output=f'{output_dir}/NetworkCohort.png',
                  weight='membership',
                  subject_node_color=nodes_cmap,
                  subject_alpha=nodes_alpha,
                  colormap='bone',
                  title='Cohort labelling.',
                  legend_title='Membership Values')   