Skip to content
Hang Yang edited this page Sep 4, 2023 · 9 revisions

Welcome to the Connectional_Hierarchy wiki!

Abstract

The human cerebral cortex is connected by intricate inter-areal structural wiring and functional communication at the macroscale. The cortical hierarchy from primary sensorimotor to higher-order association areas is a unifying organizational principle across various neurobiological properties; however, previous studies have not clarified whether the connections between cortical regions also exhibit a hierarchical pattern. Here, we identify a connectional hierarchy indexed by inter-individual variability of functional connectivity edges, which continuously progresses along a hierarchical gradient from within-network connections to between-network edges connecting sensorimotor and association networks. We found that this connectional hierarchy of variability aligns with both hemodynamic and electromagnetic connectivity strength and is constrained by structural connectivity strength. Moreover, the patterning of connectional hierarchy is related to inter-regional similarity in transcriptional and neurotransmitter receptor profiles. Using the Neurosynth cognitive atlas and cortical vulnerability maps in 13 brain disorders, we found that the connectional hierarchy of variability is associated with similarity networks of cognitive relevance and that of disorder vulnerability. Finally, we found that the prominence of the hierarchical gradient of connectivity variability declines during youth. Our results revealed a novel hierarchal organizational principle at the connectional level that links multiscale human connectomes to individual variability in connectivity.

Run the codes according to the order

step_01_individual_fc_variability

This folder contains codes to extract the parcel-level BOLD signal based on Schaefer-400 and Cammoun-033 atlas, calculate the functional connectivity, and estimate the individual variability of functional connectivity. These codes are almost identical for HCP-D and HCP (or HCP-YA), except for differences in some parameters, such as the session list. We estimated the inter-individual variability and intra-individual variability based on a linear fixed-effects (LME) model, and more details are described in FC variability.

(1) step_01_separate_dtseries_left_right.m

This script is used to separate the dtseries.nii into L.func.gii and R.func.gii, and exclude the medial wall. The HCP minimal preprocessed fMRI data were post-processed using xcp-d (https://xcp-d.readthedocs.io/). Post-processed steps include nuisance regression, filtering and smooth. Post-processed data *_space-fsLR_den-91k_desc-residual_smooth_bold.dtseries.nii for each subject was used in the following analyses.

(2) step_02_extract_parcel_signals_schaefer400.m

This script is used to extract the mean BOLD signal data using Schaefer-400 atlas. The session list is different between HCP-D and HCP. For HCP-D, Four rest runs and five task runs were concatenated and separated into eight runs to estimate inter-subject variability and intra-subject variability, respectively. For HCP, four rest runs were concatenated and separated into twelve runs to estimate inter-subject variability and intra-subject variability, respectively.

(2) step_02_extract_parcel_signals_cammoun033.m

The same as step_02_extract_parcel_signals_schaefer400.m, but used the Cammoun-033 atlas.

(3) step_03_calculate_fc_schaefer400.m

This script is used to calculate the functional connectivity. The functional connectivity was calculated as the Pearson correlation between timeseries from region pairs. Schaefer-400 atlas was used, generating a 400×400 FC matrix for each subject, and there are 79,800 unique edges (399×400/2) in each matrix. Particularly, time points with FD > 0.3 were excluded from the correlation analysis. Then, the group-level FC matrix was obtained by averaging the FC matrices across participants and runs in each dataset.

(3) step_03_calculate_fc_cammoun033.m

The same as step_03_calculate_fc_schaefer400.m, but used the Cammoun-033 atlas, generating a 68×68 FC matrix for each subject. There are 2,278 unique edges (67×68/2) in each matrix.

(4) step_04_prepare_lme_data_schaefer400.m

This script is used to prepare data for fc variability calculation based on the LME model in R. The data is saved as a 2D matrix, dim: (session × subject) × edge. Taking the HCP-D dataset as an example, there are 415 subjects, each subject has 8 runs (as described in step_03_calculate_fc_schaefer400.m), and each run generates a FC matrix with 79,800 unique edges. Therefore, the dimension of the data matrix is 3,320×79,800. As the data matrix is too large, we divided it into 100 parts, each part is 3,320×798. This script also generates the subID and seesion used in the LME model.

(4) step_04_prepare_lme_data_cammoun033.m

The same as step04_prepare_lme_data_schaefer400.m, but used the Cammoun-033 atlas, and we didn't split the data matrix (3,320×2,278) into 100 parts.

(5) step_05_lme_schaefer400.R

This script is used to estimate the inter-variability and intra-variability of functional connectivity using a linear mixed-effects (LME) model.

(5) step_05_lme_cammoun033.R

The same as step_02_extract_parcel_signals_schaefer400.m, but used the Cammoun-033 atlas.

(6) step_06_get_inter_individual_fc_variability.m

The script is used to get the inter-individual FC variability matrix for both Schaefer-400 and Cammoun-033 atlas. Particularly, we normalized the inter-individual variability matrix by dividing the corresponding intra-individual variability. Particularly, the limbic network was excluded from subsequent analysis due to a substantial signal loss, especially in the orbitofrontal and ventral temporal cortex.


step_02_connectional_hierarchy_of_inter_individual_fc_variability

This folder contains codes to generate figures and results of Figure 1. Individual variability of edge-wise functional connectivity reveals connectional hierarchy in the human brain. In this study, we used inter-individual FC variability as the index of connectional hierarchy.

(1) step_01_connectional_hierarchy_of_fc_variability.m

We plotted the inter-individual FC variability matrix at the nodal-level and network-level for both the HCP-D and HCP datasets. The similarity between the two datasets was measured by Spearman rank correlation. We grouped all edges into three types, including within-sensorimotor (S-S), within-association (A-A), and between sensorimotor-association (S-A), and compared their mean differences in FC variability using a permutation test (10,000 iterations).

(2) step_02_density_plot_nodal_level.py

Density plot for the similarity between FC variability of HCP-D and HCP at the nodal level.

(3) step_03_scatterplot_network_level.R

Scatter plot for the similarity between FC variability of HCP-D and HCP at the network level.

(4) step_04_barplot_connectional_hierarchy.R

Bar plot for the connectional hierarchy represented by within-network to between-network FC variability for each functional network (e.g., DM-DM, DM-FP, DM-VA, DM-DA, DM-SM, and DM-VS).

(5) step_05_barplot_S_A_axis.R

Bar plot for the FC variability values within-sensorimotor (S-S), within-association (A-A), and between sensorimotor-association (S-A).


step_03_bold_meg_functional_network

This folder contains codes to generate figures and results of Figure 2. Hemodynamic and electrophysiological functional connectome basis of connectional hierarchy. The group-averaged BOLD functional connectivity for each dataset was generated in step_03_calculate_fc_schaefer400.m. The MEG functional connectivity was estimated at six frequency bands, from delta (2 to 4 Hz), theta (5 to 7 Hz), alpha (8 to 12 Hz), beta (13 to 29 Hz), low-gamma (30 to 59 Hz) and high-gamma (60 to 90 Hz).

(1) step_01_bold_functional_connectivity.m

We plotted the BOLD functional connectivity matrix for both the HCP-D and HCP datasets, and calculated their similarity with FC variability using Spearman rank correlation.

(2) step_02_density_plot_bold_fc.py

Bar plot for the similarity between FC variability and BOLD FC for both HCP-D and HCP at the nodal level.

(3) step_03_meg_connectivity.m

We plotted the MEG functional connectivity matrix at the six frequency bands, and calculated their similarity with FC variability using Spearman rank correlation.

(4) step_04_barplot_fc_variability_meg_connectivity_corr.R

Bar plot for the similarity between FC variability and MEG connectivity for both HCP-D and HCP at the nodal level across the six frequency bands.


step_04_white_matter_structural_network

This folder contains codes to generate figures and results of Figure 3. Structural connectome basis of connectional hierarchy in FC variability. The probabilistic tractography was performed using MRtrix3, with 40 million streamlines, length range from 30mm to 250mm, FOD power = 0.33.

(1) step_01_structural_connectivity.m

We plotted the structural connectivity (SC) matrix for both the HCP-D and HCP datasets, and calculated their similarity with FC variability using Spearman rank correlation. The connection strength of each edge was log-transformed. Particularly, we defined a sc_mask in our study by setting the weight of the edge to be 0 for all participants if the connection strength of this edge was 0 in at least one participant. Then, we calculated the Spearman rank correlation of connection strength between SC network and individual variability network across all edges with non-zero streamline counts.

(2) step_02_density_plot.py

Density plot for the similarity between FC variability and Structural connectivity for both HCP-D and HCP at the nodal level.


step_05_correlated_gene_expression_network

This folder contains codes to generate figures and results of Figure 4. Transcriptional similarity of gene expression underlying connectional hierarchy. The gene expression data were obtained from the Allen Human Brain Atlas (AHBA, https://human.brain-map.org), which is a public transcriptional atlas contributed by DNA microarrays sampled from six postmortem human brains (5 males, aged 24-57). The microarray data were available in the left hemisphere for four donors, and two donors containing tissue samples from both hemispheres.

(1) step_01_get_gene_expressions.py

We used abagen toolbox (https://abagen.readthedocs.io/) to process the gene expression data, and got the regional gene expression data (400×15,633, 15,633 is the number of retained genes) based on Schaefer-400 atlas. The main function is presented below, and there are several key parameters:

expression, report = abagen.get_expression_data(schaefer['maps'], lr_mirror="bidirectional", missing="interpolate",sample_norm='srs', gene_norm='srs',return_donors=False, return_report=True)
  • lr_mirror="bidirectional" , as only two donors containing tissue samples from both hemispheres, to increase spatial coverage, tissue samples were mirrored bilaterally across the left and right hemispheres.
  • missing="interpolate" , If a brain region was not assigned a tissue sample, every voxel in the region was mapped to the nearest tissue sample from the donor in order to generate a dense, interpolated expression map. The average of these expression values was taken across all voxels in the region, weighted by the distance between each voxel and the sample mapped to it, in order to obtain an estimate of the parcellated expression values for the missing region.
  • sample_norm='srs' , gene_norm='srs' , we used scaled robust sigmoid to normalize microarray expression data.

(2) step_02_transcriptional_similarity.m

First, we constrained our analyses to 2,256 brain-specific genes (listed here) according to previous studies. Then, we constructed the correlated gene expression network (400×400) by computing the pairwise Pearson correlation between regional transcriptional profiles (1×2,256), and a higher inter-regional transcriptional similarity indicates that the two regions are modulated by similar genes. We plotted the correlated gene expression network at both nodal-level and network-level, and calculated their similarity with FC variability using Spearman rank correlation at the nodal level.

(3) step_03_density_plot.py

Density plot for the similarity between FC variability and Transcriptional similarity for both HCP-D and HCP at the nodal level.


step_06_receptor_similarity_network

This folder contains codes to generate figures and results of Figure 5. Network of neurotransmitter receptors and transporters expression shapes connectional hierarchy. In brief, the the PET images for 19 different neurotransmitter receptors and transporters were acquired from 1,238 healthy participants (718 males) across multiple studies. These receptors and transporters belong to nine different neurotransmitter systems, including dopamine, norepinephrine, serotonin, acetylcholine, glutamate, GABA, histamine, cannabinoid and opioid. The region×receptor matrix has been extracted by (Hansen et al., 2022) and is available at (https://github.com/netneurolab/hansen_receptors).

(1) step_01_receptor_similarity.m

First, we normalized the regional receptor expression data (400×19) using z-score. Then, we obtained the receptor similarity network (400×400) by computing the pairwise correlation between receptor density profiles (1×19). We plotted the receptor similarity network at both nodal-level and network-level, and calculated their similarity with FC variability using Spearman rank correlation at the nodal level.

(2) step_02_density_plot.py

Density plot for the similarity between FC variability and Receptor similarity for both HCP-D and HCP at the nodal level.


step_07_cognitive_similarity_network

This folder contains codes to generate figures and results of Figure 6. Implications of connectional hierarchy in cognitions. In brief, we acquired the activation probability maps for 123 cognitive terms (listed here) and parcellated these maps using the Schaefer-400 atlas. Then, we calculated the network of cognitive similarity and compared it with FC variability.

(1) step_01_get_neurosynth_activation_probability_maps.py

We used Neurosynth toolbox (https://github.com/neurosynth/neurosynth) to get the activation probability maps for the 123 cognitive terms, and there is a detailed tutorial provided by Neurosynth. The activation probability maps were stored in files named *association-test_z.nii.gz in the output of Neurosynth.

Particularly, there are several prerequisite files for the analysis, including the data-neurosynth_version-7_coordinates.tsv.gz, data-neurosynth_version-7_metadata.tsv.gz, data-neurosynth_version-7_vocab-terms_source-abstract_type-tfidf_features.npz, data-neurosynth_version-7_vocab-terms_vocabulary.txt. We downloaded these files from https://github.com/neurosynth/neurosynth-data, and stored here. The current dataset is version 0.7, released July, 2018. Files from this version have version-7 in the filenames.

  • Files ending in coordinates.tsv.gz contain the coordinates for different versions of the Neurosynth database. They contain one row for each coordinate. These files are tab-delimited and compressed, and they can be loaded with pandas.read_table().
  • Files ending in metadata.tsv.gz contain the metadata for different versions of the Neurosynth database. Each study (ID) is stored on its own line in the file. These IDs are in the same order as the id column of the associated coordinates.tsv.gz file, but the rows will differ because the coordinates file will contain multiple rows per study. They are also in the same order as the rows in any features.npz files for the same version. These files are tab-delimited and compressed, and they can be loaded with pandas.read_table().
  • Files ending in features.npz contain feature values for different types of "vocabularies". These files are stored as compressed, sparse matrices in order to reduce file size. Each file, once loaded and reconstructed into a dense matrix, contains one row per study and one column per label. Associated labels are stored in the files ending in vocabulary.txt and study IDs can be extracted from the associated metadata.tsv.gz file.
  • Files ending in vocabulary.txt contain a list of the labels associated with the accompanying features.npz file. Each label is stored on its own line in the file. These files match the columns of the associated features.npz file.
from neurosynth import Dataset
from neurosynth import meta
import numpy as np
import pandas as pd
from scipy import sparse
import os

file_path = '.../Connectional_Hierarchy/step_07_cognitive_similarity_network/'
# Create a new Dataset instance
dataset = Dataset(file_path + 'cognitive_atlas/database.txt')

# Reconstruct the feature data into a spreadsheet-like format
feature_data_sparse = sparse.load_npz(file_path + 'cognitive_atlas/data-neurosynth_version-7_vocab-terms_source-abstract_type-tfidf_features.npz')
feature_data = feature_data_sparse.todense()
metadata_df = pd.read_table(file_path + 'cognitive_atlas/data-neurosynth_version-7_metadata.tsv.gz')
ids = metadata_df['id'].tolist()
feature_names = np.genfromtxt(file_path + 'cognitive_atlas/data-neurosynth_version-7_vocab-terms_vocabulary.txt', dtype=str, delimiter='/t').tolist()
feature_df = pd.DataFrame(index=ids, columns=feature_names, data=feature_data)

# Add features to dataset
dataset.add_features(feature_df,duplicates = 'ignore')
dataset.save(file_path + 'cognitive_atlas/dataset.pkl')
# dataset = Dataset.load('dataset.pkl') # We can skip the above steps and load the dataset directly next time

base_dir = file_path + 'cognitive_atlas/images'
out_dir = os.path.abspath(file_path + 'cognitive_atlas')
os.makedirs(out_dir, exist_ok=True)

# Load the 123 cognitive terms we used in this work
cognitive_atlas = pd.read_csv(file_path + 'cognitive_atlas/cognitive_atlas.csv') # 123 cognitive terms
selected_term = cognitive_atlas['cognitive_terms'].to_list()

for term in selected_term:
    # Run a meta-analysis based on the selected term
    ids = dataset.get_studies(features=term)
    ma = meta.MetaAnalysis(dataset, ids)
    output_dir = os.path.join(base_dir,term)
    os.makedirs(output_dir)
    ma.save_results(output_dir, term)

(2) step_02_cognitive_similarity.m

First, we parcellated and concatenated the regional activation probability maps from 123 cognitive terms using the Schaefer-400 atlas (400×123). We next generated the network of cognitive relevance by computing the Pearson correlation of regional activation probability between each two brain regions across the 123 cognitive terms.

(3) step_03_density_plot.py

Density plot for the similarity between FC variability and Cognitive similarity for both HCP-D and HCP at the nodal level.


step_08_disorder_similarity_network

This folder contains codes to generate figures and results of Figure 7. Implications of connectional hierarchy on brain disorders. The enigma data used in this study can be found here. The statistical results of cortical abnormality maps

(1) step_01_get_enigma_data.m

This script is used to load the enigma data used in this study. The cortical abnormality maps from the 13 disorders were first concatenated (68×13), and then reordered fromDesikan-68 to Cammoun-033. Particularly,

  • For 22q11.2 deletion syndrome, attention-deficit/hyperactivity disorder, autism spectrum disorder, epilepsy (idiopathic generalized, right temporal lobe, and left temporal lobe), depression, obsessive-compulsive disorder, schizophrenia, and bipolar disorder, their cortical thickness and surface area data were obtained from the Enhancing Neuroimaging Genetics through Meta-Analysis (ENIGMA) consortium using the ENIGMA toolbox (https://github.com/MICA-MNI/ENIGMA).
  • The cortical atrophy data of obesity was obtained from the Supplementary Table 14 from (Opel et al., 2021).
  • The cortical atrophy data of schizotypy was obtained from the Supplementary Table 17 and 18 from (Kirschner et al., 2022).
  • The cortical atrophy data of Parkinson’s disease was obtained from the Supplementary Table 2a and 2b from (Laansma et al., 2021).

(2) step_02_disorder_similarity.m

We generated the network of disorder similarity (68×68) by computing the Pearson correlation between the abnormality profiles (1×13) of the two cortical regions. We next compared the disorder similarity network and individual variability network across all edges using Spearman rank correlation, to evaluate the implications of connectional hierarchy in brain disorders. Considering the inaccurate assignment from the DK-68 atlas to Yeo’s 7 networks, we did not analyze the network-averaged results here.

(3) step_03_density_plot.py

Density plot for the similarity between FC variability and Disorder similarity for both HCP-D and HCP at the nodal level.


step_09_developments_of_connectional_hierarchy

This folder contains codes to generate figures and results of Figure 8. Development of connectional hierarchy in youth.

(1) step_01_get_fc_14_age_groups.m

To explore the maturation of connectional hierarchy in youth, we re-grouped the HCP-D participants (8~21 years old) into 14 age groups with a one-year gap, and saved the individual FC matrix for all subjects in each age group, respectively.

(2) step_02_lme_14_age_groups.R

The LME model was used to estimate the FC variability for each age group.

(3) step_03_estimate_inter_edge_heterogeneity.m

We evaluated connectional hierarchy with the inter-edge heterogeneity of individual variability of between-network edges, to summarize the degree of connectional hierarchy. Specifically, we extracted 6 within-network and 15 between-network variability values from the network-level inter-individual variability matrix (6×6). The degree of connection hierarchy was defined as the inter-edge heterogeneity of individual variability, which we quantified as the median absolute deviation across these 21 variability values. With this approach, we estimated inter-edge heterogeneity of individual variability for each of the 14 age groups and evaluated the developmental changes in connectional hierarchy from 8 to 21 years old.

(4) step_04_gam_inter_edge_heterogeneity_age.R

To model both linear and nonlinear developmental effects, we used a generalized additive model (GAM) with penalized splines, which estimates nonlinearities using restricted maximum likelihood (REML) and penalizes nonlinearity in order to avoid over-fitting the data. The mean age across participants was calculated for each age group and used in the GAM. The mean age was modeled using a penalized spline, while including sex and in-scanner head motion as model covariates, as follows: $$\text {Inter-edge heterogeneity} \sim s(\text {Age, } k=4)+\text {Sex}+\text {Motion}$$ where $s()$ is the spline basis function, and $k$ is the basis dimension for smooths. As the analyses were performed at the group level, the sex in the GAM is represented by the male ratio of participants for a particular age group, and the motion was calculated by averaging the mean FD across all individuals and all runs within each age group.

To demonstrate the association between inter-edge heterogeneity and age more intuitively, the p value of $s(\text {Age})$ was transformed to a Z value, and a partial correlation was also performed with sex and motion controlled.


FC variability

In this study, we estimate the inter-individual variability and intra-individual variability of each FC edge (69,751 unique edges in total, limbic network excluded) using a linear mixed-effects (LME) model for both HCP-YA and HCP-D datasets. The LME model was implemented with a R package Rex (https://github.com/TingsterX/Reliability_Explorer). The LME model can capture both fixed and random effects, which assumes that both the observed response variable and the residual term follow a normal distribution with zero mean and specific variance. Specifically, for each FC edge, the LME model can be written as follows: $$FC_{i t}=\mu_{0}+\lambda_{i}+\alpha_{t}+\epsilon_{i t}, \text { where } \lambda_{i} \sim N\left(0, \sigma_{\lambda}^{2}\right), \epsilon_{t} \sim N\left(0, \sigma_{\epsilon}^{2}\right) \text {.}$$ Here, $i$ identifies the subject, t indicates the session, $\lambda_{i}$ and $\alpha_{t}$ measures random effect and fixed effect, respectively, and $\epsilon_{i t}$ is the residual term. The observed individual variation $\sigma_{FC}^{2}$ between $FC_{i t}$ can be decomposed into real inter-individual variation $\sigma_{b}^{2}$ across participants and the intra-individual variations $\sigma_{w}^{2}$ across sessions (captured by the residual variation $\sigma_{\epsilon}^{2}$). In this work, we used $\sigma_{b}^{2}$ and $\sigma_{w}^{2}$ to represent inter- and intra-individual variabilities for one specific FC edge. We further normalized the inter-individual variability of each FC edge by dividing the intra-individual variability with equation of $\frac{\sigma_{b}^{2}}{\sigma_{w}^{2}}$, which was defined as the FC variability and used in our following analyses.

Null models

To evaluate the statistical significance of the correlation between to networks, we generated null models for networks similarly as did in spatial permutation testing for the comparison of cortical properties. Specifically, we generated the null networks by randomly rearranging the order of brain regions (nodes) in the matrix. Taking the matrix estimated by Schaefer-400 parcellation as an example, the original order of node IDs in both columns and rows of the matrix is from 1 to 400. We shuffled the order of node IDs of both the columns and rows, so that the same node ID could represent different brain regions across different randomized networks. This null network preserves the mean and variance of the matrix, it also ensures that the regional profile (the column of the matrix, 400×1) included all the 400 cortical regions.

With this approach, we constructed the null distribution by generating 50,000 randomized individual variability networks and calculated their Spearman rank correlation with other networks, including hemodynamic and electromagnetic functional connectivity, structural network, correlated gene expression network, correlated neurotransmitter receptor expression network, the network of cognitive relevance and the network of disorder vulnerability. We compared the correlation value obtained by the empirical individual variability network with the ones acquired with the null networks to determine the significance level. The P value of this permutation testing was determined as the proportion of permutations that showed a higher correlation than the actual correlation from the real data. As we did 12 analyses (i.e., 6 for MEG data and 6 for other modalities) for each dataset, we performed a Bonferroni correction to correct for multiple comparisons (i.e., 24 in total), and the Bonferroni corrected P values (PBonf) were reported.

Notes

The scripts for supplementary analyses will be uploaded in the near future...

Back to the top.