In [1]:
import pandas as pd
import numpy as np
from pathlib import Path

# Methodological replication of the "Brain-wide functional connectivity patterns support general cognitive ability and mediate effects of socioeconomic status in youth"
https://pmc.ncbi.nlm.nih.gov/articles/PMC8575890/

Can not find this exact release data version, so the full replication of the study is not possible. We can move on to the methodological replication. We will test the generalizability of the method across data versions and cohorts.
The original study used ABCD Release 2.0.1. Our analysis uses most recent release due to availability. Notable differences between these releases include initial number of subjects and the number of cohorts. Therefore, a direct replication is not possible.

## Question 1:
When we apply the authors' exact method to a different iteration of the ABCD dataset, do we get a similar result?

### Data Source and Sample:

* Original Paper's Data: ABCD Release 2.0.1
* New Sample Size: 13274 (total number of observations), 11535 (with g_score), 7674 (baseline cohort), 5600 (followup cohort)

In [2]:
# Define paths
version_B_dir = Path("data/processed/version_B")

# Load features
features_B_baseline = pd.read_csv(version_B_dir / "features_baseline.csv")
features_B_followup = pd.read_csv(version_B_dir / "features_followup.csv")

# Load connectomes
conn_B_baseline = np.load(version_B_dir / "connectomes_baseline.npy")
conn_B_followup = np.load(version_B_dir / "connectomes_followup.npy")

# Load subjects (separate for baseline/followup)
subs_B_baseline = pd.read_csv(version_B_dir / "subjects_baseline.csv")["Subject"].tolist()
subs_B_followup = pd.read_csv(version_B_dir / "subjects_followup.csv")["Subject"].tolist()

print("Version B")
print("Baseline features:", features_B_baseline.shape)
print("Followup features:", features_B_followup.shape)
print("Baseline connectomes:", conn_B_baseline.shape)
print("Followup connectomes:", conn_B_followup.shape)
print("Baseline subjects:", len(subs_B_baseline))
print("Followup subjects:", len(subs_B_followup))


Version B
Baseline features: (4321, 13)
Followup features: (2432, 13)
Baseline connectomes: (4321, 87153)
Followup connectomes: (2432, 87153)
Baseline subjects: 4321
Followup subjects: 2432


#### Demographic characteristics of subjects included in neuroimaging analysis (baseline cohort):
* Age (mean (s.d.))       $~~~~~~~~~     9.98~(0.62)$
* Female (%) $~~~~~~~~~~~~~~~~~~ 2217~(51.3)$
* Race ethnicity (%):
* White $~~~~~~~~~~~~~~~~~~~~~~~~~~ 3071~(71.1)$
* Black $~~~~~~~~~~~~~~~~~~~~~~~~~~~ 529~(12.2)$
* Asian $~~~~~~~~~~~~~~~~~~~~~~~~~~~ 90~(2.1)$
* Other/Mixed $~~~~~~~~~~~~~~~~~ 631~(14.6)$
* Hispanic (%):
* No $~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 3591~(83.1)$
* Yes $~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 730~(16.9)$
* Household income(%):
* $< 50k ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1158 ~(26.8)$
* $ \geq 50k < 100k ~~~~~~~~~~~~~~~ 1244 ~(28.8)$
* $ \geq 100k ~~~~~~~~~~~~~~~~~~~~~~~~~~ 1919 ~(44.4)$

#### Demographic characteristics of subjects included in neuroimaging analysis (followup cohort):
* Age (mean (s.d.))       $~~~~~~~~~     11.92~(0.65)$
* Female (%) $~~~~~~~~~~~~~~~~~~ 1357~(49.2)$
* Race ethnicity (%):
* White $~~~~~~~~~~~~~~~~~~~~~~~~~~ 1923~(69.78)$
* Black $~~~~~~~~~~~~~~~~~~~~~~~~~~~ 310~(11.25)$
* Asian $~~~~~~~~~~~~~~~~~~~~~~~~~~~ 63~(2.29)$
* Other/Mixed $~~~~~~~~~~~~~~~~~ 460~(16.7)$
* Hispanic (%):
* No $~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2234~(81.1)$
* Yes $~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 522~(18.9)$
* Household income(%):
* $< 50k ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 696 ~(25.2)$
* $ \geq 50k < 100k ~~~~~~~~~~~~~~~ 790 ~(28.7)$
* $ \geq 100k ~~~~~~~~~~~~~~~~~~~~~~~~~~ 1270 ~(46.1)$

In [1]:
#features_B_baseline.head()

### Prepare features we want to include in a model without network information

In [3]:
from src.features.extraction import extract_article_features, extract_ses_features, combine_data

In [4]:
X_baseline_B, y_baseline_B = extract_article_features(features_B_baseline)
X_followup_B, y_followup_B = extract_article_features(features_B_followup)

print(X_baseline_B.shape, y_baseline_B.shape)
print(X_followup_B.shape, y_followup_B.shape)


(4321, 9) (4321,)
(2432, 9) (2432,)


In [5]:
# Convert NumPy array to DataFrame with placeholder column names
conn_B_baseline_df = pd.DataFrame(
    conn_B_baseline,
    columns=[f"conn_{i}" for i in range(conn_B_baseline.shape[1])]
)

# Add Subject column
conn_B_baseline_df.insert(0, "Subject", subs_B_baseline)

In [6]:
conn_B_followup_df = pd.DataFrame(
    conn_B_followup,
    columns=[f"conn_{i}" for i in range(conn_B_followup.shape[1])]
)

# Add Subject column
conn_B_followup_df.insert(0, "Subject", subs_B_followup)

In [7]:
from src.evaluation.cross_validation import run_cross_validation, print_cross_val_results

In [10]:
results_no_network, models_no_network = run_cross_validation(X_baseline_B, y_baseline_B, use_network=False)

Cross-validation progress: 100%|██████████| 20/20 [00:00<00:00, 271.95it/s]


Site: site02 | Pearson r = 0.401 (p=0.000) | Partial η² = 0.097

Site: site03 | Pearson r = 0.345 (p=0.000) | Partial η² = 0.068

Site: site04 | Pearson r = 0.337 (p=0.000) | Partial η² = 0.100

Site: site05 | Pearson r = 0.498 (p=0.000) | Partial η² = 0.237

Site: site06 | Pearson r = 0.317 (p=0.000) | Partial η² = 0.051

Site: site07 | Pearson r = 0.443 (p=0.000) | Partial η² = 0.188

Site: site08 | Pearson r = 0.314 (p=0.000) | Partial η² = -0.351

Site: site09 | Pearson r = 0.439 (p=0.000) | Partial η² = -0.024

Site: site10 | Pearson r = 0.393 (p=0.000) | Partial η² = 0.151

Site: site11 | Pearson r = 0.483 (p=0.000) | Partial η² = 0.164

Site: site12 | Pearson r = 0.258 (p=0.024) | Partial η² = -0.118

Site: site13 | Pearson r = 0.414 (p=0.000) | Partial η² = 0.168

Site: site14 | Pearson r = 0.398 (p=0.000) | Partial η² = 0.151

Site: site15 | Pearson r = 0.507 (p=0.000) | Partial η² = 0.166

Site: site16 | Pearson r = 0.441 (p=0.000) | Partial η² = 0.194

Site: site17 | Pearso




In [11]:
print_cross_val_results(results_no_network)

Cross-validated Performance Metrics:
Average Pearson's r: 0.404
Partial eta squared: 0.092
R²: 0.092


In [12]:
results_network, models_network = run_cross_validation(X_baseline_B, y_baseline_B, use_network=True, X_network=conn_B_baseline_df, num_pc=100)

Cross-validation progress:   5%|▌         | 1/20 [00:49<15:32, 49.09s/it]


Site: site02 | Pearson r = 0.410 (p=0.000) | Partial η² = 0.126


Cross-validation progress:  10%|█         | 2/20 [01:32<13:40, 45.60s/it]


Site: site03 | Pearson r = 0.465 (p=0.000) | Partial η² = 0.211


Cross-validation progress:  15%|█▌        | 3/20 [02:11<12:03, 42.58s/it]


Site: site04 | Pearson r = 0.418 (p=0.000) | Partial η² = 0.161


Cross-validation progress:  20%|██        | 4/20 [02:45<10:31, 39.47s/it]


Site: site05 | Pearson r = 0.550 (p=0.000) | Partial η² = 0.298


Cross-validation progress:  25%|██▌       | 5/20 [03:17<09:08, 36.54s/it]


Site: site06 | Pearson r = 0.419 (p=0.000) | Partial η² = 0.168


Cross-validation progress:  30%|███       | 6/20 [03:44<07:45, 33.24s/it]


Site: site07 | Pearson r = 0.564 (p=0.000) | Partial η² = 0.306


Cross-validation progress:  35%|███▌      | 7/20 [04:09<06:37, 30.56s/it]


Site: site08 | Pearson r = 0.398 (p=0.000) | Partial η² = -0.049


Cross-validation progress:  40%|████      | 8/20 [04:34<05:46, 28.87s/it]


Site: site09 | Pearson r = 0.539 (p=0.000) | Partial η² = 0.183


Cross-validation progress:  45%|████▌     | 9/20 [04:58<04:59, 27.23s/it]


Site: site10 | Pearson r = 0.398 (p=0.000) | Partial η² = 0.144


Cross-validation progress:  50%|█████     | 10/20 [05:21<04:19, 25.92s/it]


Site: site11 | Pearson r = 0.535 (p=0.000) | Partial η² = 0.259


Cross-validation progress:  55%|█████▌    | 11/20 [05:43<03:44, 24.94s/it]


Site: site12 | Pearson r = 0.372 (p=0.001) | Partial η² = -0.044


Cross-validation progress:  60%|██████    | 12/20 [06:08<03:19, 24.89s/it]


Site: site13 | Pearson r = 0.482 (p=0.000) | Partial η² = 0.230


Cross-validation progress:  65%|██████▌   | 13/20 [06:33<02:54, 24.99s/it]


Site: site14 | Pearson r = 0.522 (p=0.000) | Partial η² = 0.272


Cross-validation progress:  70%|███████   | 14/20 [06:59<02:32, 25.37s/it]


Site: site15 | Pearson r = 0.595 (p=0.000) | Partial η² = 0.327


Cross-validation progress:  75%|███████▌  | 15/20 [07:22<02:02, 24.56s/it]


Site: site16 | Pearson r = 0.500 (p=0.000) | Partial η² = 0.247


Cross-validation progress:  80%|████████  | 16/20 [07:45<01:35, 23.99s/it]


Site: site17 | Pearson r = 0.332 (p=0.000) | Partial η² = -0.007


Cross-validation progress:  85%|████████▌ | 17/20 [08:09<01:11, 23.96s/it]


Site: site18 | Pearson r = 0.366 (p=0.000) | Partial η² = 0.107


Cross-validation progress:  90%|█████████ | 18/20 [08:31<00:47, 23.60s/it]


Site: site19 | Pearson r = 0.513 (p=0.000) | Partial η² = 0.242


Cross-validation progress:  95%|█████████▌| 19/20 [08:55<00:23, 23.66s/it]


Site: site20 | Pearson r = 0.534 (p=0.000) | Partial η² = 0.229


Cross-validation progress: 100%|██████████| 20/20 [09:20<00:00, 28.04s/it]


Site: site21 | Pearson r = 0.554 (p=0.000) | Partial η² = 0.290





In [13]:
print_cross_val_results(results_network)

Cross-validated Performance Metrics:
Average Pearson's r: 0.473
Partial eta squared: 0.185
R²: 0.185


In [None]:
results_no_network, models_no_network = run_cross_validation(X_followup_B, y_followup_B, use_network=False)
print_cross_val_results(results_no_network)

In [None]:
results_network, models_network = run_cross_validation(X_followup_B, y_followup_B, use_network=True, X_network=conn_B_followup_df, num_pc=100)

In [None]:
print_cross_val_results(results_network)

### Summary table:
| Description                    | Pearson's correlation | Partial $\eta^2$ | Coefficient of Determination |
|--------------------------------|-----------------------|------------------|------------------------------|
| Baseline: Basic features + PCs | 0.473                 | 0.185            | 0.185                        |
| Baseline: Basic features  only | 0.404                 | 0.092            | 0.092                        |
| Followup: Basic features + PCs | 0.457                 | 0.152            | 0.152                        |
| Followup: Basic features only  | 0.325                 | 0.024            | 0.024                        |





The original finding is robust. We worked with the different data but results on the baseline data are almost identical tho the article claim. For the followup data results are not so good, but still we observe an impovement when comparing to observations only model.  Thus, he brain-cognition relationship captured by the model generalizes across similar cohorts.

## Answer 1:
Yes, we do get the similar results for the baseline cohort. Moreover, results for followup data also show significant improvement when adding PCs to observations only model.

## Question 2: What is the effect of SES features?

In [9]:
ses_baseline_B = extract_ses_features(features_B_baseline)
ses_followup_B = extract_ses_features(features_B_followup)

In [10]:
main_baseline_B =combine_data(ses_baseline_B, X_baseline_B)
main_followup_B = combine_data(ses_followup_B, X_followup_B)

Dropping duplicate columns: ['site_id_l', 'Subject']
Dropping duplicate columns: ['site_id_l', 'Subject']


In [27]:
results_no_network, models_no_network = run_cross_validation(main_baseline_B, y_baseline_B, use_network=False)
print_cross_val_results(results_no_network)

Cross-validation progress: 100%|██████████| 20/20 [00:00<00:00, 350.83it/s]


Site: site02 | Pearson r = 0.460 (p=0.000) | Partial η² = 0.102

Site: site03 | Pearson r = 0.423 (p=0.000) | Partial η² = 0.150

Site: site04 | Pearson r = 0.501 (p=0.000) | Partial η² = 0.250

Site: site05 | Pearson r = 0.584 (p=0.000) | Partial η² = 0.337

Site: site06 | Pearson r = 0.458 (p=0.000) | Partial η² = 0.194

Site: site07 | Pearson r = 0.650 (p=0.000) | Partial η² = 0.393

Site: site08 | Pearson r = 0.470 (p=0.000) | Partial η² = 0.080

Site: site09 | Pearson r = 0.449 (p=0.000) | Partial η² = 0.091

Site: site10 | Pearson r = 0.461 (p=0.000) | Partial η² = 0.206

Site: site11 | Pearson r = 0.575 (p=0.000) | Partial η² = 0.300

Site: site12 | Pearson r = 0.415 (p=0.000) | Partial η² = 0.100

Site: site13 | Pearson r = 0.567 (p=0.000) | Partial η² = 0.316

Site: site14 | Pearson r = 0.489 (p=0.000) | Partial η² = 0.234

Site: site15 | Pearson r = 0.625 (p=0.000) | Partial η² = 0.378

Site: site16 | Pearson r = 0.509 (p=0.000) | Partial η² = 0.253

Site: site17 | Pearson r




In [28]:
results_network, models_network = run_cross_validation(main_baseline_B, y_baseline_B, use_network=True,
                                                       X_network=conn_B_baseline_df, num_pc=100)
print_cross_val_results(results_network)

Cross-validation progress:   5%|▌         | 1/20 [00:39<12:33, 39.64s/it]


Site: site02 | Pearson r = 0.462 (p=0.000) | Partial η² = 0.156


Cross-validation progress:  10%|█         | 2/20 [01:04<09:18, 31.01s/it]


Site: site03 | Pearson r = 0.490 (p=0.000) | Partial η² = 0.233


Cross-validation progress:  15%|█▌        | 3/20 [01:25<07:31, 26.57s/it]


Site: site04 | Pearson r = 0.529 (p=0.000) | Partial η² = 0.277


Cross-validation progress:  20%|██        | 4/20 [01:46<06:28, 24.26s/it]


Site: site05 | Pearson r = 0.605 (p=0.000) | Partial η² = 0.364


Cross-validation progress:  25%|██▌       | 5/20 [02:07<05:44, 22.94s/it]


Site: site06 | Pearson r = 0.496 (p=0.000) | Partial η² = 0.244


Cross-validation progress:  30%|███       | 6/20 [02:28<05:13, 22.40s/it]


Site: site07 | Pearson r = 0.676 (p=0.000) | Partial η² = 0.435


Cross-validation progress:  35%|███▌      | 7/20 [02:49<04:44, 21.88s/it]


Site: site08 | Pearson r = 0.498 (p=0.000) | Partial η² = 0.166


Cross-validation progress:  40%|████      | 8/20 [03:08<04:13, 21.12s/it]


Site: site09 | Pearson r = 0.523 (p=0.000) | Partial η² = 0.216


Cross-validation progress:  45%|████▌     | 9/20 [03:27<03:44, 20.39s/it]


Site: site10 | Pearson r = 0.469 (p=0.000) | Partial η² = 0.216


Cross-validation progress:  50%|█████     | 10/20 [03:46<03:18, 19.86s/it]


Site: site11 | Pearson r = 0.596 (p=0.000) | Partial η² = 0.337


Cross-validation progress:  55%|█████▌    | 11/20 [04:09<03:07, 20.81s/it]


Site: site12 | Pearson r = 0.423 (p=0.000) | Partial η² = 0.081


Cross-validation progress:  60%|██████    | 12/20 [04:32<02:52, 21.53s/it]


Site: site13 | Pearson r = 0.593 (p=0.000) | Partial η² = 0.346


Cross-validation progress:  65%|██████▌   | 13/20 [04:52<02:27, 21.00s/it]


Site: site14 | Pearson r = 0.559 (p=0.000) | Partial η² = 0.308


Cross-validation progress:  70%|███████   | 14/20 [05:15<02:10, 21.76s/it]


Site: site15 | Pearson r = 0.638 (p=0.000) | Partial η² = 0.406


Cross-validation progress:  75%|███████▌  | 15/20 [05:32<01:41, 20.34s/it]


Site: site16 | Pearson r = 0.549 (p=0.000) | Partial η² = 0.292


Cross-validation progress:  80%|████████  | 16/20 [05:53<01:21, 20.45s/it]


Site: site17 | Pearson r = 0.399 (p=0.000) | Partial η² = 0.117


Cross-validation progress:  85%|████████▌ | 17/20 [06:14<01:02, 20.74s/it]


Site: site18 | Pearson r = 0.456 (p=0.000) | Partial η² = 0.197


Cross-validation progress:  90%|█████████ | 18/20 [06:35<00:41, 20.79s/it]


Site: site19 | Pearson r = 0.573 (p=0.000) | Partial η² = 0.315


Cross-validation progress:  95%|█████████▌| 19/20 [07:03<00:22, 22.97s/it]


Site: site20 | Pearson r = 0.492 (p=0.000) | Partial η² = 0.188


Cross-validation progress: 100%|██████████| 20/20 [07:29<00:00, 22.50s/it]


Site: site21 | Pearson r = 0.577 (p=0.000) | Partial η² = 0.324
Cross-validated Performance Metrics:
Average Pearson's r: 0.530
Partial eta squared: 0.261
R²: 0.261





### Summary table 2 a:
| Description                         | Pearson's correlation | Partial $\eta^2$ | Coefficient of Determination |
|-------------------------------------|-----------------------|------------------|-----------------------------|
| Baseline: Basic features + PCs      | 0.473                 | 0.185            | 0.185                       |
| Baseline: Basic featurres only      | 0.404                 | 0.092            | 0.092                       |
| Baseline: SES only                  | 0.375                 | 0.114            | 0.114                       |
| Baseline: SES + PCs                 | 0.437                 | 0.174            | 0.174                       |
| Baseline: Basic features + SES      | 0.502                 | 0.223            | 0.223                       |
| Baseline: Basic features+ SES + PCs | 0.530                 | 0.261            | 0.261                       |



In [29]:
results_no_network, models_no_network = run_cross_validation(main_followup_B, y_followup_B, use_network=False)
print_cross_val_results(results_no_network)

Cross-validation progress: 100%|██████████| 20/20 [00:00<00:00, 382.23it/s]


Site: site02 | Pearson r = 0.393 (p=0.000) | Partial η² = 0.145

Site: site03 | Pearson r = 0.357 (p=0.000) | Partial η² = 0.117

Site: site04 | Pearson r = 0.449 (p=0.000) | Partial η² = 0.160

Site: site05 | Pearson r = 0.548 (p=0.000) | Partial η² = 0.278

Site: site06 | Pearson r = 0.284 (p=0.004) | Partial η² = 0.008

Site: site07 | Pearson r = 0.557 (p=0.000) | Partial η² = 0.279

Site: site08 | Pearson r = 0.342 (p=0.002) | Partial η² = 0.012

Site: site09 | Pearson r = 0.441 (p=0.001) | Partial η² = 0.107

Site: site10 | Pearson r = 0.520 (p=0.000) | Partial η² = 0.259

Site: site11 | Pearson r = 0.547 (p=0.000) | Partial η² = 0.296

Site: site12 | Pearson r = 0.491 (p=0.000) | Partial η² = 0.240

Site: site13 | Pearson r = 0.455 (p=0.000) | Partial η² = 0.199

Site: site14 | Pearson r = 0.482 (p=0.000) | Partial η² = 0.232

Site: site15 | Pearson r = 0.735 (p=0.000) | Partial η² = 0.470

Site: site16 | Pearson r = 0.373 (p=0.000) | Partial η² = 0.132

Site: site17 | Pearson r




In [30]:
results_network, models_network = run_cross_validation(main_followup_B, y_followup_B, use_network=True,
                                                       X_network=conn_B_followup_df, num_pc=100)
print_cross_val_results(results_network)

Cross-validation progress:   5%|▌         | 1/20 [00:23<07:34, 23.91s/it]


Site: site02 | Pearson r = 0.461 (p=0.000) | Partial η² = 0.205


Cross-validation progress:  10%|█         | 2/20 [00:36<05:06, 17.03s/it]


Site: site03 | Pearson r = 0.439 (p=0.000) | Partial η² = 0.185


Cross-validation progress:  15%|█▌        | 3/20 [00:53<04:52, 17.19s/it]


Site: site04 | Pearson r = 0.503 (p=0.000) | Partial η² = 0.182


Cross-validation progress:  20%|██        | 4/20 [01:16<05:14, 19.66s/it]


Site: site05 | Pearson r = 0.622 (p=0.000) | Partial η² = 0.355


Cross-validation progress:  25%|██▌       | 5/20 [01:29<04:18, 17.24s/it]


Site: site06 | Pearson r = 0.392 (p=0.000) | Partial η² = 0.108


Cross-validation progress:  30%|███       | 6/20 [01:43<03:44, 16.04s/it]


Site: site07 | Pearson r = 0.594 (p=0.000) | Partial η² = 0.310


Cross-validation progress:  35%|███▌      | 7/20 [01:57<03:17, 15.18s/it]


Site: site08 | Pearson r = 0.474 (p=0.000) | Partial η² = 0.163


Cross-validation progress:  40%|████      | 8/20 [02:08<02:49, 14.14s/it]


Site: site09 | Pearson r = 0.536 (p=0.000) | Partial η² = 0.162


Cross-validation progress:  45%|████▌     | 9/20 [02:18<02:21, 12.86s/it]


Site: site10 | Pearson r = 0.533 (p=0.000) | Partial η² = 0.282


Cross-validation progress:  50%|█████     | 10/20 [02:31<02:06, 12.64s/it]


Site: site11 | Pearson r = 0.547 (p=0.000) | Partial η² = 0.227


Cross-validation progress:  55%|█████▌    | 11/20 [02:43<01:53, 12.59s/it]


Site: site12 | Pearson r = 0.596 (p=0.000) | Partial η² = 0.354


Cross-validation progress:  60%|██████    | 12/20 [02:53<01:35, 11.90s/it]


Site: site13 | Pearson r = 0.504 (p=0.000) | Partial η² = 0.244


Cross-validation progress:  65%|██████▌   | 13/20 [03:04<01:19, 11.40s/it]


Site: site14 | Pearson r = 0.564 (p=0.000) | Partial η² = 0.317


Cross-validation progress:  70%|███████   | 14/20 [03:16<01:10, 11.76s/it]


Site: site15 | Pearson r = 0.736 (p=0.000) | Partial η² = 0.502


Cross-validation progress:  75%|███████▌  | 15/20 [03:26<00:56, 11.21s/it]


Site: site16 | Pearson r = 0.443 (p=0.000) | Partial η² = 0.180


Cross-validation progress:  80%|████████  | 16/20 [03:36<00:43, 10.82s/it]


Site: site17 | Pearson r = 0.454 (p=0.000) | Partial η² = 0.073


Cross-validation progress:  85%|████████▌ | 17/20 [03:49<00:34, 11.43s/it]


Site: site18 | Pearson r = 0.381 (p=0.000) | Partial η² = 0.128


Cross-validation progress:  90%|█████████ | 18/20 [04:02<00:23, 11.98s/it]


Site: site19 | Pearson r = 0.368 (p=0.001) | Partial η² = 0.103


Cross-validation progress:  95%|█████████▌| 19/20 [04:13<00:11, 11.57s/it]


Site: site20 | Pearson r = 0.603 (p=0.000) | Partial η² = 0.339


Cross-validation progress: 100%|██████████| 20/20 [04:25<00:00, 13.27s/it]


Site: site21 | Pearson r = 0.630 (p=0.000) | Partial η² = 0.389
Cross-validated Performance Metrics:
Average Pearson's r: 0.519
Partial eta squared: 0.241
R²: 0.241





### Summary table 2 b:
| Description                          | Pearson's correlation | Partial $\eta^2$ | Coefficient of Determination |
|--------------------------------------|-----------------------|------------------|------------------------------|
| Followup: Basic features + PCs       | 0.457                 | 0.152            | 0.152                        |
| Followup: Basic features only        | 0.325                 | 0.024            | 0.024                        |
| Followup: SES only                   | 0.405                 | 0.137            | 0.137                        |
| Followup: SES + PCs                  | 0.487                 | 0.197            | 0.197                        |
| Followup: Basic features + SES       | 0.463                 | 0.188            | 0.188                        |
| Followup: Basic features + SES + PCs | 0.519                 | 0.241            | 0.241                        |


## Answer 2:
Yes, the new features (SES) help improve the quality of the models!
We have the evidence to support this finding from both tables:

* SES alone is a better predictor than basic features alone.
* Adding SES to basic features provides a larger boost in explained variance than adding PCs does.
* The best model in both cases is the full model that includes basic features, the new SES features and the PCs.

## Question 3:
Is it possible to move towards network summary statistics without loss of explained variation?

There are several questions that need to be asked. First, we need to understand whether network summary statistics increase the amount of explained variance. If yes, then which one. We need to compare the following:
* How much variance can network stats explain alone?
* Does adding network stats to the original data help?
* Do network stats provide new information that wasn't captured by the PCs?
* Can they replace PCs?

In [8]:
regions = pd.read_csv('../data/raw/gordon_sub_cere_parcels.csv')

In [9]:
from src.network.statistics import compute_one_statistic

## Strength:
* Strength metrics (avg_positive_strength, avg_negative_strength, avg_total_strength) follow
          definitions of node strength in weighted networks [Barrat et al., 2004; Rubinov & Sporns, 2010].
          Positive/negative separation follows signed-network convention [Rubinov, 2011].
* Within- and between-system means (within_positive_mean, etc.) are standard summaries of
          functional connectivity [Power et al., 2011].
* Segregation metrics:
    * segregation_ratio: within-system positive mean / between-system positive mean (used in several FC studies).
    * segregation_index: (within-system positive mean – between-system positive mean) / within-system positive mean,
              as defined by Chan et al. (2014, PNAS).This function calculates the following statistics for each network:



In [20]:
avg_pos_strength_baseline_B = compute_one_statistic(df_features=main_baseline_B,
                          df_network=conn_B_baseline_df,
                          regions=regions,
                          stat_name="avg_positive_strength",
                          mode="all",
                          node_order=None,
                          matrix_size=418,
                          diag_value=0.0)

100%|██████████| 4321/4321 [00:35<00:00, 122.78it/s]


In [21]:
baseline_B_plus = combine_data(main_baseline_B , avg_pos_strength_baseline_B)
results_no_network, models_no_network = run_cross_validation(baseline_B_plus , y_baseline_B, use_network=False)
print_cross_val_results(results_no_network)

Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 20/20 [00:00<00:00, 126.89it/s]


Site: site02 | Pearson r = 0.479 (p=0.000) | Partial η² = 0.132

Site: site03 | Pearson r = 0.436 (p=0.000) | Partial η² = 0.166

Site: site04 | Pearson r = 0.497 (p=0.000) | Partial η² = 0.246

Site: site05 | Pearson r = 0.585 (p=0.000) | Partial η² = 0.339

Site: site06 | Pearson r = 0.451 (p=0.000) | Partial η² = 0.187

Site: site07 | Pearson r = 0.650 (p=0.000) | Partial η² = 0.395

Site: site08 | Pearson r = 0.472 (p=0.000) | Partial η² = 0.101

Site: site09 | Pearson r = 0.431 (p=0.000) | Partial η² = 0.069

Site: site10 | Pearson r = 0.469 (p=0.000) | Partial η² = 0.214

Site: site11 | Pearson r = 0.575 (p=0.000) | Partial η² = 0.302

Site: site12 | Pearson r = 0.422 (p=0.000) | Partial η² = 0.098

Site: site13 | Pearson r = 0.564 (p=0.000) | Partial η² = 0.307

Site: site14 | Pearson r = 0.479 (p=0.000) | Partial η² = 0.224

Site: site15 | Pearson r = 0.618 (p=0.000) | Partial η² = 0.375

Site: site16 | Pearson r = 0.520 (p=0.000) | Partial η² = 0.260

Site: site17 | Pearson r




In [22]:
avg_pos_strength_followup_B = compute_one_statistic(df_features=main_followup_B,
                          df_network=conn_B_followup_df,
                          regions=regions,
                          stat_name="avg_positive_strength",
                          mode="all",
                          node_order=None,
                          matrix_size=418,
                          diag_value=0.0)

100%|██████████| 2756/2756 [00:22<00:00, 124.65it/s]


In [24]:
followup_B_plus = combine_data(main_followup_B, avg_pos_strength_followup_B)
results_no_network, models_no_network = run_cross_validation(followup_B_plus.fillna(0.0), y_followup_B, use_network=False)
print_cross_val_results(results_no_network)

Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 20/20 [00:00<00:00, 279.29it/s]


Site: site02 | Pearson r = 0.400 (p=0.000) | Partial η² = 0.155

Site: site03 | Pearson r = 0.376 (p=0.000) | Partial η² = 0.132

Site: site04 | Pearson r = 0.442 (p=0.000) | Partial η² = 0.127

Site: site05 | Pearson r = 0.572 (p=0.000) | Partial η² = 0.306

Site: site06 | Pearson r = 0.294 (p=0.003) | Partial η² = 0.018

Site: site07 | Pearson r = 0.532 (p=0.000) | Partial η² = 0.256

Site: site08 | Pearson r = 0.364 (p=0.001) | Partial η² = 0.040

Site: site09 | Pearson r = 0.373 (p=0.006) | Partial η² = 0.031

Site: site10 | Pearson r = 0.521 (p=0.000) | Partial η² = 0.260

Site: site11 | Pearson r = 0.517 (p=0.000) | Partial η² = 0.212

Site: site12 | Pearson r = 0.505 (p=0.000) | Partial η² = 0.253

Site: site13 | Pearson r = 0.455 (p=0.000) | Partial η² = 0.200

Site: site14 | Pearson r = 0.484 (p=0.000) | Partial η² = 0.233

Site: site15 | Pearson r = 0.740 (p=0.000) | Partial η² = 0.489

Site: site16 | Pearson r = 0.369 (p=0.000) | Partial η² = 0.128

Site: site17 | Pearson r




In [25]:
results_network, models_network = run_cross_validation(main_baseline_B, y_baseline_B, use_network=True,
                                                       X_network=conn_B_baseline_df, num_pc=15)
print_cross_val_results(results_network)

Cross-validation progress:   5%|▌         | 1/20 [00:24<07:51, 24.81s/it]


Site: site02 | Pearson r = 0.445 (p=0.000) | Partial η² = 0.111


Cross-validation progress:  10%|█         | 2/20 [00:44<06:33, 21.85s/it]


Site: site03 | Pearson r = 0.440 (p=0.000) | Partial η² = 0.176


Cross-validation progress:  15%|█▌        | 3/20 [01:01<05:34, 19.70s/it]


Site: site04 | Pearson r = 0.500 (p=0.000) | Partial η² = 0.250


Cross-validation progress:  20%|██        | 4/20 [01:19<05:00, 18.76s/it]


Site: site05 | Pearson r = 0.584 (p=0.000) | Partial η² = 0.339


Cross-validation progress:  25%|██▌       | 5/20 [01:35<04:30, 18.02s/it]


Site: site06 | Pearson r = 0.466 (p=0.000) | Partial η² = 0.206


Cross-validation progress:  30%|███       | 6/20 [01:51<04:02, 17.29s/it]


Site: site07 | Pearson r = 0.665 (p=0.000) | Partial η² = 0.413


Cross-validation progress:  35%|███▌      | 7/20 [02:07<03:40, 16.99s/it]


Site: site08 | Pearson r = 0.500 (p=0.000) | Partial η² = 0.149


Cross-validation progress:  40%|████      | 8/20 [02:24<03:21, 16.79s/it]


Site: site09 | Pearson r = 0.468 (p=0.000) | Partial η² = 0.140


Cross-validation progress:  45%|████▌     | 9/20 [02:37<02:51, 15.59s/it]


Site: site10 | Pearson r = 0.459 (p=0.000) | Partial η² = 0.207


Cross-validation progress:  50%|█████     | 10/20 [02:50<02:29, 14.92s/it]


Site: site11 | Pearson r = 0.577 (p=0.000) | Partial η² = 0.310


Cross-validation progress:  55%|█████▌    | 11/20 [03:03<02:09, 14.34s/it]


Site: site12 | Pearson r = 0.402 (p=0.000) | Partial η² = 0.068


Cross-validation progress:  60%|██████    | 12/20 [03:13<01:44, 13.00s/it]


Site: site13 | Pearson r = 0.576 (p=0.000) | Partial η² = 0.322


Cross-validation progress:  65%|██████▌   | 13/20 [03:26<01:30, 12.94s/it]


Site: site14 | Pearson r = 0.540 (p=0.000) | Partial η² = 0.286


Cross-validation progress:  70%|███████   | 14/20 [03:39<01:18, 13.01s/it]


Site: site15 | Pearson r = 0.621 (p=0.000) | Partial η² = 0.384


Cross-validation progress:  75%|███████▌  | 15/20 [03:49<00:59, 11.92s/it]


Site: site16 | Pearson r = 0.530 (p=0.000) | Partial η² = 0.272


Cross-validation progress:  80%|████████  | 16/20 [04:02<00:49, 12.28s/it]


Site: site17 | Pearson r = 0.381 (p=0.000) | Partial η² = 0.116


Cross-validation progress:  85%|████████▌ | 17/20 [04:15<00:37, 12.63s/it]


Site: site18 | Pearson r = 0.439 (p=0.000) | Partial η² = 0.177


Cross-validation progress:  90%|█████████ | 18/20 [04:28<00:25, 12.74s/it]


Site: site19 | Pearson r = 0.594 (p=0.000) | Partial η² = 0.335


Cross-validation progress:  95%|█████████▌| 19/20 [04:41<00:12, 12.65s/it]


Site: site20 | Pearson r = 0.486 (p=0.000) | Partial η² = 0.180


Cross-validation progress: 100%|██████████| 20/20 [04:53<00:00, 14.66s/it]


Site: site21 | Pearson r = 0.562 (p=0.000) | Partial η² = 0.303
Cross-validated Performance Metrics:
Average Pearson's r: 0.512
Partial eta squared: 0.237
R²: 0.237





In [26]:
results_network, models_network = run_cross_validation(main_followup_B, y_followup_B, use_network=True,
                                                       X_network=conn_B_followup_df, num_pc=15)
print_cross_val_results(results_network)

Cross-validation progress:   5%|▌         | 1/20 [00:16<05:22, 16.99s/it]


Site: site02 | Pearson r = 0.423 (p=0.000) | Partial η² = 0.170


Cross-validation progress:  10%|█         | 2/20 [00:23<03:14, 10.82s/it]


Site: site03 | Pearson r = 0.396 (p=0.000) | Partial η² = 0.153


Cross-validation progress:  15%|█▌        | 3/20 [00:29<02:29,  8.79s/it]


Site: site04 | Pearson r = 0.459 (p=0.000) | Partial η² = 0.150


Cross-validation progress:  20%|██        | 4/20 [00:36<02:08,  8.01s/it]


Site: site05 | Pearson r = 0.574 (p=0.000) | Partial η² = 0.305


Cross-validation progress:  25%|██▌       | 5/20 [00:46<02:08,  8.56s/it]


Site: site06 | Pearson r = 0.317 (p=0.001) | Partial η² = 0.047


Cross-validation progress:  30%|███       | 6/20 [00:52<01:50,  7.90s/it]


Site: site07 | Pearson r = 0.572 (p=0.000) | Partial η² = 0.297


Cross-validation progress:  35%|███▌      | 7/20 [00:59<01:36,  7.42s/it]


Site: site08 | Pearson r = 0.380 (p=0.001) | Partial η² = 0.052


Cross-validation progress:  40%|████      | 8/20 [01:05<01:25,  7.10s/it]


Site: site09 | Pearson r = 0.402 (p=0.003) | Partial η² = 0.059


Cross-validation progress:  45%|████▌     | 9/20 [01:11<01:14,  6.76s/it]


Site: site10 | Pearson r = 0.519 (p=0.000) | Partial η² = 0.262


Cross-validation progress:  50%|█████     | 10/20 [01:18<01:06,  6.67s/it]


Site: site11 | Pearson r = 0.567 (p=0.000) | Partial η² = 0.297


Cross-validation progress:  55%|█████▌    | 11/20 [01:24<00:59,  6.60s/it]


Site: site12 | Pearson r = 0.517 (p=0.000) | Partial η² = 0.265


Cross-validation progress:  60%|██████    | 12/20 [01:31<00:52,  6.54s/it]


Site: site13 | Pearson r = 0.464 (p=0.000) | Partial η² = 0.209


Cross-validation progress:  65%|██████▌   | 13/20 [01:38<00:48,  6.97s/it]


Site: site14 | Pearson r = 0.505 (p=0.000) | Partial η² = 0.254


Cross-validation progress:  70%|███████   | 14/20 [01:45<00:41,  6.87s/it]


Site: site15 | Pearson r = 0.741 (p=0.000) | Partial η² = 0.504


Cross-validation progress:  75%|███████▌  | 15/20 [01:51<00:32,  6.56s/it]


Site: site16 | Pearson r = 0.369 (p=0.000) | Partial η² = 0.122


Cross-validation progress:  80%|████████  | 16/20 [01:59<00:28,  7.11s/it]


Site: site17 | Pearson r = 0.361 (p=0.000) | Partial η² = -0.072


Cross-validation progress:  85%|████████▌ | 17/20 [02:06<00:20,  6.94s/it]


Site: site18 | Pearson r = 0.390 (p=0.000) | Partial η² = 0.143


Cross-validation progress:  90%|█████████ | 18/20 [02:12<00:13,  6.78s/it]


Site: site19 | Pearson r = 0.338 (p=0.002) | Partial η² = 0.063


Cross-validation progress:  95%|█████████▌| 19/20 [02:18<00:06,  6.56s/it]


Site: site20 | Pearson r = 0.584 (p=0.000) | Partial η² = 0.308


Cross-validation progress: 100%|██████████| 20/20 [02:25<00:00,  7.25s/it]


Site: site21 | Pearson r = 0.592 (p=0.000) | Partial η² = 0.345
Cross-validated Performance Metrics:
Average Pearson's r: 0.473
Partial eta squared: 0.197
R²: 0.197





#### Positive strength (one value for each region, 15 values total)
| **Case**                | **PC Condition** | **Average Pearson's r** | **Cross-validated Partial η² (avg)** |
|-------------------------|------------------|-------------------------|--------------------------------------|
| B\_baseline + statistic |                  | 0.503                   | 0.224                                |
| B\_baseline             | none             | 0.502                   | 0.223                                |
| B\_baseline             | 15 PCs           | 0.512                   | 0.237                                |
| B\_baseline             | 100 PCs          | 0.530                   | 0.261                                |
| B\_followup + statistic |                  | 0.460                   | 0.183                                |
| B\_followup             | none             | 0.463                   | 0.188                                |
| B\_followup             | 15 PCs           | 0.473                   | 0.197                                |
| B\_followup             | 100 PCs          | 0.519                   | 0.241                                |




In [27]:
avg_neg_strength_baseline_B = compute_one_statistic(df_features=main_baseline_B,
                                                    df_network=conn_B_baseline_df,
                                                    regions=regions,
                                                    stat_name="avg_negative_strength",
                                                    mode="all",
                                                    node_order=None,
                                                    matrix_size=418,
                                                    diag_value=0.0)

100%|██████████| 4321/4321 [00:36<00:00, 118.46it/s]


In [28]:
baseline_B_plus = combine_data(main_baseline_B, avg_neg_strength_baseline_B)
results_no_network, models_no_network = run_cross_validation(baseline_B_plus, y_baseline_B, use_network=False)
print_cross_val_results(results_no_network)

Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 20/20 [00:00<00:00, 231.57it/s]


Site: site02 | Pearson r = 0.457 (p=0.000) | Partial η² = 0.110

Site: site03 | Pearson r = 0.427 (p=0.000) | Partial η² = 0.152

Site: site04 | Pearson r = 0.495 (p=0.000) | Partial η² = 0.245

Site: site05 | Pearson r = 0.580 (p=0.000) | Partial η² = 0.334

Site: site06 | Pearson r = 0.456 (p=0.000) | Partial η² = 0.190

Site: site07 | Pearson r = 0.643 (p=0.000) | Partial η² = 0.387

Site: site08 | Pearson r = 0.468 (p=0.000) | Partial η² = 0.086

Site: site09 | Pearson r = 0.442 (p=0.000) | Partial η² = 0.096

Site: site10 | Pearson r = 0.467 (p=0.000) | Partial η² = 0.211

Site: site11 | Pearson r = 0.574 (p=0.000) | Partial η² = 0.305

Site: site12 | Pearson r = 0.417 (p=0.000) | Partial η² = 0.095

Site: site13 | Pearson r = 0.563 (p=0.000) | Partial η² = 0.307

Site: site14 | Pearson r = 0.481 (p=0.000) | Partial η² = 0.226

Site: site15 | Pearson r = 0.625 (p=0.000) | Partial η² = 0.383

Site: site16 | Pearson r = 0.516 (p=0.000) | Partial η² = 0.257

Site: site17 | Pearson r




In [29]:
avg_neg_strength_followup_B = compute_one_statistic(df_features=main_followup_B,
                                                    df_network=conn_B_followup_df,
                                                    regions=regions,
                                                    stat_name="avg_negative_strength",
                                                    mode="all",
                                                    node_order=None,
                                                    matrix_size=418,
                                                    diag_value=0.0)


100%|██████████| 2756/2756 [00:19<00:00, 138.42it/s]


In [30]:
followup_B_plus = combine_data(main_followup_B, avg_neg_strength_followup_B)
results_no_network, models_no_network = run_cross_validation(followup_B_plus.fillna(0.0), y_followup_B,
                                                             use_network=False)
print_cross_val_results(results_no_network)

Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 20/20 [00:00<00:00, 228.78it/s]


Site: site02 | Pearson r = 0.419 (p=0.000) | Partial η² = 0.170

Site: site03 | Pearson r = 0.371 (p=0.000) | Partial η² = 0.128

Site: site04 | Pearson r = 0.452 (p=0.000) | Partial η² = 0.155

Site: site05 | Pearson r = 0.557 (p=0.000) | Partial η² = 0.288

Site: site06 | Pearson r = 0.304 (p=0.002) | Partial η² = 0.030

Site: site07 | Pearson r = 0.557 (p=0.000) | Partial η² = 0.279

Site: site08 | Pearson r = 0.351 (p=0.002) | Partial η² = -0.000

Site: site09 | Pearson r = 0.436 (p=0.001) | Partial η² = 0.105

Site: site10 | Pearson r = 0.518 (p=0.000) | Partial η² = 0.258

Site: site11 | Pearson r = 0.544 (p=0.000) | Partial η² = 0.287

Site: site12 | Pearson r = 0.489 (p=0.000) | Partial η² = 0.237

Site: site13 | Pearson r = 0.440 (p=0.000) | Partial η² = 0.182

Site: site14 | Pearson r = 0.458 (p=0.000) | Partial η² = 0.209

Site: site15 | Pearson r = 0.735 (p=0.000) | Partial η² = 0.484

Site: site16 | Pearson r = 0.392 (p=0.000) | Partial η² = 0.145

Site: site17 | Pearson 




#### Negative strength (one value for each region, 15 values total)
| **Case**                | **PC Condition** | **Average Pearson's r** | **Cross-validated Partial η² (avg)** |
|-------------------------|------------------|-------------------------|--------------------------------------|
| B\_baseline + statistic |                  | 0.500                   | 0.222                                |
| B\_baseline             | none             | 0.502                   | 0.223                                |
| B\_baseline             | 15 PCs           | 0.512                   | 0.237                                |
| B\_baseline             | 100 PCs          | 0.530                   | 0.261                                |
| B\_followup + statistic |                  | 0.464                   | 0.187                                |
| B\_followup             | none             | 0.463                   | 0.188                                |
| B\_followup             | 15 PCs           | 0.473                   | 0.197                                |
| B\_followup             | 100 PCs          | 0.519                   | 0.241                                |

In [31]:
avg_total_strength_baseline_B = compute_one_statistic(df_features=main_baseline_B,
                                                    df_network=conn_B_baseline_df,
                                                    regions=regions,
                                                    stat_name="avg_total_strength",
                                                    mode="all",
                                                    node_order=None,
                                                    matrix_size=418,
                                                    diag_value=0.0)
baseline_B_plus = combine_data(main_baseline_B, avg_total_strength_baseline_B)
results_no_network, models_no_network = run_cross_validation(baseline_B_plus, y_baseline_B, use_network=False)
print_cross_val_results(results_no_network)

100%|██████████| 4321/4321 [00:38<00:00, 113.48it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 20/20 [00:00<00:00, 212.39it/s]


Site: site02 | Pearson r = 0.474 (p=0.000) | Partial η² = 0.124

Site: site03 | Pearson r = 0.430 (p=0.000) | Partial η² = 0.153

Site: site04 | Pearson r = 0.502 (p=0.000) | Partial η² = 0.250

Site: site05 | Pearson r = 0.587 (p=0.000) | Partial η² = 0.341

Site: site06 | Pearson r = 0.454 (p=0.000) | Partial η² = 0.191

Site: site07 | Pearson r = 0.650 (p=0.000) | Partial η² = 0.394

Site: site08 | Pearson r = 0.472 (p=0.000) | Partial η² = 0.090

Site: site09 | Pearson r = 0.427 (p=0.000) | Partial η² = 0.071

Site: site10 | Pearson r = 0.468 (p=0.000) | Partial η² = 0.212

Site: site11 | Pearson r = 0.573 (p=0.000) | Partial η² = 0.300

Site: site12 | Pearson r = 0.431 (p=0.000) | Partial η² = 0.115

Site: site13 | Pearson r = 0.569 (p=0.000) | Partial η² = 0.317

Site: site14 | Pearson r = 0.476 (p=0.000) | Partial η² = 0.222

Site: site15 | Pearson r = 0.620 (p=0.000) | Partial η² = 0.375

Site: site16 | Pearson r = 0.518 (p=0.000) | Partial η² = 0.260

Site: site17 | Pearson r




In [32]:
avg_total_strength_followup_B = compute_one_statistic(df_features=main_followup_B,
                                                    df_network=conn_B_followup_df,
                                                    regions=regions,
                                                    stat_name="avg_total_strength",
                                                    mode="all",
                                                    node_order=None,
                                                    matrix_size=418,
                                                    diag_value=0.0)

followup_B_plus = combine_data(main_followup_B, avg_total_strength_followup_B)
results_no_network, models_no_network = run_cross_validation(followup_B_plus.fillna(0.0), y_followup_B,
                                                             use_network=False)
print_cross_val_results(results_no_network)

100%|██████████| 2756/2756 [00:20<00:00, 134.64it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 20/20 [00:00<00:00, 300.52it/s]


Site: site02 | Pearson r = 0.422 (p=0.000) | Partial η² = 0.173

Site: site03 | Pearson r = 0.379 (p=0.000) | Partial η² = 0.134

Site: site04 | Pearson r = 0.449 (p=0.000) | Partial η² = 0.139

Site: site05 | Pearson r = 0.572 (p=0.000) | Partial η² = 0.308

Site: site06 | Pearson r = 0.310 (p=0.002) | Partial η² = 0.037

Site: site07 | Pearson r = 0.538 (p=0.000) | Partial η² = 0.261

Site: site08 | Pearson r = 0.363 (p=0.001) | Partial η² = 0.028

Site: site09 | Pearson r = 0.395 (p=0.003) | Partial η² = 0.064

Site: site10 | Pearson r = 0.510 (p=0.000) | Partial η² = 0.249

Site: site11 | Pearson r = 0.548 (p=0.000) | Partial η² = 0.270

Site: site12 | Pearson r = 0.496 (p=0.000) | Partial η² = 0.243

Site: site13 | Pearson r = 0.446 (p=0.000) | Partial η² = 0.191

Site: site14 | Pearson r = 0.472 (p=0.000) | Partial η² = 0.220

Site: site15 | Pearson r = 0.745 (p=0.000) | Partial η² = 0.500

Site: site16 | Pearson r = 0.378 (p=0.000) | Partial η² = 0.134

Site: site17 | Pearson r




#### Total strength (one value for each region, 15 values total)
| **Case**                | **PC Condition** | **Average Pearson's r** | **Cross-validated Partial η² (avg)** |
|-------------------------|------------------|-------------------------|--------------------------------------|
| B\_baseline + statistic |                  | 0.502                   | 0.223                                |
| B\_baseline             | none             | 0.502                   | 0.223                                |
| B\_baseline             | 15 PCs           | 0.512                   | 0.237                                |
| B\_baseline             | 100 PCs          | 0.530                   | 0.261                                |
| B\_followup + statistic |                  | 0.464                   | 0.190                                |
| B\_followup             | none             | 0.463                   | 0.188                                |
| B\_followup             | 15 PCs           | 0.473                   | 0.197                                |
| B\_followup             | 100 PCs          | 0.519                   | 0.241                                |

In [33]:
within_pos_mean_baseline_B = compute_one_statistic(df_features=main_baseline_B,
                                                    df_network=conn_B_baseline_df,
                                                    regions=regions,
                                                    stat_name="within_positive_mean",
                                                    mode="all",
                                                    node_order=None,
                                                    matrix_size=418,
                                                    diag_value=0.0)


100%|██████████| 4321/4321 [00:37<00:00, 116.10it/s]


In [35]:
baseline_B_plus = combine_data(main_baseline_B, within_pos_mean_baseline_B)
results_no_network, models_no_network = run_cross_validation(baseline_B_plus, y_baseline_B, use_network=False)
print_cross_val_results(results_no_network)

Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 20/20 [00:00<00:00, 211.69it/s]


Site: site02 | Pearson r = 0.461 (p=0.000) | Partial η² = 0.082

Site: site03 | Pearson r = 0.434 (p=0.000) | Partial η² = 0.166

Site: site04 | Pearson r = 0.495 (p=0.000) | Partial η² = 0.243

Site: site05 | Pearson r = 0.581 (p=0.000) | Partial η² = 0.335

Site: site06 | Pearson r = 0.458 (p=0.000) | Partial η² = 0.195

Site: site07 | Pearson r = 0.644 (p=0.000) | Partial η² = 0.391

Site: site08 | Pearson r = 0.462 (p=0.000) | Partial η² = 0.073

Site: site09 | Pearson r = 0.442 (p=0.000) | Partial η² = 0.094

Site: site10 | Pearson r = 0.469 (p=0.000) | Partial η² = 0.215

Site: site11 | Pearson r = 0.588 (p=0.000) | Partial η² = 0.305

Site: site12 | Pearson r = 0.427 (p=0.000) | Partial η² = 0.108

Site: site13 | Pearson r = 0.553 (p=0.000) | Partial η² = 0.298

Site: site14 | Pearson r = 0.481 (p=0.000) | Partial η² = 0.226

Site: site15 | Pearson r = 0.631 (p=0.000) | Partial η² = 0.388

Site: site16 | Pearson r = 0.521 (p=0.000) | Partial η² = 0.265

Site: site17 | Pearson r




In [36]:
within_pos_mean_followup_B = compute_one_statistic(df_features=main_followup_B,
                                                    df_network=conn_B_followup_df,
                                                    regions=regions,
                                                    stat_name="within_positive_mean",
                                                    mode="all",
                                                    node_order=None,
                                                    matrix_size=418,
                                                    diag_value=0.0)

followup_B_plus = combine_data(main_followup_B,within_pos_mean_followup_B )
results_no_network, models_no_network = run_cross_validation(followup_B_plus.fillna(0.0), y_followup_B,
                                                             use_network=False)
print_cross_val_results(results_no_network)

100%|██████████| 2756/2756 [00:21<00:00, 128.57it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 20/20 [00:00<00:00, 339.80it/s]


Site: site02 | Pearson r = 0.403 (p=0.000) | Partial η² = 0.148

Site: site03 | Pearson r = 0.358 (p=0.000) | Partial η² = 0.116

Site: site04 | Pearson r = 0.436 (p=0.000) | Partial η² = 0.132

Site: site05 | Pearson r = 0.558 (p=0.000) | Partial η² = 0.287

Site: site06 | Pearson r = 0.301 (p=0.002) | Partial η² = 0.029

Site: site07 | Pearson r = 0.544 (p=0.000) | Partial η² = 0.274

Site: site08 | Pearson r = 0.354 (p=0.001) | Partial η² = 0.022

Site: site09 | Pearson r = 0.394 (p=0.003) | Partial η² = 0.064

Site: site10 | Pearson r = 0.514 (p=0.000) | Partial η² = 0.248

Site: site11 | Pearson r = 0.545 (p=0.000) | Partial η² = 0.293

Site: site12 | Pearson r = 0.512 (p=0.000) | Partial η² = 0.262

Site: site13 | Pearson r = 0.454 (p=0.000) | Partial η² = 0.196

Site: site14 | Pearson r = 0.495 (p=0.000) | Partial η² = 0.244

Site: site15 | Pearson r = 0.718 (p=0.000) | Partial η² = 0.452

Site: site16 | Pearson r = 0.349 (p=0.000) | Partial η² = 0.110

Site: site17 | Pearson r




#### Within positive connectivity (one value for each region, 15 values total)
| **Case**                | **PC Condition** | **Average Pearson's r** | **Cross-validated Partial η² (avg)** |
|-------------------------|------------------|-------------------------|--------------------------------------|
| B\_baseline + statistic |                  | 0.503                   | 0.221                                |
| B\_baseline             | none             | 0.502                   | 0.223                                |
| B\_baseline             | 15 PCs           | 0.512                   | 0.237                                |
| B\_baseline             | 100 PCs          | 0.530                   | 0.261                                |
| B\_followup + statistic |                  | 0.462                   | 0.184                                |
| B\_followup             | none             | 0.463                   | 0.188                                |
| B\_followup             | 15 PCs           | 0.473                   | 0.197                                |
| B\_followup             | 100 PCs          | 0.519                   | 0.241                                |

In [37]:
between_pos_mean_baseline_B = compute_one_statistic(df_features=main_baseline_B,
                                                    df_network=conn_B_baseline_df,
                                                    regions=regions,
                                                    stat_name="between_positive_mean",
                                                    mode="all",
                                                    node_order=None,
                                                    matrix_size=418,
                                                    diag_value=0.0)
baseline_B_plus = combine_data(main_baseline_B, between_pos_mean_baseline_B)
results_no_network, models_no_network = run_cross_validation(baseline_B_plus, y_baseline_B, use_network=False)
print_cross_val_results(results_no_network)

100%|██████████| 4321/4321 [00:36<00:00, 118.50it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 20/20 [00:00<00:00, 206.71it/s]


Site: site02 | Pearson r = 0.478 (p=0.000) | Partial η² = 0.137

Site: site03 | Pearson r = 0.437 (p=0.000) | Partial η² = 0.166

Site: site04 | Pearson r = 0.497 (p=0.000) | Partial η² = 0.247

Site: site05 | Pearson r = 0.582 (p=0.000) | Partial η² = 0.335

Site: site06 | Pearson r = 0.459 (p=0.000) | Partial η² = 0.194

Site: site07 | Pearson r = 0.648 (p=0.000) | Partial η² = 0.392

Site: site08 | Pearson r = 0.475 (p=0.000) | Partial η² = 0.108

Site: site09 | Pearson r = 0.428 (p=0.000) | Partial η² = 0.060

Site: site10 | Pearson r = 0.467 (p=0.000) | Partial η² = 0.212

Site: site11 | Pearson r = 0.573 (p=0.000) | Partial η² = 0.303

Site: site12 | Pearson r = 0.427 (p=0.000) | Partial η² = 0.102

Site: site13 | Pearson r = 0.563 (p=0.000) | Partial η² = 0.303

Site: site14 | Pearson r = 0.481 (p=0.000) | Partial η² = 0.224

Site: site15 | Pearson r = 0.618 (p=0.000) | Partial η² = 0.375

Site: site16 | Pearson r = 0.520 (p=0.000) | Partial η² = 0.260

Site: site17 | Pearson r




In [38]:
between_pos_mean_followup_B = compute_one_statistic(df_features=main_followup_B,
                                                    df_network=conn_B_followup_df,
                                                    regions=regions,
                                                    stat_name="between_positive_mean",
                                                    mode="all",
                                                    node_order=None,
                                                    matrix_size=418,
                                                    diag_value=0.0)

followup_B_plus = combine_data(main_followup_B,between_pos_mean_followup_B )
results_no_network, models_no_network = run_cross_validation(followup_B_plus.fillna(0.0), y_followup_B,
                                                             use_network=False)
print_cross_val_results(results_no_network)

100%|██████████| 2756/2756 [00:21<00:00, 127.42it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 20/20 [00:00<00:00, 253.31it/s]


Site: site02 | Pearson r = 0.403 (p=0.000) | Partial η² = 0.158

Site: site03 | Pearson r = 0.380 (p=0.000) | Partial η² = 0.136

Site: site04 | Pearson r = 0.449 (p=0.000) | Partial η² = 0.133

Site: site05 | Pearson r = 0.574 (p=0.000) | Partial η² = 0.309

Site: site06 | Pearson r = 0.295 (p=0.003) | Partial η² = 0.021

Site: site07 | Pearson r = 0.537 (p=0.000) | Partial η² = 0.256

Site: site08 | Pearson r = 0.369 (p=0.001) | Partial η² = 0.045

Site: site09 | Pearson r = 0.380 (p=0.005) | Partial η² = 0.035

Site: site10 | Pearson r = 0.516 (p=0.000) | Partial η² = 0.257

Site: site11 | Pearson r = 0.509 (p=0.000) | Partial η² = 0.186

Site: site12 | Pearson r = 0.505 (p=0.000) | Partial η² = 0.253

Site: site13 | Pearson r = 0.457 (p=0.000) | Partial η² = 0.203

Site: site14 | Pearson r = 0.474 (p=0.000) | Partial η² = 0.223

Site: site15 | Pearson r = 0.736 (p=0.000) | Partial η² = 0.493

Site: site16 | Pearson r = 0.383 (p=0.000) | Partial η² = 0.140

Site: site17 | Pearson r




#### Between positive connectivity (one value for each region, 15 values total)
| **Case**                | **PC Condition** | **Average Pearson's r** | **Cross-validated Partial η² (avg)** |
|-------------------------|------------------|-------------------------|--------------------------------------|
| B\_baseline + statistic |                  | 0.503                   | 0.224                                |
| B\_baseline             | none             | 0.502                   | 0.223                                |
| B\_baseline             | 15 PCs           | 0.512                   | 0.237                                |
| B\_baseline             | 100 PCs          | 0.530                   | 0.261                                |
| B\_followup + statistic |                  | 0.461                   | 0.184                                |
| B\_followup             | none             | 0.463                   | 0.188                                |
| B\_followup             | 15 PCs           | 0.473                   | 0.197                                |
| B\_followup             | 100 PCs          | 0.519                   | 0.241                                |

In [40]:
within_neg_mean_baseline_B = compute_one_statistic(df_features=main_baseline_B,
                                                    df_network=conn_B_baseline_df,
                                                    regions=regions,
                                                    stat_name="within_negative_mean",
                                                    mode="all",
                                                    node_order=None,
                                                    matrix_size=418,
                                                    diag_value=0.0)
baseline_B_plus = combine_data(main_baseline_B, within_neg_mean_baseline_B)
results_no_network, models_no_network = run_cross_validation(baseline_B_plus, y_baseline_B, use_network=False)
print_cross_val_results(results_no_network)

100%|██████████| 4321/4321 [00:37<00:00, 116.19it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 20/20 [00:00<00:00, 202.54it/s]


Site: site02 | Pearson r = 0.451 (p=0.000) | Partial η² = 0.087

Site: site03 | Pearson r = 0.431 (p=0.000) | Partial η² = 0.156

Site: site04 | Pearson r = 0.498 (p=0.000) | Partial η² = 0.246

Site: site05 | Pearson r = 0.576 (p=0.000) | Partial η² = 0.328

Site: site06 | Pearson r = 0.440 (p=0.000) | Partial η² = 0.177

Site: site07 | Pearson r = 0.645 (p=0.000) | Partial η² = 0.388

Site: site08 | Pearson r = 0.475 (p=0.000) | Partial η² = 0.073

Site: site09 | Pearson r = 0.454 (p=0.000) | Partial η² = 0.104

Site: site10 | Pearson r = 0.449 (p=0.000) | Partial η² = 0.190

Site: site11 | Pearson r = 0.578 (p=0.000) | Partial η² = 0.305

Site: site12 | Pearson r = 0.415 (p=0.000) | Partial η² = 0.103

Site: site13 | Pearson r = 0.570 (p=0.000) | Partial η² = 0.319

Site: site14 | Pearson r = 0.476 (p=0.000) | Partial η² = 0.220

Site: site15 | Pearson r = 0.625 (p=0.000) | Partial η² = 0.382

Site: site16 | Pearson r = 0.510 (p=0.000) | Partial η² = 0.254

Site: site17 | Pearson r




In [41]:
within_neg_mean_followup_B = compute_one_statistic(df_features=main_followup_B,
                                                    df_network=conn_B_followup_df,
                                                    regions=regions,
                                                    stat_name="within_negative_mean",
                                                    mode="all",
                                                    node_order=None,
                                                    matrix_size=418,
                                                    diag_value=0.0)

followup_B_plus = combine_data(main_followup_B, within_neg_mean_followup_B )
results_no_network, models_no_network = run_cross_validation(followup_B_plus.fillna(0.0), y_followup_B,
                                                             use_network=False)
print_cross_val_results(results_no_network)

100%|██████████| 2756/2756 [00:19<00:00, 140.35it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 20/20 [00:00<00:00, 323.71it/s]


Site: site02 | Pearson r = 0.415 (p=0.000) | Partial η² = 0.166

Site: site03 | Pearson r = 0.376 (p=0.000) | Partial η² = 0.133

Site: site04 | Pearson r = 0.427 (p=0.000) | Partial η² = 0.132

Site: site05 | Pearson r = 0.566 (p=0.000) | Partial η² = 0.296

Site: site06 | Pearson r = 0.272 (p=0.006) | Partial η² = -0.024

Site: site07 | Pearson r = 0.550 (p=0.000) | Partial η² = 0.265

Site: site08 | Pearson r = 0.347 (p=0.002) | Partial η² = 0.010

Site: site09 | Pearson r = 0.447 (p=0.001) | Partial η² = 0.104

Site: site10 | Pearson r = 0.524 (p=0.000) | Partial η² = 0.264

Site: site11 | Pearson r = 0.547 (p=0.000) | Partial η² = 0.290

Site: site12 | Pearson r = 0.479 (p=0.000) | Partial η² = 0.227

Site: site13 | Pearson r = 0.440 (p=0.000) | Partial η² = 0.182

Site: site14 | Pearson r = 0.462 (p=0.000) | Partial η² = 0.213

Site: site15 | Pearson r = 0.746 (p=0.000) | Partial η² = 0.489

Site: site16 | Pearson r = 0.376 (p=0.000) | Partial η² = 0.133

Site: site17 | Pearson 




#### Within negative connectivity (one value for each region, 15 values total)
| **Case**                | **PC Condition** | **Average Pearson's r** | **Cross-validated Partial η² (avg)** |
|-------------------------|------------------|-------------------------|--------------------------------------|
| B\_baseline + statistic |                  | 0.501                   | 0.220                                |
| B\_baseline             | none             | 0.502                   | 0.223                                |
| B\_baseline             | 15 PCs           | 0.512                   | 0.237                                |
| B\_baseline             | 100 PCs          | 0.530                   | 0.261                                |
| B\_followup + statistic |                  | 0.464                   | 0.186                                |
| B\_followup             | none             | 0.463                   | 0.188                                |
| B\_followup             | 15 PCs           | 0.473                   | 0.197                                |
| B\_followup             | 100 PCs          | 0.519                   | 0.241                                |

In [42]:
between_neg_mean_baseline_B = compute_one_statistic(df_features=main_baseline_B,
                                                    df_network=conn_B_baseline_df,
                                                    regions=regions,
                                                    stat_name="between_negative_mean",
                                                    mode="all",
                                                    node_order=None,
                                                    matrix_size=418,
                                                    diag_value=0.0)
baseline_B_plus = combine_data(main_baseline_B, between_neg_mean_baseline_B)
results_no_network, models_no_network = run_cross_validation(baseline_B_plus, y_baseline_B, use_network=False)
print_cross_val_results(results_no_network)

100%|██████████| 4321/4321 [00:35<00:00, 121.45it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 20/20 [00:00<00:00, 238.12it/s]


Site: site02 | Pearson r = 0.456 (p=0.000) | Partial η² = 0.109

Site: site03 | Pearson r = 0.427 (p=0.000) | Partial η² = 0.152

Site: site04 | Pearson r = 0.495 (p=0.000) | Partial η² = 0.245

Site: site05 | Pearson r = 0.579 (p=0.000) | Partial η² = 0.332

Site: site06 | Pearson r = 0.459 (p=0.000) | Partial η² = 0.193

Site: site07 | Pearson r = 0.643 (p=0.000) | Partial η² = 0.387

Site: site08 | Pearson r = 0.468 (p=0.000) | Partial η² = 0.086

Site: site09 | Pearson r = 0.443 (p=0.000) | Partial η² = 0.098

Site: site10 | Pearson r = 0.467 (p=0.000) | Partial η² = 0.211

Site: site11 | Pearson r = 0.574 (p=0.000) | Partial η² = 0.305

Site: site12 | Pearson r = 0.414 (p=0.000) | Partial η² = 0.091

Site: site13 | Pearson r = 0.563 (p=0.000) | Partial η² = 0.307

Site: site14 | Pearson r = 0.480 (p=0.000) | Partial η² = 0.225

Site: site15 | Pearson r = 0.626 (p=0.000) | Partial η² = 0.383

Site: site16 | Pearson r = 0.516 (p=0.000) | Partial η² = 0.258

Site: site17 | Pearson r




In [43]:
between_neg_mean_followup_B = compute_one_statistic(df_features=main_followup_B,
                                                    df_network=conn_B_followup_df,
                                                    regions=regions,
                                                    stat_name="between_negative_mean",
                                                    mode="all",
                                                    node_order=None,
                                                    matrix_size=418,
                                                    diag_value=0.0)

followup_B_plus = combine_data(main_followup_B, between_neg_mean_followup_B)
results_no_network, models_no_network = run_cross_validation(followup_B_plus.fillna(0.0), y_followup_B,
                                                             use_network=False)
print_cross_val_results(results_no_network)

100%|██████████| 2756/2756 [00:20<00:00, 137.71it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 20/20 [00:00<00:00, 263.03it/s]


Site: site02 | Pearson r = 0.417 (p=0.000) | Partial η² = 0.168

Site: site03 | Pearson r = 0.371 (p=0.000) | Partial η² = 0.127

Site: site04 | Pearson r = 0.452 (p=0.000) | Partial η² = 0.155

Site: site05 | Pearson r = 0.556 (p=0.000) | Partial η² = 0.288

Site: site06 | Pearson r = 0.308 (p=0.002) | Partial η² = 0.035

Site: site07 | Pearson r = 0.558 (p=0.000) | Partial η² = 0.282

Site: site08 | Pearson r = 0.351 (p=0.002) | Partial η² = -0.001

Site: site09 | Pearson r = 0.437 (p=0.001) | Partial η² = 0.106

Site: site10 | Pearson r = 0.517 (p=0.000) | Partial η² = 0.257

Site: site11 | Pearson r = 0.544 (p=0.000) | Partial η² = 0.287

Site: site12 | Pearson r = 0.489 (p=0.000) | Partial η² = 0.237

Site: site13 | Pearson r = 0.440 (p=0.000) | Partial η² = 0.182

Site: site14 | Pearson r = 0.458 (p=0.000) | Partial η² = 0.210

Site: site15 | Pearson r = 0.737 (p=0.000) | Partial η² = 0.485

Site: site16 | Pearson r = 0.393 (p=0.000) | Partial η² = 0.145

Site: site17 | Pearson 




#### Between negative connectivity (one value for each region, 15 values total)
| **Case**                | **PC Condition** | **Average Pearson's r** | **Cross-validated Partial η² (avg)** |
|-------------------------|------------------|-------------------------|--------------------------------------|
| B\_baseline + statistic |                  | 0.500                   | 0.222                                |
| B\_baseline             | none             | 0.502                   | 0.223                                |
| B\_baseline             | 15 PCs           | 0.512                   | 0.237                                |
| B\_baseline             | 100 PCs          | 0.530                   | 0.261                                |
| B\_followup + statistic |                  | 0.464                   | 0.187                                |
| B\_followup             | none             | 0.463                   | 0.188                                |
| B\_followup             | 15 PCs           | 0.473                   | 0.197                                |
| B\_followup             | 100 PCs          | 0.519                   | 0.241                                |

In [44]:
segregation_ratio_baseline_B = compute_one_statistic(df_features=main_baseline_B,
                                                    df_network=conn_B_baseline_df,
                                                    regions=regions,
                                                    stat_name="segregation_ratio",
                                                    mode="all",
                                                    node_order=None,
                                                    matrix_size=418,
                                                    diag_value=0.0)
baseline_B_plus = combine_data(main_baseline_B, segregation_ratio_baseline_B)
results_no_network, models_no_network = run_cross_validation(baseline_B_plus, y_baseline_B, use_network=False)
print_cross_val_results(results_no_network)

100%|██████████| 4321/4321 [00:38<00:00, 111.90it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 20/20 [00:00<00:00, 230.25it/s]


Site: site02 | Pearson r = 0.459 (p=0.000) | Partial η² = 0.106

Site: site03 | Pearson r = 0.452 (p=0.000) | Partial η² = 0.181

Site: site04 | Pearson r = 0.496 (p=0.000) | Partial η² = 0.244

Site: site05 | Pearson r = 0.582 (p=0.000) | Partial η² = 0.336

Site: site06 | Pearson r = 0.482 (p=0.000) | Partial η² = 0.221

Site: site07 | Pearson r = 0.653 (p=0.000) | Partial η² = 0.401

Site: site08 | Pearson r = 0.464 (p=0.000) | Partial η² = 0.081

Site: site09 | Pearson r = 0.438 (p=0.000) | Partial η² = 0.098

Site: site10 | Pearson r = 0.476 (p=0.000) | Partial η² = 0.220

Site: site11 | Pearson r = 0.585 (p=0.000) | Partial η² = 0.299

Site: site12 | Pearson r = 0.420 (p=0.000) | Partial η² = 0.097

Site: site13 | Pearson r = 0.555 (p=0.000) | Partial η² = 0.303

Site: site14 | Pearson r = 0.507 (p=0.000) | Partial η² = 0.253

Site: site15 | Pearson r = 0.635 (p=0.000) | Partial η² = 0.393

Site: site16 | Pearson r = 0.528 (p=0.000) | Partial η² = 0.270

Site: site17 | Pearson r




In [45]:
segregation_ratio_followup_B = compute_one_statistic(df_features=main_followup_B,
                                                    df_network=conn_B_followup_df,
                                                    regions=regions,
                                                    stat_name="segregation_ratio",
                                                    mode="all",
                                                    node_order=None,
                                                    matrix_size=418,
                                                    diag_value=0.0)

followup_B_plus = combine_data(main_followup_B, segregation_ratio_followup_B)
results_no_network, models_no_network = run_cross_validation(followup_B_plus.fillna(0.0), y_followup_B,
                                                             use_network=False)
print_cross_val_results(results_no_network)

100%|██████████| 2756/2756 [00:21<00:00, 130.34it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 20/20 [00:00<00:00, 289.93it/s]


Site: site02 | Pearson r = 0.391 (p=0.000) | Partial η² = 0.145

Site: site03 | Pearson r = 0.361 (p=0.000) | Partial η² = 0.117

Site: site04 | Pearson r = 0.436 (p=0.000) | Partial η² = 0.141

Site: site05 | Pearson r = 0.562 (p=0.000) | Partial η² = 0.290

Site: site06 | Pearson r = 0.320 (p=0.001) | Partial η² = 0.052

Site: site07 | Pearson r = 0.573 (p=0.000) | Partial η² = 0.300

Site: site08 | Pearson r = 0.373 (p=0.001) | Partial η² = 0.032

Site: site09 | Pearson r = 0.409 (p=0.002) | Partial η² = 0.081

Site: site10 | Pearson r = 0.513 (p=0.000) | Partial η² = 0.250

Site: site11 | Pearson r = 0.554 (p=0.000) | Partial η² = 0.303

Site: site12 | Pearson r = 0.504 (p=0.000) | Partial η² = 0.253

Site: site13 | Pearson r = 0.447 (p=0.000) | Partial η² = 0.189

Site: site14 | Pearson r = 0.487 (p=0.000) | Partial η² = 0.236

Site: site15 | Pearson r = 0.706 (p=0.000) | Partial η² = 0.452

Site: site16 | Pearson r = 0.384 (p=0.000) | Partial η² = 0.141

Site: site17 | Pearson r




#### Segregation ratio (one value for each region, 15 values total)
| **Case**                | **PC Condition** | **Average Pearson's r** | **Cross-validated Partial η² (avg)** |
|-------------------------|------------------|-------------------------|--------------------------------------|
| B\_baseline + statistic |                  | 0.507                   | 0.226                                |
| B\_baseline             | none             | 0.502                   | 0.223                                |
| B\_baseline             | 15 PCs           | 0.512                   | 0.237                                |
| B\_baseline             | 100 PCs          | 0.530                   | 0.261                                |
| B\_followup + statistic |                  | 0.466                   | 0.189                                |
| B\_followup             | none             | 0.463                   | 0.188                                |
| B\_followup             | 15 PCs           | 0.473                   | 0.197                                |
| B\_followup             | 100 PCs          | 0.519                   | 0.241                                |

In [46]:
segregation_index_baseline_B = compute_one_statistic(df_features=main_baseline_B,
                                                    df_network=conn_B_baseline_df,
                                                    regions=regions,
                                                    stat_name="segregation_index",
                                                    mode="all",
                                                    node_order=None,
                                                    matrix_size=418,
                                                    diag_value=0.0)
baseline_B_plus = combine_data(main_baseline_B, segregation_index_baseline_B)
results_no_network, models_no_network = run_cross_validation(baseline_B_plus, y_baseline_B, use_network=False)
print_cross_val_results(results_no_network)

100%|██████████| 4321/4321 [00:38<00:00, 113.19it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 20/20 [00:00<00:00, 231.91it/s]


Site: site02 | Pearson r = 0.459 (p=0.000) | Partial η² = 0.100

Site: site03 | Pearson r = 0.442 (p=0.000) | Partial η² = 0.170

Site: site04 | Pearson r = 0.496 (p=0.000) | Partial η² = 0.244

Site: site05 | Pearson r = 0.580 (p=0.000) | Partial η² = 0.334

Site: site06 | Pearson r = 0.489 (p=0.000) | Partial η² = 0.227

Site: site07 | Pearson r = 0.654 (p=0.000) | Partial η² = 0.399

Site: site08 | Pearson r = 0.461 (p=0.000) | Partial η² = 0.077

Site: site09 | Pearson r = 0.433 (p=0.000) | Partial η² = 0.092

Site: site10 | Pearson r = 0.472 (p=0.000) | Partial η² = 0.215

Site: site11 | Pearson r = 0.581 (p=0.000) | Partial η² = 0.296

Site: site12 | Pearson r = 0.436 (p=0.000) | Partial η² = 0.117

Site: site13 | Pearson r = 0.562 (p=0.000) | Partial η² = 0.310

Site: site14 | Pearson r = 0.498 (p=0.000) | Partial η² = 0.245

Site: site15 | Pearson r = 0.637 (p=0.000) | Partial η² = 0.394

Site: site16 | Pearson r = 0.530 (p=0.000) | Partial η² = 0.273

Site: site17 | Pearson r




In [47]:
segregation_index_followup_B = compute_one_statistic(df_features=main_followup_B,
                                                    df_network=conn_B_followup_df,
                                                    regions=regions,
                                                    stat_name="segregation_index",
                                                    mode="all",
                                                    node_order=None,
                                                    matrix_size=418,
                                                    diag_value=0.0)

followup_B_plus = combine_data(main_followup_B, segregation_index_followup_B)
results_no_network, models_no_network = run_cross_validation(followup_B_plus.fillna(0.0), y_followup_B,
                                                             use_network=False)
print_cross_val_results(results_no_network)

100%|██████████| 2756/2756 [00:20<00:00, 132.86it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 20/20 [00:00<00:00, 316.06it/s]


Site: site02 | Pearson r = 0.377 (p=0.000) | Partial η² = 0.134

Site: site03 | Pearson r = 0.358 (p=0.000) | Partial η² = 0.115

Site: site04 | Pearson r = 0.436 (p=0.000) | Partial η² = 0.141

Site: site05 | Pearson r = 0.558 (p=0.000) | Partial η² = 0.289

Site: site06 | Pearson r = 0.309 (p=0.002) | Partial η² = 0.040

Site: site07 | Pearson r = 0.585 (p=0.000) | Partial η² = 0.305

Site: site08 | Pearson r = 0.346 (p=0.002) | Partial η² = 0.006

Site: site09 | Pearson r = 0.398 (p=0.003) | Partial η² = 0.073

Site: site10 | Pearson r = 0.511 (p=0.000) | Partial η² = 0.246

Site: site11 | Pearson r = 0.566 (p=0.000) | Partial η² = 0.311

Site: site12 | Pearson r = 0.508 (p=0.000) | Partial η² = 0.257

Site: site13 | Pearson r = 0.433 (p=0.000) | Partial η² = 0.173

Site: site14 | Pearson r = 0.464 (p=0.000) | Partial η² = 0.215

Site: site15 | Pearson r = 0.742 (p=0.000) | Partial η² = 0.501

Site: site16 | Pearson r = 0.389 (p=0.000) | Partial η² = 0.146

Site: site17 | Pearson r




#### Segregation index (one value for each region, 15 values total)
| **Case**                | **PC Condition** | **Average Pearson's r** | **Cross-validated Partial η² (avg)** |
|-------------------------|------------------|-------------------------|--------------------------------------|
| B\_baseline + statistic |                  | 0.507                   | 0.227                                |
| B\_baseline             | none             | 0.502                   | 0.223                                |
| B\_baseline             | 15 PCs           | 0.512                   | 0.237                                |
| B\_baseline             | 100 PCs          | 0.530                   | 0.261                                |
| B\_followup + statistic |                  | 0.464                   | 0.188                                |
| B\_followup             | none             | 0.463                   | 0.188                                |
| B\_followup             | 15 PCs           | 0.473                   | 0.197                                |
| B\_followup             | 100 PCs          | 0.519                   | 0.241                                |

####  Option B preprocessed baseline data summary table:
| **Model Description**             | **Average Pearson's r** | **Cross-validated Partial η² (avg)** |
|-----------------------------------|-------------------------|--------------------------------------|
| Full features                     | 0.502                   | 0.223                                |
| Full features + 15 PCs            | 0.512                   | 0.237                                |
| Full features + 100 PCs           | 0.530                   | 0.261                                |
| Full features + pos strength      | 0.503                   | 0.224                                |
| Full features + neg strength      | 0.500                   | 0.222                                |
| Full features + total strength    | 0.502                   | 0.223                                |
| Full features + within positive   | 0.503                   | 0.221                                |
| Full features + between positive  | 0.503                   | 0.224                                |
| Full features + within negative   | 0.501                   | 0.220                                |
| Full features + between negative  | 0.500                   | 0.222                                |
| Full features + segregation ratio | 0.507                   | 0.226                                |
| Full features + segregation index | 0.507                   | 0.227                                |


####  Option B preprocessed followup data summary table:
| **Model Description**             | **Average Pearson's r** | **Cross-validated Partial η² (avg)** |
|-----------------------------------|-------------------------|--------------------------------------|
| Full features                     | 0.463                   | 0.188                                |
| Full features + 15 PCs            | 0.473                   | 0.197                                |
| Full features + 100 PCs           | 0.519                   | 0.241                                |
| Full features + pos strength      | 0.460                   | 0.183                                |
| Full features + neg strength      | 0.464                   | 0.187                                |
| Full features + total strength    | 0.464                   | 0.190                                |
| Full features + within positive   | 0.462                   | 0.184                                |
| Full features + between positive  | 0.461                   | 0.184                                |
| Full features + within negative   | 0.464                   | 0.186                                |
| Full features + between negative  | 0.464                   | 0.187                                |
| Full features + segregation ratio | 0.466                   | 0.189                                |
| Full features + segregation index | 0.464                   | 0.188                                |




## Option A data preprocessing:

In [10]:
# Define paths
version_A_dir = Path("data/processed/version_A")

# Load features
features_A_baseline = pd.read_csv(version_A_dir / "features_baseline.csv")
features_A_followup = pd.read_csv(version_A_dir / "features_followup.csv")

# Load connectomes
conn_A_baseline = np.load(version_A_dir / "connectomes_baseline.npy")
conn_A_followup = np.load(version_A_dir / "connectomes_followup.npy")

# Load subjects (separate for baseline/followup)
subs_A_baseline = pd.read_csv(version_A_dir / "subjects.csv")["Subject"].tolist()
subs_A_followup = pd.read_csv(version_A_dir / "subjects.csv")["Subject"].tolist()

print("Version A")
print("Baseline features:", features_A_baseline.shape)
print("Followup features:", features_A_followup.shape)
print("Baseline connectomes:", conn_A_baseline.shape)
print("Followup connectomes:", conn_A_followup.shape)
print("Baseline subjects:", len(subs_A_baseline))
print("Followup subjects:", len(subs_A_followup))

Version A
Baseline features: (2409, 13)
Followup features: (2409, 13)
Baseline connectomes: (2409, 87153)
Followup connectomes: (2409, 87153)
Baseline subjects: 2409
Followup subjects: 2409


In [11]:
X_baseline_A, y_baseline_A = extract_article_features(features_A_baseline)
X_followup_A, y_followup_A = extract_article_features(features_A_followup)

print(X_baseline_A.shape, y_baseline_A.shape)
print(X_followup_A.shape, y_followup_A.shape)

(2409, 9) (2409,)
(2409, 9) (2409,)


In [12]:
# Convert NumPy array to DataFrame with placeholder column names
conn_A_baseline_df = pd.DataFrame(
    conn_A_baseline,
    columns=[f"conn_{i}" for i in range(conn_A_baseline.shape[1])]
)

# Add Subject column
conn_A_baseline_df.insert(0, "Subject", subs_A_baseline)

conn_A_followup_df = pd.DataFrame(
    conn_A_followup,
    columns=[f"conn_{i}" for i in range(conn_A_followup.shape[1])]
)

# Add Subject column
conn_A_followup_df.insert(0, "Subject", subs_A_followup)


In [13]:
results_no_network, models_no_network = run_cross_validation(X_baseline_A, y_baseline_A, use_network=False)
print_cross_val_results(results_no_network)

Cross-validation progress: 100%|██████████| 19/19 [00:00<00:00, 262.28it/s]


Site: site02 | Pearson r = 0.251 (p=0.002) | Partial η² = 0.017

Site: site03 | Pearson r = 0.212 (p=0.009) | Partial η² = -0.000

Site: site04 | Pearson r = 0.344 (p=0.000) | Partial η² = 0.094

Site: site05 | Pearson r = 0.454 (p=0.000) | Partial η² = 0.192

Site: site06 | Pearson r = 0.414 (p=0.000) | Partial η² = 0.139

Site: site08 | Pearson r = 0.293 (p=0.025) | Partial η² = -0.592

Site: site09 | Pearson r = 0.276 (p=0.028) | Partial η² = -0.217

Site: site10 | Pearson r = 0.320 (p=0.000) | Partial η² = 0.077

Site: site11 | Pearson r = 0.393 (p=0.001) | Partial η² = 0.144

Site: site12 | Pearson r = 0.353 (p=0.010) | Partial η² = 0.064

Site: site13 | Pearson r = 0.298 (p=0.000) | Partial η² = 0.069

Site: site14 | Pearson r = 0.363 (p=0.000) | Partial η² = 0.124

Site: site15 | Pearson r = 0.621 (p=0.000) | Partial η² = 0.217

Site: site16 | Pearson r = 0.413 (p=0.000) | Partial η² = 0.168

Site: site17 | Pearson r = 0.296 (p=0.003) | Partial η² = -0.117

Site: site18 | Pears




In [14]:
results_no_network, models_no_network = run_cross_validation(X_followup_A, y_followup_A, use_network=False)
print_cross_val_results(results_no_network)

Cross-validation progress: 100%|██████████| 19/19 [00:00<00:00, 312.72it/s]


Site: site02 | Pearson r = 0.218 (p=0.008) | Partial η² = 0.027

Site: site03 | Pearson r = 0.184 (p=0.024) | Partial η² = 0.008

Site: site04 | Pearson r = 0.267 (p=0.000) | Partial η² = 0.017

Site: site05 | Pearson r = 0.489 (p=0.000) | Partial η² = 0.176

Site: site06 | Pearson r = 0.260 (p=0.001) | Partial η² = 0.045

Site: site08 | Pearson r = 0.154 (p=0.231) | Partial η² = -0.629

Site: site09 | Pearson r = 0.202 (p=0.115) | Partial η² = -0.245

Site: site10 | Pearson r = 0.339 (p=0.000) | Partial η² = 0.112

Site: site11 | Pearson r = 0.435 (p=0.000) | Partial η² = 0.179

Site: site12 | Pearson r = 0.333 (p=0.013) | Partial η² = 0.091

Site: site13 | Pearson r = 0.272 (p=0.001) | Partial η² = 0.050

Site: site14 | Pearson r = 0.273 (p=0.000) | Partial η² = 0.069

Site: site15 | Pearson r = 0.593 (p=0.000) | Partial η² = 0.055

Site: site16 | Pearson r = 0.340 (p=0.000) | Partial η² = 0.113

Site: site17 | Pearson r = 0.191 (p=0.065) | Partial η² = -0.268

Site: site18 | Pearso




In [15]:
ses_baseline_A = extract_ses_features(features_A_baseline)
ses_followup_A = extract_ses_features(features_A_followup)

In [16]:
main_baseline_A = combine_data(ses_baseline_A, X_baseline_A)
main_followup_A = combine_data(ses_followup_A, X_followup_A)

Dropping duplicate columns: ['site_id_l', 'Subject']
Dropping duplicate columns: ['site_id_l', 'Subject']


In [17]:
results_no_network, models_no_network = run_cross_validation(main_baseline_A, y_baseline_A, use_network=False)
print_cross_val_results(results_no_network)

Cross-validation progress: 100%|██████████| 19/19 [00:00<00:00, 391.63it/s]


Site: site02 | Pearson r = 0.385 (p=0.000) | Partial η² = 0.096

Site: site03 | Pearson r = 0.329 (p=0.000) | Partial η² = 0.083

Site: site04 | Pearson r = 0.485 (p=0.000) | Partial η² = 0.234

Site: site05 | Pearson r = 0.530 (p=0.000) | Partial η² = 0.274

Site: site06 | Pearson r = 0.408 (p=0.000) | Partial η² = 0.160

Site: site08 | Pearson r = 0.388 (p=0.002) | Partial η² = -0.131

Site: site09 | Pearson r = 0.462 (p=0.000) | Partial η² = 0.071

Site: site10 | Pearson r = 0.415 (p=0.000) | Partial η² = 0.165

Site: site11 | Pearson r = 0.508 (p=0.000) | Partial η² = 0.241

Site: site12 | Pearson r = 0.397 (p=0.003) | Partial η² = 0.121

Site: site13 | Pearson r = 0.472 (p=0.000) | Partial η² = 0.211

Site: site14 | Pearson r = 0.507 (p=0.000) | Partial η² = 0.244

Site: site15 | Pearson r = 0.662 (p=0.000) | Partial η² = 0.397

Site: site16 | Pearson r = 0.476 (p=0.000) | Partial η² = 0.225

Site: site17 | Pearson r = 0.435 (p=0.000) | Partial η² = 0.117

Site: site18 | Pearson 




In [18]:
results_no_network, models_no_network = run_cross_validation(main_followup_A, y_followup_A, use_network=False)
print_cross_val_results(results_no_network)

Cross-validation progress: 100%|██████████| 19/19 [00:00<00:00, 404.48it/s]


Site: site02 | Pearson r = 0.307 (p=0.000) | Partial η² = 0.060

Site: site03 | Pearson r = 0.415 (p=0.000) | Partial η² = 0.169

Site: site04 | Pearson r = 0.388 (p=0.000) | Partial η² = 0.134

Site: site05 | Pearson r = 0.556 (p=0.000) | Partial η² = 0.266

Site: site06 | Pearson r = 0.288 (p=0.000) | Partial η² = 0.027

Site: site08 | Pearson r = 0.358 (p=0.004) | Partial η² = -0.046

Site: site09 | Pearson r = 0.335 (p=0.008) | Partial η² = -0.035

Site: site10 | Pearson r = 0.383 (p=0.000) | Partial η² = 0.104

Site: site11 | Pearson r = 0.584 (p=0.000) | Partial η² = 0.336

Site: site12 | Pearson r = 0.500 (p=0.000) | Partial η² = 0.249

Site: site13 | Pearson r = 0.490 (p=0.000) | Partial η² = 0.237

Site: site14 | Pearson r = 0.431 (p=0.000) | Partial η² = 0.183

Site: site15 | Pearson r = 0.714 (p=0.000) | Partial η² = 0.353

Site: site16 | Pearson r = 0.390 (p=0.000) | Partial η² = 0.151

Site: site17 | Pearson r = 0.384 (p=0.000) | Partial η² = 0.014

Site: site18 | Pearson




In [19]:
results_network, models_network = run_cross_validation(main_baseline_A, y_baseline_A, use_network=True,
                                                       X_network=conn_A_baseline_df, num_pc=15)
print_cross_val_results(results_network)

Cross-validation progress:   5%|▌         | 1/19 [00:09<02:55,  9.72s/it]


Site: site02 | Pearson r = 0.392 (p=0.000) | Partial η² = 0.114


Cross-validation progress:  11%|█         | 2/19 [00:17<02:27,  8.68s/it]


Site: site03 | Pearson r = 0.361 (p=0.000) | Partial η² = 0.110


Cross-validation progress:  16%|█▌        | 3/19 [00:24<02:08,  8.01s/it]


Site: site04 | Pearson r = 0.485 (p=0.000) | Partial η² = 0.233


Cross-validation progress:  21%|██        | 4/19 [00:30<01:48,  7.20s/it]


Site: site05 | Pearson r = 0.528 (p=0.000) | Partial η² = 0.274


Cross-validation progress:  26%|██▋       | 5/19 [00:37<01:36,  6.88s/it]


Site: site06 | Pearson r = 0.438 (p=0.000) | Partial η² = 0.189


Cross-validation progress:  32%|███▏      | 6/19 [00:44<01:30,  6.94s/it]


Site: site08 | Pearson r = 0.433 (p=0.001) | Partial η² = -0.068


Cross-validation progress:  37%|███▋      | 7/19 [00:50<01:19,  6.63s/it]


Site: site09 | Pearson r = 0.468 (p=0.000) | Partial η² = 0.072


Cross-validation progress:  42%|████▏     | 8/19 [00:57<01:15,  6.89s/it]


Site: site10 | Pearson r = 0.419 (p=0.000) | Partial η² = 0.170


Cross-validation progress:  47%|████▋     | 9/19 [01:04<01:09,  6.92s/it]


Site: site11 | Pearson r = 0.507 (p=0.000) | Partial η² = 0.245


Cross-validation progress:  53%|█████▎    | 10/19 [01:10<01:00,  6.68s/it]


Site: site12 | Pearson r = 0.385 (p=0.004) | Partial η² = 0.087


Cross-validation progress:  58%|█████▊    | 11/19 [01:16<00:50,  6.36s/it]


Site: site13 | Pearson r = 0.481 (p=0.000) | Partial η² = 0.224


Cross-validation progress:  63%|██████▎   | 12/19 [01:22<00:43,  6.15s/it]


Site: site14 | Pearson r = 0.513 (p=0.000) | Partial η² = 0.254


Cross-validation progress:  68%|██████▊   | 13/19 [01:27<00:36,  6.02s/it]


Site: site15 | Pearson r = 0.645 (p=0.000) | Partial η² = 0.396


Cross-validation progress:  74%|███████▎  | 14/19 [01:32<00:28,  5.76s/it]


Site: site16 | Pearson r = 0.477 (p=0.000) | Partial η² = 0.227


Cross-validation progress:  79%|███████▉  | 15/19 [01:38<00:23,  5.79s/it]


Site: site17 | Pearson r = 0.379 (p=0.000) | Partial η² = 0.066


Cross-validation progress:  84%|████████▍ | 16/19 [01:46<00:19,  6.34s/it]


Site: site18 | Pearson r = 0.445 (p=0.000) | Partial η² = 0.192


Cross-validation progress:  89%|████████▉ | 17/19 [01:52<00:12,  6.25s/it]


Site: site19 | Pearson r = 0.445 (p=0.000) | Partial η² = 0.188


Cross-validation progress:  95%|█████████▍| 18/19 [01:58<00:06,  6.09s/it]


Site: site20 | Pearson r = 0.451 (p=0.000) | Partial η² = 0.148


Cross-validation progress: 100%|██████████| 19/19 [02:04<00:00,  6.54s/it]


Site: site21 | Pearson r = 0.455 (p=0.000) | Partial η² = 0.192
Cross-validated Performance Metrics:
Average Pearson's r: 0.458
Partial eta squared: 0.174
R²: 0.174





In [20]:
results_network, models_network = run_cross_validation(main_baseline_A, y_baseline_A, use_network=True,
                                                       X_network=conn_A_baseline_df, num_pc=100)
print_cross_val_results(results_network)

Cross-validation progress:   5%|▌         | 1/19 [00:12<03:43, 12.39s/it]


Site: site02 | Pearson r = 0.467 (p=0.000) | Partial η² = 0.196


Cross-validation progress:  11%|█         | 2/19 [00:23<03:13, 11.41s/it]


Site: site03 | Pearson r = 0.456 (p=0.000) | Partial η² = 0.203


Cross-validation progress:  16%|█▌        | 3/19 [00:36<03:13, 12.09s/it]


Site: site04 | Pearson r = 0.514 (p=0.000) | Partial η² = 0.256


Cross-validation progress:  21%|██        | 4/19 [00:48<03:01, 12.08s/it]


Site: site05 | Pearson r = 0.518 (p=0.000) | Partial η² = 0.268


Cross-validation progress:  26%|██▋       | 5/19 [00:59<02:46, 11.89s/it]


Site: site06 | Pearson r = 0.467 (p=0.000) | Partial η² = 0.216


Cross-validation progress:  32%|███▏      | 6/19 [01:11<02:36, 12.02s/it]


Site: site08 | Pearson r = 0.436 (p=0.001) | Partial η² = -0.084


Cross-validation progress:  37%|███▋      | 7/19 [01:23<02:24, 12.03s/it]


Site: site09 | Pearson r = 0.481 (p=0.000) | Partial η² = 0.088


Cross-validation progress:  42%|████▏     | 8/19 [01:33<02:04, 11.28s/it]


Site: site10 | Pearson r = 0.462 (p=0.000) | Partial η² = 0.203


Cross-validation progress:  47%|████▋     | 9/19 [01:47<02:00, 12.08s/it]


Site: site11 | Pearson r = 0.543 (p=0.000) | Partial η² = 0.290


Cross-validation progress:  53%|█████▎    | 10/19 [02:00<01:52, 12.51s/it]


Site: site12 | Pearson r = 0.394 (p=0.003) | Partial η² = 0.077


Cross-validation progress:  58%|█████▊    | 11/19 [02:12<01:37, 12.19s/it]


Site: site13 | Pearson r = 0.490 (p=0.000) | Partial η² = 0.233


Cross-validation progress:  63%|██████▎   | 12/19 [02:24<01:24, 12.08s/it]


Site: site14 | Pearson r = 0.488 (p=0.000) | Partial η² = 0.237


Cross-validation progress:  68%|██████▊   | 13/19 [02:36<01:12, 12.09s/it]


Site: site15 | Pearson r = 0.663 (p=0.000) | Partial η² = 0.415


Cross-validation progress:  74%|███████▎  | 14/19 [02:45<00:56, 11.27s/it]


Site: site16 | Pearson r = 0.522 (p=0.000) | Partial η² = 0.272


Cross-validation progress:  79%|███████▉  | 15/19 [02:57<00:45, 11.30s/it]


Site: site17 | Pearson r = 0.394 (p=0.000) | Partial η² = 0.077


Cross-validation progress:  84%|████████▍ | 16/19 [03:06<00:32, 10.85s/it]


Site: site18 | Pearson r = 0.476 (p=0.000) | Partial η² = 0.211


Cross-validation progress:  89%|████████▉ | 17/19 [03:18<00:22, 11.18s/it]


Site: site19 | Pearson r = 0.443 (p=0.000) | Partial η² = 0.191


Cross-validation progress:  95%|█████████▍| 18/19 [03:29<00:10, 10.99s/it]


Site: site20 | Pearson r = 0.420 (p=0.000) | Partial η² = 0.121


Cross-validation progress: 100%|██████████| 19/19 [03:41<00:00, 11.64s/it]


Site: site21 | Pearson r = 0.462 (p=0.000) | Partial η² = 0.197
Cross-validated Performance Metrics:
Average Pearson's r: 0.479
Partial eta squared: 0.193
R²: 0.193





In [21]:
results_network, models_network = run_cross_validation(main_followup_A, y_followup_A, use_network=True,
                                                       X_network=conn_A_followup_df, num_pc=15)
print_cross_val_results(results_network)

Cross-validation progress:   5%|▌         | 1/19 [00:15<04:39, 15.52s/it]


Site: site02 | Pearson r = 0.345 (p=0.000) | Partial η² = 0.094


Cross-validation progress:  11%|█         | 2/19 [00:24<03:13, 11.40s/it]


Site: site03 | Pearson r = 0.441 (p=0.000) | Partial η² = 0.194


Cross-validation progress:  16%|█▌        | 3/19 [00:31<02:34,  9.67s/it]


Site: site04 | Pearson r = 0.413 (p=0.000) | Partial η² = 0.149


Cross-validation progress:  21%|██        | 4/19 [00:38<02:10,  8.73s/it]


Site: site05 | Pearson r = 0.559 (p=0.000) | Partial η² = 0.276


Cross-validation progress:  26%|██▋       | 5/19 [00:44<01:46,  7.59s/it]


Site: site06 | Pearson r = 0.352 (p=0.000) | Partial η² = 0.098


Cross-validation progress:  32%|███▏      | 6/19 [00:52<01:38,  7.61s/it]


Site: site08 | Pearson r = 0.395 (p=0.001) | Partial η² = -0.010


Cross-validation progress:  37%|███▋      | 7/19 [00:58<01:24,  7.05s/it]


Site: site09 | Pearson r = 0.360 (p=0.004) | Partial η² = -0.020


Cross-validation progress:  42%|████▏     | 8/19 [01:03<01:12,  6.55s/it]


Site: site10 | Pearson r = 0.389 (p=0.000) | Partial η² = 0.120


Cross-validation progress:  47%|████▋     | 9/19 [01:11<01:10,  7.00s/it]


Site: site11 | Pearson r = 0.613 (p=0.000) | Partial η² = 0.374


Cross-validation progress:  53%|█████▎    | 10/19 [01:19<01:07,  7.44s/it]


Site: site12 | Pearson r = 0.519 (p=0.000) | Partial η² = 0.269


Cross-validation progress:  58%|█████▊    | 11/19 [01:27<00:59,  7.49s/it]


Site: site13 | Pearson r = 0.501 (p=0.000) | Partial η² = 0.249


Cross-validation progress:  63%|██████▎   | 12/19 [01:35<00:52,  7.53s/it]


Site: site14 | Pearson r = 0.468 (p=0.000) | Partial η² = 0.211


Cross-validation progress:  68%|██████▊   | 13/19 [01:43<00:46,  7.80s/it]


Site: site15 | Pearson r = 0.718 (p=0.000) | Partial η² = 0.395


Cross-validation progress:  74%|███████▎  | 14/19 [01:50<00:38,  7.63s/it]


Site: site16 | Pearson r = 0.373 (p=0.000) | Partial η² = 0.135


Cross-validation progress:  79%|███████▉  | 15/19 [01:58<00:30,  7.51s/it]


Site: site17 | Pearson r = 0.375 (p=0.000) | Partial η² = 0.002


Cross-validation progress:  84%|████████▍ | 16/19 [02:05<00:22,  7.52s/it]


Site: site18 | Pearson r = 0.330 (p=0.002) | Partial η² = 0.069


Cross-validation progress:  89%|████████▉ | 17/19 [02:13<00:15,  7.64s/it]


Site: site19 | Pearson r = 0.508 (p=0.000) | Partial η² = 0.249


Cross-validation progress:  95%|█████████▍| 18/19 [02:21<00:07,  7.61s/it]


Site: site20 | Pearson r = 0.517 (p=0.000) | Partial η² = 0.266


Cross-validation progress: 100%|██████████| 19/19 [02:26<00:00,  7.74s/it]


Site: site21 | Pearson r = 0.514 (p=0.000) | Partial η² = 0.260
Cross-validated Performance Metrics:
Average Pearson's r: 0.457
Partial eta squared: 0.178
R²: 0.178





In [22]:
results_network, models_network = run_cross_validation(main_followup_A, y_followup_A, use_network=True,
                                                       X_network=conn_A_followup_df, num_pc=100)
print_cross_val_results(results_network)

Cross-validation progress:   5%|▌         | 1/19 [00:14<04:13, 14.08s/it]


Site: site02 | Pearson r = 0.376 (p=0.000) | Partial η² = 0.116


Cross-validation progress:  11%|█         | 2/19 [00:23<03:17, 11.60s/it]


Site: site03 | Pearson r = 0.474 (p=0.000) | Partial η² = 0.224


Cross-validation progress:  16%|█▌        | 3/19 [00:33<02:49, 10.62s/it]


Site: site04 | Pearson r = 0.460 (p=0.000) | Partial η² = 0.173


Cross-validation progress:  21%|██        | 4/19 [00:43<02:36, 10.44s/it]


Site: site05 | Pearson r = 0.583 (p=0.000) | Partial η² = 0.309


Cross-validation progress:  26%|██▋       | 5/19 [00:53<02:24, 10.34s/it]


Site: site06 | Pearson r = 0.373 (p=0.000) | Partial η² = 0.090


Cross-validation progress:  32%|███▏      | 6/19 [01:05<02:19, 10.71s/it]


Site: site08 | Pearson r = 0.463 (p=0.000) | Partial η² = 0.068


Cross-validation progress:  37%|███▋      | 7/19 [01:15<02:05, 10.47s/it]


Site: site09 | Pearson r = 0.392 (p=0.002) | Partial η² = 0.001


Cross-validation progress:  42%|████▏     | 8/19 [01:24<01:52, 10.22s/it]


Site: site10 | Pearson r = 0.393 (p=0.000) | Partial η² = 0.126


Cross-validation progress:  47%|████▋     | 9/19 [01:36<01:47, 10.75s/it]


Site: site11 | Pearson r = 0.577 (p=0.000) | Partial η² = 0.310


Cross-validation progress:  53%|█████▎    | 10/19 [01:48<01:38, 10.99s/it]


Site: site12 | Pearson r = 0.596 (p=0.000) | Partial η² = 0.352


Cross-validation progress:  58%|█████▊    | 11/19 [01:57<01:24, 10.51s/it]


Site: site13 | Pearson r = 0.510 (p=0.000) | Partial η² = 0.256


Cross-validation progress:  63%|██████▎   | 12/19 [02:06<01:10, 10.06s/it]


Site: site14 | Pearson r = 0.543 (p=0.000) | Partial η² = 0.281


Cross-validation progress:  68%|██████▊   | 13/19 [02:17<01:00, 10.16s/it]


Site: site15 | Pearson r = 0.643 (p=0.000) | Partial η² = 0.339


Cross-validation progress:  74%|███████▎  | 14/19 [02:25<00:48,  9.74s/it]


Site: site16 | Pearson r = 0.420 (p=0.000) | Partial η² = 0.166


Cross-validation progress:  79%|███████▉  | 15/19 [02:35<00:38,  9.69s/it]


Site: site17 | Pearson r = 0.470 (p=0.000) | Partial η² = 0.142


Cross-validation progress:  84%|████████▍ | 16/19 [02:44<00:28,  9.58s/it]


Site: site18 | Pearson r = 0.322 (p=0.002) | Partial η² = 0.039


Cross-validation progress:  89%|████████▉ | 17/19 [02:54<00:19,  9.59s/it]


Site: site19 | Pearson r = 0.531 (p=0.000) | Partial η² = 0.270


Cross-validation progress:  95%|█████████▍| 18/19 [03:03<00:09,  9.53s/it]


Site: site20 | Pearson r = 0.546 (p=0.000) | Partial η² = 0.295


Cross-validation progress: 100%|██████████| 19/19 [03:14<00:00, 10.24s/it]


Site: site21 | Pearson r = 0.587 (p=0.000) | Partial η² = 0.328
Cross-validated Performance Metrics:
Average Pearson's r: 0.487
Partial eta squared: 0.204
R²: 0.204





In [23]:
avg_pos_strength_baseline_A = compute_one_statistic(df_features=main_baseline_A,
                                                    df_network=conn_A_baseline_df,
                                                    regions=regions,
                                                    stat_name="avg_positive_strength",
                                                    mode="all",
                                                    node_order=None,
                                                    matrix_size=418,
                                                    diag_value=0.0)
baseline_A_plus = combine_data(main_baseline_A, avg_pos_strength_baseline_A)
results_no_network, models_no_network = run_cross_validation(baseline_A_plus, y_baseline_A, use_network=False)
print_cross_val_results(results_no_network)

100%|██████████| 2409/2409 [00:18<00:00, 131.03it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 19/19 [00:00<00:00, 330.98it/s]


Site: site02 | Pearson r = 0.401 (p=0.000) | Partial η² = 0.114

Site: site03 | Pearson r = 0.318 (p=0.000) | Partial η² = 0.068

Site: site04 | Pearson r = 0.483 (p=0.000) | Partial η² = 0.233

Site: site05 | Pearson r = 0.536 (p=0.000) | Partial η² = 0.281

Site: site06 | Pearson r = 0.400 (p=0.000) | Partial η² = 0.154

Site: site08 | Pearson r = 0.393 (p=0.002) | Partial η² = -0.119

Site: site09 | Pearson r = 0.480 (p=0.000) | Partial η² = 0.063

Site: site10 | Pearson r = 0.411 (p=0.000) | Partial η² = 0.159

Site: site11 | Pearson r = 0.512 (p=0.000) | Partial η² = 0.246

Site: site12 | Pearson r = 0.396 (p=0.003) | Partial η² = 0.109

Site: site13 | Pearson r = 0.464 (p=0.000) | Partial η² = 0.202

Site: site14 | Pearson r = 0.472 (p=0.000) | Partial η² = 0.216

Site: site15 | Pearson r = 0.653 (p=0.000) | Partial η² = 0.393

Site: site16 | Pearson r = 0.481 (p=0.000) | Partial η² = 0.230

Site: site17 | Pearson r = 0.407 (p=0.000) | Partial η² = 0.089

Site: site18 | Pearson 




In [24]:
avg_pos_strength_followup_A = compute_one_statistic(df_features=main_followup_A,
                                                    df_network=conn_A_followup_df,
                                                    regions=regions,
                                                    stat_name="avg_positive_strength",
                                                    mode="all",
                                                    node_order=None,
                                                    matrix_size=418,
                                                    diag_value=0.0)
followup_A_plus = combine_data(main_followup_A, avg_pos_strength_followup_A)
results_no_network, models_no_network = run_cross_validation(followup_A_plus.fillna(0.0), y_followup_A, use_network=False)
print_cross_val_results(results_no_network)

100%|██████████| 2409/2409 [00:18<00:00, 129.25it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 19/19 [00:00<00:00, 295.43it/s]


Site: site02 | Pearson r = 0.329 (p=0.000) | Partial η² = 0.078

Site: site03 | Pearson r = 0.392 (p=0.000) | Partial η² = 0.151

Site: site04 | Pearson r = 0.383 (p=0.000) | Partial η² = 0.121

Site: site05 | Pearson r = 0.580 (p=0.000) | Partial η² = 0.291

Site: site06 | Pearson r = 0.301 (p=0.000) | Partial η² = 0.045

Site: site08 | Pearson r = 0.369 (p=0.003) | Partial η² = -0.037

Site: site09 | Pearson r = 0.302 (p=0.017) | Partial η² = -0.050

Site: site10 | Pearson r = 0.383 (p=0.000) | Partial η² = 0.098

Site: site11 | Pearson r = 0.571 (p=0.000) | Partial η² = 0.325

Site: site12 | Pearson r = 0.511 (p=0.000) | Partial η² = 0.261

Site: site13 | Pearson r = 0.509 (p=0.000) | Partial η² = 0.259

Site: site14 | Pearson r = 0.454 (p=0.000) | Partial η² = 0.205

Site: site15 | Pearson r = 0.695 (p=0.000) | Partial η² = 0.356

Site: site16 | Pearson r = 0.387 (p=0.000) | Partial η² = 0.146

Site: site17 | Pearson r = 0.376 (p=0.000) | Partial η² = 0.031

Site: site18 | Pearson




In [25]:
avg_neg_strength_baseline_A = compute_one_statistic(df_features=main_baseline_A,
                                                    df_network=conn_A_baseline_df,
                                                    regions=regions,
                                                    stat_name="avg_negative_strength",
                                                    mode="all",
                                                    node_order=None,
                                                    matrix_size=418,
                                                    diag_value=0.0)
baseline_A_plus = combine_data(main_baseline_A, avg_neg_strength_baseline_A)
results_no_network, models_no_network = run_cross_validation(baseline_A_plus, y_baseline_A, use_network=False)
print_cross_val_results(results_no_network)


100%|██████████| 2409/2409 [00:18<00:00, 130.49it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 19/19 [00:00<00:00, 287.94it/s]


Site: site02 | Pearson r = 0.398 (p=0.000) | Partial η² = 0.116

Site: site03 | Pearson r = 0.351 (p=0.000) | Partial η² = 0.099

Site: site04 | Pearson r = 0.483 (p=0.000) | Partial η² = 0.230

Site: site05 | Pearson r = 0.527 (p=0.000) | Partial η² = 0.274

Site: site06 | Pearson r = 0.418 (p=0.000) | Partial η² = 0.170

Site: site08 | Pearson r = 0.390 (p=0.002) | Partial η² = -0.134

Site: site09 | Pearson r = 0.441 (p=0.000) | Partial η² = 0.050

Site: site10 | Pearson r = 0.433 (p=0.000) | Partial η² = 0.181

Site: site11 | Pearson r = 0.476 (p=0.000) | Partial η² = 0.205

Site: site12 | Pearson r = 0.391 (p=0.004) | Partial η² = 0.118

Site: site13 | Pearson r = 0.466 (p=0.000) | Partial η² = 0.201

Site: site14 | Pearson r = 0.482 (p=0.000) | Partial η² = 0.227

Site: site15 | Pearson r = 0.644 (p=0.000) | Partial η² = 0.380

Site: site16 | Pearson r = 0.496 (p=0.000) | Partial η² = 0.243

Site: site17 | Pearson r = 0.450 (p=0.000) | Partial η² = 0.124

Site: site18 | Pearson 




In [26]:
avg_neg_strength_followup_A = compute_one_statistic(df_features=main_followup_A,
                                                    df_network=conn_A_followup_df,
                                                    regions=regions,
                                                    stat_name="avg_negative_strength",
                                                    mode="all",
                                                    node_order=None,
                                                    matrix_size=418,
                                                    diag_value=0.0)

followup_A_plus = combine_data(main_followup_A, avg_neg_strength_followup_A)
results_no_network, models_no_network = run_cross_validation(followup_A_plus.fillna(0.0), y_followup_A,
                                                             use_network=False)
print_cross_val_results(results_no_network)

100%|██████████| 2409/2409 [00:19<00:00, 121.41it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 19/19 [00:00<00:00, 212.43it/s]


Site: site02 | Pearson r = 0.311 (p=0.000) | Partial η² = 0.065

Site: site03 | Pearson r = 0.415 (p=0.000) | Partial η² = 0.166

Site: site04 | Pearson r = 0.393 (p=0.000) | Partial η² = 0.137

Site: site05 | Pearson r = 0.565 (p=0.000) | Partial η² = 0.275

Site: site06 | Pearson r = 0.308 (p=0.000) | Partial η² = 0.042

Site: site08 | Pearson r = 0.381 (p=0.002) | Partial η² = -0.073

Site: site09 | Pearson r = 0.351 (p=0.005) | Partial η² = -0.016

Site: site10 | Pearson r = 0.379 (p=0.000) | Partial η² = 0.092

Site: site11 | Pearson r = 0.575 (p=0.000) | Partial η² = 0.330

Site: site12 | Pearson r = 0.526 (p=0.000) | Partial η² = 0.276

Site: site13 | Pearson r = 0.482 (p=0.000) | Partial η² = 0.224

Site: site14 | Pearson r = 0.430 (p=0.000) | Partial η² = 0.179

Site: site15 | Pearson r = 0.703 (p=0.000) | Partial η² = 0.361

Site: site16 | Pearson r = 0.387 (p=0.000) | Partial η² = 0.149

Site: site17 | Pearson r = 0.392 (p=0.000) | Partial η² = 0.016

Site: site18 | Pearson




In [27]:
avg_total_strength_baseline_A = compute_one_statistic(df_features=main_baseline_A,
                                                      df_network=conn_A_baseline_df,
                                                      regions=regions,
                                                      stat_name="avg_total_strength",
                                                      mode="all",
                                                      node_order=None,
                                                      matrix_size=418,
                                                      diag_value=0.0)
baseline_A_plus = combine_data(main_baseline_A, avg_total_strength_baseline_A)
results_no_network, models_no_network = run_cross_validation(baseline_A_plus, y_baseline_A, use_network=False)
print_cross_val_results(results_no_network)

100%|██████████| 2409/2409 [00:18<00:00, 127.55it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 19/19 [00:00<00:00, 295.36it/s]


Site: site02 | Pearson r = 0.405 (p=0.000) | Partial η² = 0.117

Site: site03 | Pearson r = 0.334 (p=0.000) | Partial η² = 0.079

Site: site04 | Pearson r = 0.486 (p=0.000) | Partial η² = 0.236

Site: site05 | Pearson r = 0.537 (p=0.000) | Partial η² = 0.283

Site: site06 | Pearson r = 0.399 (p=0.000) | Partial η² = 0.155

Site: site08 | Pearson r = 0.392 (p=0.002) | Partial η² = -0.138

Site: site09 | Pearson r = 0.479 (p=0.000) | Partial η² = 0.070

Site: site10 | Pearson r = 0.417 (p=0.000) | Partial η² = 0.168

Site: site11 | Pearson r = 0.496 (p=0.000) | Partial η² = 0.230

Site: site12 | Pearson r = 0.390 (p=0.004) | Partial η² = 0.109

Site: site13 | Pearson r = 0.463 (p=0.000) | Partial η² = 0.201

Site: site14 | Pearson r = 0.479 (p=0.000) | Partial η² = 0.225

Site: site15 | Pearson r = 0.650 (p=0.000) | Partial η² = 0.388

Site: site16 | Pearson r = 0.494 (p=0.000) | Partial η² = 0.241

Site: site17 | Pearson r = 0.421 (p=0.000) | Partial η² = 0.090

Site: site18 | Pearson 




In [28]:
avg_total_strength_followup_A = compute_one_statistic(df_features=main_followup_A,
                                                      df_network=conn_A_followup_df,
                                                      regions=regions,
                                                      stat_name="avg_total_strength",
                                                      mode="all",
                                                      node_order=None,
                                                      matrix_size=418,
                                                      diag_value=0.0)

followup_A_plus = combine_data(main_followup_A, avg_total_strength_followup_A)
results_no_network, models_no_network = run_cross_validation(followup_A_plus.fillna(0.0), y_followup_A,
                                                             use_network=False)
print_cross_val_results(results_no_network)

100%|██████████| 2409/2409 [00:18<00:00, 132.01it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 19/19 [00:00<00:00, 269.21it/s]


Site: site02 | Pearson r = 0.339 (p=0.000) | Partial η² = 0.087

Site: site03 | Pearson r = 0.402 (p=0.000) | Partial η² = 0.158

Site: site04 | Pearson r = 0.391 (p=0.000) | Partial η² = 0.129

Site: site05 | Pearson r = 0.585 (p=0.000) | Partial η² = 0.302

Site: site06 | Pearson r = 0.307 (p=0.000) | Partial η² = 0.047

Site: site08 | Pearson r = 0.371 (p=0.003) | Partial η² = -0.053

Site: site09 | Pearson r = 0.318 (p=0.012) | Partial η² = -0.028

Site: site10 | Pearson r = 0.360 (p=0.000) | Partial η² = 0.069

Site: site11 | Pearson r = 0.578 (p=0.000) | Partial η² = 0.334

Site: site12 | Pearson r = 0.538 (p=0.000) | Partial η² = 0.289

Site: site13 | Pearson r = 0.497 (p=0.000) | Partial η² = 0.246

Site: site14 | Pearson r = 0.448 (p=0.000) | Partial η² = 0.200

Site: site15 | Pearson r = 0.692 (p=0.000) | Partial η² = 0.358

Site: site16 | Pearson r = 0.389 (p=0.000) | Partial η² = 0.149

Site: site17 | Pearson r = 0.382 (p=0.000) | Partial η² = 0.041

Site: site18 | Pearson




In [30]:
combined_strength_baseline_A = combine_data(avg_neg_strength_baseline_A, avg_pos_strength_baseline_A)
baseline_A_plus = combine_data(main_baseline_A,combined_strength_baseline_A)
results_no_network, models_no_network = run_cross_validation(baseline_A_plus.fillna(0.0), y_baseline_A,
                                                             use_network=False)
print_cross_val_results(results_no_network)

Dropping duplicate columns: ['Subject', 'site_id_l']
Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 19/19 [00:00<00:00, 187.01it/s]


Site: site02 | Pearson r = 0.407 (p=0.000) | Partial η² = 0.123

Site: site03 | Pearson r = 0.340 (p=0.000) | Partial η² = 0.087

Site: site04 | Pearson r = 0.479 (p=0.000) | Partial η² = 0.228

Site: site05 | Pearson r = 0.523 (p=0.000) | Partial η² = 0.268

Site: site06 | Pearson r = 0.406 (p=0.000) | Partial η² = 0.160

Site: site08 | Pearson r = 0.388 (p=0.002) | Partial η² = -0.117

Site: site09 | Pearson r = 0.467 (p=0.000) | Partial η² = 0.057

Site: site10 | Pearson r = 0.426 (p=0.000) | Partial η² = 0.174

Site: site11 | Pearson r = 0.476 (p=0.000) | Partial η² = 0.199

Site: site12 | Pearson r = 0.391 (p=0.004) | Partial η² = 0.115

Site: site13 | Pearson r = 0.458 (p=0.000) | Partial η² = 0.192

Site: site14 | Pearson r = 0.464 (p=0.000) | Partial η² = 0.210

Site: site15 | Pearson r = 0.638 (p=0.000) | Partial η² = 0.375

Site: site16 | Pearson r = 0.489 (p=0.000) | Partial η² = 0.238

Site: site17 | Pearson r = 0.420 (p=0.000) | Partial η² = 0.075

Site: site18 | Pearson 




In [31]:
combined_strength_followup_A = combine_data(avg_neg_strength_followup_A, avg_pos_strength_followup_A)
followup_A_plus = combine_data(main_followup_A,combined_strength_followup_A)
results_no_network, models_no_network = run_cross_validation(followup_A_plus.fillna(0.0), y_followup_A,
                                                             use_network=False)
print_cross_val_results(results_no_network)

Dropping duplicate columns: ['Subject', 'site_id_l']
Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 19/19 [00:00<00:00, 154.47it/s]


Site: site02 | Pearson r = 0.319 (p=0.000) | Partial η² = 0.068

Site: site03 | Pearson r = 0.392 (p=0.000) | Partial η² = 0.148

Site: site04 | Pearson r = 0.384 (p=0.000) | Partial η² = 0.123

Site: site05 | Pearson r = 0.590 (p=0.000) | Partial η² = 0.303

Site: site06 | Pearson r = 0.311 (p=0.000) | Partial η² = 0.050

Site: site08 | Pearson r = 0.378 (p=0.002) | Partial η² = -0.056

Site: site09 | Pearson r = 0.314 (p=0.013) | Partial η² = -0.042

Site: site10 | Pearson r = 0.351 (p=0.000) | Partial η² = 0.053

Site: site11 | Pearson r = 0.566 (p=0.000) | Partial η² = 0.318

Site: site12 | Pearson r = 0.528 (p=0.000) | Partial η² = 0.279

Site: site13 | Pearson r = 0.497 (p=0.000) | Partial η² = 0.245

Site: site14 | Pearson r = 0.450 (p=0.000) | Partial η² = 0.199

Site: site15 | Pearson r = 0.690 (p=0.000) | Partial η² = 0.361

Site: site16 | Pearson r = 0.377 (p=0.000) | Partial η² = 0.137

Site: site17 | Pearson r = 0.395 (p=0.000) | Partial η² = 0.042

Site: site18 | Pearson




In [32]:
within_pos_mean_baseline_A = compute_one_statistic(df_features=main_baseline_A,
                                                   df_network=conn_A_baseline_df,
                                                   regions=regions,
                                                   stat_name="within_positive_mean",
                                                   mode="all",
                                                   node_order=None,
                                                   matrix_size=418,
                                                   diag_value=0.0)

baseline_A_plus = combine_data(main_baseline_A, within_pos_mean_baseline_A)
results_no_network, models_no_network = run_cross_validation(baseline_A_plus, y_baseline_A, use_network=False)
print_cross_val_results(results_no_network)


100%|██████████| 2409/2409 [00:19<00:00, 121.78it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 19/19 [00:00<00:00, 289.90it/s]


Site: site02 | Pearson r = 0.411 (p=0.000) | Partial η² = 0.109

Site: site03 | Pearson r = 0.328 (p=0.000) | Partial η² = 0.081

Site: site04 | Pearson r = 0.476 (p=0.000) | Partial η² = 0.226

Site: site05 | Pearson r = 0.494 (p=0.000) | Partial η² = 0.234

Site: site06 | Pearson r = 0.410 (p=0.000) | Partial η² = 0.164

Site: site08 | Pearson r = 0.409 (p=0.001) | Partial η² = -0.101

Site: site09 | Pearson r = 0.484 (p=0.000) | Partial η² = 0.100

Site: site10 | Pearson r = 0.430 (p=0.000) | Partial η² = 0.179

Site: site11 | Pearson r = 0.496 (p=0.000) | Partial η² = 0.222

Site: site12 | Pearson r = 0.424 (p=0.002) | Partial η² = 0.149

Site: site13 | Pearson r = 0.483 (p=0.000) | Partial η² = 0.222

Site: site14 | Pearson r = 0.490 (p=0.000) | Partial η² = 0.230

Site: site15 | Pearson r = 0.672 (p=0.000) | Partial η² = 0.408

Site: site16 | Pearson r = 0.477 (p=0.000) | Partial η² = 0.226

Site: site17 | Pearson r = 0.404 (p=0.000) | Partial η² = 0.052

Site: site18 | Pearson 




In [33]:
within_pos_mean_followup_A = compute_one_statistic(df_features=main_followup_A,
                                                   df_network=conn_A_followup_df,
                                                   regions=regions,
                                                   stat_name="within_positive_mean",
                                                   mode="all",
                                                   node_order=None,
                                                   matrix_size=418,
                                                   diag_value=0.0)

followup_A_plus = combine_data(main_followup_A, within_pos_mean_followup_A)
results_no_network, models_no_network = run_cross_validation(followup_A_plus.fillna(0.0), y_followup_A,
                                                             use_network=False)
print_cross_val_results(results_no_network)

100%|██████████| 2409/2409 [00:19<00:00, 124.52it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 19/19 [00:00<00:00, 294.76it/s]


Site: site02 | Pearson r = 0.303 (p=0.000) | Partial η² = 0.048

Site: site03 | Pearson r = 0.403 (p=0.000) | Partial η² = 0.158

Site: site04 | Pearson r = 0.374 (p=0.000) | Partial η² = 0.117

Site: site05 | Pearson r = 0.535 (p=0.000) | Partial η² = 0.242

Site: site06 | Pearson r = 0.284 (p=0.000) | Partial η² = 0.024

Site: site08 | Pearson r = 0.372 (p=0.003) | Partial η² = -0.061

Site: site09 | Pearson r = 0.330 (p=0.009) | Partial η² = -0.020

Site: site10 | Pearson r = 0.388 (p=0.000) | Partial η² = 0.106

Site: site11 | Pearson r = 0.563 (p=0.000) | Partial η² = 0.315

Site: site12 | Pearson r = 0.511 (p=0.000) | Partial η² = 0.260

Site: site13 | Pearson r = 0.482 (p=0.000) | Partial η² = 0.221

Site: site14 | Pearson r = 0.462 (p=0.000) | Partial η² = 0.208

Site: site15 | Pearson r = 0.731 (p=0.000) | Partial η² = 0.389

Site: site16 | Pearson r = 0.355 (p=0.000) | Partial η² = 0.124

Site: site17 | Pearson r = 0.386 (p=0.000) | Partial η² = -0.003

Site: site18 | Pearso




In [35]:
between_pos_mean_baseline_A = compute_one_statistic(df_features=main_baseline_A,
                                                    df_network=conn_A_baseline_df,
                                                    regions=regions,
                                                    stat_name="between_positive_mean",
                                                    mode="all",
                                                    node_order=None,
                                                    matrix_size=418,
                                                    diag_value=0.0)
baseline_A_plus = combine_data(main_baseline_A, between_pos_mean_baseline_A)
results_no_network, models_no_network = run_cross_validation(baseline_A_plus, y_baseline_A, use_network=False)
print_cross_val_results(results_no_network)


100%|██████████| 2409/2409 [00:20<00:00, 120.19it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 19/19 [00:00<00:00, 273.49it/s]


Site: site02 | Pearson r = 0.400 (p=0.000) | Partial η² = 0.116

Site: site03 | Pearson r = 0.328 (p=0.000) | Partial η² = 0.075

Site: site04 | Pearson r = 0.483 (p=0.000) | Partial η² = 0.232

Site: site05 | Pearson r = 0.529 (p=0.000) | Partial η² = 0.275

Site: site06 | Pearson r = 0.401 (p=0.000) | Partial η² = 0.155

Site: site08 | Pearson r = 0.397 (p=0.002) | Partial η² = -0.123

Site: site09 | Pearson r = 0.480 (p=0.000) | Partial η² = 0.067

Site: site10 | Pearson r = 0.410 (p=0.000) | Partial η² = 0.160

Site: site11 | Pearson r = 0.506 (p=0.000) | Partial η² = 0.242

Site: site12 | Pearson r = 0.388 (p=0.004) | Partial η² = 0.102

Site: site13 | Pearson r = 0.461 (p=0.000) | Partial η² = 0.198

Site: site14 | Pearson r = 0.471 (p=0.000) | Partial η² = 0.217

Site: site15 | Pearson r = 0.646 (p=0.000) | Partial η² = 0.387

Site: site16 | Pearson r = 0.488 (p=0.000) | Partial η² = 0.236

Site: site17 | Pearson r = 0.422 (p=0.000) | Partial η² = 0.111

Site: site18 | Pearson 




In [36]:
between_pos_mean_followup_A = compute_one_statistic(df_features=main_followup_A,
                                                    df_network=conn_A_followup_df,
                                                    regions=regions,
                                                    stat_name="between_positive_mean",
                                                    mode="all",
                                                    node_order=None,
                                                    matrix_size=418,
                                                    diag_value=0.0)

followup_A_plus = combine_data(main_followup_A, between_pos_mean_followup_A)
results_no_network, models_no_network = run_cross_validation(followup_A_plus.fillna(0.0), y_followup_A,
                                                             use_network=False)
print_cross_val_results(results_no_network)

100%|██████████| 2409/2409 [00:19<00:00, 121.73it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 19/19 [00:00<00:00, 284.03it/s]


Site: site02 | Pearson r = 0.337 (p=0.000) | Partial η² = 0.084

Site: site03 | Pearson r = 0.397 (p=0.000) | Partial η² = 0.156

Site: site04 | Pearson r = 0.390 (p=0.000) | Partial η² = 0.127

Site: site05 | Pearson r = 0.580 (p=0.000) | Partial η² = 0.296

Site: site06 | Pearson r = 0.298 (p=0.000) | Partial η² = 0.041

Site: site08 | Pearson r = 0.377 (p=0.002) | Partial η² = -0.031

Site: site09 | Pearson r = 0.304 (p=0.016) | Partial η² = -0.044

Site: site10 | Pearson r = 0.363 (p=0.000) | Partial η² = 0.071

Site: site11 | Pearson r = 0.562 (p=0.000) | Partial η² = 0.311

Site: site12 | Pearson r = 0.516 (p=0.000) | Partial η² = 0.266

Site: site13 | Pearson r = 0.499 (p=0.000) | Partial η² = 0.249

Site: site14 | Pearson r = 0.457 (p=0.000) | Partial η² = 0.208

Site: site15 | Pearson r = 0.691 (p=0.000) | Partial η² = 0.359

Site: site16 | Pearson r = 0.392 (p=0.000) | Partial η² = 0.150

Site: site17 | Pearson r = 0.371 (p=0.000) | Partial η² = 0.033

Site: site18 | Pearson




In [37]:
within_neg_mean_baseline_A = compute_one_statistic(df_features=main_baseline_A,
                                                   df_network=conn_A_baseline_df,
                                                   regions=regions,
                                                   stat_name="within_negative_mean",
                                                   mode="all",
                                                   node_order=None,
                                                   matrix_size=418,
                                                   diag_value=0.0)
baseline_A_plus = combine_data(main_baseline_A, within_neg_mean_baseline_A)
results_no_network, models_no_network = run_cross_validation(baseline_A_plus, y_baseline_A, use_network=False)
print_cross_val_results(results_no_network)

100%|██████████| 2409/2409 [00:20<00:00, 117.61it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 19/19 [00:00<00:00, 245.72it/s]


Site: site02 | Pearson r = 0.380 (p=0.000) | Partial η² = 0.084

Site: site03 | Pearson r = 0.353 (p=0.000) | Partial η² = 0.103

Site: site04 | Pearson r = 0.481 (p=0.000) | Partial η² = 0.230

Site: site05 | Pearson r = 0.533 (p=0.000) | Partial η² = 0.276

Site: site06 | Pearson r = 0.389 (p=0.000) | Partial η² = 0.142

Site: site08 | Pearson r = 0.393 (p=0.002) | Partial η² = -0.152

Site: site09 | Pearson r = 0.462 (p=0.000) | Partial η² = 0.082

Site: site10 | Pearson r = 0.408 (p=0.000) | Partial η² = 0.158

Site: site11 | Pearson r = 0.494 (p=0.000) | Partial η² = 0.223

Site: site12 | Pearson r = 0.400 (p=0.003) | Partial η² = 0.127

Site: site13 | Pearson r = 0.465 (p=0.000) | Partial η² = 0.203

Site: site14 | Pearson r = 0.498 (p=0.000) | Partial η² = 0.238

Site: site15 | Pearson r = 0.665 (p=0.000) | Partial η² = 0.404

Site: site16 | Pearson r = 0.476 (p=0.000) | Partial η² = 0.225

Site: site17 | Pearson r = 0.416 (p=0.000) | Partial η² = 0.099

Site: site18 | Pearson 




In [38]:
within_neg_mean_followup_A = compute_one_statistic(df_features=main_followup_A,
                                                   df_network=conn_A_followup_df,
                                                   regions=regions,
                                                   stat_name="within_negative_mean",
                                                   mode="all",
                                                   node_order=None,
                                                   matrix_size=418,
                                                   diag_value=0.0)

followup_A_plus = combine_data(main_followup_A, within_neg_mean_followup_A)
results_no_network, models_no_network = run_cross_validation(followup_A_plus.fillna(0.0), y_followup_A,
                                                             use_network=False)
print_cross_val_results(results_no_network)

100%|██████████| 2409/2409 [00:20<00:00, 120.22it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 19/19 [00:00<00:00, 294.89it/s]


Site: site02 | Pearson r = 0.313 (p=0.000) | Partial η² = 0.066

Site: site03 | Pearson r = 0.424 (p=0.000) | Partial η² = 0.175

Site: site04 | Pearson r = 0.374 (p=0.000) | Partial η² = 0.119

Site: site05 | Pearson r = 0.565 (p=0.000) | Partial η² = 0.280

Site: site06 | Pearson r = 0.278 (p=0.000) | Partial η² = 0.012

Site: site08 | Pearson r = 0.372 (p=0.003) | Partial η² = -0.049

Site: site09 | Pearson r = 0.331 (p=0.009) | Partial η² = -0.040

Site: site10 | Pearson r = 0.383 (p=0.000) | Partial η² = 0.101

Site: site11 | Pearson r = 0.568 (p=0.000) | Partial η² = 0.322

Site: site12 | Pearson r = 0.490 (p=0.000) | Partial η² = 0.237

Site: site13 | Pearson r = 0.458 (p=0.000) | Partial η² = 0.205

Site: site14 | Pearson r = 0.410 (p=0.000) | Partial η² = 0.164

Site: site15 | Pearson r = 0.733 (p=0.000) | Partial η² = 0.373

Site: site16 | Pearson r = 0.399 (p=0.000) | Partial η² = 0.158

Site: site17 | Pearson r = 0.398 (p=0.000) | Partial η² = 0.025

Site: site18 | Pearson




In [39]:
between_neg_mean_baseline_A = compute_one_statistic(df_features=main_baseline_A,
                                                    df_network=conn_A_baseline_df,
                                                    regions=regions,
                                                    stat_name="between_negative_mean",
                                                    mode="all",
                                                    node_order=None,
                                                    matrix_size=418,
                                                    diag_value=0.0)
baseline_A_plus = combine_data(main_baseline_A, between_neg_mean_baseline_A)
results_no_network, models_no_network = run_cross_validation(baseline_A_plus, y_baseline_A, use_network=False)
print_cross_val_results(results_no_network)


100%|██████████| 2409/2409 [00:19<00:00, 120.47it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 19/19 [00:00<00:00, 275.23it/s]


Site: site02 | Pearson r = 0.397 (p=0.000) | Partial η² = 0.114

Site: site03 | Pearson r = 0.351 (p=0.000) | Partial η² = 0.098

Site: site04 | Pearson r = 0.483 (p=0.000) | Partial η² = 0.231

Site: site05 | Pearson r = 0.524 (p=0.000) | Partial η² = 0.270

Site: site06 | Pearson r = 0.418 (p=0.000) | Partial η² = 0.170

Site: site08 | Pearson r = 0.390 (p=0.002) | Partial η² = -0.134

Site: site09 | Pearson r = 0.438 (p=0.000) | Partial η² = 0.047

Site: site10 | Pearson r = 0.432 (p=0.000) | Partial η² = 0.181

Site: site11 | Pearson r = 0.474 (p=0.000) | Partial η² = 0.203

Site: site12 | Pearson r = 0.393 (p=0.004) | Partial η² = 0.121

Site: site13 | Pearson r = 0.469 (p=0.000) | Partial η² = 0.204

Site: site14 | Pearson r = 0.479 (p=0.000) | Partial η² = 0.225

Site: site15 | Pearson r = 0.642 (p=0.000) | Partial η² = 0.378

Site: site16 | Pearson r = 0.496 (p=0.000) | Partial η² = 0.243

Site: site17 | Pearson r = 0.452 (p=0.000) | Partial η² = 0.129

Site: site18 | Pearson 




In [40]:
between_neg_mean_followup_A = compute_one_statistic(df_features=main_followup_A,
                                                    df_network=conn_A_followup_df,
                                                    regions=regions,
                                                    stat_name="between_negative_mean",
                                                    mode="all",
                                                    node_order=None,
                                                    matrix_size=418,
                                                    diag_value=0.0)

followup_A_plus = combine_data(main_followup_A, between_neg_mean_followup_A)
results_no_network, models_no_network = run_cross_validation(followup_A_plus.fillna(0.0), y_followup_A,
                                                             use_network=False)
print_cross_val_results(results_no_network)

100%|██████████| 2409/2409 [00:20<00:00, 118.72it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 19/19 [00:00<00:00, 270.32it/s]


Site: site02 | Pearson r = 0.312 (p=0.000) | Partial η² = 0.066

Site: site03 | Pearson r = 0.416 (p=0.000) | Partial η² = 0.167

Site: site04 | Pearson r = 0.394 (p=0.000) | Partial η² = 0.138

Site: site05 | Pearson r = 0.564 (p=0.000) | Partial η² = 0.274

Site: site06 | Pearson r = 0.311 (p=0.000) | Partial η² = 0.045

Site: site08 | Pearson r = 0.382 (p=0.002) | Partial η² = -0.073

Site: site09 | Pearson r = 0.349 (p=0.005) | Partial η² = -0.017

Site: site10 | Pearson r = 0.379 (p=0.000) | Partial η² = 0.093

Site: site11 | Pearson r = 0.575 (p=0.000) | Partial η² = 0.329

Site: site12 | Pearson r = 0.524 (p=0.000) | Partial η² = 0.274

Site: site13 | Pearson r = 0.485 (p=0.000) | Partial η² = 0.226

Site: site14 | Pearson r = 0.430 (p=0.000) | Partial η² = 0.179

Site: site15 | Pearson r = 0.702 (p=0.000) | Partial η² = 0.361

Site: site16 | Pearson r = 0.387 (p=0.000) | Partial η² = 0.149

Site: site17 | Pearson r = 0.390 (p=0.000) | Partial η² = 0.012

Site: site18 | Pearson




In [41]:
segregation_ratio_baseline_A = compute_one_statistic(df_features=main_baseline_A,
                                                     df_network=conn_A_baseline_df,
                                                     regions=regions,
                                                     stat_name="segregation_ratio",
                                                     mode="all",
                                                     node_order=None,
                                                     matrix_size=418,
                                                     diag_value=0.0)
baseline_A_plus = combine_data(main_baseline_A, segregation_ratio_baseline_A)
results_no_network, models_no_network = run_cross_validation(baseline_A_plus, y_baseline_A, use_network=False)
print_cross_val_results(results_no_network)


100%|██████████| 2409/2409 [00:25<00:00, 94.07it/s] 


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 19/19 [00:00<00:00, 134.58it/s]


Site: site02 | Pearson r = 0.410 (p=0.000) | Partial η² = 0.117

Site: site03 | Pearson r = 0.344 (p=0.000) | Partial η² = 0.092

Site: site04 | Pearson r = 0.474 (p=0.000) | Partial η² = 0.223

Site: site05 | Pearson r = 0.511 (p=0.000) | Partial η² = 0.257

Site: site06 | Pearson r = 0.413 (p=0.000) | Partial η² = 0.169

Site: site08 | Pearson r = 0.405 (p=0.001) | Partial η² = -0.127

Site: site09 | Pearson r = 0.455 (p=0.000) | Partial η² = 0.088

Site: site10 | Pearson r = 0.445 (p=0.000) | Partial η² = 0.196

Site: site11 | Pearson r = 0.506 (p=0.000) | Partial η² = 0.238

Site: site12 | Pearson r = 0.404 (p=0.003) | Partial η² = 0.125

Site: site13 | Pearson r = 0.479 (p=0.000) | Partial η² = 0.216

Site: site14 | Pearson r = 0.506 (p=0.000) | Partial η² = 0.246

Site: site15 | Pearson r = 0.684 (p=0.000) | Partial η² = 0.420

Site: site16 | Pearson r = 0.501 (p=0.000) | Partial η² = 0.246

Site: site17 | Pearson r = 0.423 (p=0.000) | Partial η² = 0.073

Site: site18 | Pearson 




In [42]:
segregation_ratio_followup_A = compute_one_statistic(df_features=main_followup_A,
                                                     df_network=conn_A_followup_df,
                                                     regions=regions,
                                                     stat_name="segregation_ratio",
                                                     mode="all",
                                                     node_order=None,
                                                     matrix_size=418,
                                                     diag_value=0.0)

followup_A_plus = combine_data(main_followup_A, segregation_ratio_followup_A)
results_no_network, models_no_network = run_cross_validation(followup_A_plus.fillna(0.0), y_followup_A,
                                                             use_network=False)
print_cross_val_results(results_no_network)

100%|██████████| 2409/2409 [00:20<00:00, 116.54it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 19/19 [00:00<00:00, 253.56it/s]


Site: site02 | Pearson r = 0.319 (p=0.000) | Partial η² = 0.074

Site: site03 | Pearson r = 0.424 (p=0.000) | Partial η² = 0.175

Site: site04 | Pearson r = 0.380 (p=0.000) | Partial η² = 0.122

Site: site05 | Pearson r = 0.525 (p=0.000) | Partial η² = 0.235

Site: site06 | Pearson r = 0.308 (p=0.000) | Partial η² = 0.053

Site: site08 | Pearson r = 0.384 (p=0.002) | Partial η² = -0.068

Site: site09 | Pearson r = 0.353 (p=0.005) | Partial η² = 0.011

Site: site10 | Pearson r = 0.372 (p=0.000) | Partial η² = 0.080

Site: site11 | Pearson r = 0.577 (p=0.000) | Partial η² = 0.330

Site: site12 | Pearson r = 0.521 (p=0.000) | Partial η² = 0.269

Site: site13 | Pearson r = 0.476 (p=0.000) | Partial η² = 0.215

Site: site14 | Pearson r = 0.475 (p=0.000) | Partial η² = 0.221

Site: site15 | Pearson r = 0.709 (p=0.000) | Partial η² = 0.388

Site: site16 | Pearson r = 0.384 (p=0.000) | Partial η² = 0.147

Site: site17 | Pearson r = 0.383 (p=0.000) | Partial η² = -0.020

Site: site18 | Pearson




In [43]:
segregation_index_baseline_A = compute_one_statistic(df_features=main_baseline_A,
                                                     df_network=conn_A_baseline_df,
                                                     regions=regions,
                                                     stat_name="segregation_index",
                                                     mode="all",
                                                     node_order=None,
                                                     matrix_size=418,
                                                     diag_value=0.0)
baseline_A_plus = combine_data(main_baseline_A, segregation_index_baseline_A)
results_no_network, models_no_network = run_cross_validation(baseline_A_plus, y_baseline_A, use_network=False)
print_cross_val_results(results_no_network)


100%|██████████| 2409/2409 [00:19<00:00, 124.71it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 19/19 [00:00<00:00, 277.26it/s]


Site: site02 | Pearson r = 0.405 (p=0.000) | Partial η² = 0.112

Site: site03 | Pearson r = 0.342 (p=0.000) | Partial η² = 0.088

Site: site04 | Pearson r = 0.473 (p=0.000) | Partial η² = 0.223

Site: site05 | Pearson r = 0.517 (p=0.000) | Partial η² = 0.263

Site: site06 | Pearson r = 0.416 (p=0.000) | Partial η² = 0.171

Site: site08 | Pearson r = 0.403 (p=0.002) | Partial η² = -0.125

Site: site09 | Pearson r = 0.483 (p=0.000) | Partial η² = 0.096

Site: site10 | Pearson r = 0.443 (p=0.000) | Partial η² = 0.194

Site: site11 | Pearson r = 0.488 (p=0.000) | Partial η² = 0.215

Site: site12 | Pearson r = 0.400 (p=0.003) | Partial η² = 0.116

Site: site13 | Pearson r = 0.481 (p=0.000) | Partial η² = 0.217

Site: site14 | Pearson r = 0.497 (p=0.000) | Partial η² = 0.238

Site: site15 | Pearson r = 0.689 (p=0.000) | Partial η² = 0.430

Site: site16 | Pearson r = 0.494 (p=0.000) | Partial η² = 0.242

Site: site17 | Pearson r = 0.397 (p=0.000) | Partial η² = 0.048

Site: site18 | Pearson 




In [44]:
segregation_index_followup_A = compute_one_statistic(df_features=main_followup_A,
                                                     df_network=conn_A_followup_df,
                                                     regions=regions,
                                                     stat_name="segregation_index",
                                                     mode="all",
                                                     node_order=None,
                                                     matrix_size=418,
                                                     diag_value=0.0)

followup_A_plus = combine_data(main_followup_A, segregation_index_followup_A)
results_no_network, models_no_network = run_cross_validation(followup_A_plus.fillna(0.0), y_followup_A,
                                                             use_network=False)
print_cross_val_results(results_no_network)

100%|██████████| 2409/2409 [00:20<00:00, 117.25it/s]


Dropping duplicate columns: ['Subject', 'site_id_l']


Cross-validation progress: 100%|██████████| 19/19 [00:00<00:00, 283.03it/s]


Site: site02 | Pearson r = 0.310 (p=0.000) | Partial η² = 0.068

Site: site03 | Pearson r = 0.423 (p=0.000) | Partial η² = 0.174

Site: site04 | Pearson r = 0.380 (p=0.000) | Partial η² = 0.122

Site: site05 | Pearson r = 0.522 (p=0.000) | Partial η² = 0.235

Site: site06 | Pearson r = 0.295 (p=0.000) | Partial η² = 0.037

Site: site08 | Pearson r = 0.349 (p=0.005) | Partial η² = -0.117

Site: site09 | Pearson r = 0.328 (p=0.009) | Partial η² = -0.016

Site: site10 | Pearson r = 0.370 (p=0.000) | Partial η² = 0.076

Site: site11 | Pearson r = 0.590 (p=0.000) | Partial η² = 0.348

Site: site12 | Pearson r = 0.524 (p=0.000) | Partial η² = 0.274

Site: site13 | Pearson r = 0.458 (p=0.000) | Partial η² = 0.193

Site: site14 | Pearson r = 0.458 (p=0.000) | Partial η² = 0.206

Site: site15 | Pearson r = 0.747 (p=0.000) | Partial η² = 0.428

Site: site16 | Pearson r = 0.380 (p=0.000) | Partial η² = 0.144

Site: site17 | Pearson r = 0.353 (p=0.000) | Partial η² = -0.048

Site: site18 | Pearso




####  Option A preprocessed baseline data summary table:
| **Model Description**             | **Average Pearson's r** | **Cross-validated Partial η² (avg)** |
|-----------------------------------|-------------------------|--------------------------------------|
| Full features                     | 0.455                   | 0.170                                |
| Full features + 15 PCs            | 0.458                   | 0.174                                |
| Full features + 100 PCs           | 0.479                   | 0.193                                |
| Full features + pos strength      | 0.451                   | 0.165                                |
| Full features + neg strength      | 0.450                   | 0.166                                |
| Full features + total strength    | 0.451                   | 0.164                                |
| Full features + within positive   | 0.457                   | 0.169                                |
| Full features + between positive  | 0.451                   | 0.166                                |
| Full features + within negative   | 0.450                   | 0.164                                |
| Full features + between negative  | 0.450                   | 0.166                                |
| Full features + segregation ratio | 0.458                   | 0.171                                |
| Full features + segregation index | 0.457                   | 0.170                                |


####  Option A preprocessed followup data summary table:
| **Model Description**             | **Average Pearson's r** | **Cross-validated Partial η² (avg)** |
|-----------------------------------|-------------------------|--------------------------------------|
| Full features                     | 0.441                   | 0.161                                |
| Full features + 15 PCs            | 0.457                   | 0.178                                |
| Full features + 100 PCs           | 0.487                   | 0.204                                |
| Full features + pos strength      | 0.440                   | 0.163                                |
| Full features + neg strength      | 0.444                   | 0.160                                |
| Full features + total strength    | 0.444                   | 0.166                                |
| Full features + within positive   | 0.441                   | 0.158                                |
| Full features + between positive  | 0.440                   | 0.163                                |
| Full features + within negative   | 0.438                   | 0.156                                |
| Full features + between negative  | 0.444                   | 0.160                                |
| Full features + segregation ratio | 0.446                   | 0.163                                |
| Full features + segregation index | 0.441                   | 0.158                                |