Brain disorder diagnosis notebook rework #13

zaRizk7 · 2025-06-28T02:11:45Z

This PR introduces an almost complete rework of the brain disorder diagnosis tutorial notebook and some changes regarding the structure to store resources for the book (e.g., logo and icon).

Notebook

Complete restructure of the notebook per-agreed structure.
Adds introduction to the problem and multi-site data with ABIDE and why integrating the data remains a challenge particularly for ML.
Adds problem formulation and solution similar to the ones in Kunda et al. (2022).
Adds an objective following the produced flowchart, which describes the modeling pipeline and its associated PyKale (or other libraries) module.
Adds base.yml files for both lpgo and skf evaluation.
Include small config subsets that can be modified during runtime.
Adds quizzes for several sections.
Adds visualization for functional connectivity and phenotypic data distribution (before and after preprocessing).
Adds illustration for comparing lpgo and skf.
Include skf experiment result to compare with the default lpgo runtime used for the experiment to highlight the improvements when applying MIDA.
Adds extra tasks for exploration.
Include quiz solutions at the end.
Includes tmi2022.yml file to replicate the param grid found in Kunda et al. (2022)'s source code.
Adds estimated Colab runtime on package installation and training.

Helper Functions

Include visualization.py to aid in visualizing phenotypic data distributions and FCs.
Adds parse_param_grid to parsing to workaround yacs limitation to dynamically define param_grid.
Includes top_k_sites in load_data from data.py to sample the k-largest sites.
Include top_k_sites in config.py.

Book Project Structure Reorganization

Adds favicon.ico icon.
Reorganized the logo and icon, moved to resources.
Include sphinx_exercise to aid in generating RST-style exercise.
Include torch-geometric and sphinx_exercise in requirements.py.
Updates nilearn version.

Note

Once merged, the branch specified in the notebook needs to be changed to clone the main branch.
The figures will only be properly rendered in jupyter-book, the Colab runtime will not render it as it doesn't support RST rendering.

…lace pandas

…in-decoding

…best model, and pyg wheel during pip install

…or comparing eval splits

…ix for fc visualization

…including quizzes, extra tasks, and more

- update runtime output with top-10 sites - update the plot format and description - add estimated runtime for long running cells - update interpretation and answer

Copilot

Pull Request Overview

This PR introduces a comprehensive rework of the brain disorder diagnosis tutorial notebook along with several updates in helper functions, configuration, and experiment setup files.

Restructures the diagnostic notebook and adds new visualization and helper functions.
Updates preprocessing logic (including a change in handedness mapping) and extends parameter parsing.
Revises YAML experiment configuration files and project config for improved resource management.

Reviewed Changes

Copilot reviewed 10 out of 17 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tutorials/brain-disorder-diagnosis/visualization.py	Adds functions to plot phenotypic distributions and functional connectivity matrices.
tutorials/brain-disorder-diagnosis/preprocess.py	Updates phenotypic data processing, including a change in handedness mapping and an option to one-hot encode categories.
tutorials/brain-disorder-diagnosis/parsing.py	Introduces the parse_param_grid helper function for dynamic hyperparameter grid construction.
tutorials/brain-disorder-diagnosis/experiments/skf/tmi2022.yml	Defines configuration for skf experiments including cross-validation and parameter grid settings.
tutorials/brain-disorder-diagnosis/experiments/skf/base.yml	Provides baseline settings for skf experiments including dataset and trainer configurations.
tutorials/brain-disorder-diagnosis/experiments/lpgo/tmi2022.yml	Sets up configuration for lpgo experiments with updated parameter grid values.
tutorials/brain-disorder-diagnosis/experiments/lpgo/base.yml	Updates baseline experiment settings for the lpgo pipeline, including an increased NUM_SEARCH_ITER value.
tutorials/brain-disorder-diagnosis/data.py	Adds support for selecting the top K sites when loading data to filter subjects by site frequency.
tutorials/brain-disorder-diagnosis/config.py	Renames and adds configuration keys (e.g., DATA_DIR, TOP_K_SITES) for improved resource handling.
_config.yml	Updates project configuration with new logo and favicon paths along with additional Sphinx extensions.

Comments suppressed due to low confidence (2)

tutorials/brain-disorder-diagnosis/preprocess.py:36

The mapping for handedness has changed from 'LEFT' to 'RIGHT' for both '-9999' and np.nan. Verify that this change aligns with the intended diagnostic criteria.

        "-9999": "RIGHT",

tutorials/brain-disorder-diagnosis/experiments/lpgo/base.yml:10

[nitpick] Increasing NUM_SEARCH_ITER to 100 may significantly impact computation time; confirm that this value is optimal for the intended performance requirements.

  NUM_SEARCH_ITER: 100

Copilot · 2025-06-28T02:13:23Z

tutorials/brain-disorder-diagnosis/experiments/skf/tmi2022.yml

+    - [domain_adapter__num_components, [50, 150, 300]]
+    - [domain_adapter__mu, [0.25, 0.5, 0.75, 1.0]]
+    - [domain_adapter__ignore_y, [True]]
+    - [domain_adapter__augment, [pre, post, null]]


Consider quoting the values 'pre', 'post', and 'null' in the parameter grid to ensure they are parsed as strings in YAML.

Suggested change

- [domain_adapter__augment, [pre, post, null]]

- [domain_adapter__augment, ['pre', 'post', 'null']]

Using 'null' might made yacs mis-parsed it as a string, as the expected result is null = None.

Copilot · 2025-06-28T02:13:23Z

tutorials/brain-disorder-diagnosis/experiments/lpgo/tmi2022.yml

+    - [domain_adapter__num_components, [50, 150, 300]]
+    - [domain_adapter__mu, [0.25, 0.5, 0.75, 1.0]]
+    - [domain_adapter__ignore_y, [True]]
+    - [domain_adapter__augment, [pre, post, null]]


Consider quoting the values 'pre', 'post', and 'null' in the parameter grid to ensure they are parsed as strings in YAML.

Suggested change

- [domain_adapter__augment, [pre, post, null]]

- [domain_adapter__augment, ['pre', 'post', 'null']]

Using 'null' might made yacs mis-parsed it as a string, as the expected result is null = None.

zaRizk7 added 30 commits June 5, 2025 00:15

cast site to numpy

dcf93a4

add num_solver_iter and rename extension

f0d92aa

update notebook objectives and trainer imports

7047c23

update base exp yaml

092bf9d

use skf by default

448c359

add handle for google colab runtime

1e32ba2

update output

cf9e370

reduce preprocess_phenotypic_data functionality and use polars to rep…

bff672b

…lace pandas

use polars to replace pandas

72161f0

add manifest and load_data function to fetch data from gdrive

11da26e

update default cfg and base exp yml

d22141d

update notebook contents

b5b5923

add polars and gdown to req

5bbcc39

change nilearn req

237691d

remove param_validation

afb3b0a

add handle to prioritize site-packages for colab

fb65616

use single core only

8567ddd

update pre_dispatch config

ac4a5e5

add --user to handle site-packages

7a175ba

use default n_jobs

fe52be1

fallback to pandas

bebe42f

update config and base yml

79f0336

remove polars

d93784d

use tangent-pearson by default

1ad406b

remove fc cfg

544fb50

reduce search iter

63d78c4

update notebook with new cfg

d41007a

Merge branch 'main' into brain-decoding

dba1953

revert to use param_validation for load_data

df96558

fix pydoc typo

8688472

zaRizk7 added 24 commits June 26, 2025 12:16

Merge branch 'main' of https://github.com/pykale/embc-mmai25 into bra…

cef1494

…in-decoding

include top_k_site args, parse_param_grid function, select coef from …

e6b4b7c

…best model, and pyg wheel during pip install

include flowchart

234a74d

update interpretation with new base config

17d7b7e

include sphinx-exercise as requirements to format exercise question

9e65f6c

correct mapping for missing handedness

2cf6674

add one_hot_encode as optional arguments for preprocess_phenotype_data

cfd95f9

improve error message clarity

84f3948

add visualization code for phenotype distribution and fc and figure f…

5267704

…or comparing eval splits

include rst and sphinx style captioning and exercise format

030db01

reorganize config yml

1ed4fdb

add additional param_grid for tmi2022

aa4cc8e

reorganize resources like logo and add pykale icon

393cca2

include abide logo for notebook intro

090551f

add hue grouping for phenotype distribution and upper triangular matr…

ee0a647

…ix for fc visualization

replace path to data_dir to standardize with load_data args

ebd3b96

update logo directory

3f9d9d2

complete reorganization of notebook structure and add extra contents …

d9dc5b3

…including quizzes, extra tasks, and more

use latest nilearn version for pip

08ed488

updates indexing for selecting subjects for visualizing fc

760b240

add description about compile_results

15bccdf

use top-10 sites for base config

9ec41b6

updates notebook content

3dd3b92

- update runtime output with top-10 sites - update the plot format and description - add estimated runtime for long running cells - update interpretation and answer

adds hue order

db83128

zaRizk7 requested a review from Copilot June 28, 2025 02:12

Copilot AI reviewed Jun 28, 2025

View reviewed changes

swap the abide logo order

54f522f

shuo-zhou approved these changes Jun 28, 2025

View reviewed changes

shuo-zhou merged commit 7b95e71 into main Jun 28, 2025
1 check passed

shuo-zhou deleted the brain-decoding branch June 28, 2025 20:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Brain disorder diagnosis notebook rework #13

Brain disorder diagnosis notebook rework #13

Uh oh!

zaRizk7 commented Jun 28, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jun 28, 2025

Uh oh!

zaRizk7 Jun 28, 2025 •

edited

Loading

Uh oh!

Copilot AI Jun 28, 2025

Uh oh!

zaRizk7 Jun 28, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	- [domain_adapter__augment, [pre, post, null]]
	- [domain_adapter__augment, ['pre', 'post', 'null']]

Brain disorder diagnosis notebook rework #13

Brain disorder diagnosis notebook rework #13

Uh oh!

Conversation

zaRizk7 commented Jun 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Notebook

Helper Functions

Book Project Structure Reorganization

Note

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jun 28, 2025

Choose a reason for hiding this comment

Uh oh!

zaRizk7 Jun 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jun 28, 2025

Choose a reason for hiding this comment

Uh oh!

zaRizk7 Jun 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zaRizk7 commented Jun 28, 2025 •

edited

Loading

zaRizk7 Jun 28, 2025 •

edited

Loading

zaRizk7 Jun 28, 2025 •

edited

Loading