Add Heart tutorial #5

Mdnaimulislam · 2025-06-19T11:07:36Z

Adds a tutorial for cardiac hemodynamics assessment using PyKale with multimodal low-cost CXR and ECG modalities.

Notebook structure consisting of:

Introduction and Objective
Setup (Warning Supression, Required Packages, and Helper Functions)
Pretrain Data Loading and Preprocessing
Model Definition
Pretraining
Finetune Data Loading and Preprocessing
Finetuning and Evaluation
Multimdoal Interpretation
Adds descriptions for each sections.

Copilot

Pull Request Overview

Adds a new tutorial for cardiac hemodynamics assessment using multimodal CXR and ECG data, including utilities, configs, and an interpretation script.

Introduces remap_model_parameters.py to align pretrained checkpoint keys with renamed parameters
Adds separate pretraining and finetuning configuration modules plus corresponding experiment YAMLs
Implements interpret.py for integrated‐gradients attribution on ECG and CXR, and updates the tutorial Table of Contents

Reviewed Changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
tutorials/cardiac-hemodynamics-assesment/remap_model_parameters.py	Utility to remap pretrained model parameter names
tutorials/cardiac-hemodynamics-assesment/pretraining_config.py	Default configuration for pretraining
tutorials/cardiac-hemodynamics-assesment/interpret.py	Script for multimodal ECG+CXR attribution
tutorials/cardiac-hemodynamics-assesment/finetune_config.py	Default configuration for finetuning
tutorials/cardiac-hemodynamics-assesment/experiments/pretraining_base.yml	Base YAML for pretraining experiments
tutorials/cardiac-hemodynamics-assesment/experiments/finetune_base.yml	Base YAML for finetuning experiments
_toc.yml	Added cardiac-hemodynamics-assesment notebook entry

Comments suppressed due to low confidence (2)

_toc.yml:26

The folder name 'cardiac-hemodynamics-assesment' is misspelled; consider renaming it to 'cardiac-hemodynamics-assessment' for consistency.

  - file: tutorials/cardiac-hemodynamics-assesment/notebook

tutorials/cardiac-hemodynamics-assesment/remap_model_parameters.py:4

Add unit tests for remap_state_dict_keys to verify that each mapping rule correctly renames all expected keys.

def remap_state_dict_keys(state_dict):

tutorials/cardiac-hemodynamics-assesment/remap_model_parameters.py

Copilot · 2025-06-23T08:21:45Z

tutorials/cardiac-hemodynamics-assesment/interpret.py

+    # --- Prediction ---
+    last_fold_model.eval()
+    with torch.no_grad():
+        logits = last_fold_model(xray_image, ecg_waveform)


The model forward pass uses the raw ecg_waveform, but IntegratedGradients is applied to ecg_smoothed_tensor. This mismatch can produce incorrect attributions; use the same input tensor in both the prediction and attribution steps.

Suggested change

logits = last_fold_model(xray_image, ecg_waveform)

logits = last_fold_model(xray_image, ecg_smoothed_tensor)

This is for prediction. so the Smoothing is for making the visualization better. So, rejecting this copilot suggestion.

Copilot · 2025-06-23T08:21:45Z

tutorials/cardiac-hemodynamics-assesment/interpret.py

+    full_time = np.arange(0, full_length) / sampling_rate / lead_number
+    important_indices_full = np.where(
+        norm_attributions_ecg[:full_length] >= ecg_threshold
+    )[0]
+
+    zoom_start = int(zoom_range[0] * 6000)
+    zoom_end = int(zoom_range[1] * 6000)
+    zoom_time = np.arange(zoom_start, zoom_end) / sampling_rate / lead_number
+    segment_ecg_waveform = ecg_waveform_np[zoom_start:zoom_end]
+    segment_attributions = norm_attributions_ecg[zoom_start:zoom_end]
+    important_indices_zoom = np.where(segment_attributions >= ecg_threshold)[0]
+    zoom_start_sec = zoom_start / sampling_rate / lead_number
+    zoom_end_sec = zoom_end / sampling_rate / lead_number


Dividing time indices by lead_number skews the time axis. Time should be computed as np.arange(full_length) / sampling_rate without dividing by the number of leads.

Suggested change

full_time = np.arange(0, full_length) / sampling_rate / lead_number

important_indices_full = np.where(

norm_attributions_ecg[:full_length] >= ecg_threshold

)[0]

zoom_start = int(zoom_range[0] * 6000)

zoom_end = int(zoom_range[1] * 6000)

zoom_time = np.arange(zoom_start, zoom_end) / sampling_rate / lead_number

segment_ecg_waveform = ecg_waveform_np[zoom_start:zoom_end]

segment_attributions = norm_attributions_ecg[zoom_start:zoom_end]

important_indices_zoom = np.where(segment_attributions >= ecg_threshold)[0]

zoom_start_sec = zoom_start / sampling_rate / lead_number

zoom_end_sec = zoom_end / sampling_rate / lead_number

full_time = np.arange(0, full_length) / sampling_rate

important_indices_full = np.where(

norm_attributions_ecg[:full_length] >= ecg_threshold

)[0]

zoom_start = int(zoom_range[0] * 6000)

zoom_end = int(zoom_range[1] * 6000)

zoom_time = np.arange(zoom_start, zoom_end) / sampling_rate

segment_ecg_waveform = ecg_waveform_np[zoom_start:zoom_end]

segment_attributions = norm_attributions_ecg[zoom_start:zoom_end]

important_indices_zoom = np.where(segment_attributions >= ecg_threshold)[0]

zoom_start_sec = zoom_start / sampling_rate

zoom_end_sec = zoom_end / sampling_rate

Again, rejecting this suggestion as we divide by the lead to get the values in seconds.

Copilot · 2025-06-23T08:21:45Z

tutorials/cardiac-hemodynamics-assesment/interpret.py

+    # Gather all batches (as in your code)
+    batches = list(last_val_loader)
+    all_xray_images, all_ecg_waveforms, all_labels = [
+        torch.cat(items) for items in zip(*batches)
+    ]
+
+    # --- Select Sample ---
+    xray_image = (
+        all_xray_images[sample_idx]
+        .unsqueeze(0)
+        .to(next(last_fold_model.parameters()).device)
+    )
+    ecg_waveform = (
+        all_ecg_waveforms[sample_idx]
+        .unsqueeze(0)
+        .to(next(last_fold_model.parameters()).device)
+    )
+    label = all_labels[sample_idx].item()


Concatenating all validation batches into memory may be expensive for large datasets. Consider indexing directly into the DataLoader's dataset or loading only the required sample to reduce memory usage.

Suggested change

# Gather all batches (as in your code)

batches = list(last_val_loader)

all_xray_images, all_ecg_waveforms, all_labels = [

torch.cat(items) for items in zip(*batches)

]

# --- Select Sample ---

xray_image = (

all_xray_images[sample_idx]

.unsqueeze(0)

.to(next(last_fold_model.parameters()).device)

)

ecg_waveform = (

all_ecg_waveforms[sample_idx]

.unsqueeze(0)

.to(next(last_fold_model.parameters()).device)

)

label = all_labels[sample_idx].item()

# --- Select Sample ---

xray_image, ecg_waveform, label = last_val_loader.dataset[sample_idx]

xray_image = (

xray_image.unsqueeze(0)

.to(next(last_fold_model.parameters()).device)

)

ecg_waveform = (

ecg_waveform.unsqueeze(0)

.to(next(last_fold_model.parameters()).device)

)

label = label.item()

No need for now. Rejecting it.

shuo-zhou

Please address my and Copilot's comments.

shuo-zhou · 2025-06-22T23:07:20Z

tutorials/cardiac-hemodynamics-assesment/interpret.py

+from scipy.ndimage import binary_dilation
+
+
+def multimodal_ecg_cxr_attribution(


Add docstring. Consider how to integrate this function to kale in the future

shuo-zhou · 2025-06-22T23:12:14Z

tutorials/cardiac-hemodynamics-assesment/pretraining_config.py

Can finetune_config.py and pretraining_config.py be merged into one? You can keep the two files as they are now and seek feedback from other team members and Pete/Kelly later.

In my opinion, separate is better. As both are separate operations and for the perticipent it will easier to distinguish the args for pre-trainign and fine-tuning.

shuo-zhou · 2025-06-22T23:13:28Z

tutorials/cardiac-hemodynamics-assesment/experiments/finetune_base.yml

+  EPOCHS: 10
+  LR: 0.001
+  HIDDEN_DIM: 128
+  NUM_CLASSES: 2


No need if not different from the default values in *_config.py, check all

Keeping the LR and Epochs, the participants can play with these parameters to see the performance difference.

shuo-zhou · 2025-06-22T23:14:26Z

tutorials/cardiac-hemodynamics-assesment/notebook.ipynb

Fix the sys.stdout AttributeError and remove all empty cells

shuo-zhou · 2025-06-23T08:24:32Z

tutorials/cardiac-hemodynamics-assesment/experiments/finetune_base.yml

+  ZOOM_RANGE: [3, 3.5]
+  ECG_THRESHOLD: 0.7
+  CXR_THRESHOLD: 0.7
+  LEAD_NUMBER: 12


What is the difference between NUM_LEADS and LEAD_NUMBER. Is LEAD_INDEX better?

removed LEAD_NUMBER.

# Conflicts: # _toc.yml

wenruifan

LGTM

Mdnaimulislam added 3 commits June 19, 2025 11:34

Added heart tutorial

d9d6ec6

Refine

0eee90b

pre-commit fix

1928b7c

Mdnaimulislam requested a review from shuo-zhou June 19, 2025 11:07

Mdnaimulislam self-assigned this Jun 19, 2025

Mdnaimulislam added the enhancement New feature or request label Jun 19, 2025

pre-commit fix

400412c

shuo-zhou requested a review from Copilot June 23, 2025 08:19

Copilot AI reviewed Jun 23, 2025

View reviewed changes

shuo-zhou requested changes Jun 23, 2025

View reviewed changes

Mdnaimulislam added 2 commits June 24, 2025 11:16

Merge branch 'main' into heart-tutorial

59782e4

# Conflicts: # _toc.yml

Fixed minor issues

40bdc1d

Mdnaimulislam enabled auto-merge (squash) June 24, 2025 10:52

Mdnaimulislam requested review from shuo-zhou and wenruifan June 24, 2025 10:52

wenruifan approved these changes Jun 24, 2025

View reviewed changes

shuo-zhou approved these changes Jun 24, 2025

View reviewed changes

Mdnaimulislam merged commit 8af0b79 into main Jun 24, 2025
1 check passed

	logits = last_fold_model(xray_image, ecg_waveform)
	logits = last_fold_model(xray_image, ecg_smoothed_tensor)

		from scipy.ndimage import binary_dilation


		def multimodal_ecg_cxr_attribution(

Add Heart tutorial #5

Add Heart tutorial #5

Uh oh!

Conversation

Mdnaimulislam commented Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Copilot AI Jun 23, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jun 23, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jun 23, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shuo-zhou left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wenruifan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Mdnaimulislam commented Jun 19, 2025 •

edited

Loading