This project has been split into two separate stages so you only have to run the heavy data preprocessing once!
This stage converts the raw CSV signals into images, applies edge detection, and builds the PyTorch Geometric graph datasets.
- Upload this code: Upload the
GraphECGNet_Kagglefolder as a Kaggle Dataset (e.g.,graphecgnet-code). - Create Notebook 1: Create a new Kaggle Notebook.
- Attach Datasets: Attach your code dataset and the raw CSV dataset (
mondejar/mitbih-database). - Run Preprocessing:
# Cell 1: Copy code to writable directory
import shutil, sys
import os
# If Kaggle nested your folder, copy from the inner folder
source_dir = '/kaggle/input/graphecgnet-code/GraphECGNet_Kaggle'
# If it's not nested, uncomment the line below:
# source_dir = '/kaggle/input/graphecgnet-code'
shutil.copytree(source_dir, '/kaggle/working/GraphECGNet_Kaggle', dirs_exist_ok=True)
sys.path.insert(0, '/kaggle/working/GraphECGNet_Kaggle')
# Cell 2: Run the Preprocessing Pipeline
!python /kaggle/working/GraphECGNet_Kaggle/run_preprocessing.py \
--data_path /kaggle/input/mondejar-mitbih-database # Adjust to actual dataset path- Save Output: When it finishes, you will see a
/kaggle/working/GraphDatafolder. Click "Save Version" -> "Save & Run All (Commit)". Then, go to the notebook output and create a New Dataset from theGraphDataoutput (e.g., name itgraphecgnet-preprocessed-data).
This stage skips all preprocessing and directly trains the GNN on your saved graphs. You can restart this notebook as many times as you want to change model architectures, epochs, etc.
- Create Notebook 2: Create a new notebook for training.
- Attach Datasets: Attach your code dataset (
graphecgnet-code) AND your newly created preprocessed data dataset (graphecgnet-preprocessed-data). - Run Training:
# Cell 1: Copy code to writable directory
import shutil, sys
import os
# If Kaggle nested your folder (e.g. input/dataset/GraphECGNet_Kaggle), copy from the inner folder
source_dir = '/kaggle/input/graphecgnet-code/GraphECGNet_Kaggle'
# If it's not nested, uncomment the line below:
# source_dir = '/kaggle/input/graphecgnet-code'
shutil.copytree(source_dir, '/kaggle/working/GraphECGNet_Kaggle', dirs_exist_ok=True)
sys.path.insert(0, '/kaggle/working/GraphECGNet_Kaggle')
# Cell 2: Run the Training Pipeline
!python /kaggle/working/GraphECGNet_Kaggle/run_training.py \
--graph_data /kaggle/input/graphecgnet-preprocessed-data/GraphData \
--epochs 100When you want to change the model architecture (like adding layers, changing channels, changing from GCN to GAT), you only need to:
- Update
models.pyormain.pyon your computer. - Update the
graphecgnet-codedataset on Kaggle with the new files. - Restart Notebook 2 (Training) and run it! You never have to wait for the images to generate again.
If you want to run things manually instead of using run_preprocessing.py and run_training.py:
signal2image.py:--data_path(Input CSVs),--output_path(Output Images)edge_transformation.py:--source_base(Input Images),--dest_base(Output Edges)Graph_construction.py:--edge_base(Input Edges),--output_base(Output Graphs),--dataset_namemain.py:--root(Input Graph Data),--epochs,--batch_size,--lr,--layer_name