# **Inference and evaluation of Graphical Neural Network cell tracking using Google Colab**
#### Muhamed Tafech, Sayel Elwan, Rudy Harricks, Reira Ikenaga

This notebook is a comprehensive inference and evaluation pipeline adapting the GNN (graphical neural network) cell tracking software developed by Tal Ben-Haim and Tammy Riklin-Raviv [(GitHub)](https://github.com/talbenha/cell-tracker-gnn).
This iteration trains a GNN model on a provided Cell Tracking Challenge data set.

This framework is designed for 2D+Time datasets sourced from the [Cell Tracking Challenge website](https://celltrackingchallenge.net/2d-datasets/).

Before use, ensure that Google Colab's runtime is set to T4 GPU-accelerated. This is necessary for metric learning and comprehensive feature extraction that will create the tracking model.

This method also requires the user to mount their Google Drive in which the following directories will be accessed and output to:

Input directories:


*   ```PROJECT/data/``` directory containing cell training data

Key output directories:


*  ```PROJECT/gnn_out/trained_models/``` saves trained model output

*  ```PROJECT/gnn_out/inference_results/dataset_name/01_res_inference``` contains key data for 3D visualisation script.

*  ```PROJECT/gnn_out/results/dataset_name/TR_RES``` saves key tracking results to be used for CTC evaluation


```
PROJECT/
├── (INPUT) data/
│   └── dataset_name
└── (OUTPUT) gnn_out/
    ├── features/
    │   ├── basic_features
    │   └── advanced_features
    ├── inference_results/
    │   └── dataset_name/
    │       ├── [01/02]_CSV
    │       ├── [01/02]_RES_inference
    │       └── RES_inference
    ├── trained_models/
    │   ├── metric_models
    │   └── pytorch_models
    └── results/
        └── dataset_name/
            └── TR_RES
```


#1. Setting up the Konda environment

To begin, the ```DATASET_NAME``` variable will need to be updated to the name of the dataset being used for training. This name is taken from the folder downloaded from the [Cell Tracking Challenge website](https://celltrackingchallenge.net/2d-datasets/).

In [None]:
import os
ROOT = "/content/drive/MyDrive/PROJECT"
DATASET_NAME = "Fluo_C2DL_Huh7" # Update to correspond to data being used
DATA_FOLDER = os.path.join(ROOT, "data", DATASET_NAME)
OUT_FOLDER = os.path.join(ROOT, "gnn_out")

Firstly, let's install all dependencies and create the working environment. [Konda](https://github.com/tamnguyenvan/konda) makes maintaining required packages in the Colab runtime easy.

In [None]:
! git clone https://github.com/talbenha/cell-tracker-gnn.git
! pip install konda

import os
import konda
from google.colab import drive

os.chdir(f'./cell-tracker-gnn')
drive.mount('/content/drive')

konda.install()

!konda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/main
!konda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/r

Cloning into 'cell-tracker-gnn'...
remote: Enumerating objects: 504, done.[K
remote: Counting objects: 100% (181/181), done.[K
remote: Compressing objects: 100% (61/61), done.[K
remote: Total 504 (delta 149), reused 120 (delta 120), pack-reused 323 (from 1)[K
Receiving objects: 100% (504/504), 38.01 MiB | 16.68 MiB/s, done.
Resolving deltas: 100% (263/263), done.
Collecting konda
  Downloading konda-0.1.0-py3-none-any.whl.metadata (3.7 kB)
Downloading konda-0.1.0-py3-none-any.whl (7.3 kB)
Installing collected packages: konda
Successfully installed konda-0.1.0
Mounted at /content/drive
Downloading Miniconda installer...
Installing Miniconda to /usr/local...
✅ Miniconda installed successfully!
Run '!conda --version' to check if conda is working.

📋 Usage examples:
  konda create -n my_env python=3.11 -y
  konda activate my_env
accepted Terms of Service for [4;94mhttps://repo.anaconda.com/pkgs/main[0m
accepted Terms of Service for [4;94mhttps://repo.anaconda.com/pkgs/r[0m


Now to create the Konda virtual environment - **(this step can take around 10 mins)**.

In [None]:
!konda create --name cell-tracking-challenge python=3.8 --file requirements-conda.txt

[1;30;43mStreaming output truncated to the last 5000 lines.[0m















cudatoolkit-11.1.1   | :  32% 0.3220437837491704/1 [00:13<00:22, 33.22s/it][A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A
















cudatoolkit-11.1.1   | :  33% 0.32543931974904367/1 [00:13<00:21, 32.06s/it][A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A
















cudatoolkit-11.1.1   | :  33% 0.33016519634588215/1 [00:13<00:18, 27.98s/it][A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A
















cudatoolkit-11.1.1   | :  33% 0.33380146064425387/1 [00:13<00:21, 32.17s/it][A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A
















cudatoolkit-11.1.1   | :  34% 0.3370576276292069/1 [00:13<00:21, 31.99s/it] [A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A
















cudatoolkit-11.1.1   | :  34% 0.3413653971812849/1 [00:13<00:19, 29.00s/it][A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A
















cudatoolkit-11.1.1   | :  35% 0.3476370028526925/1 [00:13<

In [None]:
!konda activate cell-tracking-challenge
!konda run "pip install --upgrade pip==23.0.0"
!konda run "pip install -r requirements.txt"
!konda run "pip install record-keeper PyYAML==5.4.1 pandas networks"

✅ Set active environment to: cell-tracking-challenge
Activating conda environment: cell-tracking-challenge
Collecting pip==23.0.0
  Downloading pip-23.0-py3-none-any.whl (2.1 MB)
[K     |████████████████████████████████| 2.1 MB 5.9 MB/s 
[?25hInstalling collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 21.2.4
    Uninstalling pip-21.2.4:
      Successfully uninstalled pip-21.2.4
Successfully installed pip-23.0
Collecting hydra-core==1.1.0.dev5
  Downloading hydra_core-1.1.0.dev5-py3-none-any.whl (136 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m136.0/136.0 kB[0m [31m7.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting hydra-colorlog==1.0.0
  Downloading hydra_colorlog-1.0.0-py3-none-any.whl (3.6 kB)
Collecting hydra-optuna-sweeper==0.9.0rc2
  Downloading hydra_optuna_sweeper-0.9.0rc2-py3-none-any.whl (5.9 kB)
Collecting wandb>=0.10.20
  Downloading wandb-0.22.2-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.9 M

# 2. Training the graphical neural network (GNN)

## 2.1 Feature extraction
First, let's check the mounted Google Drive has necessary IO directories.

In [None]:
def ensure_dir(path, message=None):
    if not os.path.isdir(path):
        if message:
            print(message)
        os.makedirs(path, exist_ok=True)

ensure_dir(DATA_FOLDER, "Data folder not present in drive. Creating now.")
ensure_dir(os.path.join(OUT_FOLDER, "trained_models", "metric_models", DATASET_NAME),
           "No metric models output folder. Creating now.")
ensure_dir(os.path.join(OUT_FOLDER, "trained_models", "pytorch_models", DATASET_NAME),
           "No pytorch_models folder. Creating now.")
ensure_dir(os.path.join(OUT_FOLDER, "features"),
           "Features folder not created yet. Creating now.")
ensure_dir(os.path.join(OUT_FOLDER, "inference_results", DATASET_NAME, "RES_inference"),
           "Results folder not found. Creating now.")
ensure_dir(os.path.join(OUT_FOLDER, "results", DATASET_NAME),
           "Results folder not found. Creating now.")

Now for basic feature extraction.

In [None]:
!konda run "python run_feat_extract.py \
params.input_images=\"{DATA_FOLDER}\" \
params.input_masks=\"{DATA_FOLDER}\" \
params.input_seg=\"{DATA_FOLDER}\" \
params.output_csv=\"$OUT_FOLDER/features/basic_features\" \
params.sequences=['01','02'] \
params.seg_dir='_GT/TRA' \
params.basic=True"

[2025-10-14 07:02:18,432][numexpr.utils][INFO] - NumExpr defaulting to 2 threads.
files were saved to : /content/drive/MyDrive/PROJECT/gnn_out/features/basic_features/Fluo_C2DL_Huh7/01/csv
files were saved to : /content/drive/MyDrive/PROJECT/gnn_out/features/basic_features/Fluo_C2DL_Huh7/02/csv


## 2.2 Metric learning training
This step can take time and requires Colab GPU-acceleration.

**⚠ [Note for development team]** Training epochs are capped to 1 for testing so as to not deplete compute units. In an experiment, change this parameter to satisfy requirements.

In [None]:
%%bash
sed -i 's/num_epochs: 1000/num_epochs: 1/' "configs/metric_learning/config_2D.yaml"

In [None]:
!konda run "python run_train_metric_learning.py \
dataset.kwargs.data_dir_img=\"{DATA_FOLDER}\" \
dataset.kwargs.data_dir_mask=\"{DATA_FOLDER}\" \
dataset.kwargs.dir_csv=\"{OUT_FOLDER}/features/basic_features/{DATASET_NAME}\" \
dataset.kwargs.subdir_mask='GT/TRA'"


[2025-10-14 07:05:48,602][numexpr.utils][INFO] - NumExpr defaulting to 2 threads.
[2025-10-14 07:05:49,433][matplotlib.font_manager][INFO] - generated new fontManager
start time: 2025-10-14/07-05-49
cuda
Config dictionary
{'model_name': 'resnet18',
 'embedding_dim': 128,
 'normalized_feat': True,
 'dataset_dict_keys': ['val'],
 'batch_size': 32,
 'num_epochs': 1,
 'num_workers': 16,
 'lr_trunk': 1e-05,
 'lr_embedder': 0.0001,
 'weight_decay': 0.0001,
 'loss_function': 'MultiSimilarityLoss',
 'loss_distance': 'CosineSimilarity',
 'loss_margin': 0.25,
 'loss_gamma': 128,
 'epsilon_miner': 0.1,
 'shorter': True,
 'avg_of_avgs': True,
 'k': 'max_bin_count',
 'm_samples': 4,
 'test_interval': 1,
 'patience': 50}
data_config dictionary
{'pad_value': 0,
 'norm_value': 0,
 'normalize_type': 'MinMaxCell',
 'train_val_test_split': [80, 20, 0],
 'data_dir_img': '/content/drive/MyDrive/PROJECT/data/Fluo_C2DL_Huh7',
 'data_dir_mask': '/content/drive/MyDrive/PROJECT/data/Fluo_C2DL_Huh7',
 'subdir_ma

The above code saved output in a datetime named folder, here we rename it for easier reference


In [None]:
command = f"""locate=$(find /content/cell-tracker-gnn/outputs -type f -name "all_params.pth" -printf "%h\\Sn" | head -1) && \
cp -r "$locate"/. {OUT_FOLDER}/trained_models/metric_models/{DATASET_NAME}/"""

!{command}

## 2.3 Advanced feature extraction



In [None]:
!konda run "python run_feat_extract.py \
params.input_images=\"{DATA_FOLDER}\" \
params.input_masks=\"{DATA_FOLDER}\" \
params.input_seg=\"{DATA_FOLDER}\" \
params.output_csv=\"{OUT_FOLDER}/features/advanced_features\" \
params.sequences=['01','02'] \
params.seg_dir='_GT/TRA' \
params.basic=False \
params.input_model=\"{OUT_FOLDER}/trained_models/metric_models/{DATASET_NAME}/all_params.pth\""

[2025-10-14 07:07:43,760][numexpr.utils][INFO] - NumExpr defaulting to 2 threads.
Using resnet18 model architecture.
files were saved to : /content/drive/MyDrive/PROJECT/gnn_out/features/advanced_features/Fluo_C2DL_Huh7/01/csv
Using resnet18 model architecture.
files were saved to : /content/drive/MyDrive/PROJECT/gnn_out/features/advanced_features/Fluo_C2DL_Huh7/02/csv


## 2.4 Cell tracking training

**⚠ [Note for development team]** Training epochs are capped to 1 for testing so as to not deplete compute units. In an experiment, change this parameter to satisfy requirements.

In [None]:
!sed -i 's/max_epochs: 500/max_epochs: 1/' "configs/trainer/default_trainer.yaml"

In [None]:
!konda run "python run.py \
datamodule.dataset_params.main_path=\"{OUT_FOLDER}/features/advanced_features/{DATASET_NAME}\" \
datamodule.dataset_params.exp_name='2D_SIM' \
datamodule.dataset_params.drop_feat=[]"


[2m⚙ CONFIG[0m[2m                                                                        [0m
[2m├── [0m[2mtrainer[0m[2m                                                                     [0m
[2m│   [0m[2m└── [0m[2;91;40m_target_[0m[2;97;40m:[0m[2;97;40m [0m[2;40mpytorch_lightning.Trainer[0m[2;40m                                     [0m
[2m│   [0m[2m    [0m[2;91;40mgpus[0m[2;97;40m:[0m[2;97;40m [0m[2;40m1[0m[2;40m                                                                 [0m
[2m│   [0m[2m    [0m[2;91;40mmin_epochs[0m[2;97;40m:[0m[2;97;40m [0m[2;40m1[0m[2;40m                                                           [0m
[2m│   [0m[2m    [0m[2;91;40mmax_epochs[0m[2;97;40m:[0m[2;97;40m [0m[2;40m1[0m[2;40m                                                           [0m
[2m│   [0m[2m    [0m[2;91;40mweights_summary[0m[2;97;40m:[0m[2;97;40m [0m[2;40mtop[0m[2;40m                                                  

In [None]:
command = f"""
dire=$(find /content/cell-tracker-gnn/logs/runs -type f -name "last.ckpt" -printf "%h\\n" | head -1) && \
cp -r "$dire"/../. "{OUT_FOLDER}/trained_models/pytorch_models/{DATASET_NAME}/"
"""

!{command}

# Inference

In [None]:
os.environ['PYTHONPATH'] = '/content/cell-tracker-gnn/'

!konda run "python src/inference/preprocess_seq2graph_clean.py -cs 20 -ii \"{DATA_FOLDER}/01\" -iseg \"{DATA_FOLDER}/01_GT/TRA\" -im \"{OUT_FOLDER}/trained_models/metric_models/{DATASET_NAME}/all_params.pth\" -oc \"{OUT_FOLDER}/inference_results/{DATASET_NAME}/01_CSV\""
!konda run "python src/inference/inference_clean.py -mp \"{OUT_FOLDER}/trained_models/pytorch_models/{DATASET_NAME}/checkpoints/last.ckpt\" -ns \"01\" -oc \"{OUT_FOLDER}/inference_results/{DATASET_NAME}/\""
!konda run "python src/inference/postprocess_clean.py -modality \"2D\" -iseg \"{DATA_FOLDER}/01_GT/TRA\" -oi \"{OUT_FOLDER}/inference_results/{DATASET_NAME}/01_RES_inference/\""


number of detected changes: 0
0
global_delta_row: 13
global_delta_col: 11
We don't assign new region of interest - use the old one
Using resnet18 model architecture.
files were saved to : $/content/drive/MyDrive/PROJECT/gnn_out/inference_results/Fluo_C2DL_Huh7/01_CSV/csv
load model from: /content/drive/MyDrive/PROJECT/gnn_out/trained_models/pytorch_models/Fluo_C2DL_Huh7/checkpoints/last.ckpt
"directed":        True
"loss_weights":    65
"lr":              0.001
"lr_sch_module":   {'target': None, 'monitor': 'val/acc', 'kwargs': {'verbose': True, 'mode': 'min'}}
"model_params":    {'target': 'CellTrack_Model', 'kwargs': {'hand_NodeEncoder_dic': {'input_dim': 13, 'fc_dims': [64, 16]}, 'learned_NodeEncoder_dic': {'input_dim': 128, 'fc_dims': [64, 16]}, 'intialize_EdgeEncoder_dic': {'input_dim': 239, 'fc_dims': [128, 64]}, 'message_passing': {'target': 'CellTrack_GNN', 'kwargs': {'in_channels': 32, 'hidden_channels': 32, 'in_edge_channels': 64, 'hidden_edge_channels_conv': 16, 'hidden_edge

Clean up and move some files which the repo misplaces:

In [None]:
!mv {DATA_FOLDER}/01_GT/TR_RES/ {OUT_FOLDER}/results/{DATASET_NAME}/

mv: cannot stat '/content/drive/MyDrive/PROJECT/gnn_out/01_SEG_RES': No such file or directory
rmdir: failed to remove '/content/drive/MyDrive/PROJECT/gnn_out/01_RES': No such file or directory


# Evaluation

In [None]:
!pip install py-ctcmetrics
!ctc_evaluate --gt "{DATA_FOLDER}/01_GT" --res "{OUT_FOLDER}/results/{DATASET_NAME}/TR_RES/"

Collecting py-ctcmetrics
  Downloading py_ctcmetrics-1.3.2-py3-none-any.whl.metadata (581 bytes)
Collecting numpy (from py-ctcmetrics)
  Downloading numpy-2.3.3-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (62 kB)
Collecting scikit-learn (from py-ctcmetrics)
  Downloading scikit_learn-1.7.2-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (11 kB)
Collecting scipy (from py-ctcmetrics)
  Downloading scipy-1.16.2-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (62 kB)
Collecting tifffile (from py-ctcmetrics)
  Downloading tifffile-2025.10.4-py3-none-any.whl.metadata (30 kB)
Collecting imagecodecs (from py-ctcmetrics)
  Downloading imagecodecs-2025.8.2-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (20 kB)
Collecting pandas (from py-ctcmetrics)
  Downloading pandas-2.3.3-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.metadata (91 kB)
Collecting python-dateutil>=2.8.2 (from pandas->py-ctcmetric