## Demo to show how to read json file for training and building the dataset

## File Description

This file is a Jupyter Notebook that contains code snippets for training and building a dataset using the MONAI library for medical image segmentation. The file demonstrates how to read a JSON file, load and preprocess the training data, and create a data loader for training a neural network.

## Cells Description

- **Cell 2**: Imports necessary modules from the MONAI library for data loading and preprocessing.

- **Cell 3**: Define the path to the JSON file containing the training data.

- **Cell 4+5**: This cell loads the training data from the JSON file using the `load_decathlon_datalist()` function from MONAI and display the loaded training files.

- **Cell 6**: Define a set of transformations to be applied to the training data, including loading images, ensuring channel-first format, scaling intensity, and performing random cropping and rotation.

- **Cell 7**: Creates a cache dataset using the loaded training files and the defined transformations.

- **Cell 8**: Creates a data loader for the training dataset, specifying the batch size, shuffle option, and number of workers.

- **Cell 9+10**: Displays the shape of the first batch of training data and the items in the training dataset.

- **Cell 11**: This cell shows an example command to launch the ASCHOPLEX tool for further processing.

The file provides a step-by-step guide on how to load, preprocess, and prepare medical image data for training a neural network using the MONAI library.


In [1]:
import monai
from monai.data import load_decathlon_datalist, list_data_collate, CacheDataset, DataLoader
from monai.transforms import (
    Activations,
    EnsureChannelFirstd,
    AsDiscrete,
    Compose,
    LoadImaged,
    RandCropByPosNegLabeld,
    RandRotate90d,
    ScaleIntensityd,
)

In [2]:
datalist = "/var/data/student_home/lia/phuse_thesis_2024/monai_segmentation/monai_training/JSON_dir/train.json"

In [3]:
train_files = load_decathlon_datalist(data_list_file_path=datalist, is_segmentation=True, data_list_key='training')

In [4]:
train_files

[{'fold': 0,
  'image': '/var/data/MONAI_Choroid_Plexus/dataset_monai_train_from_scratch/26_ChP.nii.gz',
  'label': '/var/data/MONAI_Choroid_Plexus/dataset_monai_train_from_scratch/labels/final/26_ChP.nii.gz'},
 {'fold': 0,
  'image': '/var/data/MONAI_Choroid_Plexus/dataset_monai_train_from_scratch/22_ChP.nii.gz',
  'label': '/var/data/MONAI_Choroid_Plexus/dataset_monai_train_from_scratch/labels/final/22_ChP.nii.gz'},
 {'fold': 0,
  'image': '/var/data/MONAI_Choroid_Plexus/dataset_monai_train_from_scratch/19_ChP.nii.gz',
  'label': '/var/data/MONAI_Choroid_Plexus/dataset_monai_train_from_scratch/labels/final/19_ChP.nii.gz'},
 {'fold': 0,
  'image': '/var/data/MONAI_Choroid_Plexus/dataset_monai_train_from_scratch/3_ChP.nii.gz',
  'label': '/var/data/MONAI_Choroid_Plexus/dataset_monai_train_from_scratch/labels/final/3_ChP.nii.gz'},
 {'fold': 0,
  'image': '/var/data/MONAI_Choroid_Plexus/dataset_monai_train_from_scratch/2_ChP.nii.gz',
  'label': '/var/data/MONAI_Choroid_Plexus/dataset_mon

In [16]:
# define transforms for image and segmentation

train_transforms = Compose(
    [
        LoadImaged(keys=["image", "label"]),
        EnsureChannelFirstd(keys=["image", "label"]),
        ScaleIntensityd(keys="image"),
        RandCropByPosNegLabeld(keys=["image", "label"], label_key="label", spatial_size=[180, 180, 180], pos=1, neg=1, num_samples=8 ## !! This increases the batch size! (*num_samples)
            ),
        RandRotate90d(keys=["image", "label"], prob=0.5, spatial_axes=[0, 2]),
    ]
)

# define dataset, data loader
check_ds = CacheDataset(data=train_files, transform=train_transforms)
# use batch_size=2 to load images and use RandCropByPosNegLabeld to generate 2 x 4 images for network training
check_loader = DataLoader(check_ds, batch_size=2, num_workers=4, collate_fn=list_data_collate)
check_data = monai.utils.misc.first(check_loader) # Returns the first item in the given iterable or default if empty, meaningful mostly with ‘for’ expressions.
print(check_data["image"].shape, check_data["label"].shape)


Loading dataset:   0%|          | 0/29 [00:00<?, ?it/s]

Loading dataset: 100%|██████████| 29/29 [00:44<00:00,  1.54s/it]


(16, 1, 180, 180, 180) (16, 1, 180, 180, 180)


In [9]:

train_dataset = CacheDataset(
    data=train_files, transform=train_transforms, cache_num=6, cache_rate=1.0, num_workers=2)
train_loader = DataLoader(train_dataset, batch_size=4, shuffle=True, num_workers=2, collate_fn=list_data_collate)



Loading dataset: 100%|██████████| 6/6 [00:06<00:00,  1.13s/it]


In [10]:
t_check_data = monai.utils.misc.first(train_loader) # Returns the first item in the given iterable or default if empty, meaningful mostly with ‘for’ expressions.
print(t_check_data["image"].shape, t_check_data["label"].shape)


(16, 1, 180, 180, 180) (16, 1, 180, 180, 180)


In [15]:
t_check_data.items()

dict_items([('fold', tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])), ('image', tensor([[[[[0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
           ...,
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000]],

          [[0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
           ...,
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
           [0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000]],

          [[0.0000, 0.0000, 0.0000,  ..., 0.0000, 0.0000, 0.0000],
   

In [11]:
train_dataset

<monai.data.dataset.CacheDataset at 0x7f35e19453f0>

In [None]:
python ASCHOPLEX/launching_tool.py --dataroot '/var/data/MONAI_Choroid_Plexus/dataset_aschoplex' --work_dir '/var/data/student_home/lia/ASCHOPLEX' --finetune 'yes' --prediction 'yes' 