Copyright (c) MONAI Consortium  
Licensed under the Apache License, Version 2.0 (the "License");  
you may not use this file except in compliance with the License.  
You may obtain a copy of the License at  
&nbsp;&nbsp;&nbsp;&nbsp;http://www.apache.org/licenses/LICENSE-2.0  
Unless required by applicable law or agreed to in writing, software  
distributed under the License is distributed on an "AS IS" BASIS,  
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  
See the License for the specific language governing permissions and  
limitations under the License.

# Lazy resampling benchmark

In this notebook, we used 3D spleen segmentation task to show our lazy-resampling benchmark. Our results include the following two main parts.
- Time spent on each transform and the total amount of data preparation in the lazy and non-lazy mode.
- End-to-end time comparison

## Setup environment

In [16]:
!python -c "import monai" || pip install -q "monai-weekly[nibabel, tqdm, gdown]"

## Setup imports

In [None]:
import os
import time
import torch
import tempfile
import monai
import monai.transforms as mt
from monai.utils import WorkflowProfiler
from monai.apps import download_and_extract

from monai.config import print_config

print_config()

# Setup data directory
You can specify a directory with the `MONAI_DATA_DIRECTORY` environment variable.  
This allows you to save results and reuse downloads.  
If not specified a temporary directory will be used.

In [2]:
directory = os.environ.get("MONAI_DATA_DIRECTORY")
root_dir = tempfile.mkdtemp() if directory is None else directory
print(root_dir)

/workspace/Data


## Download dataset and prepare the utilities

This section downloads and extracts the dataset.

The dataset comes from http://medicaldecathlon.com/.

`TraceObjectsOnly` is used to narrow the scope of the trace to top-level transforms only.

In [3]:
class TraceObjectsOnly:
    def __init__(self, objects):
        self.objects = objects

    def __call__(self, frame):
        self_obj = frame.f_locals.get("self", None)
        return frame.f_code.co_name == "__call__" and self_obj in self.objects


resource = "https://msd-for-monai.s3-us-west-2.amazonaws.com/Task09_Spleen.tar"
md5 = "410d4a301da4e5b2f6f86ec3ddba524e"

compressed_file = os.path.join(root_dir, "Task09_Spleen.tar")
data_dir = os.path.join(root_dir, "Task09_Spleen")
if not os.path.exists(data_dir):
    download_and_extract(resource, compressed_file, root_dir, md5)

## Transform Profiling Comparison

`transform_list` shows the transform chain.
- `LoadImaged` loads the brats MRI images and labels from NIfTI format files.
- `EnsureChannelFirstd` ensures the original data to construct "channel first" shape.
- `ConvertToMultiChannelBasedOnBratsClassesd` convert labels to multi channels based on brats classes.
- `Orientationd` unifies the data orientation based on the affine matrix.
- `Spacingd` adjusts the spacing by `pixdim=(1., 1., 1.)` based on the affine matrix.
- `RandSpatialCropd` crop the image and label to [224, 224, 144] at a random position as center.
- `RandFlipd` randomly reverse the order of elements along the given spatial axis.
- `NormalizeIntensityd` normalize the input based on calculated mean and std.
- `RandScaleIntensityd` randomly scale the intensity of input image.
- `RandShiftIntensityd` randomly shift intensity with randomly picked offset.

In [4]:
transform_list = [
    # load 4 Nifti images and stack them together
    mt.LoadImaged(keys=["image", "label"]),
    mt.EnsureChannelFirstd(keys="image"),
    mt.EnsureTyped(keys=["image", "label"]),
    mt.ConvertToMultiChannelBasedOnBratsClassesd(keys="label"),
    mt.Orientationd(keys=["image", "label"], axcodes="RAS"),
    mt.Spacingd(
        keys=["image", "label"],
        pixdim=(1.0, 1.0, 1.0),
        mode=("bilinear", "nearest"),
    ),
    mt.RandSpatialCropd(keys=["image", "label"], roi_size=[224, 224, 144], random_size=False),
    mt.RandFlipd(keys=["image", "label"], prob=1.0, spatial_axis=0),
    mt.RandFlipd(keys=["image", "label"], prob=1.0, spatial_axis=1),
    mt.RandFlipd(keys=["image", "label"], prob=1.0, spatial_axis=2),
    mt.NormalizeIntensityd(keys="image", nonzero=True, channel_wise=True),
    mt.RandScaleIntensityd(keys="image", factors=0.1, prob=1.0),
    mt.RandShiftIntensityd(keys="image", offsets=0.1, prob=1.0),
]

monai.transforms.io.dictionary LoadImaged.__init__:image_only: Current default value of argument `image_only=False` has been deprecated since version 1.1. It will be changed to `image_only=True` in version 1.3.


### The preprocessing pipeline evaluated non-lazily

In [6]:
monai.utils.set_determinism(24)

train_transform = mt.Compose(transform_list)
train_ds = monai.apps.DecathlonDataset(
    root_dir=root_dir,
    task="Task01_BrainTumour",
    transform=train_transform,
    section="training",
    download=True,
    cache_rate=0.0,
)
data_loader = monai.data.DataLoader(train_ds, batch_size=1, shuffle=True)

with WorkflowProfiler(TraceObjectsOnly(transform_list)) as wp:
    time_start = time.time()
    for _item_lazy in data_loader:
        pass
    time_process = time.time() - time_start

print(f"non-lazy preprocessing time: {time_process}")
wp.get_times_summary_pd()

non-lazy preprocessing time: 327.38880467414856


Unnamed: 0,Count,Total Time (s),Avg,Std,Min,Max
LoadImaged.__call__,388,214.60006,0.553093,0.014112,0.519402,0.600372
NormalizeIntensityd.__call__,388,51.334329,0.132305,0.026704,0.096193,0.192629
RandFlipd.__call__,1164,21.008026,0.018048,0.002203,0.016785,0.069709
ConvertToMultiChannelBasedOnBratsClassesd.__call__,388,5.235506,0.013494,0.001316,0.012556,0.027214
Spacingd.__call__,388,4.345852,0.011201,0.0063,0.009089,0.115761
RandShiftIntensityd.__call__,388,4.018844,0.010358,0.006234,0.009085,0.057106
RandScaleIntensityd.__call__,388,3.786071,0.009758,0.003357,0.008987,0.049616
Orientationd.__call__,388,1.406409,0.003625,0.010838,0.001987,0.120482
RandSpatialCropd.__call__,388,0.611028,0.001575,0.001485,0.00126,0.030637
EnsureChannelFirstd.__call__,388,0.083966,0.000216,3.1e-05,0.000188,0.000482


### The preprocessing pipeline evaluated lazily

In [7]:
monai.utils.set_determinism(24)

overrides = {
    "image": {"mode": "bilinear", "padding_mode": "border", "dtype": torch.float32},
    "label": {"mode": 0, "padding_mode": "nearest", "dtype": torch.uint8},
}
train_transform = mt.Compose(transform_list, lazy=True, overrides=overrides)
train_ds = monai.apps.DecathlonDataset(
    root_dir=root_dir,
    task="Task01_BrainTumour",
    transform=train_transform,
    section="training",
    download=True,
    cache_rate=0.0,
)
data_loader = monai.data.DataLoader(train_ds, batch_size=1, shuffle=True)
with WorkflowProfiler(TraceObjectsOnly(transform_list)) as wp:
    time_start = time.time()
    for _item_lazy in data_loader:
        pass
    time_process = time.time() - time_start

print(f"lazy preprocessing time: {time_process}")
wp.get_times_summary_pd()

lazy preprocessing time: 301.8494665622711


Unnamed: 0,Count,Total Time (s),Avg,Std,Min,Max
LoadImaged.__call__,388,210.230422,0.541831,0.011048,0.509415,0.5891
NormalizeIntensityd.__call__,388,51.072924,0.131631,0.027062,0.099262,0.195547
RandShiftIntensityd.__call__,388,5.196085,0.013392,0.008437,0.008982,0.037699
ConvertToMultiChannelBasedOnBratsClassesd.__call__,388,5.149661,0.013272,0.001081,0.0125,0.02383
RandScaleIntensityd.__call__,388,3.762584,0.009697,0.000706,0.009134,0.022499
Orientationd.__call__,388,1.879595,0.004844,0.006223,0.001772,0.023112
Spacingd.__call__,388,1.359067,0.003503,0.00104,0.00304,0.020699
RandFlipd.__call__,1164,0.802183,0.000689,0.000109,0.000561,0.002725
RandSpatialCropd.__call__,388,0.275304,0.00071,0.000167,0.00057,0.00366
EnsureChannelFirstd.__call__,388,0.084098,0.000217,2.4e-05,0.000187,0.000323


## End-to-end workflow Profiling Comparison

https://github.com/Project-MONAI/tutorials/blob/main/3d_segmentation/spleen_segmentation_3d.ipynb contains the complete workflow. Here we should modify it in two places:
- `transform_list` should be used in place of the 'train_transforms' and 'val_transforms'.
- Use regular Dataset instead of CacheDataset for training and validation process.

### Total time and every epoch time comparison
![lazy_benchmark_total_epoch_time_comparison](../figures/lazy_benchmark_total_epoch_time_comparison.png)

### Performance comparison
The end-to-end pipeline was benchmarked on a A100 80G GPU.

Lazy mode training: best_metric: Dice 0.7970 at epoch: 65 total time: 41296.6880s

Non-lazy mode training: best_metric: Dice 0.7955 at epoch: 68 total time: 42887.6809s