# Data loading pipeline examples

The purpose of this notebook is to illustrate reading Nifti files and test speed of different methods.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Project-MONAI/tutorials/blob/master/acceleration/transform_speed.ipynb)

## Setup environment

In [1]:
%pip install -q "monai[nibabel]"

Note: you may need to restart the kernel to use updated packages.


In [None]:
# FIXME: install MONAI from source code, will remove it when v0.4.0 released
pip install git+https://github.com/Project-MONAI/MONAI#egg=MONAI

## Setup imports

In [1]:
# Copyright 2020 MONAI Consortium
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#     http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import glob
import os
import shutil
import tempfile

import nibabel as nib
import numpy as np
import torch

try:
    torch.multiprocessing.set_start_method("spawn")
except RuntimeError:
    pass


from monai.config import print_config
from monai.data import ArrayDataset, create_test_image_3d
from monai.transforms import (
    AddChannel,
    Compose,
    LoadNifti,
    RandAffine,
    RandSpatialCrop,
    Rotate,
    ScaleIntensity,
    ToTensor,
)
from monai.utils import first

print_config()

MONAI version: 0.2.0
Python version: 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 19:07:31)  [GCC 7.3.0]
Numpy version: 1.18.1
Pytorch version: 1.6.0

Optional dependencies:
Pytorch Ignite version: 0.3.0
Nibabel version: 3.1.1
scikit-image version: 0.15.0
Pillow version: 7.2.0
Tensorboard version: 2.1.0

For details about installing the optional dependencies, please visit:
    https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies



## Setup data directory

You can specify a directory with the `MONAI_DATA_DIRECTORY` environment variable.  
This allows you to save results and reuse downloads.  
If not specified a temporary directory will be used.

In [2]:
directory = os.environ.get("MONAI_DATA_DIRECTORY")
root_dir = tempfile.mkdtemp() if directory is None else directory
print(root_dir)

/workspace/data/medical


### 0. Preparing input data (nifti images)

Create a number of test Nifti files, 3d single channel images with spatial size (256, 256, 256) voxels.

In [4]:
for i in range(5):
    im, seg = create_test_image_3d(256, 256, 256)

    n = nib.Nifti1Image(im, np.eye(4))
    nib.save(n, os.path.join(root_dir, f"im{i}.nii.gz"))

    n = nib.Nifti1Image(seg, np.eye(4))
    nib.save(n, os.path.join(root_dir, f"seg{i}.nii.gz"))

In [5]:
# prepare list of image names and segmentation names
images = sorted(glob.glob(os.path.join(root_dir, "im*.nii.gz")))
segs = sorted(glob.glob(os.path.join(root_dir, "seg*.nii.gz")))

### 1. Test image loading with minimal preprocessing

In [6]:
imtrans = Compose([LoadNifti(image_only=True), AddChannel(), ToTensor()])

segtrans = Compose([LoadNifti(image_only=True), AddChannel(), ToTensor()])

ds = ArrayDataset(images, imtrans, segs, segtrans)
loader = torch.utils.data.DataLoader(ds, batch_size=3, num_workers=8)

im, seg = first(loader)
print(im.shape, seg.shape)

torch.Size([3, 1, 256, 256, 256]) torch.Size([3, 1, 256, 256, 256])


In [7]:
%timeit data = next(iter(loader))

4.98 s ± 394 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


### 2. Test image-patch loading with CPU multi-processing:

- rotate (256, 256, 256)-voxel in the plane axes=(1, 2)
- extract random (64, 64, 64) patches
- implemented in MONAI using ` scipy.ndimage.rotate`

In [8]:
images = sorted(glob.glob(os.path.join(root_dir, "im*.nii.gz")))
segs = sorted(glob.glob(os.path.join(root_dir, "seg*.nii.gz")))

imtrans = Compose(
    [
        LoadNifti(image_only=True),
        ScaleIntensity(),
        AddChannel(),
        Rotate(angle=np.pi / 4),
        RandSpatialCrop((64, 64, 64), random_size=False),
        ToTensor(),
    ]
)

segtrans = Compose(
    [
        LoadNifti(image_only=True),
        AddChannel(),
        Rotate(angle=np.pi / 4),
        RandSpatialCrop((64, 64, 64), random_size=False),
        ToTensor(),
    ]
)

ds = ArrayDataset(images, imtrans, segs, segtrans)
loader = torch.utils.data.DataLoader(
    ds, batch_size=3, num_workers=8, pin_memory=torch.cuda.is_available()
)

im, seg = first(loader)
print(im.shape, seg.shape)

torch.Size([3, 1, 64, 64, 64]) torch.Size([3, 1, 64, 64, 64])


In [9]:
%timeit -n 3 data = next(iter(loader))

18.6 s ± 1.82 s per loop (mean ± std. dev. of 7 runs, 3 loops each)


(the above results were based on Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz)

### 3. Test image-patch loading with preprocessing on GPU:

- random rotate (256, 256, 256)-voxel in the plane axes=(1, 2)
- extract random (64, 64, 64) patches
- implemented in MONAI using native pytorch resampling

In [10]:
images = sorted(glob.glob(os.path.join(root_dir, "im*.nii.gz")))
segs = sorted(glob.glob(os.path.join(root_dir, "seg*.nii.gz")))

# same parameter with different interpolation mode for image and segmentation
rand_affine_img = RandAffine(
    prob=1.0,
    rotate_range=np.pi / 4,
    translate_range=(96, 96, 96),
    spatial_size=(64, 64, 64),
    mode="bilinear",
    as_tensor_output=True,
    device=torch.device("cuda:0"),
)
rand_affine_seg = RandAffine(
    prob=1.0,
    rotate_range=np.pi / 4,
    translate_range=(96, 96, 96),
    spatial_size=(64, 64, 64),
    mode="nearest",
    as_tensor_output=True,
    device=torch.device("cuda:0"),
)

imtrans = Compose(
    [LoadNifti(image_only=True), ScaleIntensity(), AddChannel(), rand_affine_img, ToTensor()]
)

segtrans = Compose([LoadNifti(image_only=True), AddChannel(), rand_affine_seg, ToTensor()])

ds = ArrayDataset(images, imtrans, segs, segtrans)
loader = torch.utils.data.DataLoader(ds, batch_size=3, num_workers=0)

im, seg = first(loader)

print(im.shape, seg.shape)

torch.Size([3, 1, 64, 64, 64]) torch.Size([3, 1, 64, 64, 64])


In [11]:
%timeit -n 3 data = next(iter(loader))

4.94 s ± 1.54 s per loop (mean ± std. dev. of 7 runs, 3 loops each)


In [12]:
print(torch.cuda.get_device_name(0))
print(torch.cuda.memory_summary(0, abbreviated=True))

Quadro GV100
|                  PyTorch CUDA memory summary, device ID 0                 |
|---------------------------------------------------------------------------|
|            CUDA OOMs: 0            |        cudaMalloc retries: 0         |
|        Metric         | Cur Usage  | Peak Usage | Tot Alloc  | Tot Freed  |
|---------------------------------------------------------------------------|
| Allocated memory      |    6144 KB |   94208 KB |   13026 MB |   13020 MB |
|---------------------------------------------------------------------------|
| Active memory         |    6144 KB |   94208 KB |   13026 MB |   13020 MB |
|---------------------------------------------------------------------------|
| GPU reserved memory   |  159744 KB |  159744 KB |  159744 KB |       0 B  |
|---------------------------------------------------------------------------|
| Non-releasable memory |   14336 KB |   77823 KB |   11789 MB |   11775 MB |
|--------------------------------------------------

## Cleanup data directory

Remove directory if a temporary was used.

In [13]:
if directory is None:
    shutil.rmtree(root_dir)