### Basic Example of WSIDataloader class usage

This notebook presents how to use the `WSIDataloader` class to efficiently load randomly selected patches from a list of WSIs using multiple CPU workers, and apply data augmentation to these patches using GPU acceleration.

To define the `patch_generator` function, we chose to use the [`TIAToolbox`](https://github.com/TissueImageAnalytics/tiatoolbox) library, but any other similar implementation using ``openslide`` as a backend should work.

This notebook was executed on a CentOS machine with 32GB of RAM, a Core i9 13900 and an RTX4090. We strongly recommend storing the WSIs on an SSD for fast random access.

In the below cell, we create two things:
1. The `CustomSlidingWindowPatchExtractor` class: inherits from TIAToolbox's `SlidingWindowPatchExtractor`, only adding a definition for the `__len__(self)` method, to use it as an iterator.

2. The `get_patches(wsi_path)` function: This functions takes the path to a WSI as input and outputs an iterator over all the WSI patches as defined by the given parameters. The `mask` used here is the simplest `TIAToolbox` Otsu mask implementation. The other parameters will make the iterator extract all non overlapping patches of size 224x224 at a 20x magnification objective.


In [1]:
from pathlib import Path

from tiatoolbox.tools.patchextraction import SlidingWindowPatchExtractor
from tiatoolbox.wsicore.wsireader import WSIReader
import torchvision.transforms.v2 as transforms
from torch.nn import Sequential
from tqdm import tqdm

from wsiloader import WSIDataloader


class CustiomSlidingWindowPatchExtractor(SlidingWindowPatchExtractor):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

    def __len__(self):
        return self.locations_df.shape[0] if self.locations_df is not None else 0


def get_patches(wsi_path: str):
    wsi = WSIReader.open(input_img=wsi_path)

    # This mask can and should be adaptated to your data
    # Here we use a simple Otsu
    mask = wsi.tissue_mask(resolution=1.25, units="power")

    # All of these parameters can and should be adapted to your specific needs
    patch_size = 224  # Square patch of size 224x224
    resolution = 20   # 20x magnification objective
    overlap    = 0.0  # No overlap between extracted patches

    patches = CustiomSlidingWindowPatchExtractor(
        input_img=wsi,
        patch_size=(patch_size,)*2,
        stride=(patch_size - int(overlap*patch_size),)*2,
        resolution=resolution,
        units="power",
        input_mask=mask,
        min_mask_ratio=0.3,
        within_bound=True,
    )

    return patches



In the cell below we create a `WSIDataloader` instance. For this example we used a subset of 27 WSIs (39GB total) from the normal train set of the [Camelyon16](https://camelyon16.grand-challenge.org/Data/) dataset. We also define a set of strong augmentations typically seen in various contrastive learning frameworks. Note that instead of the usual `transforms.Compose` we use `torch.nn.Sequential` to enable the tranforms execution on the available cuda device. The `WSIDataloader` class will then load the data using multiple `torch.utils.data.Dataloader` CPU workers and will then apply the transforms on GPU.

In [2]:
# List WSI samples from the normal train set of the Camelyon16 dataset
wsi_paths = list(Path("/data/camelyon_samples").glob("*.tif"))

augmentations = Sequential(
    transforms.RandomResizedCrop(224, scale=(0.2, 1.), antialias=True),
    transforms.RandomApply([transforms.ColorJitter(0.4, 0.4, 0.4, 0.1)], p=0.8),
    transforms.RandomGrayscale(p=0.2),
    transforms.RandomApply([transforms.GaussianBlur(3, [.1, 2.])], p=0.5),
    transforms.RandomHorizontalFlip(),
    transforms.RandomVerticalFlip(),
    transforms.Normalize(mean=[0.6684, 0.5115, 0.6791],
                         std=[0.2521, 0.2875, 0.2100])
)
augmentations.to("cuda")

# Pytorch Dataloader Parameters:
dataloader_params = dict(
    batch_size=128,
    shuffle=True,
    drop_last=True,
    num_workers=8,
)

loader = WSIDataloader(
    wsi_paths=wsi_paths,
    patch_generator=get_patches,
    transforms=augmentations,
    transforms_device="cuda",
    **dataloader_params,
)
print(f"Number of batches: {len(loader)}")


Building WSI index:   0%|                                                                        | 0/26 [00:00<?, ?it/s]

Building WSI index: 100%|█████████████████████████████| 26/26 [00:26<00:00,  1.00s/it, [8612 > 146,414 indexed patches]]

Number of batches: 1143





We can now iterate through the loader the same way as when using a `torch.utils.data.Dataloader`, but for every batch, patches will be randomly sampled across all WSIs.

In [3]:
for batch in tqdm(loader):
    pass

100%|██████████| 1143/1143 [02:23<00:00,  7.98it/s]


The below cell demonstrates the efficiency of sequentially applying the tranforms on GPU instead of applying them on CPU in the Dataloader workers. When the `transforms_device` parameter is set to `cpu`, transforms are passed down to the dataset and applied in parallel using the default Dataloader workers behaviour. In this simple example we see a substantial time improvement when using GPU for transforms execution (2:23 vs 3:35).

In [4]:
loader = WSIDataloader(
    wsi_paths=wsi_paths,
    patch_generator=get_patches,
    transforms=augmentations.to("cpu"),
    transforms_device="cpu",
    **dataloader_params,
)
print(f"Number of batches: {len(loader)}")

for batch in tqdm(loader):
    pass

Building WSI index: 100%|█████████████████████████████| 26/26 [00:26<00:00,  1.00s/it, [8612 > 146,414 indexed patches]]


Number of batches: 1143


100%|██████████| 1143/1143 [03:35<00:00,  5.31it/s]
