### Multiple Magnification loop example

Recent papers have shown that training models using multiple magnifications could greatly improve model performance, i.e. mixing patches extracted with both 20x and 10x magnification instead of only using one or the other.

This notebook presents how to use the `WSIDataloader` class to easily mix magnifications in patches extracted in the Dataset.

To define the `patch_generator` function, we chose to use the [`TIAToolbox`](https://github.com/TissueImageAnalytics/tiatoolbox) library, but any other similar implementation using ``openslide`` as a backend should work.

This notebook was executed on a CentOS machine with 32GB of RAM, a Core i9 13900 and an RTX4090. We strongly recommend storing the WSIs on an SSD for fast random access.

In the below cell, we create two things:
1. The `CustomSlidingWindowPatchExtractor` class: inherits from TIAToolbox's `SlidingWindowPatchExtractor`, only adding a definition for the `__len__(self)` method, to use it as an iterator.

2. The `get_patches(wsi_path)` function: This functions takes the path to a WSI as input and outputs an iterator over all the WSI patches as defined by the given parameters. The `mask` used here is the simplest `TIAToolbox` Otsu mask implementation. Note that the resolution parameter is not fixed, we define it as a random sample between 3 options: 20x, 10x and 5x.


In [1]:
from pathlib import Path

import numpy as np
from tiatoolbox.tools.patchextraction import SlidingWindowPatchExtractor
from tiatoolbox.wsicore.wsireader import WSIReader
import torchvision.transforms.v2 as transforms
from torch.nn import Sequential
from tqdm import tqdm

from wsiloader import WSIDataloader


class CustiomSlidingWindowPatchExtractor(SlidingWindowPatchExtractor):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

    def __len__(self):
        return self.locations_df.shape[0] if self.locations_df is not None else 0


def get_patches(wsi_path: str):
    wsi = WSIReader.open(input_img=wsi_path)

    # This mask can and should be adaptated to your data
    # Here we use a simple Otsu
    mask = wsi.tissue_mask(resolution=1.25, units="power")

    # All of these parameters can and should be adapted to your specific needs
    patch_size = 224  # Square patch of size 224x224
    resolution = np.random.choice([20, 10, 5]) # Randomly chose a magnification objective between 20x, 10x and 5x
    overlap    = 0.0  # No overlap between extracted patches

    patches = CustiomSlidingWindowPatchExtractor(
        input_img=wsi,
        patch_size=(patch_size,)*2,
        stride=(patch_size - int(overlap*patch_size),)*2,
        resolution=resolution,
        units="power",
        input_mask=mask,
        min_mask_ratio=0.3,
        within_bound=True,
    )

    return patches



In the cell below we create a `WSIDataloader` instance. For this example we used a subset of 27 WSIs (39GB total) from the normal train set of the [Camelyon16](https://camelyon16.grand-challenge.org/Data/) dataset. We also define a set of strong augmentations typically seen in various contrastive learning frameworks. Note that instead of the usual `transforms.Compose` we use `torch.nn.Sequential` to enable the tranforms execution on the available cuda device. The `WSIDataloader` class will then load the data using multiple `torch.utils.data.Dataloader` CPU workers and will then apply the transforms on GPU.

In [5]:
# List WSI samples from the normal train set of the Camelyon16 dataset
wsi_paths = list(Path("/home/travail/data/camelyon").glob("*.tif"))

augmentations = Sequential(
    transforms.RandomResizedCrop(224, scale=(0.2, 1.), antialias=True),
    transforms.RandomApply([transforms.ColorJitter(0.4, 0.4, 0.4, 0.1)], p=0.8),
    transforms.RandomGrayscale(p=0.2),
    transforms.RandomApply([transforms.GaussianBlur(3, [.1, 2.])], p=0.5),
    transforms.RandomHorizontalFlip(),
    transforms.RandomVerticalFlip(),
    transforms.Normalize(mean=[0.6684, 0.5115, 0.6791],
                         std=[0.2521, 0.2875, 0.2100])
)
augmentations.to("cuda")

# Pytorch Dataloader Parameters:
dataloader_params = dict(
    batch_size=128,
    shuffle=True,
    drop_last=True,
    num_workers=8,
)

loader = WSIDataloader(
    wsi_paths=wsi_paths,
    patch_generator=get_patches,
    transforms=augmentations,
    transforms_device="cuda",
    **dataloader_params,
)
print(f"Number of batches: {len(loader)}")


Building WSI index:   0%|                                                                        | 0/26 [00:00<?, ?it/s]

Building WSI index: 100%|██████████████████████████████| 26/26 [00:25<00:00,  1.03it/s, [8612 > 72,940 indexed patches]]

Number of batches: 569





Note the above number of indexed patches: 72 940.
These patches are a mix of patches extracted at 20x, 10x and 5x magnification objectives.

These resolutions are for now tied to specific WSIs, but we can reset the patch index using the `WSIDataloader.reset_index()` method:

In [7]:
loader.reset_index()

Building WSI index: 100%|██████████████████████████████| 26/26 [00:31<00:00,  1.22s/it, [2055 > 47,325 indexed patches]]


This time, 47 325 patches have been indexed, meaning that different magnification objectives were set for each WSI. In a training loop, the `reset_index()` method could called after every epoch, to resample patches and apply different magnification objectives. 