Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Whole-slide classification pipeline — end-to-end

This repository will give an overview on how to use streaming to train whole slides to single labels. Streaming is an implementation of convolutions using tiling and gradient checkpointing to save memory.

alt text

Papers until now about this method (please consider citing when using this code):

  • Application on prostate data, paper: H. Pinckaers, W. Bulten, J. Van der Laak and G. Litjens, "Detection of prostate cancer in whole-slide images through end-to-end training with image-level labels," in IEEE Transactions on Medical Imaging, doi: 10.1109/TMI.2021.3066295 - Open Access.

  • Methods paper: H. Pinckaers, B. van Ginneken and G. Litjens, "Streaming convolutional neural networks for end-to-end learning with multi-megapixel images," in IEEE Transactions on Pattern Analysis and Machine Intelligence, doi: 10.1109/TPAMI.2020.3019563 - older preprint

Other resources:



  • Install libvips
  • See requirements.txt, install via pip install -r requirements.txt
  • Pytorch (1.6+) for mixedprecision support
  • Make sure the repo is in your $PYTHONPATH

Hardware requirements:

  • GPU with 11 GB memory (smaller could work with smaller tile-sizes)
  • Preferably 32+ GB RAM (go for less workers when you have less memory available)

Windows users

  • Please see issues #2 and #3 for help with building the cpp extensions.


For now, only the ResNet-34 implementation is checked. Other networks could be implemented (please make an issue, I can help).

Input sizes

Recommended image sizes (microscopy magnification):

  • 4096x4096 for spacing 4.0 (2.5x)
  • 8192x8192 for spacing 2.0 (5x)
  • 16384x16384 for spacing 1.0 (10x)


0. Prepare train.csv and val.csv

For this pipeline you will need two csv files: train and val.csv. The syntax is easy:


1. Prepare data

python streaming/ \
    --csv='' \
    --slide-dir='' \
    --filetype='tif' \
    --save-dir='' \

2. Train network!

python streaming/ \
    --name=test-name \
    --train_csv='train.csv' \
    --val_csv='val.csv' \
    --data_dir='/local/data' \
    --save_dir='/home/user/models' \
    --lr=2e-4 \
    --num_workers=1 \

3. Options

There are quite some options (disable boolean options by prepending with no_, so e.g., no_mixedprecision):

Required options Description
name: str The name of the current experiment, used for saving checkpoints.
num_classes: int The number of classes in the task.
train_csv: str The filenames (without extension) and labels of train set.
val_csv: str The filenames (without extension) and labels of validation or test set.
data_dir: str The directory where the images reside.
save_dir: str Where to save the checkpoints.
Optional options
filetype: str default: '.jpg'. The file-extension of the images.
Train options
lr: float default: 1e-4 . Learning rate.
batch_size: int default: 16. Effective mini-batch size.
pretrained: bool default: True. Whether to use ImageNet weights.
image_size: int default: 16384. Effective input size of the network.
tile_size: int default: 5120. The input/tile size of the streaming-part of the network.
epochs: int default: 50. How many epochs to train.
multilabel: bool default: False.
regression: bool default: False.
Validation options
validation: bool default: True. Whether to run on validation set.
validation_interval: int default: 1. How many times to run on validation set, after n train epochs.
epoch_multiply: int default: 1. This will increase the size of one train epoch by reusing train images.
Increase speed
mixedprecision: bool default: True. Paper is trained with full precision, but this is faster.
variable_input_shapes: bool default: False. When the images vary a lot with size, this helps with speed.
normalize_on_gpu: bool default: True. Helps with RAM usage of dataloaders.
num_workers: int default: 2. Number of dataloader workers.
convert_to_vips: bool default: False.
Model options
resnet: bool default: True. Only resnet is tested so far.
mobilenet: bool default: False. Experimental.
train_all_layers: bool default: False. Whether to finetune whole network, or only last block.
Save and logging options
resuming: bool default: True. Will restart from the last checkpoint with same experiment-name.
resume_name: str default: ''. Restart from another experiment with this name.
resume_epoch: int default: -1. Restart from specific epoch.
save: bool default: True. Save checkpoints.
progressbar: bool default: True. Show the progressbar.
Evaluation options
weight_averaging: bool default: False. Average weights over 5 epochs around picked epoch.
only_eval: bool default: False. Only do one evaluation epoch.
Obscure train options
gather_batch_on_one_gpu: bool default: False.
accumulate_batch: int default: -1. Do not touch, is calculated automatically.
weighted_sampler: bool default: False. Oversample minority class, only works in binary tasks.
train_set_size: int default: -1. Sometimes you want to test on smaller train-set you can limit number here.
train_streaming_layers: bool default: True. Whether to backpropagate the streaming-part of network.


Use streaming to train whole-slides images with single image-level labels, by reducing GPU memory requirements with 99%.







No releases published


No packages published