We are witnessing the emergence of many new feature extractors trained using self-supervised learning on large pathology datasets. This repository aims to provide a comprehensive list of these models, alongside key information about them.
I aim to update this list as new models are released, but please submit a pull request / issue for any models I have missed!
Name | Group | Weights | Released | SSL | WSIs | Tiles | Patients | Batch size | Iterations | Architecture | Parameters | Embed dim | Input size | Dataset | Links |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
CTransPath | Sichuan University / Tencent AI Lab | ✅ | Dec 2021* | SRCL | 32K | 16M | Swin-Transformer | 768 | 224 | TCGA, PAIP | |||||
RetCCL | Sichuan University / Tencent AI Lab | ✅ | Dec 2021* | CCL | 32K | 16M | ResNet-50 | 2048 | 224 | TCGA, PAIP | |||||
REMEDIS | Google Research | ✅ | May 2022* | SimCLR/BiT | 29K | 50M | 11K cases | 4096 | 1.2M | ResNet-50 | 2048 | 224 | TCGA | ||
HIPT | Mahmood Lab | ✅ | Jun 2022* | DINOv1 | 11K | 100M | 256 | 400K | ViT-S | 384 | 256 | TCGA | |||
Lunit-DINO | Lunit | ✅ | Dec 2022* | DINOv1 | 21K | ViT-S | 384 | 224 | TCGA | ||||||
Lunit-{BT,MoCoV2,SwAV} | Lunit | ✅ | Dec 2022* | {BT,MoCoV2,SwAV} | 21K | ResNet-50 | 2048 | 224 | TCGA | ||||||
Phikon | Owkin | ✅ | Jul 2023* | iBOT | 6.1K | 43M | 5.6K | 1440 | 155K | ViT-B | 86M | 768 | 224 | TCGA | |
CONCH | Mahmood Lab | ✅ | Jul 2023* | iBOT & vision-language pretraining | 21K | 16M | 1024 | 80 epochs | ViT-B | 86M | 768 | 224 | proprietary | ||
UNI | Mahmood Lab | ✅ | Aug 2023* | DINOv2 | 100K | 100M | ViT-L | 1024 | 224 | proprietary (Mass-100K) | |||||
Virchow | Paige / Microsoft | ✅ | Sep 2023* | DINOv2 | 1.5M | 120K | ViT-H | 632M | 2560 | 224 | proprietary (from MSKCC) | ||||
Campanella et al. (MAE) | Thomas Fuchs Lab | ❌ | Oct 2023* | MAE | 420K | 3.3B | 77K | 1080 | 1.3K INE | ViT-L | 303M | 224 | proprietary (MSHS) | ||
Campanella et al. (DINO) | Thomas Fuchs Lab | ❌ | Oct 2023* | DINOv1 | 420K | 3.3B | 77K | 1440 | 2.5K INE | ViT-L | 303M | 224 | proprietary (MSHS) | ||
Path Foundation | ✅ | Oct 2023* | SimCLR, MSN | 6K | 60M | 1024 | ViT-S | 384 | 224 | TCGA | |||||
PathoDuet | Shanghai Jiao Tong University | ✅ | Dec 2023* | inspired by MoCoV3 | 11K | 13M | 2048 | 100 epochs | ViT-B | 4096 | 224 | TCGA | |||
RudolfV | Aignostics | ❌ | Jan 2024* | DINOv2 | 100K | 750M | 36K | ViT-L | 224 | proprietary (from EU & US), TCGA | |||||
kaiko | kaiko.ai | ✅ | Mar 2024* | DINOv2 | 29K | 260M** | 512 | 200 INE | ViT-L | 1024 | 224 | TCGA | |||
PLUTO | PathAI | ❌ | May 2024* | DINOv2 (+ MAE and Fourier loss) | 160K | 200M | FlexiViT-S | 22M | 224 | proprietary (PathAI) | |||||
BEPH | Shanghai Jiao Tong University | ✅ | May 2024* | BEiTv2 | 12K | 12M | 1024 | ViT-B | 193M | 1024 | 224 | TCGA | |||
Prov-GigaPath | Microsoft / Providence | ✅ | May 2024* | DINOv2 | 170K | 1.4B | 30K | 384 | ViT | 1536 | 224 | proprietary (Providence) | |||
Hibou-B | HistAI | ✅ | Jun 2024* | DINOv2 | 1.1M | 510M | 310K cases | 1024 | 500K | ViT-B | 86M | 768 | 224 | proprietary | |
Hibou-L | HistAI | ✅ | Jun 2024* | DINOv2 | 1.1M | 1.2B | 310K cases | 1024 | 1.2M | ViT-L | 304M | 1024 | 224 | proprietary | |
H-optimus-0 | Bioptimus | ✅ | Jul 2024* | DINOv2/iBOT | 500K (across 4,000 clinics) | >100M | 200K | ViT-G with 4 registers | 1.1B | 1536 | 224 | proprietary | |||
mSTAR | Smart Lab | ❌ | Jul 2024* | mSTAR (multimodal) | 10K | 10K | ViT-L | 224 | TCGA | ||||||
Virchow 2 | Paige / Microsoft | ✅ | Aug 2024* | DINOv2 (+ ECT and KDE) | 3.1M | 2B | 230K | 4096 | ViT-H with 4 registers | 632M | 3584 | 224 | proprietary (from MSKCC and international sites) | ||
Virchow 2G | Paige / Microsoft | ❌ | Aug 2024* | DINOv2 (+ ECT and KDE) | 3.1M | 2B | 230K | 3072 | ViT-G with 8 registers | 1.9B | 3584 | 224 | proprietary (from MSKCC and international sites) | ||
Phikon-v2 | Owkin | ✅ | Sep 2024* | DINOv2 | 58.4K | 456M | 4096 | 250K | ViT-L | 307M | 1024 | 224 | PANCAN-XL (TCGA, CPTAC, GTEx, proprietary) |
Notes:
- Models trained on >100K slides may be considered foundation models and are marked in bold
- # of WSIs, tiles, and patients are reported to 2 significant figures
- INE = ImageNet epochs
- Order is chronological
- Some of these feature extractors have been evaluated in a benchmarking study for whole slide classification here.
- ** means inferred from other numbers provided in the paper
This table includes models that produce slide-level or patient-level embeddings without supervision.
Name | Group | Weights | Released | SSL | WSIs | Patients | Batch size | Iterations | Architecture | Parameters | Embed dim | Patch size | Dataset | Links |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
GigaSSL | CBIO | ✅ | Dec 2022* | SimCLR | 12K | 1K epochs | ResNet-18 | 256 | 256 | TCGA | ||||
PRISM | Paige / Microsoft | ✅ | May 2024* | contrastive (with language) | 590K (190K text reports) | 190K | 64 (x4) | 75K (10 epochs) | Perceiver + BioGPT | 1280 | 224 | proprietary | ||
Prov-GigaPath | Microsoft / Providence | ✅ | May 2024* | DINOv2 | 170K | 30K | LongNet | 86M | 1536 | 224 | proprietary (Providence) | |||
MADELEINE | Mahmood Lab | ✅ | Aug 2024* | contrastive (InfoNCE & OT) | 16K | 2K | 120 | 90 epochs | multi-head attention MIL | 512 | 256 | ACROBAT, BWH Kidney (proprietary) | ||
CHIEF | Yu Lab | ✅ | Sep 2024* |