Pathology Feature Extractors and Foundation Models

We are witnessing the emergence of many new feature extractors trained using self-supervised learning on large pathology datasets. This repository aims to provide a comprehensive list of these models, alongside key information about them.

I aim to update this list as new models are released, but please submit a pull request / issue for any models I have missed!

Patch-level models

Name	Group	Weights	Released	SSL	WSIs	Tiles	Patients	Batch size	Iterations	Architecture	Parameters	Embed dim	Input size	Dataset
CTransPath	Sichuan University / Tencent AI Lab	✅	Dec 2021*	SRCL	32K	16M				Swin-Transformer		768	224	TCGA, PAIP
RetCCL	Sichuan University / Tencent AI Lab	✅	Dec 2021*	CCL	32K	16M				ResNet-50		2048	224	TCGA, PAIP
REMEDIS	Google Research	✅	May 2022*	SimCLR/BiT	29K	50M	11K cases	4096	1.2M	ResNet-50		2048	224	TCGA
HIPT	Mahmood Lab	✅	Jun 2022*	DINOv1	11K	100M		256	400K	ViT-S		384	256	TCGA
Lunit-DINO	Lunit	✅	Dec 2022*	DINOv1	21K					ViT-S		384	224	TCGA
Lunit-{BT,MoCoV2,SwAV}	Lunit	✅	Dec 2022*	{BT,MoCoV2,SwAV}	21K					ResNet-50		2048	224	TCGA
Phikon	Owkin	✅	Jul 2023*	iBOT	6.1K	43M	5.6K	1440	155K	ViT-B	86M	768	224	TCGA
CONCH	Mahmood Lab	✅	Jul 2023*	iBOT & vision-language pretraining	21K	16M		1024	80 epochs	ViT-B	86M	768	224	proprietary
UNI	Mahmood Lab	✅	Aug 2023*	DINOv2	100K	100M				ViT-L		1024	224	proprietary (Mass-100K)
Virchow	Paige / Microsoft	✅	Sep 2023*	DINOv2	1.5M		120K			ViT-H	632M	2560	224	proprietary (from MSKCC)
Campanella et al. (MAE)	Thomas Fuchs Lab	❌	Oct 2023*	MAE	420K	3.3B	77K	1080	1.3K INE	ViT-L	303M		224	proprietary (MSHS)
Campanella et al. (DINO)	Thomas Fuchs Lab	❌	Oct 2023*	DINOv1	420K	3.3B	77K	1440	2.5K INE	ViT-L	303M		224	proprietary (MSHS)
Path Foundation	Google	✅	Oct 2023*	SimCLR, MSN	6K	60M		1024		ViT-S		384	224	TCGA
PathoDuet	Shanghai Jiao Tong University	✅	Dec 2023*	inspired by MoCoV3	11K	13M		2048	100 epochs	ViT-B		4096	224	TCGA
RudolfV	Aignostics	❌	Jan 2024*	DINOv2	100K	750M	36K			ViT-L			224	proprietary (from EU & US), TCGA
kaiko	kaiko.ai	✅	Mar 2024*	DINOv2	29K	260M**		512	200 INE	ViT-L		1024	224	TCGA
PLUTO	PathAI	❌	May 2024*	DINOv2 (+ MAE and Fourier loss)	160K	200M				FlexiViT-S	22M		224	proprietary (PathAI)
BEPH	Shanghai Jiao Tong University	✅	May 2024*	BEiTv2	12K	12M		1024		ViT-B	193M	1024	224	TCGA
Prov-GigaPath	Microsoft / Providence	✅	May 2024*	DINOv2	170K	1.4B	30K	384		ViT		1536	224	proprietary (Providence)
Hibou-B	HistAI	✅	Jun 2024*	DINOv2	1.1M	510M	310K cases	1024	500K	ViT-B	86M	768	224	proprietary
Hibou-L	HistAI	✅	Jun 2024*	DINOv2	1.1M	1.2B	310K cases	1024	1.2M	ViT-L	304M	1024	224	proprietary
H-optimus-0	Bioptimus	✅	Jul 2024*	DINOv2/iBOT	500K (across 4,000 clinics)	>100M	200K			ViT-G with 4 registers	1.1B	1536	224	proprietary
mSTAR	Smart Lab	❌	Jul 2024*	mSTAR (multimodal)	10K		10K			ViT-L			224	TCGA
Virchow 2	Paige / Microsoft	✅	Aug 2024*	DINOv2 (+ ECT and KDE)	3.1M	2B	230K	4096		ViT-H with 4 registers	632M	3584	224	proprietary (from MSKCC and international sites)
Virchow 2G	Paige / Microsoft	❌	Aug 2024*	DINOv2 (+ ECT and KDE)	3.1M	2B	230K	3072		ViT-G with 8 registers	1.9B	3584	224	proprietary (from MSKCC and international sites)
Phikon-v2	Owkin	✅	Sep 2024*	DINOv2	58.4K	456M		4096	250K	ViT-L	307M	1024	224	PANCAN-XL (TCGA, CPTAC, GTEx, proprietary)

Notes:

Models trained on >100K slides may be considered foundation models and are marked in bold
# of WSIs, tiles, and patients are reported to 2 significant figures
INE = ImageNet epochs
Order is chronological
Some of these feature extractors have been evaluated in a benchmarking study for whole slide classification here.
** means inferred from other numbers provided in the paper

Slide-level / patient-level models

This table includes models that produce slide-level or patient-level embeddings without supervision.

Name	Group	Weights	Released	SSL	WSIs	Patients	Batch size	Iterations	Architecture	Parameters	Embed dim	Patch size	Dataset
GigaSSL	CBIO	✅	Dec 2022*	SimCLR	12K			1K epochs	ResNet-18		256	256	TCGA
PRISM	Paige / Microsoft	✅	May 2024*	contrastive (with language)	590K (190K text reports)	190K	64 (x4)	75K (10 epochs)	Perceiver + BioGPT		1280	224	proprietary
Prov-GigaPath	Microsoft / Providence	✅	May 2024*	DINOv2	170K	30K			LongNet	86M	1536	224	proprietary (Providence)
MADELEINE	Mahmood Lab	✅	Aug 2024*	contrastive (InfoNCE & OT)	16K	2K	120	90 epochs	multi-head attention MIL		512	256	ACROBAT, BWH Kidney (proprietary)
CHIEF	Yu Lab	✅	Sep 2024*

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pathology Feature Extractors and Foundation Models

Patch-level models

Slide-level / patient-level models

About

Contributors 2

georg-wolflein/pathology-foundation-models

Folders and files

Latest commit

History

Repository files navigation

Pathology Feature Extractors and Foundation Models

Patch-level models

Slide-level / patient-level models

About

Topics

Resources

Stars

Watchers

Forks

Contributors 2