The official code of 《Efficient Multi-Stage Self-Supervised Learning for Pathology Image Analysis via Masking》
conda create -n RHP
conda activate RHP
pip install -r requirements.txt
All data in the code is stored in LMDB format to avoid file fragmentation. Given a folder_path containing images, the method to obtain LMDB files is as follows:
from util.dataset import folder2lmdb
folder2lmdb(folder_path, lmdb_path)
imagenet_pretrained_ckpt is the ImageNet pre-trained model, which can be obtained using timm.
python train.py --db_path lmdb_path --ckpt imagenet_pretrained_ckpt --mask_ratio 0.75 --bs your_batch_size
stage1_pretrained_ckpt is the checkpoint obtained from stage1 training
python train.py --db_path lmdb_path --ckpt stage1_pretrained_ckpt --mask_ratio 0 --bs int(your_batch_size*0.25)
| Dataset | Download Link |
|---|---|
| TCGA | Link |
| CPTAC | Link |
| PatchCamelyon | Link |
| BreakHis | Link |
| ColorectalHistologyMNIST | Link |
| NCT-CRC-HE | Link |
| CRC-TP | Link |
| MoNuSeg | Link |
| GlaS | Link |
| Camelyon16 | Link |
This repository is built using the BYOL repository and the MAE repository.
