SAMora: Enhancing SAM through Hierarchical Self-Supervised
Pre-Training for Medical Images
Shuhang Chen, Hangjie Yuan, Yunqiu Xu, Pengwei Liu, Tao Feng, Zeying Huang, and Yi Yang
ICCV 2025
paper | code
- Overview
- Features
- Project Structure
- Installation
- Data Preparation
- Checkpoints
- Configs
- Training
- Evaluation
- Acknowledgements
- Citation
When adapting SAM to medical image segmentation, two major challenges typically arise:
- A significant domain gap between medical images and natural images
- Limited annotations but abundant unlabeled data
The core idea of SAMora is to first learn LoRA experts from three hierarchical levels—image-level, patch-level, and pixel-level—using large-scale unlabeled medical images, then perform hierarchical fusion within encoder blocks through Hierarchical LoRA Attention (HL-Attn), and finally complete efficient finetuning with limited labeled data.
- Supports three-level experts:
- image-level
- patch-level
- pixel-level
- Supports block-level HL-Attn fusion
- Supports two-stage training
- Supports YAML-based configuration
- Supports a unified interface for single-decoder and dual-decoder models
- Compatible with original SAM checkpoints
- Supports exporting stage-1 expert checkpoints and stage-2 finetuned checkpoints
.
├── train.py
├── trainer.py
├── test.py
├── utils.py
├── samora_lora_sam.py
├── samora_lora_hsam.py
├── configs/
│ ├── stage1_image.yaml
│ ├── stage1_patch.yaml
│ ├── stage1_pixel.yaml
│ ├── stage2_samora.yaml
│ ├── stage2_hsamora.yaml
│ ├── test_samora.yaml
│ └── test_hsamora.yaml
├── datasets/
│ ├── dataset_synapse.py
│ └── dataset_unlabeled.py
├── ssl/
│ ├── losses.py
│ ├── projector.py
│ ├── denoise_decoder.py
│ ├── teacher_simclr.py
│ └── teacher_mae.py
└── segment_anything/
├── build_sam.py
└── modeling/
├── __init__.py
├── sam.py
├── image_encoder.py
└── hl_attn.py
git clone <your-repo-url>
cd <your-repo-name>Python 3.9+ with CUDA is recommended.
conda create -n samora python=3.10 -y
conda activate samorapip install torch torchvision
pip install tensorboardX numpy pillow h5py pyyaml tqdm
pip install -r requirements.txtThis project assumes two types of data.
Used to pretrain the three experts:
- image-level
- patch-level
- pixel-level
You can specify unlabeled data directories in the config file via:
unlabeled_roots:
- /path/to/unlabeled/amos22
- /path/to/unlabeled/lits
- /path/to/unlabeled/kitsdatasets/dataset_unlabeled.py currently supports the following formats:
.npy.npz.h5.hdf5.png.jpg.jpeg.tif.tiff
The current default setting follows Synapse:
- train:
../data/Synapse/train_npz - test:
../data/Synapse/test_vol_h5
If you use other datasets, you need to extend the dataset loader and configs accordingly.
Please prepare a base SAM checkpoint first, for example:
checkpoints/sam_vit_b_01ec64.pth
We recommend saving them as:
checkpoints/stage1/
├── image_expert_epoch_29.pth
├── patch_expert_epoch_29.pth
└── pixel_expert_epoch_29.pth
For example:
checkpoints/
├── stage2_samora/
│ └── epoch_19.pth
└── stage2_hsamora/
└── epoch_29.pth
We recommend placing all training and evaluation settings under configs/.
configs/stage1_image.yamlconfigs/stage1_patch.yamlconfigs/stage1_pixel.yamlconfigs/stage2_samora.yamlconfigs/stage2_hsamora.yaml
configs/test_samora.yamlconfigs/test_hsamora.yaml
python train.py --config configs/stage1_image.yamlpython train.py --config configs/stage1_patch.yamlpython train.py --config configs/stage1_pixel.yamlpython train.py --config configs/stage2_samora.yamlKey config entries:
variant: samoramodule: samora_lora_samvit_name: samora_vit_b
python train.py --config configs/stage2_hsamora.yamlKey config entries:
variant: hsamoramodule: samora_lora_hsamvit_name: hsamora_vit_b
python test.py --config configs/test_samora.yamlpython test.py --config configs/test_hsamora.yamlThis project is implemented with reference to or built upon the following works:
- Segment Anything Model (SAM)
- SAMed
- H-SAM
- SAMora
We also thank the open-source community for the released codebases and implementation ideas.
@inproceedings{samora,
title={SAMora: Enhancing SAM through Hierarchical Self-Supervised Pre-Training for Medical Images},
author={Chen, Shuhang and Yuan, Hangjie and Liu, Pengwei and Gu, Hanxue and Feng, Tao and Ni, Dong},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={21209--21219},
year={2025}
}