SwinMM: Masked Multi-view with Swin Transformers for 3D Medical Image Segmentation

Update

08/13/2023:

News:

SwinMM has been included in the MONAI's research contribution.

What is SwinMM?

Masked Multi-view with Swin Transformers, dubbed SwinMM, is the first comprehensive multi-view pipeline for self-supervised medical image analysis. SwinMM yields competitive performance, significantly lower training costs, and higher data efficiency compared to recent state-of-the-art models. SwinMM consists of two key components.

Pretrain

In the pre-training stage, we introduce a masked multi-view encoder that simultaneously trains masked multi-view observations with a diverse set of proxy tasks. These tasks include image reconstruction, rotation, contrastive learning, and a mutual learning paradigm that comprehensively leverages hidden multi-view information from 3D medical data by maximizing the consistency between predictions from different views.

Finetune

In the fine-tuning stage, a cross-view decoder is developed to aggregate the multi-view information using a novel cross-view attention block.

Pre-trained Models

We present two checkpoints here:

Here is the sample testing result on WORD

Installation

Please check INSTALL.md for installation instructions.

Evaluation

Testing can be done using the following scripts. Please change pretrained_dir and pretrained_model_name according to the path of the checkpoint you would like to test, and change data_dir and json_list according to the datasets.

cd WORD
python test_parrallel.py --pretrained_dir ./runs/multiview_101616/ \
	--pretrained_model_name model.pt \
	--distributed \
	--data_dir ./dataset/dataset12_WORD/ \
	--json_list dataset12_WORD.json

Training

Please check TRAINING.md for training instructions.

Acknowledgment

This work is partially supported by Google Cloud Research Credits program. This Repo is based on SwinUNETR, MONAI and bagua.

Citation

If you find this repository helpful, please consider citing:

@inproceedings{wang2023SwinMM,
  title     = {SwinMM: Masked Multi-view with Swin Transformers for 3D Medical Image Segmentation},
  author    = {Wang, Yiqing and Li, Zihan and Mei, Jieru and Wei, Zihao and Liu, Li and Wang, Chen and Sang, Shengtian and Yuille, Alan and Xie, Cihang and Zhou, Yuyin},
  booktitle = {MICCAI},
  year      = {2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Pretrain		Pretrain
WORD		WORD
figures		figures
scripts		scripts
.gitignore		.gitignore
INSTALL.md		INSTALL.md
README.md		README.md
TRAINING.md		TRAINING.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pretrain

Pretrain

WORD

WORD

figures

figures

scripts

scripts

.gitignore

.gitignore

INSTALL.md

INSTALL.md

README.md

README.md

TRAINING.md

TRAINING.md

requirements.txt

requirements.txt

Repository files navigation

SwinMM: Masked Multi-view with Swin Transformers for 3D Medical Image Segmentation

Update

What is SwinMM?

Pretrain

Finetune

Pre-trained Models

Installation

Evaluation

Training

Acknowledgment

Citation

About

Releases

Packages

Contributors 4

Languages

UCSC-VLAA/SwinMM

Folders and files

Latest commit

History

Repository files navigation

SwinMM: Masked Multi-view with Swin Transformers for 3D Medical Image Segmentation

Update

What is SwinMM?

Pretrain

Finetune

Pre-trained Models

Installation

Evaluation

Training

Acknowledgment

Citation

About

Resources

Stars

Watchers

Forks

Languages