Skip to content

jlianglab/POPAR

Repository files navigation

POPAR: Patch Order Prediction and Appearance Recovery for Self-supervised Medical Image Analysis

This repository provides a PyTorch implementation of the POPAR: Patch Order Prediction and Appearance Recovery for Self-supervised Medical Image Analysis.

We propose POPAR (patch order prediction and appearance recovery), a novel vision transformer-based self-supervised learning framework for chest X-ray images. POPAR leverages the benefits of vision transformers and unique properties of medical imaging, aiming to simultaneously learn patch-wise high-level contextual features by correcting shuffled patch orders and fine-grained features by recovering patch appearance.

Publication

POPAR: Patch Order Prediction and Appearance Recovery for Self-supervised Medical Image Analysis
Jiaxuan Pang1, Fatemeh Haghighi1,DongAo Ma1,Nahid Ul Islam1,Mohammad Reza Hosseinzadeh Taher1, Michael B. Gotway2, Jianming Liang1
1 Arizona State University, 2 Mayo Clinic
Published in: Domain Adaptation and Representation Transfer (DART), 2022.

Paper | Supplementary material | Code | [Poster] | [Slides] | Presentation ([YouTube])

Major results from our work

  1. POPAR consistently outperforms all state-of-the-art transformer-based self-supervised imagenet pretrained models that are publicly available.


  1. Our downgraded POPAR-1 and POPAR-3 outperform or achieve on-par performance on most target tasks comapring with all state-of-the-art transformer-based self-supervised imagenet pretrained models that are publicly available.


  1. POPAR with Swin-base backbone, even our downgraded version yields significantly better or on-par performance compared with three self-supervised learning methods with ResNet-50 backbone in all target tasks.


  1. POPAR models outperform SimMIM in all target tasks across ViT-base and Swin-base backbones.


  1. POPAR models outperform fully supervised pretrained models on ImageNet and ChestX-ray14 datasets across architectures


Available implementation

Requirements

Models

Our pre-trained ViT and Swin Transformer models can be downloaded as following:

Backbone Input Resolution (Shuffled Patches) AUC on ChestX-ray14 AUC on CheXpert AUC on ShenZhen ACC on RSNA Pneumonia Model
POPAR-3 ViT-B 224x224 (196) 79.58±0.13 87.86±0.17 93.87±0.63 73.17±0.46 download
POPAR Swin-B 448x448 (196) 81.81±0.10 88.34±0.50 97.33±0.74 74.19±0.37 download

Acknowledgement

This research has been supported in part by ASU and Mayo Clinic through a Seed Grant and an Innovation Grant, and in part by the NIH under Award Number R01HL128785. The content is solely the responsi- bility of the authors and does not necessarily represent the official views of the NIH. This work has utilized the GPUs provided in part by the ASU Research Computing and in part by the Extreme Science and Engineering Discovery Environment (XSEDE) funded by the National Science Foundation (NSF) under grant numbers: ACI-1548562, ACI-1928147, and ACI-2005632. The content of this paper is covered by patents pending.

License

Released under the ASU GitHub Project License.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages