Skip to content

prithuls/MV-Swin-T

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 

Repository files navigation

MV-Swin-T

In this article, we present a novel transformer-based multi-view network, MV-Swin-T, built upon the Swin Transformer architecture for mammographic image classification to fully exploit multi-view insights. Our contributions include:

  • Designing a novel multi-view network entirely based on the transformer architecture, capitalizing on the benefits of transformer operations for enhanced performance.
  • A novel "Multi-headed Dynamic Attention Block (MDA)" with fixed and shifted window features to enable self and cross-view information fusion from both CC and MLO views of the same breast.
  • Addressing the challenge of effectively combining data from multiple views or modalities, especially when images may not align correctly.
  • We present results using the publicly available CBIS-DDSM And VinDr-Mammo dataset.

Results

image

image

Figures

cross_attention

window_attention

Citation

If you find this work useful, please consider citing our paper:

@article{sarker2024mv,
  title={MV-Swin-T: Mammogram Classification with Multi-view Swin Transformer},
  author={Sarker, Sushmita and Sarker, Prithul and Bebis, George and Tavakkoli, Alireza},
  journal={arXiv preprint arXiv:2402.16298},
  year={2024}
}