On-Edge Deployment of Vision Transformers for Medical Diagnostics Using the Kvasir-Capsule Dataset

Dara Varam, Lujain Khalil, and Dr. Tamer Shanableh

This repository contains the necessary code for developing an Android application suitable for model deployment.

This is the official Flutter implementation of our paper, "On-Edge Deployment of Vision Transformers for Medical Diagnostics: A Study on the Kvasir-Capsule Dataset", published in MDPI Applied Sciences (10th September, 2024).

Details of our Keras implementation can be accessed here: https://github.com/DaraVaram/Lightweight-ViTs-for-Medical-Diagnostics.

User Interface

Home screen and Drawer

Classification scenarios at image uploads

(a) An image is uploaded and classified correctly by the model. (b) An image is uploaded and classified incorrectly by the model. The user can choose the correct class from a drop-down menu.

Report screen

BibTeX Citation

MDPI paper citation

@Article{app14188115,
AUTHOR = {Varam, Dara and Khalil, Lujain and Shanableh, Tamer},
TITLE = {On-Edge Deployment of Vision Transformers for Medical Diagnostics Using the Kvasir-Capsule Dataset},
JOURNAL = {Applied Sciences},
VOLUME = {14},
YEAR = {2024},
NUMBER = {18},
ARTICLE-NUMBER = {8115},
URL = {https://www.mdpi.com/2076-3417/14/18/8115},
ISSN = {2076-3417},
ABSTRACT = {This paper aims to explore the possibility of utilizing vision transformers (ViTs) for on-edge medical diagnostics by experimenting with the Kvasir-Capsule image classification dataset, a large-scale image dataset of gastrointestinal diseases. Quantization techniques made available through TensorFlow Lite (TFLite), including post-training float-16 (F16) quantization and quantization-aware training (QAT), are applied to achieve reductions in model size, without compromising performance. The seven ViT models selected for this study are EfficientFormerV2S2, EfficientViT_B0, EfficientViT_M4, MobileViT_V2_050, MobileViT_V2_100, MobileViT_V2_175, and RepViT_M11. Three metrics are considered when analyzing a model: (i) F1-score, (ii) model size, and (iii) performance-to-size ratio, where performance is the F1-score and size is the model size in megabytes (MB). In terms of F1-score, we show that MobileViT_V2_175 with F16 quantization outperforms all other models with an F1-score of 0.9534. On the other hand, MobileViT_V2_050 trained using QAT was scaled down to a model size of 1.70 MB, making it the smallest model amongst the variations this paper examined. MobileViT_V2_050 also achieved the highest performance-to-size ratio of 41.25. Despite preferring smaller models for latency and memory concerns, medical diagnostics cannot afford poor-performing models. We conclude that MobileViT_V2_175 with F16 quantization is our best-performing model, with a small size of 27.47 MB, providing a benchmark for lightweight models on the Kvasir-Capsule dataset.},
DOI = {10.3390/app14188115}
}

Full Kvasir-Capsule dataset available here: https://datasets.simula.no/kvasir-capsule/.

@article{Smedsrud2021,
  title = {{Kvasir-Capsule, a video capsule endoscopy dataset}},
  author = {
    Smedsrud, Pia H and Thambawita, Vajira and Hicks, Steven A and
    Gjestang, Henrik and Nedrejord, Oda Olsen and N{\ae}ss, Espen and
    Borgli, Hanna and Jha, Debesh and Berstad, Tor Jan Derek and
    Eskeland, Sigrun L and Lux, Mathias and Espeland, H{\aa}vard and
    Petlund, Andreas and Nguyen, Duc Tien Dang and Garcia-Ceja, Enrique and
    Johansen, Dag and Schmidt, Peter T and Toth, Ervin and
    Hammer, Hugo L and de Lange, Thomas and Riegler, Michael A and
    Halvorsen, P{\aa}l
  },
  doi = {10.1038/s41597-021-00920-z},
  journal = {Scientific Data},
  number = {1},
  pages = {142},
  volume = {8},
  year = {2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
android		android
assets		assets
lib		lib
test		test
.gitignore		.gitignore
.metadata		.metadata
README.md		README.md
analysis_options.yaml		analysis_options.yaml
install.bat		install.bat
pubspec.lock		pubspec.lock
pubspec.yaml		pubspec.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

On-Edge Deployment of Vision Transformers for Medical Diagnostics Using the Kvasir-Capsule Dataset

User Interface

Home screen and Drawer

Classification scenarios at image uploads

Report screen

BibTeX Citation

About

Releases

Packages

Contributors 2

Languages

lujain-khalil/gastro_lens

Folders and files

Latest commit

History

Repository files navigation

On-Edge Deployment of Vision Transformers for Medical Diagnostics Using the Kvasir-Capsule Dataset

User Interface

Home screen and Drawer

Classification scenarios at image uploads

Report screen

BibTeX Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages